Tag Archivio per: Perform


There is a requirement to return only the common values from R data structures like vector, list, and dataframe.

In this article, we will discuss how to perform the intersect() operations in vector, list and dataframe.

In a Vector

The intersect() method is used to return the common values from the two vectors.

There are three ways to use the intersect() method:

1. We can directly use the intersect() method using the following command:
Syntax:

intersect(vector_object1,vector_object2)

2. We need to load the dplyr library that supports the intersect() method.
Syntax:

library(dplyr)
intersect(vector_object1,vector_object2)

It takes two vectors as the parameters.

3. Use the intersect() method with Reduce.

In this case, we can perform the intersect operation on multiple vectors.
Reduce takes two parameters. The first parameter is the intersect method and the second parameter takes the multiple vectors inside the list() function.

Syntax:
Reduce(intersect,list(vector_object1,vector_object2,……….))

Parameters:

  1. intersect is the method to perform intersection.
  2. vector_object represents the vector.

Example 1:

In this example, we perform an intersect() operation directly on two vectors.

#create apple price vector
apple=c(23,43,45,43,34)

#create mango price vector
mango=c(23,43,67,43,56)

#display
print(apple)
print(mango)

#do the intersection operation on apple and mango vectors
print(«Intersection of apple and mango:» )

print(intersect(apple,mango))

Result:

In both vectors, 23 and 43 are common.

Example 2:

In this example, we perform an intersect() operation from the dplyr library on two vectors.

library(dplyr)

#create apple price vector
apple=c(23,43,45,43,34)

#create mango price vector
mango=c(23,43,67,43,56)

#display
print(apple)
print(mango)

#do the intersection operation on apple and mango vectors
print(«Intersection of apple and mango:» )

print(intersect(apple,mango))

Result:

In both vectors, 23 and 43 are common.

Example 3:

In this example, we perform an intersect() operation using Reduce on four vectors.

#create apple price vector
apple=c(23,43,45,43,34)
 
#create mango price vector
mango=c(23,43,67,43,56)

#create papaya price vector
papaya=c(45,43,34)
 
#create peas price vector
peas=c(23,43)
 
 
#display
print(apple)
print(mango)
print(papaya)
print(peas)
 
#do the intersection operation on four vectors
print(«Intersection of apple,mango,papaya,peas:» )
 
print(Reduce(intersect,list(apple,mango,papaya,peas)))

Result:

In the four vectors, 43 is common.

In a List

The intersect() method is used to return the common values from two lists.

There are three ways to use the intersect() method:

1. We can directly use intersect() method using the following command:
Syntax:

intersect(list_object1,list_object2)

2. We need to load the dplyr library that supports the intersect() method.
Syntax:

library(dplyr)
intersect(list_object1,list_object2)

It takes two lists as parameters.

3. Use the intersect() method with Reduce.

In this case, we can perform the intersect operation on multiple lists.
Reduce takes two parameters. The first parameter is the intersect method and the second parameter takes the multiple lists inside the list() function.

Syntax:

Reduce(intersect,list(list_object1,list_object2,……….))

Parameters:

  1. intersect is the method to perform intersection
  2. list_object represents the vector

Example 1:

In this example, we perform an intersect() operation directly on two lists.

#create apple price list
apple=list(23,43,45,43,34)
 
#create mango price list
mango=list(23,43,67,43,56)
 
 
#do the intersection operation on apple and mango list
print(«Intersection of apple and mango:» )
 
print(intersect(apple,mango))

Result:

In both lists, 23 and 43 are common.

Example 2:

In this example, we perform an intersect() operation from the dplyr library on two lists.

library(dplyr)

#create apple price list
apple=list(23,43,45,43,34)
 
#create mango price list
mango=list(23,43,67,43,56)
 
 
#do the intersection operation on apple and mango list
print(«Intersection of apple and mango:» )
 
print(intersect(apple,mango))

Result:

Example 3:

In this example, we perform an intersect() operation using Reduce on four lists.

#create apple price list
apple=list(23,43,45,43,34)
 
#create mango price list
mango=list(23,43,67,43,56)

#create papaya price list
papaya=list(45,43,34)
 
#create peas price list
peas=list(23,43)
 
 
#do the intersection operation on four lists
print(«Intersection of apple,mango,papaya,peas:» )
 
print(Reduce(intersect,list(apple,mango,papaya,peas)))

Result:

In the four lists, 43 is common.

In a Dataframe

The intersect() method is used to return the common values from the two dataframes.

There are three ways to use the intersect() method:

1. We can directly use intersect() method using the following command:
Syntax:

intersect(dataframe_object1,dataframe_object2)

2. We need to load the dplyr library that supports the intersect() method.
Syntax:

library(dplyr)
intersect(dataframe_object1,dataframe_object2)

It takes two dataframes as parameters.

3. Use the intersect() method with Reduce.

In this case, we can perform the intersection operation on multiple dataframes.

Reduce takes two parameters. The first parameter is the intersect method and the second parameter takes the multiple dataframes inside the list() function.

Syntax:

Reduce(intersect,list(dataframe_object1,dataframe_object2,……….))

Parameters:

  1. intersect is the method to perform intersection
  2. dataframe_object represents the dataframe

Example 1:

In this example, we perform an intersect() operation directly on two dataframes.

#create a dataframe-market1 that has 3 rows and 5 columns.
market1=data.frame(market_id=c(1,2,3), market_name=c(‘M1’,‘M2’,‘M4’), market_place=c(‘India’,‘USA’,‘India’), market_type=c(‘bar’,‘grocery’,‘restaurent’), market_squarefeet=c(120,342,220))

#create a dataframe-market2 that has 4 rows and 5 columns.
market2=data.frame(market_id=c(1,2,3,4), market_name=c(‘M1’,‘M2’,‘M3’,‘M4’), market_place=c(‘India’,‘USA’,‘India’,‘Australia’), market_type=c(‘bar’,‘bar’,‘grocery’,‘restaurent’), market_squarefeet=c(120,342,220,110))

#perform intersection on market1 and market2
print(«intersection on market1 and market2»)
print(intersect(market1,market2))

Result:

In both dataframes, only the 1st row is common.

Example 2:

In this example, we perform an intersect() operation directly on two dataframes.

library(dplyr)
 
#create a dataframe-market1 that has 3 rows and 5 columns.
market1=data.frame(market_id=c(1,2,3), market_name=c(‘M1’,‘M2’,‘M4’), market_place=c(‘India’,‘USA’,‘India’), market_type=c(‘bar’,‘grocery’,‘restaurent’), market_squarefeet=c(120,342,220))

#create a dataframe-market2 that has 4 rows and 5 columns.
market2=data.frame(market_id=c(1,2,3,4), market_name=c(‘M1’,‘M2’,‘M3’,‘M4’), market_place=c(‘India’,‘USA’,‘India’,‘Australia’), market_type=c(‘bar’,‘bar’,‘grocery’,‘restaurent’), market_squarefeet=c(120,342,220,110))

#perform intersection on market1 and market2
print(«intersection on market1 and market2»)
print(intersect(market1,market2))

Result:

In both dataframes, only the 1st row is common.

Example 3:

In this example, we perform an intersect() operation using Reduce on three dataframes.

#create a dataframe-market1 that has 3 rows and 5 columns.
market1=data.frame(market_id=c(1,2,3), market_name=c(‘M1’,‘M2’,‘M4’), market_place=c(‘India’,‘USA’,‘India’), market_type=c(‘bar’,‘grocery’,‘restaurent’), market_squarefeet=c(120,342,220))

#create a dataframe-market2 that has 4 rows and 5 columns.
market2=data.frame(market_id=c(1,2,3,4), market_name=c(‘M1’,‘M2’,‘M3’,‘M4’), market_place=c(‘India’,‘USA’,‘India’,‘Australia’), market_type=c(‘bar’,‘bar’,‘grocery’,‘restaurent’), market_squarefeet=c(120,342,220,110))

#create a dataframe-market3 that has 4 rows and 5 columns.
market3=data.frame(market_id=c(1,2,3,4), market_name=c(‘M1’,‘M2’,‘M3’,‘M4’), market_place=c(‘India’,‘USA’,‘India’,‘Australia’), market_type=c(‘bar’,‘bar’,‘grocery’,‘restaurent’), market_squarefeet=c(120,342,220,110))

#perform intersection on market1, market2 and market3
print(«intersection on market1,market2 and market3»)
print(Reduce(intersect,list(market1,market2,market3)))

Result:

In the three dataframes, only the 1st row is common.

Conclusion

In this R tutorial, we saw the different ways to perform the intersect() operations in the vector, list, and dataframe. If you want to perform the intersection operation on more than two data, you can use the Reduce() function.



Source link


PyTorch is an open-source framework for the Python programming language. We can process the data in PyTorch in the form of a Tensor.

A tensor is a multidimensional array that is used to store data. To use a tensor, we have to import the torch module.

To create a tensor the method used is tensor().

Syntax:
torch.tensor(data)

where data is a multi-dimensional array.

torch.asin()

torch.asin() in PyTorch returns inverse sine values of all the elements in a tensor. It takes only one parameter.

Syntax:
torch.asin(tensor_object)

Parameter:
tensor_object is the input tensor

Example 1:

Let’s create a one-dimensional tensor – data1 and return inverse sine values by applying torch.asin() on it.

#import torch module
import torch
 
#create a 1D tensor – data1 with 5 numeric values.
data1 = torch.tensor([23,45,67,10,0])
 
#display
print(«Tensor: «,data1)
 
#perform asin() on above tensor
print(«Inverse sine values: «,torch.asin(data1))

Output:

Tensor:  tensor([23, 45, 67, 10,  0])
Inverse sine values:  tensor([nan, nan, nan, nan, 0.])

We can see that inverse sine values were returned.

Example 2:

Let’s create a two-dimensional tensor – data1 and return inverse sine values by applying torch.asin() on it.

#import torch module
import torch
 
#create a 2D tensor – data1 with 5 numeric values in each row.
data1 = torch.tensor([[23,45,67,10,0],[65,78,90,120,180]])
 
#display
print(«Tensor: «,data1)
 
#perform asin() on above tensor
print(«Inverse sine values: «,torch.asin(data1))

Output:

Tensor:  tensor([[ 23,  45,  67,  10,   0],
        [ 65,  78,  90, 120, 180]])
Inverse sine values:  tensor([[nan, nan, nan, nan, 0.],
        [nan, nan, nan, nan, nan]])

We can see that inverse sine values were returned.

torch.acos()

torch.acos() in PyTorch returns inverse cosine values of all the elements in a tensor. It takes only one parameter.

Syntax:
torch.acos(tensor_object)

Parameter:
tensor_object is the input tensor

Example 1:

Let’s create a one-dimensional tensor – data1 and return inverse cosine values by applying torch.acos() on it.

#import torch module
import torch
 
#create a 1D tensor – data1 with 5 numeric values.
data1 = torch.tensor([23,45,67,10,0])
 
#display
print(«Tensor: «,data1)
 
#perform acos() on above tensor
print(«Inverse cosine values: «,torch.acos(data1))

Output:

Tensor:  tensor([23, 45, 67, 10,  0])
Inverse cosine values:  tensor([   nan,    nan,    nan,    nan, 1.5708])

We can see that inverse cosine values were returned.

Example 2:

Let’s create a two-dimensional tensor – data1 and return inverse cosine values by applying torch.acos() on it.

#import torch module
import torch
 
#create a 2D tensor – data1 with 5 numeric values in each row.
data1 = torch.tensor([[23,45,67,10,0],[65,78,90,120,180]])
 
#display
print(«Tensor: «,data1)
 
#perform acos() on above tensor
print(«Inverse cosine values: «,torch.acos(data1))

Output:

Tensor:  tensor([[ 23,  45,  67,  10,   0],
        [ 65,  78,  90, 120, 180]])
Inverse cosine values:  tensor([[   nan,    nan,    nan,    nan, 1.5708],
        [   nan,    nan,    nan,    nan,    nan]])

We can see that inverse cosine values were returned.

torch.atan()

torch.atan() in PyTorch returns inverse tangent values of all the elements in a tensor. It takes only one parameter.

Syntax:
torch.atan(tensor_object)

Parameter:
tensor_object is the input tensor

Example 1:

Let’s create a one-dimensional tensor – data1 and return inverse tangent values by applying torch.atan() on it.

#import torch module
import torch
 
#create a 1D tensor – data1 with 5 numeric values.
data1 = torch.tensor([23,45,67,10,0])
 
#display
print(«Tensor: «,data1)
 
#perform atan() on above tensor
print(«Inverse tangent values: «,torch.atan(data1))

Output:

Tensor:  tensor([23, 45, 67, 10,  0])
Inverse tangent values:  tensor([1.5273, 1.5486, 1.5559, 1.4711, 0.0000])

We can see that inverse tangent values were returned.

Example 2:

Let’s create a two-dimensional tensor – data1 and return inverse tangent values by applying torch.atan() on it.

#import torch module
import torch
 
#create a 2D tensor – data1 with 5 numeric values in each row.
data1 = torch.tensor([[23,45,67,10,0],[65,78,90,120,180]])
 
#display
print(«Tensor: «,data1)
 
#perform atan() on above tensor
print(«Inverse Tangent  values: «,torch.atan(data1))

Output:

Tensor:  tensor([[ 23,  45,  67,  10,   0],
        [ 65,  78,  90, 120, 180]])
Inverse Tangent  values:  tensor([[1.5273, 1.5486, 1.5559, 1.4711, 0.0000],
        [1.5554, 1.5580, 1.5597, 1.5625, 1.5652]])

We can see that inverse tangent values were returned.

torch.asinh()

torch.asinh() in PyTorch returns inverse hyperbolic sine values of all the elements in a tensor. It takes only one parameter.

Syntax:
torch.asinh(tensor_object)

Parameter:
tensor_object is the input tensor

Example 1:

Let’s create a one-dimensional tensor – data1 and return inverse hyperbolic sine values by applying torch.asinh() on it.

#import torch module
import torch
 
#create a 1D tensor – data1 with 5 numeric values.
data1 = torch.tensor([0,1,45,10,23])
 
#display
print(«Tensor: «,data1)
 
#perform asinh() on above tensor
print(«Inverse hyperbolic sine values: «,torch.asinh(data1))

Output:

Tensor:  tensor([ 0,  1, 45, 10, 23])
Inverse hyperbolic sine values:  tensor([0.0000, 0.8814, 4.4999, 2.9982, 3.8291])

We can see that inverse hyperbolic sine values were returned.

Example 2:

Let’s create a two-dimensional tensor – data1 and return inverse hyperbolic sine values by applying torch.asinh() on it.

#import torch module
import torch
 
#create a 2D tensor – data1 with 5 numeric values in each row.
data1 = torch.tensor([[23,45,67,10,0],[65,78,90,120,180]])
 
#display
print(«Tensor: «,data1)
 
#perform asinh() on above tensor
print(«Inverse hyperbolic sine values: «,torch.asinh(data1))

Output:

Tensor:  tensor([[ 23,  45,  67,  10,   0],
        [ 65,  78,  90, 120, 180]])
Inverse hyperbolic sine values:  tensor([[3.8291, 4.4999, 4.8979, 2.9982, 0.0000],
        [4.8676, 5.0499, 5.1930, 5.4807, 5.8861]])

We can see that inverse hyperbolic sine values were returned.

torch.acosh()

torch.acosh() in PyTorch returns inverse hyperbolic cosine values of all the elements in a tensor. It takes only one parameter.

Syntax:
torch.acosh(tensor_object)

Parameter:
tensor_object is the input tensor

Example 1:

Let’s create a one-dimensional tensor – data1 and return inverse hyperbolic cosine values by applying torch.acosh() on it.

#import torch module
import torch
 
#create a 1D tensor – data1 with 5 numeric values.
data1 = torch.tensor([23,45,67,10,0])
 
#display
print(«Tensor: «,data1)
 
#perform acosh() on above tensor
print(«Inverse hyperbolic cosine values: «,torch.acosh(data1))

Output:

Tensor:  tensor([23, 45, 67, 10,  0])
Inverse hyperbolic cosine values:  tensor([3.8282, 4.4997, 4.8978, 2.9932,    nan])

We can see that inverse hyperbolic cosine values were returned.

Example 2:

Let’s create a two-dimensional tensor – data1 and return inverse hyperbolic cosine values by applying torch.acosh() on it.

#import torch module
import torch
 
#create a 2D tensor – data1 with 5 numeric values in each row.
data1 = torch.tensor([[23,45,67,10,0],[65,78,90,120,180]])
 
#display
print(«Tensor: «,data1)
 
#perform acosh() on above tensor
print(«Inverse hyperbolic cosine values: «,torch.acosh(data1))

Output:

Tensor:  tensor([[ 23,  45,  67,  10,   0],
        [ 65,  78,  90, 120, 180]])
Inverse hyperbolic cosine values:  tensor([[3.8282, 4.4997, 4.8978, 2.9932,    nan],
        [4.8675, 5.0498, 5.1929, 5.4806, 5.8861]])

We can see that inverse hyperbolic cosine values were returned.

torch.atanh()

torch.atanh() in PyTorch returns inverse hyperbolic tangent values of all the elements in a tensor. It takes only one parameter.

Syntax:
torch.atanh(tensor_object)

Parameter:
tensor_object is the input tensor

Example 1:

Let’s create a one-dimensional tensor – data1 and return inverse hyperbolic tangent values by applying torch.atanh() on it.

#import torch module
import torch
 
#create a 1D tensor – data1 with 5 numeric values.
data1 = torch.tensor([23,45,67,10,0])
 
#display
print(«Tensor: «,data1)
 
#perform atanh() on above tensor
print(«Inverse hyperbolic tangent values: «,torch.atanh(data1))

Output:

Tensor:  tensor([23, 45, 67, 10,  0])
Inverse hyperbolic tangent values:  tensor([nan, nan, nan, nan, 0.])

We can see that inverse hyperbolic tangent values were returned.

Example 2:

Let’s create a two-dimensional tensor – data1 and return inverse hyperbolic tangent values by applying torch.atanh() on it.

#import torch module
import torch
 
#create a 2D tensor – data1 with 5 numeric values in each row.
data1 = torch.tensor([[23,45,67,10,0],[65,78,90,120,180]])
 
#display
print(«Tensor: «,data1)
 
#perform atanh() on above tensor
print(«Inverse hyperbolic tangent values: «,torch.atanh(data1))

Output:

Tensor:  tensor([[ 23,  45,  67,  10,   0],
        [ 65,  78,  90, 120, 180]])
Inverse hyperbolic tangent values:  tensor([[nan, nan, nan, nan, 0.],
        [nan, nan, nan, nan, nan]])

We can see that inverse hyperbolic tangent values were returned.

Conclusion

In this PyTorch lesson, we saw how to perform Inverse Trigonometric functions in PyTorch. We discussed three types of inverse trigonometric functions – asin(),acos() and atan(). If you need to perform inverse hyperbolic functions, you can use asinh(),acosh() and atanh().



Source link


In this R tutorial, we will see how to perform the aggregation operations by grouping the data and returning the median in the grouped rows.

This operation has to be performed on a dataframe. Let’s create the dataframe with seven rows and five columns.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))

#display the market dataframe
print(market)

Result

Now, we will return the median in a column by grouping the similar values in another column.

Method 1: Aggregate()

Here, we use the aggregate() function that takes three parameters.

Syntax

aggregate(dataframe_object$grouped, list(dataframe_object$grouping), FUN=median)

Parameters

  1. The first parameter takes the variable column (grouped) which returns the median  per group.
  2. The second parameter takes a single or multiple column (grouping) in a list such that the values are grouped in these columns.
  3. The third parameter takes FUN, which takes the median function to return the median in the grouped values.

Example 1
In this example, we group the values in the market_place column and get the median  in the market_squarefeet column grouped by the market_place column.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))
 
#get the median of square feet in group by grouping market_place
print(aggregate(market$market_squarefeet, list(market$market_place), FUN=median))

Result

We can see that the similar values (Australia, India and USA) in the market_place column are grouped and returned the median of the grouped values in the market_square feet column.

Example 2
In this example, we group the values in the market_type column and get the median in the market_squarefeet column grouped by the market_type column.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))
 
#get the median of square feet in group by grouping market_type
print(aggregate(market$market_squarefeet, list(market$market_type), FUN=median))

Result

We can see that the similar values (bar, grocery, and restaurent) in the market_type column are grouped and returned the median of the grouped values in the market_square feet column.

Example 3
In this example, we group the values in the market_type and market_place columns and get the median in the market_squarefeet column grouped by the market_type and market_place columns.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,‘M4’,‘M3’,
‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,‘India’,‘Australia’),
market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,‘grocery’,‘bar’,‘grocery’),
market_squarefeet=c(120,342,220,110,342,220,110))
 
#get the median of square feet in group by grouping market_place and market_type
print(aggregate(market$market_squarefeet, list(market$market_place,market$market_type), FUN=median))

Result

We can see that the similar values from the two columns were grouped and returned the median in each  grouped value in the market_square feet column.

Method 2: Dplyr

Here, we use the group_by() function with summarise_at() function which are available in the dplyr library to perform the group_by() funtion with the median operation.

Syntax

dataframe_object%>% group_by(grouping) %>% summarise_at(vars(grouped), list(name = median))

Where:

  1. group_by() takes one parameter, i.e. grouping column
  2. summarise_at() takes two parameters:
  1. The first parameter takes the variable column (grouped) which returns the median per group.
  2. The second parameter takes the median function through the list.

Finally, we first summarize with the median and load it into the group. Then, we load the grouped column into the dataframe object.

It returns a tibble. 

Example 1
In this example, we group the values in the market_place column and get the median in the market_squarefeet column grouped by the market_place column.

library(«dplyr»)
 
#get the median  of square feet in group by grouping market_place
print(market %>% group_by(market_place) %>% summarise_at(vars(market_squarefeet), list(name = median)))

Result

We can see that the similar values (Australia, India and USA) in the market_place column are grouped and returned the median from each grouped value in the market_square feet column.

Example 2
In this example, we group the values in the market_type column and get the median  in the market_squarefeet column grouped by the market_type  column.

library(«dplyr»)
 
#get the median  of square feet in group by grouping market_type
print(market %>% group_by(market_type) %>% summarise_at(vars(market_squarefeet), list(name = median)))

Result

We can see that the similar values (bar, grocery, and restaurant) in the market_type column are grouped and returned the median in each  grouped value in the market_square feet column.

Conclusion

It is possible to group the single or multiple columns with the other numeric columns to return the median from the numeric column using the aggregate() function. Similarly, we can use the groupby() function with the summarise_at() function to group the similar values in a column and return the median from the grouped values with respect to another column.



Source link


In this R tutorial, we will see how to perform the aggregation operations by grouping the data and returning the total sum for the grouped rows.

This operation has to be performed on a dataframe. Let’s create the dataframe with seven rows and five columns.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))

#display the market dataframe
print(market)

Result

Now, we will return the total sum of a column by grouping the similar values in another column.

Method 1: Aggregate()

Here, we use the aggregate() function that takes three parameters.

Syntax

aggregate(dataframe_object$grouped, list(dataframe_object$grouping), FUN=sum)

Parameters

  1. The first parameter takes the variable column (grouped) which returns the sum of values per group.
  2. The second parameter takes a single or multiple column (grouping) in a list such that the values are grouped in these columns.
  3. The third parameter takes FUN, which takes the sum function to return the total sum on the grouped values.

Example 1
In this example, we group the values in the market_place column and get the sum of the values in the market_squarefeet column grouped by the market_place column.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))
 
#get the sum of square feet in group by grouping market_place
print(aggregate(market$market_squarefeet, list(market$market_place), FUN=sum))

Result

We can see that the similar values (Australia, India and USA) in the market_place column are grouped and returned the sum of the grouped values in the market_square feet column.

Example 2
In this example, we group the values in the market_type column and get the sum in the market_squarefeet column grouped by the market_type  column.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))
 
#get the sum of square feet in group by grouping market_type
print(aggregate(market$market_squarefeet, list(market$market_type), FUN=sum))

Result

We can see that the similar values (bar, grocery, and restaurent) in the market_type column are grouped and returned the sum of the grouped values in the market_square feet column.

Example 3
In this example, we group the values in the market_type and market_place columns and get the sum of the values in the market_squarefeet column grouped by the market_type and market_place columns.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))
 
#get the sum of square feet in group by grouping market_place and market_type
print(aggregate(market$market_squarefeet, list(market$market_place,market$market_type), FUN=sum))

Result

We can see that the similar values from the two columns were grouped and returned the sum of the grouped values in the market_square feet column.

Method 2: Dplyr

Here, we use the group_by() function with the summarise_at() function which are available in the dplyr library to perform the group_by function with the sum operation.

Syntax

dataframe_object%>% group_by(grouping) %>% summarise_at(vars(grouped), list(name = sum))

Where:

  1. group_by() takes one parameter, i.e. grouping column
  2. summarise_at() takes two parameters:
  1. The first parameter takes the variable column (grouped) which returns the sum of the values per group.
  2. The second parameter takes the sum function through the list.

Finally, we first summarize with the sum and load it into the group. Then, we load the grouped column into the dataframe object.

It returns a tibble.

Example 1
In this example, we group the values in the market_place column and get the sum of the values in the market_squarefeet column grouped by the market_place column.

library(«dplyr»)
 
#get the sum of square feet in group by grouping market_place
print(market %>% group_by(market_place) %>%
summarise_at(vars(market_squarefeet), list(name = sum)))

Result

We can see that the similar values (Australia, India and USA) in the market_place column are grouped and returned the sum of the grouped values in the market_square feet column.

Example 2
In this example, we group the values in the market_type column and get the sum of the values in the market_squarefeet column grouped by the market_type  column.

library(«dplyr»)
 
#get the sum of square feet in group by grouping market_type
print(market %>% group_by(market_type) %>%
summarise_at(vars(market_squarefeet), list(name = sum)))

Result

We can see that the similar values (bar, grocery and restaurent) in the market_type column are grouped and returned the sum of the grouped values in the market_square feet column.

Conclusion

It is possible to group the single or multiple columns with the other numeric columns to return the sum of the numeric column using the aggregate() function. Similarly, we can use the groupby() fucniton with the summarise_at() function to group the similar values in a column and return the sum of the grouped values with respect to another column.



Source link


In this R tutorial, we will see how to perform the aggregation operations by grouping the data and returning the media values for grouped rows.

This operation has to be performed on a dataframe. Let’s create the dataframe with seven rows and five columns.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))

#display the market dataframe
print(market)

Result

Now, we return the media values of a column by grouping the similar values in another column.

Method 1: Aggregate()

Here, we use the aggregate() function that takes three parameters.

Syntax

aggregate(dataframe_object$grouped, list(dataframe_object$grouping), FUN=mean)

Parameters

  1. The first parameter takes the variable column (grouped) which returns the mean values per group.
  2. The second parameter takes a single or multiple column (grouping) in a list such that the values are grouped in these columns.
  3. The third parameter takes FUN, which takes the mean function to return the media on the grouped values.

Example 1
In this example, we group the values in the market_place column and get the media values in the market_squarefeet column grouped by the market_place column.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))

#get the media of square feet in group by grouping market_place
print(aggregate(market$market_squarefeet, list(market$market_place), FUN=mean))

Result

We can see that the similar values (Australia, India and USA) in the market_place column are grouped and returned the mean of the grouped values in the market_square feet column.

Example 2
In this example, we group the values in the market_type column and get the media values in the market_squarefeet column grouped by the market_type  column.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))

#get the media of square feet in group by grouping market_type
print(aggregate(market$market_squarefeet, list(market$market_type), FUN=mean))

Result

We can see that the similar values (bar, grocery, and restaurent) in the market_type column are grouped and returned the mean of the grouped values in the market_square feet column.

Example 3
In this example, we group the values in the market_type and market_place columns and get the media values in the market_squarefeet column grouped by the market_type and market_place columns.

#create a dataframe-market that has 7 rows and 5 columns.
market=data.frame(market_id=c(1,2,1,4,3,4,5),market_name=c(‘M1’,‘M2’,‘M3’,
‘M4’,‘M3’,‘M4’,‘M3’),market_place=c(‘India’,‘USA’,‘India’,‘Australia’,‘USA’,
‘India’,‘Australia’),market_type=c(‘grocery’,‘bar’,‘grocery’,‘restaurent’,
‘grocery’,‘bar’,‘grocery’),market_squarefeet=c(120,342,220,110,342,220,110))

#get the media of square feet in group by grouping market_place and market_type
print(aggregate(market$market_squarefeet, list(market$market_place,market$market_type), FUN=mean))

Result

We can see that the similar values from the two columns were grouped and returned the mean of the grouped values in the market_square feet column.

Method 2: Dplyr

Here, we use the group_by with summarise_at() which are available in the dplyr library to perform the group_by with the mean operation.

Syntax

dataframe_object%>% group_by(grouping) %>% summarise_at(vars(grouped), list(name = mean))

Where:

group_by() takes one parameter, i.e. grouping column

summarise_at()  takes two parameters:

  1. The first parameter takes the variable column (grouped) which returns the mean values per group.
  2. The second parameter takes the mean function through the list.

Finally, we first summarize with the mean and load into the group. Then, we load the grouped column into the dataframe object.

It returns a tibble.

Example 1
In this example, we group the values in the market_place column and get the media values in the market_squarefeet column grouped by the market_place column.

library(«dplyr»)

#get the media of square feet in group by grouping market_place
print(market %>% group_by(market_place) %>%
summarise_at(vars(market_squarefeet), list(name = mean)))

Result

We can see that the similar values (Australia, India and USA) in the market_place column are grouped and returned the mean of the grouped values in the market_square feet column.

Example 2
In this example, we group the values in the market_type column and get the media values in the market_squarefeet column grouped by the market_type  column.

library(«dplyr»)

#get the media of square feet in group by grouping market_type
print(market %>% group_by(market_type) %>%
summarise_at(vars(market_squarefeet), list(name = mean)))

Result

We can see that the similar values (bar, grocery, and restaurent) in the market_type column are grouped and returned the mean of the grouped values in the market_square feet column.

Conclusion

It is possible to group the single or multiple columns with other numeric columns to return the mean of the numeric column using the aggregate() function. Similarly, we can use the groupby() function with the summarise_at() function to group the similar values in a column and return the media of the grouped values with respect to another column.



Source link