Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

The method for using groupby in conjunction with summarise and summariseall is as follows:

  1. First, use the group_by function to group the data by one or more variables.

  2. Then, use the summarise function to calculate summary statistics for each group. You can specify which summary statistic to calculate using different functions like mean(), sum(), and count().

  3. Alternatively, you can use the summarise_all function to calculate summary statistics for all columns in the grouped data.

  4. Finally, use the %>% operator to chain these functions together.


Let's say we have a dataset called "sales" with columns for date, region, product, revenue and cost. To calculate the total revenue and cost by region, we can use the following code:

sales %>% groupby(region) %>% summarise( totalrevenue = sum(revenue), total_cost = sum(cost) )

This will group the sales data by region and then calculate the total revenue and cost for each region using the summarise function.