Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

The process of dividing a dataframe into smaller ones based on a single unique variable can be done using the groupby() method in pandas.

  1. First, import pandas library and create a dataframe with the variable you want to group by:
import pandas as pd

data = pd.DataFrame({
    'Category': ['A', 'B', 'A', 'C', 'B', 'A'],
    'Value': [10, 20, 15, 25, 30, 5]
})
  1. Use the groupby() method to create a group object with the unique values of the variable:
grouped_data = data.groupby('Category')
  1. You can apply different aggregation functions to the groups, such as sum(), mean(), max(), min(), etc. For example, to get the total value for each category:
summed_data = grouped_data.sum()

This will create a new dataframe with the sum of the values for each unique category. You can also iterate through the groups using a for loop and apply different functions to each group. For example, to get the maximum and minimum values for each category:

for name, group in grouped_data:
    print(f"Category {name} has a max value of {group['Value'].max()} and a min value of {group['Value'].min()}")

This will print:

Category A has a max value of 15 and a min value of 5
Category B has a max value of 30 and a min value of 20
Category C has a max value of 25 and a min value of 25