1 | initial version |
The process of dividing a dataframe into smaller ones based on a single unique variable can be done using the groupby()
method in pandas.
import pandas as pd
data = pd.DataFrame({
'Category': ['A', 'B', 'A', 'C', 'B', 'A'],
'Value': [10, 20, 15, 25, 30, 5]
})
groupby()
method to create a group object with the unique values of the variable:grouped_data = data.groupby('Category')
sum()
, mean()
, max()
, min()
, etc. For example, to get the total value for each category:summed_data = grouped_data.sum()
This will create a new dataframe with the sum of the values for each unique category. You can also iterate through the groups using a for
loop and apply different functions to each group. For example, to get the maximum and minimum values for each category:
for name, group in grouped_data:
print(f"Category {name} has a max value of {group['Value'].max()} and a min value of {group['Value'].min()}")
This will print:
Category A has a max value of 15 and a min value of 5
Category B has a max value of 30 and a min value of 20
Category C has a max value of 25 and a min value of 25