1 | initial version |
You can obtain rows in a dataframe that have identical values by using the duplicated
method in pandas. The duplicated
method returns a boolean Series indicating whether each row is a duplicate of a previous row. You can then use this boolean Series to filter the dataframe and return only the rows where the values are identical.
Here is an example:
import pandas as pd
# create a sample dataframe
df = pd.DataFrame({
'col1': ['a', 'b', 'c', 'a'],
'col2': [1, 2, 3, 1],
'col3': [True, False, True, False]
})
# identify rows with identical values
duplicate_mask = df.duplicated(keep=False)
duplicates = df[duplicate_mask]
# print the rows with identical values
print(duplicates)
Output:
col1 col2 col3
0 a 1 True
3 a 1 False
In this example, the duplicate_mask contains a boolean Series indicating which rows have identical values. We then filter the original dataframe using this mask to obtain the rows that have identical values (in this case, rows 0 and 3).