Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

You can obtain rows in a dataframe that have identical values by using the duplicated method in pandas. The duplicated method returns a boolean Series indicating whether each row is a duplicate of a previous row. You can then use this boolean Series to filter the dataframe and return only the rows where the values are identical.

Here is an example:

import pandas as pd

# create a sample dataframe
df = pd.DataFrame({
    'col1': ['a', 'b', 'c', 'a'],
    'col2': [1, 2, 3, 1],
    'col3': [True, False, True, False]
})

# identify rows with identical values
duplicate_mask = df.duplicated(keep=False)
duplicates = df[duplicate_mask]

# print the rows with identical values
print(duplicates)

Output:

  col1  col2   col3
0    a     1   True
3    a     1  False

In this example, the duplicate_mask contains a boolean Series indicating which rows have identical values. We then filter the original dataframe using this mask to obtain the rows that have identical values (in this case, rows 0 and 3).