How can I eliminate duplicate entries following the merging of two dataframes using an inner join?

answered 2022-02-20 07:00:00 +0000

nofretete
31 ●3 ●5

You can eliminate duplicate entries following the merging of two dataframes using an inner join by using the drop_duplicates() function in pandas.

Here is an example code:

import pandas as pd

df1 = pd.DataFrame({'ID': [1, 2, 3, 4], 'Name': ['John', 'Jane', 'Alice', 'Bob']})
df2 = pd.DataFrame({'ID': [1, 2, 3, 5], 'Age': [30, 25, 40, 35]})

df_merge = pd.merge(df1, df2, on='ID', how='inner')
df_merge = df_merge.drop_duplicates(subset=['ID'], keep='first')
print(df_merge)

In this example, we are merging two dataframes df1 and df2 using an inner join on the 'ID' column. After merging, we are dropping any duplicated rows based on the 'ID' column using the drop_duplicates() function. The subset parameter specifies which column to check for duplicates, and keep parameter specifies which duplicate row to keep (in this case we keep the first occurrence).

The output of this code will be:

   ID Name  Age
0   1 John   30
1   2 Jane   25
2   3 Alice  40

As you can see, the duplicated row with ID=4 is eliminated.

edit flag offensive delete link

add a comment

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer

How can I eliminate duplicate entries following the merging of two dataframes using an inner join?

1 Answer

Your Answer

Question Tools

Stats

Related questions

How can I eliminate duplicate entries following the merging of two dataframes using an inner join? edit

1 Answer

Your Answer

Question Tools

Stats

Related questions

How can I eliminate duplicate entries following the merging of two dataframes using an inner join?