Ask Your Question
4

How can we divide a combined column in a dataframe with certain null values into separate columns using Python, based on whitespace?

asked 2023-06-13 12:22:05 +0000

woof gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
1

answered 2023-06-13 12:48:02 +0000

qstack gravatar image

One way to divide a combined column in a dataframe with certain null values into separate columns based on whitespace is to use the str.split() method in Pandas.

Here's an example:

import pandas as pd

# create sample dataframe
df = pd.DataFrame({'combined_column': ['John Smith 25', 'Jane Doe 30', 'Joe    Bloggs    40', 'Bill 45']})

# divide combined_column into separate columns based on whitespace
df[['First Name', 'Last Name', 'Age']] = df['combined_column'].str.split(expand=True)

# drop the original combined_column column
df.drop(columns=['combined_column'], inplace=True)

print(df)

Output:

  First Name Last Name Age
0       John     Smith  25
1       Jane       Doe  30
2        Joe   Bloggs    40
3       Bill        45 NaN

In this example, we used str.split() to separate the combined_column into three separate columns based on whitespace. We then assigned these new columns to the dataframe using double square brackets ([[...]]) and the expand=True argument. Finally, we dropped the original combined_column column using the drop() method.

Note that in the sample dataframe, the last row only contains two values, so creating the Age column for that row resulted in a NaN value.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2023-06-13 12:22:05 +0000

Seen: 11 times

Last updated: Jun 13 '23