Ask Your Question
1

How can pandas convert the data in a column into serial numbers?

asked 2023-02-21 11:00:00 +0000

djk gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
2

answered 2022-09-11 06:00:00 +0000

nofretete gravatar image

Pandas can convert the data in a column into serial numbers by using the method "pd.factorize()". This method converts a column of categorical data into an array of numerical values. Here is an example:

import pandas as pd

# create a dataframe with a categorical column
df = pd.DataFrame({'fruit': ['apple', 'banana', 'apple', 'banana', 'orange']})

# use pd.factorize() to convert the categorical column 'fruit' into numerical values
df['fruit_id'] = pd.factorize(df['fruit'])[0]

# display the new dataframe with serial numbers in 'fruit_id' column
print(df)

Output:

    fruit  fruit_id
0   apple         0
1  banana         1
2   apple         0
3  banana         1
4  orange         2

In this example, the "fruit" column is converted into serial numbers in the "fruit_id" column. The values for "apple", "banana", and "orange" are converted to 0, 1, and 2, respectively. Note that the "factorize()" method also returns a tuple with the unique categorical values, but we are only interested in the numerical values, so we use the first element of the tuple.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2023-02-21 11:00:00 +0000

Seen: 13 times

Last updated: Sep 11 '22