Ask Your Question
4

What is an efficient way to complete missing rows in a pandas dataframe?

asked 2023-05-15 05:42:50 +0000

huitzilopochtli gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
3

answered 2023-05-15 05:53:01 +0000

bukephalos gravatar image

One efficient way to complete missing rows in a pandas dataframe is to use the .reindex() method. This method allows you to specify a new index and will add any missing rows with NaN values.

For example, if you have a dataframe with a datetime index and there are missing dates, you can complete the missing rows with NaN values using the following code:

import pandas as pd

# create a sample dataframe
dates = pd.date_range('2021-01-01', '2021-01-10', freq='D')
data = {'col1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
        'col2': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']}
df = pd.DataFrame(data, index=dates)

# create a new index with all dates
new_index = pd.date_range('2021-01-01', '2021-01-15', freq='D')

# reindex the dataframe with the new index
df = df.reindex(new_index)

print(df)

The output will be a dataframe with 5 missing rows, filled with NaN values:

            col1 col2
2021-01-01   1.0    a
2021-01-02   2.0    b
2021-01-03   3.0    c
2021-01-04   4.0    d
2021-01-05   5.0    e
2021-01-06   6.0    f
2021-01-07   7.0    g
2021-01-08   8.0    h
2021-01-09   9.0    i
2021-01-10  10.0    j
2021-01-11   NaN  NaN
2021-01-12   NaN  NaN
2021-01-13   NaN  NaN
2021-01-14   NaN  NaN
2021-01-15   NaN  NaN
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2023-05-15 05:42:50 +0000

Seen: 15 times

Last updated: May 15 '23