There are several ways to accomplish this task using Python. One way is to use the pandas library to read the Excel file, split it into smaller chunks, and then write the smaller chunks to separate files.
Here's an example code snippet:
import pandas as pd
# read the Excel file into a pandas dataframe
df = pd.read_excel('large_file.xlsx', sheet_name='Sheet1')
# specify the number of rows to include in each smaller file
chunk_size = 10000
# calculate the total number of chunks needed
num_chunks = len(df) // chunk_size + 1
# loop through each chunk and write it to a separate file
for i in range(num_chunks):
start = i * chunk_size
end = (i + 1) * chunk_size
chunk = df.iloc[start:end]
file_name = f'chunk_{i}.xlsx'
chunk.to_excel(file_name, index=False)
In this example, the Excel file is read into a pandas dataframe called df
. The code then specifies a chunk_size
of 10,000 rows per file and calculates the total number of chunks required based on the size of the dataframe.
The code then loops through each chunk of data, which is simply a slice of the original dataframe based on the start
and end
index positions. The code creates a new file name for each chunk and writes the chunk to a separate Excel file using the to_excel
method.
Note that you may need to adjust the chunk_size
and/or include additional logic to handle any leftover rows that do not fit into an exact multiple of the chunk size.
Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss
Asked: 2023-07-10 02:50:58 +0000
Seen: 13 times
Last updated: Jul 10 '23
How can popen() be used to direct streaming data to TAR?
In Python, can a string be utilized to retrieve a dataframe that has the same name as the string?
What is the method for merging field value and text into a singular line for display?
What is the method for programmatic access to a time series?