Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Assuming that the file names are in a column of a DataFrame, you can use the string methods of Pandas to extract a portion of the file name and save it to a new column. For example, if you want to extract the date from a file name that is in the format "yyyymmdd_filename.ext":

import pandas as pd

# Create a dataframe with file names
df = pd.DataFrame({'file_name': ['20211010_file1.txt', '20211011_file2.csv', '20211012_file3.txt']})

# Use the str.extract method to extract the date from the file name
df['date'] = df['file_name'].str.extract('(\d{8})')

# Output the result
print(df)

This will create a new column called "date" in the DataFrame with the extracted date from the file name. The regular expression "\d{8}" matches any eight digits in a row, which corresponds to the format of the date in the file name. You can adjust the regular expression to match the specific format of your file names.