1 | initial version |
Assuming that the file names are in a column of a DataFrame, you can use the string methods of Pandas to extract a portion of the file name and save it to a new column. For example, if you want to extract the date from a file name that is in the format "yyyymmdd_filename.ext":
import pandas as pd
# Create a dataframe with file names
df = pd.DataFrame({'file_name': ['20211010_file1.txt', '20211011_file2.csv', '20211012_file3.txt']})
# Use the str.extract method to extract the date from the file name
df['date'] = df['file_name'].str.extract('(\d{8})')
# Output the result
print(df)
This will create a new column called "date" in the DataFrame with the extracted date from the file name. The regular expression "\d{8}" matches any eight digits in a row, which corresponds to the format of the date in the file name. You can adjust the regular expression to match the specific format of your file names.