Ask Your Question
1

How can Python be used to locate multiple strings within a directory of text files, and then extract the respective row of each string along with the following 50 rows and save them to a separate file?

asked 2022-10-28 11:00:00 +0000

woof gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
0

answered 2022-08-08 03:00:00 +0000

scrum gravatar image

To accomplish this in Python, you can use the following steps:

  1. Import the necessary libraries: os, re
  2. Define the directory where the text files are stored
  3. Define the list of strings to search for
  4. Define a function to search for the strings, extract the respective rows and save them to a separate file
  5. Loop through all the text files in the directory and call the function to search for the strings in each file

Here's an example code snippet that implements these steps:

import os
import re

dir_path = 'path/to/directory'  # define directory path
search_strings = ['string1', 'string2', 'string3']  # define list of strings to search for
output_dir = 'path/to/output/directory'  # define output directory path

def search_strings_in_file(filename):
    with open(filename, 'r') as f:
        lines = f.readlines()
        for index, line in enumerate(lines):
            for i in range(len(search_strings)):
                if re.search(search_strings[i], line):
                    output_file = os.path.join(output_dir, search_strings[i] + '_' + os.path.basename(filename))
                    with open(output_file, 'w') as out:
                        out.writelines(lines[index:index+51])
                    break

# loop through all files in directory and search for strings
for filename in os.listdir(dir_path):
    if filename.endswith('.txt'):
        fullpath = os.path.join(dir_path, filename)
        search_strings_in_file(fullpath)

This code will search for all occurrences of the strings in each text file in the directory, and if a string is found, it will save the next 50 lines (including the matched line) to a separate file in the output directory. The output file names will include the matching string and the original file name.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2022-10-28 11:00:00 +0000

Seen: 7 times

Last updated: Aug 08 '22