Ask Your Question
2

How can regex in Python be used to extract the text located between numerical values?

asked 2022-03-14 11:00:00 +0000

scrum gravatar image

edit retag flag offensive close merge delete

1 Answer

Sort by ยป oldest newest most voted
3

answered 2021-07-24 13:00:00 +0000

plato gravatar image

One way to extract text located between numerical values using regex in Python is to use lookbehind and lookahead assertions. Here is an example:

import re

text = "123 this is some text 456 that we want to extract 789"
pattern = r"(?<=\d\s).*?(?=\s\d)"

result = re.findall(pattern, text)
print(result)

Explanation:

  • (?<=\d\s) is a positive lookbehind assertion that matches a digit followed by a whitespace character (but does not include the match in the result).
  • .*? matches any character (except a newline) zero or more times, but as few times as possible (non-greedy).
  • (?=\s\d) is a positive lookahead assertion that matches a whitespace character followed by a digit (but does not include the match in the result).

The re.findall function returns a list of all non-overlapping matches of the pattern in the text. In this case, the result will be:

['this is some text', 'that we want to extract']
edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss

Add Answer


Question Tools

Stats

Asked: 2022-03-14 11:00:00 +0000

Seen: 9 times

Last updated: Jul 24 '21