Summary: in this tutorial, you’ll learn about the Python regex greedy mode and how to change the mode from greedy to non-greedy. By default, all quantifiers work in a greedy mode. It means that the quantifiers will try to match their preceding elements as much as possible. Let’s start with an example to understand how the regex greedy mode works. The unexpected result with greedy modeSuppose you have the following HTML fragment that represents a button element:
And you want to match the texts within the quotes ( To do that, you may come up with the following pattern that includes the quote (“), the dot (
The meaning of the pattern is as follows:
The following uses the
The program displays the following result:
The result is not what you expected. By default, the quantifier (+) runs in the greedy mode, in which it tries to match the preceding element ( How Python regex greedy mode worksFirst, the regex engine starts matching from the first
character in the string Next, because the first character is Then, the regex engine examines the
pattern and matches the string with the next rule Because the After that, the regex engine examines the last rule in the pattern, which is a quote (“). However, it already reaches the end of the string. There’s no more character to match. It is too greedy to go too far. Finally, the regex engine goes back from the end of the string to find the quote (“). This step is called backtracking. As a result, the match is the following substring which is not what we expected:
To fix this issue, you need to instruct the quantifier ( To do that, you add a question mark (
The following program returns the expected result:
Output:
Summary
Did you find this tutorial helpful ? |