String replace with regex python

I need some help on declaring a regex. My inputs are like the following:

this is a paragraph with<[1> in between</[1> and then there are cases ... where the<[99> number ranges from 1-100</[99>. 
and there are many other lines in the txt files
with<[3> such tags </[3>

The required output is:

this is a paragraph with in between and then there are cases ... where the number ranges from 1-100. 
and there are many other lines in the txt files
with such tags

I've tried this:

#!/usr/bin/python
import os, sys, re, glob
for infile in glob.glob(os.path.join(os.getcwd(), '*.txt')):
    for line in reader: 
        line2 = line.replace('<[1> ', '')
        line = line2.replace('</[1> ', '')
        line2 = line.replace('<[1>', '')
        line = line2.replace('</[1>', '')
        
        print line

I've also tried this (but it seems like I'm using the wrong regex syntax):

        line2 = line.replace('<[*> ', '')
        line = line2.replace('</[*> ', '')
        line2 = line.replace('<[*>', '')
        line = line2.replace('</[*>', '')

I dont want to hard-code the replace from 1 to 99.

String replace with regex python

If you want to replace the string that matches the regular expression instead of a perfect match, use the sub() method of the re module.

To replace a string in Python using regex(regular expression), use the regex sub() method. The re.sub() is a built-in Python method that accepts five arguments maximum and returns replaced string.

You can also use the str.replace() method to replace the string; the new string will be replaced to match the old string entirely.

Python regex sub()

Python re.sub() function in the re module can replace substrings.

To use the sub() method, first, we have to import the re module, and then we can use its sub() method.

See the syntax of the re.sub() method.

Syntax

re.sub(pattern, repl, string, count=0, flags=0)

See the following code.

import re

str = ''

print(re.sub('[a-z]*@', 'ApD@', str))

In the above example, we try to replace just small letter cases from a to z before @ character.

See the output.

From the output, we can see that we have successfully updated the email addresses.

Specify the count

In the Python regex sub() method, we can pass a count parameter. 

It suggests to the compiler that please don’t replace more than count’s value.

import re

str = ''

print(re.sub('[a-z]*@', 'ApD@', str, 2))

Output

  

From the output, you can see that it replaced only two email addresses, and the last address is as it is.

Replace multiple substrings with the same string

You can enclose the string with [ ] to match any single character in it. It can be used to replace multiple different characters with the same string.

import re

str = '  '

print(re.sub('[a-z]*@', 'info@', str))

Output

  

You can see that it replaces the same string with multiple substrings.

If | delimits patterns, it matches any pattern. And also, it is possible to use special characters of regular expression for each pattern, but it is OK even if the usual string is specified as it is.

It can be used to replace multiple different strings with the same string.

import re

str = '  '

print(re.sub('gmail|hotmail|apple', 'appdividend', str))

In the above code, we have a string that consists of three substrings.

Now, we are replacing any of the three substrings with an appdividend. If it finds all, it will replace all; if it finds two matches, it will replace two substrings. If it finds one match, then it will replace one substring.

Output

  

You can see that all substring is replaced with appdividend.

Let’s see a scenario in which only one pattern matches.

import re

str = '  '

print(re.sub('gmail|hotmail|apple', 'appdividend', str))

Output

  

You can see that and are unchanged.

Only is replaced with .

Replace using the matched part

To replace a substring using the matched part, you can use the string that matches the part enclosed in () in the new string. See the following code.

import re

str = '  '

print(re.sub('([a-z]*)@', '\1 19-@', str))

Output

    

The \1 corresponds to the part that matches (). If there are multiple ( ), use them like \2, \3 …

It is important to escape \ like \\1 if it is a regular string surrounded by the ” or ” “, but if it is the raw string with r at the beginning like r”, you can write \1.

Replace string by position using a slice

There is no method for specifying the position and replacing it; dividing by the slice and concatenating them with the arbitrary string, a new string in which a specified position is replaced can be created.

import re

str = 'albertodelrio'

print(str[:4] + 'HANAYA' + str[7:])

Output

albeHANAYAdelrio

In the above code example, you can see that we have added a substring between string index 4 to 7.

Conclusion

In this tutorial, we have seen how to replace a string in Python using regular expressions. First, we have imported the re module, and then we have used the re.sub() method to replace one string, multiple strings, matched string, and replace string by its position using the slice.

That’s it for this tutorial.

See also

Python regular expressions

Python String replace()

Python String rstrip()

Python String strip

Python f-Strings

Can I use regex in replace Python?

Regex can be used to perform various tasks in Python. It is used to do a search and replace operations, replace patterns in text, check if a string contains the specific pattern.

Can you use regex to replace string?

The Regex. Replace(String, String, MatchEvaluator, RegexOptions) method is useful for replacing a regular expression match in if any of the following conditions is true: The replacement string cannot readily be specified by a regular expression replacement pattern.

How do you replace all occurrences of a regex pattern in a string in Python?

sub() method will replace all pattern occurrences in the target string. By setting the count=1 inside a re. sub() we can replace only the first occurrence of a pattern in the target string with another string. Set the count value to the number of replacements you want to perform.

How do you replace all occurrences of a regex pattern in a string?

The replaceAll() method returns a new string with all matches of a pattern replaced by a replacement . The pattern can be a string or a RegExp , and the replacement can be a string or a function to be called for each match.