Replace multiple spaces with single space python pandas

How do I remove multiple spaces between two strings in python.

e.g:-

"Bertug 'here multiple blanks' Mete" => "Bertug        Mete"

to

"Bertug Mete" 

Input is read from an .xls file. I have tried using split() but it doesn't seem to work as expected.

import pandas as pd , string , re

dataFrame = pd.read_excel("C:\\Users\\Bertug\\Desktop\\example.xlsx")

#names1 =  ''.join(dataFrame.Name.to_string().split()) 

print(type(dataFrame.Name))

#print(dataFrame.Name.str.split())

Let me know where I'm doing wrong.

sai

4145 silver badges13 bronze badges

asked Mar 28, 2017 at 13:50

2

I think use replace:

df.Name = df.Name.replace(r'\s+', ' ', regex=True)

Sample:

df = pd.DataFrame({'Name':['Bertug     Mete','a','Joe    Black']})
print (df)
              Name
0  Bertug     Mete
1                a
2     Joe    Black

df.Name = df.Name.replace(r'\s+', ' ', regex=True)
#similar solution
#df.Name = df.Name.str.replace(r'\s+', ' ')
print (df)
          Name
0  Bertug Mete
1            a
2    Joe Black

answered Mar 28, 2017 at 13:51

Replace multiple spaces with single space python pandas

jezraeljezrael

757k83 gold badges1204 silver badges1139 bronze badges

3

Replace multiple spaces with a single space in Python #

To replace multiple spaces with a single space:

  1. Use the str.split() method to split the string on each whitespace character.
  2. Use the str.join() method to join the list of strings with a space.
  3. The words in the new string will be separated by a single space.

Copied!

import re my_str = 'avocado banana kiwi apricot' # ✅ replace multiple whitespace characters with single space result = " ".join(my_str.split()) print(repr(result)) # 👉️ 'avocado banana kiwi apricot' # ---------------------------------------------- # ✅ replace multiple spaces with single space result_2 = re.sub(' +', ' ', my_str) print(repr(result_2)) # 👉️ 'avocado banana kiwi apricot'

The first example uses the str.split() and str.join() methods to replace multiple whitespace characters with a single space.

The str.split() method splits the string into a list of substrings using a delimiter.

When the str.split() method is called without a separator, it considers consecutive whitespace characters as a single separator.

Copied!

my_str = 'avocado banana kiwi apricot' # 👇️ ['avocado', 'banana', 'kiwi', 'apricot'] print(my_str.split())

When called without an argument, the str.split() method splits on consecutive whitespace characters (e.g. \t, \n, etc), not only spaces.

The next step is to use the str.join() method to join the list of strings with a space separator.

Copied!

my_str = 'avocado banana kiwi apricot' result = " ".join(my_str.split()) print(repr(result)) # 👉️ 'avocado banana kiwi apricot' # 👇️ 'avocado banana kiwi apricot' print(' '.join(['avocado', 'banana', 'kiwi', 'apricot']))

The str.join method takes an iterable as an argument and returns a string which is the concatenation of the strings in the iterable.

The string the method is called on is used as the separator between the elements.

An alternative approach is to use the re.sub() method.

Use the re.sub() method to replace multiple spaces with a single space, e.g. result = re.sub(' +', ' ', my_str). The re.sub method will return a new string that is obtained by replacing all occurrences of multiple spaces with a single space.

Copied!

import re my_str = 'avocado banana kiwi apricot' result = re.sub(' +', ' ', my_str) print(repr(result)) # 👉️ 'avocado banana kiwi apricot'

The re.sub method returns a new string that is obtained by replacing the occurrences of the pattern with the provided replacement.

If the pattern isn't found, the string is returned as is.

The first argument we passed to the re.sub method is a regular expression.

In our regex, we have a space and a plus +.

The plus + is used to match the preceding character (the space) 1 or more times.

In it's entirety, the example replaces 1 or more consecutive spaces with a single space.

Note that the re.sub() method returns a new string, it doesn't mutate the original string as strings are immutable in Python.

How do I replace multiple spaces with a single space in Python?

Use the re. sub() method to replace multiple spaces with a single space, e.g. result = re. sub(' +', ' ', my_str) .

How do you get rid of multiple spaces in Python?

strip() Python String strip() function will remove leading and trailing whitespaces. If you want to remove only leading or trailing spaces, use lstrip() or rstrip() function instead.

How do I remove multiple spaces from a string?

Using sting split() and join() You can also use the string split() and join() functions to remove multiple spaces from a string. We get the same result as above. Note that the string split() function splits the string at whitespace characters by default.

How do you get rid of spaces in pandas?

Pandas provide 3 methods to handle white spaces(including New line) in any text data. As it can be seen in the name, str. lstrip() is used to remove spaces from the left side of string, str. rstrip() to remove spaces from right side of the string and str.