Split a sentence into words python

Given a Sentence, write a Python program to convert the given sentence into a list of words. 

Examples: 

Input : 'Hello World'
Output : ['Hello', 'world']

Method 1: Split a sentence into a list using split()

The simplest approach provided by Python to convert the given list of Sentences into words with separate indices is to use split() method. This method split a string into a list where each word is a list item. We have alternative ways to use this function in order to achieve the required output.

Python3

lst =  "Geeks For geeks"

print( lst.split())

Output: 

['Geeks', 'For', 'geeks']

Method 2: Split a sentence into a list using for loop 

We can also use a Python for loop to split the first element. This method is also beneficial if we have more than one element.  

Python3

def convert(lst):

    return ([i for i in lst.split()])

lst =  'Geeksforgeeks is a portal for geeks'

print( convert(lst))

Output: 

['Geeksforgeeks', 'is', 'a', 'portal', 'for', 'geeks']

Method 3: Split a sentence into a list using join() 

We can split the given list and then join using join() function. We can also use this when you have a list of strings or a single string inside a list.  

Python3

def convert(lst):

    return ''.join(lst).split()

lst =  'Hello Geeks for geeks'

print( convert(lst))

Output: 

['Hello', 'Geeks', 'for', 'geeks']

Method 4: Split a sentence into a list using nltk

For our particular issue, the nltk library’s word tokenize() method can be used. This function divides a string into several substrings by taking a string as an input parameter.

Python3

import nltk

nltk.download('punkt')

string = "This is a sentence"

lst = nltk.word_tokenize(string)

print(lst)

Output:

['This', 'is', 'geeksforgeeks']

How about this algorithm? Split text on whitespace, then trim punctuation. This carefully removes punctuation from the edge of words, without harming apostrophes inside words such as we're.

>>> text
"'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'"

>>> text.split()
["'Oh,", 'you', "can't", 'help', "that,'", 'said', 'the', 'Cat:', "'we're", 'all', 'mad', 'here.', "I'm", 'mad.', "You're", "mad.'"]

>>> import string
>>> [word.strip(string.punctuation) for word in text.split()]
['Oh', 'you', "can't", 'help', 'that', 'said', 'the', 'Cat', "we're", 'all', 'mad', 'here', "I'm", 'mad', "You're", 'mad']

On this page: .split(), .join(), and list().

Splitting a Sentence into Words: .split()

Below, mary is a single string. Even though it is a sentence, the words are not represented as discreet units. For that, you need a different data type: a list of strings where each string corresponds to a word. .split() is the method to use:

 
>>> mary = 'Mary had a little lamb'
>>> mary.split() 
['Mary', 'had', 'a', 'little', 'lamb'] 

.split() splits mary on whitespce, and the returned result is a list of words in mary. This list contains 5 items as the len() function demonstrates. len() on mary, by contrast, returns the number of characters in the string (including the spaces).

 
>>> mwords = mary.split() 
>>> mwords
['Mary', 'had', 'a', 'little', 'lamb'] 
>>> len(mwords)                # number of items in mwords
5 
>>> len(mary)                  # number of characters
22 

Whitespace characters include space ' ', the newline character '\n', and tab '\t', among others. .split() separates on any combined sequence of those characters:

 
>>> chom = ' colorless     green \n\tideas\n'       # ' ', '\n', '\t' bunched up
>>> print(chom)
 colorless     green 
	ideas
 
>>> chom.split()
['colorless', 'green', 'ideas'] 

Splitting on a Specific Substring

By providing an optional parameter, .split('x') can be used to split a string on a specific substring 'x'. Without 'x' specified, .split() simply splits on all whitespace, as seen above.

 
>>> mary = 'Mary had a little lamb'
>>> mary.split('a')                 # splits on 'a'
['M', 'ry h', 'd ', ' little l', 'mb'] 
>>> hi = 'Hello mother,\nHello father.'
>>> print(hi)
Hello mother,
Hello father. 
>>> hi.split()                # no parameter given: splits on whitespace
['Hello', 'mother,', 'Hello', 'father.'] 
>>> hi.split('\n')                 # splits on '\n' only
['Hello mother,', 'Hello father.'] 

String into a List of Characters: list()

But what if you want to split a string into a list of characters? In Python, characters are simply strings of length 1. The list() function turns a string into a list of individual letters:

 
>>> list('hello world')
['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] 

More generally, list() is a built-in function that turns a Python data object into a list. When a string type is given, what's returned is a list of characters in it. When other data types are given, the specifics vary but the returned type is always a list. See this tutorial for details.

Joining a List of Strings: .join()

If you have a list of words, how do you put them back together into a single string? .join() is the method to use. Called on a "separator" string 'x', 'x'.join(y) joins every element in the list y separated by 'x'. Below, words in mwords are joined back into the sentence string with a space in between:

 
>>> mwords
['Mary', 'had', 'a', 'little', 'lamb'] 
>>> ' '.join(mwords)
'Mary had a little lamb' 

Joining can be done on any separator string. Below, '--' and the tab character '\t' are used.

 
>>> '--'.join(mwords)
'Mary--had--a--little--lamb' 
>>> '\t'.join(mwords)
'Mary\thad\ta\tlittle\tlamb' 
>>> print('\t'.join(mwords))
Mary    had     a       little  lamb 

The method can also be called on the empty string '' as the separator. The effect is the elements in the list joined together with nothing in between. Below, a list of characters is put back together into the original string:

 
>>> hi = 'hello world'
>>> hichars = list(hi)
>>> hichars
['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] 
>>> ''.join(hichars)
'hello world' 

How do you split text in words in Python?

A string can be split into substrings using the split(param) method. This method is part of the string object. The parameter is optional, but you can split on a specific string or character. Given a sentence, the string can be split into words.

How do you split a sentence into a letter in Python?

Use the list() class to split a word into a list of letters, e.g. my_list = list(my_str) . The list() class will convert the string into a list of letters. Copied!

How do you split a paragraph into a list of words in Python?

The simplest approach provided by Python to convert the given list of Sentences into words with separate indices is to use split() method. This method split a string into a list where each word is a list item.

What does the split () method return from a list of words?

Python string method split() returns a list of all the words in the string, using str as the separator (splits on all whitespace if left unspecified), optionally limiting the number of splits to num.