Given a Sentence, write a Python program to convert the given sentence into a list of words.
Examples:
Input : 'Hello World' Output : ['Hello', 'world']Method 1: Split a sentence into a list using split()
The simplest approach provided by Python to convert the given list of Sentences into words with separate indices is to use split() method. This method split a string into a list where each word is a list item. We have alternative ways to use this function in order to achieve the required output.
Python3
lst = "Geeks For geeks"
print( lst.split())
Output:
['Geeks', 'For', 'geeks']Method 2: Split a sentence into a list using for loop
We can also use a Python for loop to split the first element. This method is also beneficial if we have more than one element.
Python3
def convert(lst):
return ([i for i in lst.split()])
lst = 'Geeksforgeeks is a portal for geeks'
print( convert(lst))
Output:
['Geeksforgeeks', 'is', 'a', 'portal', 'for', 'geeks']Method 3: Split a sentence into a list using join()
We can split the given list and then join using join() function. We can also use this when you have a list of strings or a single string inside a list.
Python3
def convert(lst):
return ''.join(lst).split()
lst = 'Hello Geeks for geeks'
print( convert(lst))
Output:
['Hello', 'Geeks', 'for', 'geeks']Method 4: Split a sentence into a list using nltk
For our particular issue, the nltk library’s word tokenize() method can be used. This function divides a string into several substrings by taking a string as an input parameter.
Python3
import nltk
nltk.download('punkt')
string = "This is a sentence"
lst = nltk.word_tokenize(string)
print(lst)
Output:
['This', 'is', 'geeksforgeeks']How about this algorithm? Split text on whitespace, then trim punctuation. This carefully removes punctuation from the edge of words, without harming apostrophes inside words such as we're.
>>> text "'Oh, you can't help that,' said the Cat: 'we're all mad here. I'm mad. You're mad.'" >>> text.split() ["'Oh,", 'you', "can't", 'help', "that,'", 'said', 'the', 'Cat:', "'we're", 'all', 'mad', 'here.', "I'm", 'mad.', "You're", "mad.'"] >>> import string >>> [word.strip(string.punctuation) for word in text.split()] ['Oh', 'you', "can't", 'help', 'that', 'said', 'the', 'Cat', "we're", 'all', 'mad', 'here', "I'm", 'mad', "You're", 'mad']On this page: .split(), .join(), and list().
Splitting a Sentence into Words: .split()
Below, mary is a single string. Even though it is a sentence, the words are not represented as discreet units. For that, you need a different data type: a list of strings where each string corresponds to a word. .split() is the method to use:
>>> mary = 'Mary had a little lamb' >>> mary.split() ['Mary', 'had', 'a', 'little', 'lamb'] |
>>> mwords = mary.split() >>> mwords ['Mary', 'had', 'a', 'little', 'lamb'] >>> len(mwords) # number of items in mwords 5 >>> len(mary) # number of characters 22 |
>>> chom = ' colorless green \n\tideas\n' # ' ', '\n', '\t' bunched up >>> print(chom) colorless green ideas >>> chom.split() ['colorless', 'green', 'ideas'] |
Splitting on a Specific Substring
By providing an optional parameter, .split('x') can be used to split a string on a specific substring 'x'. Without 'x' specified, .split() simply splits on all whitespace, as seen above.
>>> mary = 'Mary had a little lamb' >>> mary.split('a') # splits on 'a' ['M', 'ry h', 'd ', ' little l', 'mb'] >>> hi = 'Hello mother,\nHello father.' >>> print(hi) Hello mother, Hello father. >>> hi.split() # no parameter given: splits on whitespace ['Hello', 'mother,', 'Hello', 'father.'] >>> hi.split('\n') # splits on '\n' only ['Hello mother,', 'Hello father.'] |
String into a List of Characters: list()
But what if you want to split a string into a list of characters? In Python, characters are simply strings of length 1. The list() function turns a string into a list of individual letters:
>>> list('hello world') ['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] |
Joining a List of Strings: .join()
If you have a list of words, how do you put them back together into a single string? .join() is the method to use. Called on a "separator" string 'x', 'x'.join(y) joins every element in the list y separated by 'x'. Below, words in mwords are joined back into the sentence string with a space in between:
>>> mwords ['Mary', 'had', 'a', 'little', 'lamb'] >>> ' '.join(mwords) 'Mary had a little lamb' |
>>> '--'.join(mwords) 'Mary--had--a--little--lamb' >>> '\t'.join(mwords) 'Mary\thad\ta\tlittle\tlamb' >>> print('\t'.join(mwords)) Mary had a little lamb |
>>> hi = 'hello world' >>> hichars = list(hi) >>> hichars ['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd'] >>> ''.join(hichars) 'hello world' |