Convert text to dataframe python

code:

df = [
    'Timestamp;T;Pressure [bar];Input line pressure [bar];Speed [rpm];Angular Position [degree];Wheel speed [rpm];Wheel angular position [degree];',
    ';1;5,281;5,303;219,727;10,283;216,363;45;',
    ';1;5,273;5,277;219,727;11,602;216,363;45;',
    ';1;5,288;5,293;205,078;12,832;216,363;45;',
    ';1;5,316;5,297;219,727;14,15;216,363;45;',
    ';1;5,314;5,307;219,727;15,469;216,363;45;',
    ';1;5,288;5,3;219,727;16,787;216,363;45;',
    ';1;5,318000000000001;5,31;219,727;18,105;216,363;45;',
    ';1;5,304;5,3;219,727;19,424;216,388;56,25;',
    ';1;5,291;5,29;219,947;20,742;216,388;56,25;',
    ';1;5,316;5,297;219,507;22,061;216,388;56,25;']

mat = [n.split(';') for n in df]
print(mat)
newdf1 = pd.DataFrame(mat)
newdf1.columns = newdf1.iloc[0]
newdf1 = newdf1.reindex(newdf1.index.drop(0))
# newdf2 = pd.DataFrame.from_dict(df)
print(newdf1)

output:

0  Timestamp  T     Pressure [bar] Input line pressure [bar] Speed [rpm]  \
1             1              5,281                     5,303     219,727   
2             1              5,273                     5,277     219,727   
3             1              5,288                     5,293     205,078   
4             1              5,316                     5,297     219,727   
5             1              5,314                     5,307     219,727   
6             1              5,288                       5,3     219,727   
7             1  5,318000000000001                      5,31     219,727   
8             1              5,304                       5,3     219,727   
9             1              5,291                      5,29     219,947   
10            1              5,316                     5,297     219,507   

0  Angular Position [degree] Wheel speed [rpm]  \
1                     10,283           216,363   
2                     11,602           216,363   
3                     12,832           216,363   
4                      14,15           216,363   
5                     15,469           216,363   
6                     16,787           216,363   
7                     18,105           216,363   
8                     19,424           216,388   
9                     20,742           216,388   
10                    22,061           216,388   

0  Wheel angular position [degree]    
1                               45    
2                               45    
3                               45    
4                               45    
5                               45    
6                               45    
7                               45    
8                            56,25    
9                            56,25    
10                           56,25 

  1. HowTo
  2. Python Pandas Howtos
  3. Load Data From Text File in Pandas

Created: March-19, 2020 | Updated: December-10, 2020

  1. read_csv() Method to Load Data From Text File
  2. read_fwf() Method to Load Width-Formated Text File to Pandas DataFrame
  3. read_table() Method to Load Text File to Pandas DataFrame

We will introduce the methods to load the data from a txt file with Pandas DataFrame. We will also go through the available options.

First, we will create a simple text file called sample.txt and add the following lines to the file:

45 apple orange banana mango
12 orange kiwi onion tomato

We need to save it to the same directory from where Python script will be running.

read_csv() Method to Load Data From Text File

read_csv() is the best way to convert the text file into Pandas DataFrame. We need to set header=None as we don’t have any header in the above-created file. We can also set keep_default_na=False inside the method if we wish to replace empty values with NaN.

Example Codes:

# python 3.x
import pandas as pd
df = pd.read_csv(
    'sample.txt', sep=" ",header=None)
print(df)

Output:

    0       1       2       3       4
0  45   apple  orange  banana   mango
1  12  orange    kiwi   onion  tomato

We set sep=" " because a single white space separates values. Similarly, we can set sep="," if we read data from a comma-separated file. Replace the white spaces inside sample.txt with , and then run the code after replacing sep=" " with sep=",".

Sample.txt

45,apple,orange,banana,mango
12,orange,kiwi,,tomato

Code:

# python 3.x
import pandas as pd
df = pd.read_csv(
    'sample.txt', sep=",",header=None)
print(df)

Output:

    0       1       2       3       4
0  45   apple  orange  banana   mango
1  12  orange    kiwi     NaN  tomato

read_fwf() Method to Load Width-Formated Text File to Pandas DataFrame

read_fwf() is very helpful to load a width-formatted text file. We can’t use sep because different values may have different delimiters. Consider the following text file:

Sample.txt

45 apple  orange banana mango
12 orange kiwi   onion  tomato

In Sample.text, delimiter is not the same for all values. So read_fwf() will do the job here.

Code:

# python 3.x
import pandas as pd
df = pd.read_fwf(
    'sample.txt',header=None)
print(df)

Output:

    0       1       2       3       4
0  45   apple  orange  banana   mango
1  12  orange    kiwi   onion  tomato

read_table() Method to Load Text File to Pandas DataFrame

read_table() is another approach to load data from text file to Pandas DataFrame.

Sample.txt:

45 apple orange banana mango
12 orange kiwi onion tomato

Code:

# python 3.x
import pandas as pd
df = pd.read_table(
    'sample.txt',header=None,sep=" ")
print(df)

Output:

    0       1       2       3       4
0  45   apple  orange  banana   mango
1  12  orange    kiwi   onion  tomato
Convert text to dataframe python

How do I import a text file into a DataFrame?

Method 1: Using read_csv().
filename. txt: As the name suggests it is the name of the text file from which we want to read data..
sep: It is a separator field. ... .
header: This is an optional field. ... .
names: We can assign column names while importing the text file by using the names argument..

What are the ways to store text data in pandas?

There are two ways to store text data in pandas: object -dtype NumPy array. StringDtype extension type..
You can accidentally store a mixture of strings and non-strings in an object dtype array. ... .
object dtype breaks dtype-specific operations like DataFrame..

How do I rename a column in DF?

One way of renaming the columns in a Pandas Dataframe is by using the rename() function. This method is quite useful when we need to rename some selected columns because we need to specify information only for the columns which are to be renamed.

What is the use of To_string in Python?

Render a DataFrame to a console-friendly tabular output. Buffer to write to. If None, the output is returned as a string.