How do i open a text file with delimiter in python?

What is the best and easiest way to read the text file delimited by tab in python? I want to convert first column of text file into a list escaping first line (header).

import csv
with open ('data.txt', 'r') as f:
    first_row = [column[0] for column in csv.reader(f,delimiter='\t')]
    print (first_row)

The code above gives all the elements of first_column. How can I escape first line (header)?

asked Jun 12, 2013 at 1:51

lisalisa

611 gold badge2 silver badges8 bronze badges

2

Maybe I'm missing something in the question, but why not just slice off the first element of the list?

import csv
with open ('data.txt', 'r') as f:
    first_column = [row[0] for row in csv.reader(f,delimiter='\t')]
    print (first_column[1:])

answered Jun 12, 2013 at 2:04

Dave CostaDave Costa

46.4k8 gold badges55 silver badges71 bronze badges

0

Once you load the file, you can access data by column name. In this example, FirstColName is the first column name of the loaded file.

import pandas as pd
import numpy as np

file = pd.read_csv(r"C:\Users\hydro\a.txt", sep='\t')
firstCol = np.asarray(file.FirstColName)
print (firstCol)

answered Dec 12, 2017 at 14:51

SubhashiSubhashi

3,9571 gold badge22 silver badges21 bronze badges

Introduction

A tab-delimited file is a well-known and widely used text format for data exchange. By using a structure similar to that of a spreadsheet, it also allows users to present information in a way that is easy to understand and share across applications - including relational database management systems.

The IANA standard for tab-separated values requires the first line of the file to contain the field names. Additionally, other lines (which represent separate records) must have the same number of columns.

Other formats, such as comma-separated values, often pose the challenge of having to escape commas, which are frequent within text (as opposed to tabs).

Opening Files with Python

Before we dive into processing tab-separated values, we will review how to read and write files with Python. The following example uses the open() built-in function to open a file named players.txt located in the current directory:

1    with open('players.txt') as players_data:
2    	players_data.read()

python

The open() function accepts an optional parameter that indicates how the file will be used. If not present, read-only mode is assumed. Other alternatives include, but are not limited to, 'w' (open for writing in truncate mode) and 'a' (open for writing in append mode).

After pressing Enter twice to execute the above suite, we will see tabs (\t) between fields, and new line breaks (\n) as record separators in Fig. 1:

How do i open a text file with delimiter in python?

Although we will be primarily concerned with extracting data from files, we can also write to them. Again, note the use of \n at the beginning to indicate a new record and \t to separate fields:

1    with open('players.txt', 'a') as players_data:
2    	players_data.write('\n{}\t{}\t{}\t{}\t{}\t{}\t{}'.format('Trey', 'Burke', '23', '1.85', '2013', '79.4', '23.2'))

python

Although the format() function helps with readability, there are more efficient methods to handle both reading and writing - all available within the same module in the standard library. This is particularly important if we are dealing with large files.

Introducing the CSV Module

Although it was named after comma-separated values, the CSV module can manage parsed files regardless of the field delimiter - be it tabs, vertical bars, or just about anything else. Additionally, this module provides two classes to read from and write data to Python dictionaries (DictReader and DictWriter, respectively). In this guide we will focus on the former exclusively.

First off, we will import the CSV module:

Next, we will open the file in read-only mode, instantiate a CSV reader object, and use it to read one row at a time:

1    with open('nba_games_november2018_visitor_wins.txt', newline = '') as games:                                                                                          
2    	game_reader = csv.reader(games, delimiter='\t')
3    	for game in game_reader:
4    		print(game)

python

Although it is not strictly necessary in our case, we will pass newline = '' as an argument to the open() function as per the module documentation. If our file contains newlines inside quoted fields, this ensures that they will be processed correctly.

Fig. 2 shows that each row was read into a list after the above suite was executed:

How do i open a text file with delimiter in python?

Although this undoubtedly looks much better than our previous version where tabs and new lines were mixed with the actual content, there is still room for improvement.

The DictReader Class

To begin, we will create an empty list where we will store each game as a separate dictionary:

Finally, we will repeat the same code as above with only a minor change. Instead of printing each row, we will add it to games_list. If you are using Python 3.5 or older, you can omit dict() and use games_list.append(game) instead. In Python 3.6 and newer, this function is used to turn the ordered dictionary into a regular one for better readability and easier manipulation.

1    with open('nba_games_november2018_visitor_wins.txt', newline = '') as games:                                                                                          
2    	game_reader = csv.DictReader(games, delimiter='\t')
3    	for game in game_reader:
4    		games_list.append(dict(game))

python

We can go one step further and use list comprehension to return only those games where the visitor score was greater than 130. The following statement creates a new list called visitor_big_score_games and populates it with each game inside games_list where the condition is true:

1    visitor_big_score_games = [game for game in games_list if int(game['Visitor score']) > 130]

python

Now that we have a list of dictionaries, we can write it to a spreadsheet as explained in Importing Data from Microsoft Excel Files with Python or manipulate it otherwise. Another option consists of writing the list converted to string into a plain text file named visitor_big_score_games.json for distribution in JSON format:

1    with open('visitor_big_score_games.json', 'w') as games:
2    	games.write(str(visitor_big_score_games))

python

The write() function requires a string as an argument. That is why we had to convert the entire list into a string before performing the write operation.

If you just want to view the list, not turn it into a spreadsheet or a JSON file, you can alternatively use pprint() to display it in a user-friendly format as shown in Fig. 3:

1    import pprint as pp
2    pp.pprint(visitor_big_score_games)

python

How do i open a text file with delimiter in python?

As you can see, the possibilities are endless and the only limit is our imagination!

Summary

In this guide we learned how to import and manipulate data from tab-delimited files with Python. This not only is a highly valuable skill for data scientists, but for web developers and other IT professionals as well.

How do you use delimiter in a text file?

Field separator character.
Comma separated values (. csv) – Commas are used to delimit the fields in each record..
Tab separated values (. tsv) – Tabs are used to delimit the fields in each record..
Text files (. txt) – Commas, tabs, or another field separator character are used to delimit the fields in each record..

How do I read a separated file in Python?

The solutions below contains 5 main steps:.
Step 1: Open the text file using the open() function. ... .
Read through the file one line at a time using a for loop..
Split the line into an array. ... .
Output the content of each field using the print method..
Once the for loop is completed, close the file using the close() method..

How do you change the delimiter in a text file in Python?

How to change the delimiter in a CSV file.
Create a new Python file in the location where your CSV file is saved. ... .
Open up an Anaconda Prompt instance. ... .
Type python change_delimiter.py (replacing change_delimiter.py with the name of your Python file) then press Enter..

How do I use tab delimiter in Python?

A tab-delimited file uses just rwo punctuation rules to encode the data..
Each row is delimited by an ordinary newline character. This is usually the standard \n . ... .
Within a row, columns are delimited by a single character, often \t ..