How to read csv file in python pandas


Read CSV Files

A simple way to store big data sets is to use CSV files (comma separated files).

CSV files contains plain text and is a well know format that can be read by everyone including Pandas.

In our examples we will be using a CSV file called 'data.csv'.

Download data.csv. or Open data.csv

Example

Load the CSV into a DataFrame:

import pandas as pd

df = pd.read_csv('data.csv')

print(df.to_string())

Try it Yourself »

Tip: use to_string() to print the entire DataFrame.

If you have a large DataFrame with many rows, Pandas will only return the first 5 rows, and the last 5 rows:

Example

Print the DataFrame without the to_string() method:

import pandas as pd

df = pd.read_csv('data.csv')

print(df)

Try it Yourself »


max_rows

The number of rows returned is defined in Pandas option settings.

You can check your system's maximum rows with the pd.options.display.max_rows statement.

Example

Check the number of maximum returned rows:

import pandas as pd

print(pd.options.display.max_rows)

Try it Yourself »

In my system the number is 60, which means that if the DataFrame contains more than 60 rows, the print(df) statement will return only the headers and the first and last 5 rows.

You can change the maximum rows number with the same statement.

Example

Increase the maximum number of rows to display the entire DataFrame:

import pandas as pd

pd.options.display.max_rows = 9999

df = pd.read_csv('data.csv')

print(df)

Try it Yourself »




In this post, we’ll go over how to import a CSV File into Python.

Photo by AbsolutVision on Unsplash

Short Answer

The easiest way to do this :

import pandas as pddf = pd.read_csv ('file_name.csv')
print(df)

If you want to import a subset of columns, simply addusecols=['column_name'];

pd.read_csv('file_name.csv', usecols= ['column_name1','column_name2'])

If you want to use another separator, simply add sep='\t' ; Default separator is ',' .

pd.read_csv('file_name.csv', sep='\t')

Recap on Pandas DataFrame

Pandas DataFrames is an excel like data structure with labeled axes (rows and columns). Here is an example of pandas DataFrame that we will use as an example below:

Code to generate DataFrame:

Importing a CSV file into the DataFrame

Pandas read_csv() function imports a CSV file to DataFrame format.

Here are some options:

filepath_or_buffer: this is the file name or file path

df.read_csv('file_name.csv’) # relative position
df.read_csv('C:/Users/abc/Desktop/file_name.csv')

header: this allows you to specify which row will be used as column names for your dataframe. Expected an int value or a list of int values.

Default value is header=0, which means the first row of the CSV file will be treated as column names.

If your file doesn’t have a header, simply set header=None .

df.read_csv('file_name.csv’, header=None) # no header

The output of no header:

How to read csv file in python pandas

sep: Specify a custom delimiter for the CSV input, the default is a comma.

pd.read_csv('file_name.csv',sep='\t') # Use Tab to separate

index_col: This is to allow you to set which columns to be used as the index of the dataframe. The default value is None, and pandas will add a new column start from 0 to specify the index column.

It can be set as a column name or column index, which will be used as the index column.

pd.read_csv('file_name.csv',index_col='Name') # Use 'Name' column as index

nrows: Only read the number of first rows from the file. Needs an int value.

usecols: Specify which columns to import to the dataframe. It can a list of int values or column names.

pd.read_csv('file_name.csv',usecols=[1,2,3]) # Only reads col1, col2, col3. col0 will be ignored.
pd.read_csv('file_name.csv',usecols=['Name']) # Only reads 'Name' column. Other columns will be ignored.

converters: Helps to convert values in the columns by defined functions.

How to read csv file in python pandas

na_values: The default missing values will be NaN. Use this if you want other strings to be considered as NaN. The expected input is a list of strings.

pd.read_csv('file_name.csv',na_values=['a','b']) # a and b values will be treated as NaN after importing into dataframe.

How do I read a CSV file in Python using pandas?

Read CSV Files.
Load the CSV into a DataFrame: import pandas as pd. df = pd.read_csv('data.csv') ... .
Print the DataFrame without the to_string() method: import pandas as pd. ... .
Check the number of maximum returned rows: import pandas as pd. ... .
Increase the maximum number of rows to display the entire DataFrame: import pandas as pd..

How do I run a CSV file in Python?

Steps to read a CSV file:.
Import the csv library. import csv..
Open the CSV file. The .open() method in python is used to open files and return a file object. ... .
Use the csv.reader object to read the CSV file. csvreader = csv.reader(file).
Extract the field names. ... .
Extract the rows/records. ... .
Close the file..

How do I read a CSV file from specific data in Python?

How to Read Specific Columns from CSV File in Python.
Method 1: Using Pandas. ➤ List-Based Indexing of a DataFrame..
Method 2: Integer Based Indexing with iloc..
Method 3: Name-Based Indexing with loc().
Method 4: Using csv Module..
Conclusion..
Learn Pandas the Fun Way by Solving Code Puzzles..

How do I read a CSV file in pandas Jupyter notebook?

Steps to Import a CSV File into Python using Pandas.
Step 1: Capture the File Path. Firstly, capture the full path where your CSV file is stored. ... .
Step 2: Apply the Python code. ... .
Step 3: Run the Code. ... .
Optional Step: Select Subset of Columns..