I am trying to read a macro-enabled Excel worksheet using pandas.read_excel with the xlrd library. It's running fine in local, but when I try to push the same into PCF, I am getting this error:
2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] df1=pd.read_excel(os.path.join(APP_PATH, os.path.join("Data", "aug_latest.xlsm")),sheet_name=None) 2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] return open_workbook(filepath_or_buffer) 2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] File "/home/vcap/deps/0/python/lib/python3.8/site-packages/xlrd/__init__.py", line 170, in open_workbook 2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported') 2020-12-11T21:09:53.441+05:30 [APP/PROC/WEB/0] [ERR] xlrd.biffh.XLRDError: Excel xlsx file; not supportedHow can I resolve this error?
asked Dec 11, 2020 at 15:53
1
As noted in the release email, linked to from the release tweet and noted in large orange warning that appears on the front page of the documentation, and less orange, but still present, in the readme on the repository and the release on pypi:
xlrd has explicitly removed support for anything other than xls files.
In your case, the solution is to:
- make sure you are on a recent version of Pandas, at least 1.0.1, and preferably the latest release. 1.2 will make his even clearer.
- install openpyxl: //openpyxl.readthedocs.io/en/stable/
- change your Pandas code to be: df1 = pd.read_excel( os.path.join(APP_PATH, "Data", "aug_latest.xlsm"), engine='openpyxl', )
answered Dec 12, 2020 at 14:49
Chris WithersChris Withers
10.2k4 gold badges30 silver badges47 bronze badges
9
The previous version, xlrd 1.2.0, may appear to work, but it could also expose you to potential security vulnerabilities. With that warning out of the way, if you still want to give it a go, type the following command:
pip install xlrd==1.2.0
answered Dec 11, 2020 at 16:47
tryhardtryhard
2,4811 gold badge2 silver badges3 bronze badges
6
Pandas provide methods to read different file formats using a single line of code. When reading an xlsx file, rd.biffh.XLRDError: Excel xlsx file; not supported error might occur.
You can solve xlrd.biffh.XLRDError: Excel xlsx file; not supported Error by updating the Pandas library version to the latest version.
Now let us see the reason for this error and how to solve it.
Reason For the Error
The Pandas cannot open an Excel (.xlsx) file when you use the read_excel() method available in the Pandas library version earlier than V1.2.0.
Because the versions older than 1.2.0 internally uses the xlrd library to read the excel files. The xlrd library supports ONLY reading the .xls files. It doesn’t support reading the .xlsx or .xlsm files any longer.
Hence, you need to use a different approach to solve this problem.
- Solution 1 — Upgrading Pandas To the Latest Version
- Solution 2 — Using Openpyxl Engine
- Conclusion
- You May Also Like
Solution 1 — Upgrading Pandas To the Latest Version
You can solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported error by upgrading the Pandas version to the latest version.
By default, the latest version uses the openpyxl library. This library supports reading the .xlsx file and .xlsm files.
To know the current Pandas library version, use the below snippet.
Code
pd.__version__Output
'1.1.2'You can upgrade the Pandas library to the latest version using the below statement.
Code
%pip install --upgrade pandasPrefixing the % in PIP lets you update the packages directly from Jupyter.
Output
Successfully installed pandas-1.4.2 Note: you may need to restart the kernel to use updated packages.The Pandas library is upgraded to the latest version, and also the dependent libraries are updated.
With the latest library, you can use the read_excel() method directly to read the excel files.
Use the code below to read the xlsx file or xlsm file.
Code
import pandas as pd df = pd.read_excel( "sample.xlsx” )The file will be read, and the dataframe will be populated. You can print the dataframe to see the values in the excel file.
Code
dfDataframe Will Look Like
NaN | NaN | NaN |
NaN | NaN | NaN |
NaN | NaN | NaN |
NaN | NaN | NaN |
This is how you need to solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported Error.
Solution 2 — Using Openpyxl Engine
You can use the openpyxl engine to read the xlsx file.
This is an alternate way to solve the 8xlrd.biffh.XLRDError: Excel xlsx file; not supported* Error.
If you do not want to upgrade the Pandas library to the latest version, you shall use this solution. However, it is always recommended to use the latest library versions to avoid security threats to your application.
Install the openpyxl library using the below snippet.
Code
%pip install openpyxlPrefixing the % in PIP lets you install the packages directly from Jupyter.
Output
Successfully installed openpyxl v3.0.9 Note: you may need to restart the kernel to use updated packages.Now, specify the openpyxl engine in the pandas read_excel() method to read the xlsx file or an xlsm file.
The code below demonstrates how to specify the openpyxl engine in the read_excel() method.
Code
df = pd.read_excel( "sample.xlsm", engine='openpyxl')The file will be read successfully, and you can print the dataframe to see the values.
dfConclusion
Pandas cannot open an Excel (.xlsx) file using the read_excel() method when you’re using the Pandas version earlier than V1.2.0.
You’ve learned how to solve the xlrd.biffh.XLRDError: Excel xlsx file; not supported error by updating the pandas library or using the alternate engine openpyxl.
It is also recommended to update the libraries to the latest versions to avoid any security threats to your application.
You May Also Like
- How To Solve Truth Value Of A Series Is Ambiguous. Use A.Empty, A.Bool(), A.Item(), A.Any() Or A.All()
- How To Solve Python Pandas Error Tokenizing Data Error?
- How to write pandas dataframe to CSV