Tôi đang sử dụng OpenPyXL để tạo một số bản tải xuống Excel từ một trong các ứng dụng Django của mình. In general, I'm pretty happy with how it works. Đặt chiều rộng cột thật dễ dàng. Tìm ra cách để làm điều đó từ tài liệu là khó. Vì vậy, đây là, chỉ trong trường hợp người khác cần biết (hoặc tôi một tuần kể từ bây giờ khi tôi quên) Vì một số lý do, việc đặt cài đặt chiều rộng bên trong pip install openpyxl 647 hoạt độngusing PyCall
xl = pyimport("openpyxl")
wb = xl.Workbook()
ws = wb.active
# ws.column_dimensions["A"].width = 25 ## ERROR: KeyError: key "A" not found
py"""
$ws.column_dimensions["A"].width = 25
"""
wb.save("foo.xlsx")
Worksheet objects have pip install openpyxl 648 and pip install openpyxl 649 attributes that control row heights and column widths. A sheet’s pip install openpyxl 648 and pip install openpyxl 649are dictionary-like values; row_dimensions contains RowDimension objects and column_dimensions contains ColumnDimension objects. Trong row_dimensions, người ta có thể truy cập một trong các đối tượng bằng cách sử dụng số của hàng (trong trường hợp này là 1 hoặc 2). Trong column_dimensions, người ta có thể truy cập một trong các đối tượng bằng chữ cái của cột (trong trường hợp này là A hoặc B)Mã số 1. Chương trình thiết lập kích thước của các ô
pip install openpyxl 652pip install openpyxl 653 pip install openpyxl 654pip install openpyxl 655pip install openpyxl 656pip install openpyxl 657pip install openpyxl 6490pip install openpyxl 6491 pip install openpyxl 6492pip install openpyxl 655pip install openpyxl 6494pip install openpyxl 6495pip install openpyxl 6496pip install openpyxl 6491 pip install openpyxl 6498pip install openpyxl 655pip install openpyxl 6480pip install openpyxl 6481pip install openpyxl 6491 pip install openpyxl 64823pip install openpyxl 6484pip install openpyxl 6491 pip install openpyxl 64823pip install openpyxl 64960pip install openpyxl 6491 pip install openpyxl 64962pip install openpyxl 6491 pip install openpyxl 64964pip install openpyxl 64843pip install openpyxl 6491 pip install openpyxl 64845pip install openpyxl 6538 First of all, you must make sure to install the openpyxl library. You can do the same by running the below command on your terminal pip install openpyxl How to install openpyxl in Python To change or modify column width sizeIn order to change the column width size, you can make use of the column_dimesnsions method of the worksheet class. Syntax. worksheet. column_dimensions[tên cột]. width=size Hãy để chúng tôi xem xét tương tự với ví dụ dưới đây Consider an existing excel file codespeedy. xlsx as shown below; So, now let us change the column size of column A; import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") As you can see, we have modified the column size of A to 20 and saved the file after modification as codespeedy1. xlsx Tương tự, bạn cũng có thể sửa đổi độ rộng cột của nhiều hàng như hình minh họa; import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
sheet.column_dimensions['C'].width = 20
sheet.column_dimensions['E'].width = 30
worksheet.save("codespeedy1.xlsx") Well, isn’t it amazing how you can manage such significant changes with such simple, small lines of code? Well, that in itself is the beauty of Python The pandas I/O API is a set of top level In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
05 functions accessed like In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
06 that generally return a pandas object. Các hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
07 tương ứng là các phương thức đối tượng được truy cập như In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
08. Below is a table containing available In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
09 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
10Format Type Data Description Reader Writer chữ CSV read_csv to_csv chữ Tệp văn bản có chiều rộng cố định read_fwf chữ JSON read_json to_json chữ HTML read_html to_html chữ LaTeX Styler. to_latex chữ XML read_xml to_xml chữ Bảng tạm cục bộ read_clipboard to_clipboard nhị phân MS Excel read_excel to_excel nhị phân tài liệu mở read_excel nhị phân Định dạng HDF5 read_hdf to_hdf nhị phân định dạng lông vũ read_feather to_feather nhị phân định dạng sàn gỗ read_parquet to_parquet nhị phân Định dạng ORC read_orc to_orc nhị phân trạng thái read_stata to_stata nhị phân SAS read_sas nhị phân SPSS read_spss nhị phân Định dạng dưa chua Python read_pickle to_pickle SQL SQL read_sql to_sql SQL Google BigQuery read_gbq to_gbq Đây là so sánh hiệu suất không chính thức đối với một số phương thức IO này. Ghi chú Đối với các ví dụ sử dụng lớp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
11, hãy đảm bảo bạn nhập lớp đó bằng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
12 cho Python 3Tệp CSV & văn bản#Hàm đặc biệt để đọc tệp văn bản (a. k. a. tệp phẳng) là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
13. Xem sách dạy nấu ăn để biết một số chiến lược nâng cao. Tùy chọn phân tích cú pháp #In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
13 chấp nhận các đối số phổ biến sauCăn bản#filepath_or_buffer khác nhauĐường dẫn đến tệp (một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
15, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
16 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
17), URL (bao gồm các vị trí http, ftp và S3) hoặc bất kỳ đối tượng nào có phương thức In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
18 (chẳng hạn như tệp đang mở hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
11)sep str, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
20 cho In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
13, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
22 cho In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
23Dấu phân cách để sử dụng. Nếu sep là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24, công cụ C không thể tự động phát hiện dấu phân cách, nhưng công cụ phân tích cú pháp Python thì có thể, nghĩa là công cụ phân tích cú pháp sau sẽ được sử dụng và tự động phát hiện dấu phân cách bằng công cụ trình thám thính dựng sẵn của Python, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
25. Ngoài ra, dấu phân cách dài hơn 1 ký tự và khác với In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
26 sẽ được hiểu là biểu thức chính quy và cũng sẽ buộc sử dụng công cụ phân tích cú pháp Python. Lưu ý rằng các dấu phân cách regex có xu hướng bỏ qua dữ liệu được trích dẫn. Ví dụ về biểu thức chính quy. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
27dấu phân cách str, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Tên đối số thay thế cho sep delim_whitespace boolean, default FalseChỉ định có hay không khoảng trắng (e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
29 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
30) will be used as the delimiter. Equivalent to setting In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
31. If this option is set to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32, nothing should be passed in for the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
33 parameterColumn and index locations and names#header int or list of ints, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
34Row number(s) to use as the column names, and the start of the data. Default behavior is to infer the column names. if no names are passed the behavior is identical to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
35 and column names are inferred from the first line of the file, if column names are passed explicitly then the behavior is identical to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
36. Explicitly pass In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
35 to be able to replace existing namesThe header can be a list of ints that specify row locations for a MultiIndex on the columns e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
38. Intervening rows that are not specified will be skipped (e. g. 2 in this example is skipped). Note that this parameter ignores commented lines and empty lines if In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
39, so header=0 denotes the first line of data rather than the first line of the filetên dạng mảng, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24List of column names to use. Nếu tệp không chứa hàng tiêu đề, thì bạn nên chuyển rõ ràng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
36. Bản sao trong danh sách này không được phépindex_col int, str, chuỗi int / str hoặc Sai, tùy chọn, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24(Các) cột để sử dụng làm nhãn hàng của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43, được cung cấp dưới dạng tên chuỗi hoặc chỉ mục cột. Nếu một chuỗi int / str được đưa ra, Multi Index được sử dụngGhi chú In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
44 có thể được sử dụng để buộc gấu trúc không sử dụng cột đầu tiên làm chỉ mục, e. g. khi bạn có tệp không đúng định dạng với dấu phân cách ở cuối mỗi dòngGiá trị mặc định của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 hướng dẫn gấu trúc đoán. Nếu số trường trong hàng tiêu đề cột bằng với số trường trong phần thân của tệp dữ liệu thì chỉ mục mặc định được sử dụng. Nếu nó lớn hơn, thì các cột đầu tiên được sử dụng làm chỉ mục sao cho số trường còn lại trong phần nội dung bằng với số trường trong tiêu đềHàng đầu tiên sau tiêu đề được sử dụng để xác định số lượng cột sẽ được đưa vào chỉ mục. If the subsequent rows contain less columns than the first row, they are filled with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46Điều này có thể tránh được thông qua In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47. Điều này đảm bảo rằng các cột được lấy nguyên trạng và dữ liệu theo sau bị bỏ quausecols giống như danh sách hoặc có thể gọi được, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Return a subset of the columns. If list-like, all elements must either be positional (i. e. integer indices into the document columns) or strings that correspond to column names provided either by the user in In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
49 or inferred from the document header row(s). Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
49 được cung cấp, (các) hàng tiêu đề tài liệu không được tính đến. For example, a valid list-like In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 parameter would be In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
52 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
53Element order is ignored, so In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
54 is the same as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
55. Để khởi tạo một DataFrame từ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
56 với thứ tự phần tử được giữ nguyên, hãy sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
57 cho các cột theo thứ tự In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
58 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
59 cho thứ tự In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
60Nếu có thể gọi được, hàm có thể gọi được sẽ được đánh giá dựa trên tên cột, trả về các tên mà hàm có thể gọi được đánh giá là True In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
0Sử dụng tham số này dẫn đến thời gian phân tích cú pháp nhanh hơn nhiều và sử dụng bộ nhớ thấp hơn khi sử dụng công cụ c. The Python engine loads the data first before deciding which columns to drop squeeze boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61Nếu dữ liệu được phân tích cú pháp chỉ chứa một cột thì hãy trả về In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62Không dùng nữa kể từ phiên bản 1. 4. 0. Nối In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
63 vào lệnh gọi tới In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
64 để nén dữ liệu. tiền tố str, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Tiền tố để thêm vào số cột khi không có tiêu đề, e. g. 'X' cho X0, X1, ... Không dùng nữa kể từ phiên bản 1. 4. 0. Sử dụng cách hiểu danh sách trên các cột của DataFrame sau khi gọi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7mangle_dupe_cols boolean, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32Các cột trùng lặp sẽ được chỉ định là 'X', 'X. 1’…’X. N’, rather than ‘X’…’X’. Passing in In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61 will cause data to be overwritten if there are duplicate names in the columnsKhông dùng nữa kể từ phiên bản 1. 5. 0. Đối số chưa bao giờ được triển khai và thay vào đó, một đối số mới có thể chỉ định mẫu đổi tên sẽ được thêm vào. Cấu hình phân tích cú pháp chung#dtype Nhập tên hoặc chính tả của cột -> loại, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Kiểu dữ liệu cho dữ liệu hoặc cột. e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
70 Sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
15 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
72 cùng với cài đặt In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 phù hợp để giữ nguyên và không diễn giải dtype. Nếu bộ chuyển đổi được chỉ định, chúng sẽ được áp dụng THAY THẾ cho chuyển đổi dtypeMới trong phiên bản 1. 5. 0. Đã thêm hỗ trợ cho defaultdict. Chỉ định một defaultdict làm đầu vào trong đó mặc định xác định dtype của các cột không được liệt kê rõ ràng. công cụ {In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
74, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
75, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
76}Công cụ phân tích cú pháp để sử dụng. The C and pyarrow engines are faster, while the python engine is currently more feature-complete. Đa luồng hiện chỉ được hỗ trợ bởi công cụ pyarrow New in version 1. 4. 0. Công cụ “pyarrow” đã được thêm làm công cụ thử nghiệm và một số tính năng không được hỗ trợ hoặc có thể không hoạt động chính xác với công cụ này. bộ chuyển đổi chính tả, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Dict của các chức năng để chuyển đổi các giá trị trong các cột nhất định. Các khóa có thể là số nguyên hoặc nhãn cột true_values danh sách, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Các giá trị được coi là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32false_values danh sách, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Values to consider as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61skipinitialspace boolean, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61Bỏ qua khoảng trắng sau dấu phân cách skiprows dạng danh sách hoặc số nguyên, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Số dòng cần bỏ qua (được lập chỉ mục 0) hoặc số dòng cần bỏ qua (int) ở đầu tệp Nếu có thể gọi được, hàm có thể gọi được sẽ được đánh giá dựa trên các chỉ số hàng, trả về True nếu hàng sẽ bị bỏ qua và Sai nếu không In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
5skipfooter int, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
84Number of lines at bottom of file to skip (unsupported with engine=’c’) nrows int, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Số hàng của tập tin để đọc. Hữu ích để đọc các phần của tệp lớn low_memory boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32Xử lý nội bộ tệp theo khối, dẫn đến việc sử dụng bộ nhớ thấp hơn trong khi phân tích cú pháp, nhưng có thể suy luận kiểu hỗn hợp. Để đảm bảo không có loại hỗn hợp, hãy đặt In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61 hoặc chỉ định loại bằng tham số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88. Note that the entire file is read into a single In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 regardless, use the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
91 parameter to return the data in chunks. (Chỉ hợp lệ với trình phân tích cú pháp C)memory_map boolean, mặc định SaiNếu đường dẫn tệp được cung cấp cho ______ 492, ánh xạ đối tượng tệp trực tiếp vào bộ nhớ và truy cập dữ liệu trực tiếp từ đó. Sử dụng tùy chọn này có thể cải thiện hiệu suất vì không còn bất kỳ chi phí I/O nào nữa NA và xử lý dữ liệu bị thiếu#na_values vô hướng, str, dạng danh sách hoặc dict, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Các chuỗi bổ sung để nhận dạng là NA/NaN. Nếu dict được thông qua, các giá trị NA cụ thể trên mỗi cột. Xem na values const bên dưới để biết danh sách các giá trị được hiểu là NaN theo mặc định. keep_default_na boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32Whether or not to include the default NaN values when parsing the data. Depending on whether In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 is passed in, the behavior is as followsNếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
96 là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 được chỉ định, thì In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 được thêm vào các giá trị NaN mặc định được sử dụng để phân tích cú phápIf In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
96 is In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32, and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 are not specified, only the default NaN values are used for parsingNếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
96 là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 được chỉ định, thì chỉ các giá trị NaN được chỉ định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 được sử dụng để phân tích cú phápNếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
96 là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 không được chỉ định, sẽ không có chuỗi nào được phân tích thành NaN
Note that if pip install openpyxl 1210 is passed in as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61, the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
96 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73 parameters will be ignoredna_filter boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32Phát hiện các điểm đánh dấu giá trị bị thiếu (chuỗi trống và giá trị của na_values). Trong dữ liệu không có bất kỳ NA nào, việc chuyển pip install openpyxl 1215 có thể cải thiện hiệu suất đọc một tệp lớnverbose boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61Cho biết số lượng giá trị NA được đặt trong các cột không phải là số skip_blank_lines boolean, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32, hãy bỏ qua các dòng trống thay vì diễn giải dưới dạng giá trị NaNXử lý ngày giờ #parse_dates boolean or list of ints or names or list of lists or dict, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61. Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32 -> thử phân tích cú pháp chỉ mụcNếu pip install openpyxl 1221 -> thử phân tích từng cột 1, 2, 3 thành một cột ngày riêng biệtNếu pip install openpyxl 1222 -> kết hợp cột 1 và 3 và phân tích dưới dạng một cột ngàyNếu pip install openpyxl 1223 -> phân tích cột 1, 3 thành ngày và gọi kết quả là 'foo'
Ghi chú Đường dẫn nhanh tồn tại cho các ngày có định dạng iso8601 infer_datetime_format boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32 và parse_dates được bật cho một cột, hãy thử suy ra định dạng ngày giờ để tăng tốc độ xử lýkeep_date_col boolean, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61If In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32 and parse_dates specifies combining multiple columns then keep the original columnsdate_parser , mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Hàm sử dụng để chuyển đổi một chuỗi các cột chuỗi thành một mảng các thể hiện thời gian. Mặc định sử dụng pip install openpyxl 1229 để thực hiện chuyển đổi. gấu trúc sẽ cố gắng gọi date_parser theo ba cách khác nhau, chuyển sang cách tiếp theo nếu xảy ra ngoại lệ. 1) Chuyển một hoặc nhiều mảng (như được định nghĩa bởi parse_dates) làm đối số; dayfirst boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61Ngày định dạng DD/MM, định dạng quốc tế và châu Âu cache_dates boolean, default TrueNếu Đúng, hãy sử dụng bộ nhớ cache của các ngày đã chuyển đổi, duy nhất để áp dụng chuyển đổi ngày giờ. Có thể tạo ra tốc độ tăng đáng kể khi phân tích chuỗi ngày trùng lặp, đặc biệt là các chuỗi có chênh lệch múi giờ Mới trong phiên bản 0. 25. 0 Lần lặp #trình lặp boolean, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61Trả về đối tượng pip install openpyxl 1232 để lặp lại hoặc nhận khối với pip install openpyxl 1233chunksize int, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Trả về đối tượng pip install openpyxl 1232 để lặp lại. Xem lặp lại và phân đoạn bên dưới. Định dạng trích dẫn, nén và tệp #nén {In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
34, pip install openpyxl 1237, pip install openpyxl 1238, pip install openpyxl 1239, pip install openpyxl 1240, pip install openpyxl 1241, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24, pip install openpyxl 1243}, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
34Để giải nén dữ liệu trên đĩa nhanh chóng. Nếu 'suy ra', thì hãy sử dụng gzip, bz2, zip, xz hoặc zstandard nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
92 giống như đường dẫn kết thúc bằng '. gz', '. bz2', '. nén', '. xz', '. zst', tương ứng và không giải nén nếu không. Nếu sử dụng 'zip', tệp ZIP chỉ được chứa một tệp dữ liệu để đọc trong. Đặt thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 để không giải nén. Can also be a dict with key pip install openpyxl 1247 set to one of {pip install openpyxl 1239, pip install openpyxl 1237, pip install openpyxl 1238, pip install openpyxl 1241} and other key-value pairs are forwarded to pip install openpyxl 1252, pip install openpyxl 1253, pip install openpyxl 1254, or pip install openpyxl 1255. As an example, the following could be passed for faster compression and to create a reproducible gzip archive. pip install openpyxl 1256Đã thay đổi trong phiên bản 1. 1. 0. tùy chọn dict được mở rộng để hỗ trợ pip install openpyxl 1257 và pip install openpyxl 1258. Đã thay đổi trong phiên bản 1. 2. 0. Các phiên bản trước đã chuyển tiếp các mục nhập chính tả cho ‘gzip’ tới pip install openpyxl 1259. nghìn str, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Dấu phân cách hàng nghìn thập phân str, mặc định pip install openpyxl 1261Ký tự để nhận dạng là dấu thập phân. e. g. sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
20 cho dữ liệu châu Âufloat_precision string, default NoneSpecifies which converter the C engine should use for floating-point values. Các tùy chọn là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 cho bộ chuyển đổi thông thường, pip install openpyxl 1264 cho bộ chuyển đổi có độ chính xác cao và pip install openpyxl 1265 cho bộ chuyển đổi khứ hồilineterminator str (độ dài 1), mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Ký tự để chia tệp thành các dòng. Only valid with C parser quotechar str (độ dài 1)The character used to denote the start and end of a quoted item. Các mục được trích dẫn có thể bao gồm dấu phân cách và nó sẽ bị bỏ qua quoting int or pip install openpyxl 1267 instance, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
84Control field quoting behavior per pip install openpyxl 1267 constants. Sử dụng một trong số pip install openpyxl 1270 (0), pip install openpyxl 1271 (1), pip install openpyxl 1272 (2) hoặc pip install openpyxl 1273 (3)trích dẫn kép boolean, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32When pip install openpyxl 1275 is specified and pip install openpyxl 1276 is not pip install openpyxl 1273, indicate whether or not to interpret two consecutive pip install openpyxl 1275 elements inside a field as a single pip install openpyxl 1275 elementescapechar str (độ dài 1), mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Chuỗi một ký tự được sử dụng để thoát khỏi dấu phân cách khi trích dẫn là pip install openpyxl 1273nhận xét str, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Cho biết phần còn lại của dòng không nên được phân tích cú pháp. Nếu được tìm thấy ở đầu dòng, dòng đó sẽ bị bỏ qua hoàn toàn. Tham số này phải là một ký tự đơn. Giống như các dòng trống (miễn là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
39), các dòng nhận xét đầy đủ bị bỏ qua bởi tham số pip install openpyxl 1284 chứ không phải bởi pip install openpyxl 1285. For example, if pip install openpyxl 1286, parsing ‘#empty\na,b,c\n1,2,3’ with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
35 will result in ‘a,b,c’ being treated as the headerencoding str, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Mã hóa để sử dụng cho UTF khi đọc/ghi (e. g. pip install openpyxl 1289). Danh sách mã hóa tiêu chuẩn Pythonphương ngữ str hoặc ví dụ pip install openpyxl 1290, mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Nếu được cung cấp, thông số này sẽ ghi đè giá trị (mặc định hoặc không) cho các thông số sau. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
33, pip install openpyxl 1293, pip install openpyxl 1294, pip install openpyxl 1295, pip install openpyxl 1275 và pip install openpyxl 1276. Nếu cần ghi đè các giá trị, Cảnh báo phân tích cú pháp sẽ được đưa ra. Xem tài liệu pip install openpyxl 1290 để biết thêm chi tiếtXử lý lỗi#error_bad_lines boolean, tùy chọn, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Dòng có quá nhiều trường (e. g. một dòng csv có quá nhiều dấu phẩy) theo mặc định sẽ gây ra một ngoại lệ và không có In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 nào được trả về. Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61, thì những "dòng xấu" này sẽ bị loại bỏ khỏi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 được trả về. Xem đường xấu bên dưới. Không dùng nữa kể từ phiên bản 1. 3. 0. Thay vào đó, nên sử dụng tham số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0503 để chỉ định hành vi khi gặp phải một dòng xấu. warn_bad_lines boolean, tùy chọn, mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24Nếu error_bad_lines là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61, vàWarner_bad_lines là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32, một cảnh báo cho mỗi “dòng xấu” sẽ được xuất raKhông dùng nữa kể từ phiên bản 1. 3. 0. Thay vào đó, nên sử dụng tham số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0503 để chỉ định hành vi khi gặp phải một dòng xấu. on_bad_lines ('lỗi', 'cảnh báo', 'bỏ qua'), 'lỗi' mặc địnhChỉ định những việc cần làm khi gặp phải một dòng xấu (một dòng có quá nhiều trường). Các giá trị được phép là 'lỗi', tăng ParserError khi gặp phải một dòng xấu ‘warn’, in cảnh báo khi gặp dòng xấu và bỏ qua dòng đó 'bỏ qua', bỏ qua các dòng xấu mà không báo trước hoặc cảnh báo khi gặp phải
Mới trong phiên bản 1. 3. 0 Chỉ định kiểu dữ liệu cột#Bạn có thể chỉ định loại dữ liệu cho toàn bộ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 hoặc từng cột riêng lẻIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
May mắn thay, pandas cung cấp nhiều cách để đảm bảo rằng (các) cột của bạn chỉ chứa một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88. Nếu chưa quen với những khái niệm này, bạn có thể xem tại đây để tìm hiểu thêm về dtypes và tại đâytại đây . tại đây . tại đây . tại đây . tại đây . tại đây . tại đây . tại đây . tại đây . to learn more about In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
72 conversion in pandas.Chẳng hạn, bạn có thể sử dụng đối số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0511 của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
13pip install openpyxl 12Hoặc bạn có thể sử dụng hàm import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0513 để ép buộc các dtypes sau khi đọc dữ liệu,import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 05sẽ chuyển đổi tất cả phân tích cú pháp hợp lệ thành float, để lại phân tích cú pháp không hợp lệ là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46Cuối cùng, cách bạn xử lý việc đọc trong các cột có chứa các kiểu dữ liệu hỗn hợp tùy thuộc vào nhu cầu cụ thể của bạn. Trong trường hợp trên, nếu bạn muốn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46 loại bỏ các điểm bất thường về dữ liệu, thì import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0513 có lẽ là lựa chọn tốt nhất của bạn. Tuy nhiên, nếu bạn muốn tất cả dữ liệu được ép buộc, bất kể loại nào, thì việc sử dụng đối số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0511 của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
13 chắc chắn sẽ đáng để thửGhi chú Trong một số trường hợp, việc đọc dữ liệu bất thường với các cột chứa các kiểu dữ liệu hỗn hợp sẽ dẫn đến tập dữ liệu không nhất quán. Nếu bạn dựa vào gấu trúc để suy ra các kiểu dữ liệu của các cột, công cụ phân tích cú pháp sẽ đi và suy ra các kiểu dữ liệu cho các khối dữ liệu khác nhau, thay vì toàn bộ tập dữ liệu cùng một lúc. Do đó, bạn có thể kết thúc với (các) cột có các kiểu dữ liệu hỗn hợp. Ví dụ, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 24sẽ dẫn đến import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0519 chứa một kiểu dữ liệu import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0520 cho một số đoạn nhất định của cột và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
15 cho những cột khác do các kiểu dữ liệu hỗn hợp từ dữ liệu được đọc trong. Điều quan trọng cần lưu ý là toàn bộ cột sẽ được đánh dấu bằng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88 của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
72, được sử dụng cho các cột có kiểu chữ hỗn hợpChỉ định phân loại dtype#Các cột import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 có thể được phân tích cú pháp trực tiếp bằng cách chỉ định import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0525 hoặc import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0526import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 33Các cột riêng lẻ có thể được phân tích cú pháp dưới dạng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 bằng cách sử dụng đặc tả chính tảimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 35Việc chỉ định import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0525 sẽ dẫn đến một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 không có thứ tự có import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0530 là các giá trị duy nhất được quan sát thấy trong dữ liệu. Để kiểm soát nhiều hơn đối với các danh mục và thứ tự, hãy tạo trước một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0531 và chuyển mã đó cho In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88 của cột đóimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 41Khi sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0533, các giá trị "bất ngờ" bên ngoài import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0534 được coi là giá trị bị thiếuIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
70Điều này phù hợp với hành vi của import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0535Ghi chú Với import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0525, các danh mục kết quả sẽ luôn được phân tích thành chuỗi (đối tượng dtype). If the categories are numeric they can be converted using the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0513 function, or as appropriate, another converter such as import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0538Khi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88 là một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0531 với đồng nhất import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0530 ( tất cả là số, tất cả ngày giờ, v.v. ), quá trình chuyển đổi được thực hiện tự độngIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
71Đặt tên và sử dụng các cột#Xử lý tên cột#Một tệp có thể có hoặc không có hàng tiêu đề. gấu trúc giả sử hàng đầu tiên nên được sử dụng làm tên cột In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
72Bằng cách chỉ định đối số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
49 kết hợp với pip install openpyxl 1284, bạn có thể chỉ ra các tên khác sẽ sử dụng và có nên loại bỏ hàng tiêu đề hay không (nếu có)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73Nếu tiêu đề nằm trong một hàng khác với hàng đầu tiên, hãy chuyển số hàng cho pip install openpyxl 1284. Điều này sẽ bỏ qua các hàng trướcIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
74Ghi chú Hành vi mặc định là suy ra tên cột. nếu không có tên nào được chuyển thì hành vi giống hệt với In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
35 và tên cột được suy ra từ dòng không trống đầu tiên của tệp, nếu tên cột được chuyển rõ ràng thì hành vi giống hệt với In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
36Phân tích cú pháp tên trùng lặp#Không dùng nữa kể từ phiên bản 1. 5. 0. ______20547 chưa bao giờ được triển khai và thay vào đó, một đối số mới trong đó mẫu đổi tên có thể được chỉ định sẽ được thêm vào.
Nếu tệp hoặc tiêu đề chứa tên trùng lặp, theo mặc định, gấu trúc sẽ phân biệt giữa chúng để ngăn ghi đè dữ liệu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
75Không còn dữ liệu trùng lặp vì theo mặc định, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0548 sẽ sửa đổi một loạt các cột trùng lặp 'X', ..., 'X' thành 'X', 'X. 1’, …, ‘X. N'Lọc cột (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47)#Đối số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 cho phép bạn chọn bất kỳ tập hợp con nào của các cột trong một tệp, bằng cách sử dụng tên cột, số vị trí hoặc có thể gọi đượcIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
76Đối số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 cũng có thể được sử dụng để chỉ định cột nào không được sử dụng trong kết quả cuối cùngIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
77Trong trường hợp này, khả năng gọi được chỉ định rằng chúng tôi loại trừ các cột “a” và “c” khỏi đầu ra Nhận xét và dòng trống#Bỏ qua chú thích dòng và dòng trống#Nếu tham số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0552 được chỉ định, thì các dòng nhận xét hoàn toàn sẽ bị bỏ qua. Theo mặc định, các dòng hoàn toàn trống cũng sẽ bị bỏ quaIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
78Nếu import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0553, thì In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 sẽ không bỏ qua các dòng trốngIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
79Cảnh báo Sự hiện diện của các dòng bị bỏ qua có thể tạo ra sự mơ hồ liên quan đến số dòng; In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
50Nếu cả pip install openpyxl 1284 và pip install openpyxl 1285 đều được chỉ định, thì pip install openpyxl 1284 sẽ liên quan đến phần cuối của pip install openpyxl 1285. Ví dụIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
51Bình luận#Đôi khi nhận xét hoặc dữ liệu meta có thể được bao gồm trong một tệp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
52Theo mặc định, trình phân tích cú pháp bao gồm các nhận xét trong đầu ra In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
53We can suppress the comments using the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0552 keywordIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
54Xử lý dữ liệu Unicode#Đối số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0562 nên được sử dụng cho dữ liệu unicode được mã hóa, điều này sẽ dẫn đến kết quả là các chuỗi byte được giải mã thành unicodeIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
55Một số định dạng mã hóa tất cả các ký tự dưới dạng nhiều byte, chẳng hạn như UTF-16, sẽ không phân tích cú pháp chính xác nếu không chỉ định mã hóa. Danh sách đầy đủ các bảng mã tiêu chuẩn của Python Cột chỉ mục và dấu phân cách #Nếu một tệp có nhiều hơn một cột dữ liệu so với số lượng tên cột, thì cột đầu tiên sẽ được sử dụng làm tên hàng của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
56In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
57Thông thường, bạn có thể đạt được hành vi này bằng cách sử dụng tùy chọn import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564Có một số trường hợp ngoại lệ khi một tệp đã được chuẩn bị với các dấu phân cách ở cuối mỗi dòng dữ liệu, gây nhầm lẫn cho trình phân tích cú pháp. Để vô hiệu hóa rõ ràng suy luận cột chỉ mục và loại bỏ cột cuối cùng, hãy vượt qua In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
44In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
58Nếu một tập hợp con dữ liệu đang được phân tích cú pháp bằng tùy chọn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47, thì thông số kỹ thuật của import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564 dựa trên tập hợp con đó, không phải dữ liệu gốcIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
59Xử lý ngày#Chỉ định cột ngày #Để tạo điều kiện thuận lợi hơn khi làm việc với dữ liệu ngày giờ, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
13 sử dụng các đối số từ khóa import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0569 và import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0570 để cho phép người dùng chỉ định nhiều cột và định dạng ngày/giờ để biến dữ liệu văn bản đầu vào thành các đối tượng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0571Trường hợp đơn giản nhất là chỉ cần vượt qua trong import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0572In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
0Thông thường, chúng tôi có thể muốn lưu trữ dữ liệu ngày và giờ riêng biệt hoặc lưu trữ các trường ngày khác nhau một cách riêng biệt. từ khóa import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0569 có thể được sử dụng để chỉ định tổ hợp các cột để phân tích ngày và/hoặc thời gian từBạn có thể chỉ định một danh sách các danh sách cột thành import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0569, các cột ngày kết quả sẽ được thêm vào đầu ra (để không ảnh hưởng đến thứ tự cột hiện có) và các tên cột mới sẽ là phần nối của các tên cột thành phầnIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
1Theo mặc định, trình phân tích cú pháp loại bỏ các cột ngày của thành phần, nhưng bạn có thể chọn giữ lại chúng thông qua từ khóa import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0575In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
2Lưu ý rằng nếu bạn muốn kết hợp nhiều cột thành một cột ngày, thì phải sử dụng danh sách lồng nhau. Nói cách khác, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0576 chỉ ra rằng mỗi cột thứ hai và thứ ba phải được phân tích cú pháp thành các cột ngày riêng biệt trong khi import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0577 có nghĩa là hai cột phải được phân tích cú pháp thành một cộtBạn cũng có thể sử dụng lệnh để chỉ định các cột tên tùy chỉnh In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
3Điều quan trọng cần nhớ là nếu nhiều cột văn bản được phân tích thành một cột ngày, thì một cột mới sẽ được thêm vào trước dữ liệu. Thông số kỹ thuật của import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564 dựa trên tập hợp cột mới này thay vì các cột dữ liệu ban đầuIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
4Ghi chú Nếu một cột hoặc chỉ mục chứa ngày không thể phân tích cú pháp, thì toàn bộ cột hoặc chỉ mục đó sẽ được trả về không thay đổi dưới dạng kiểu dữ liệu đối tượng. Để phân tích cú pháp ngày giờ không chuẩn, hãy sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0538 sau import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0580Ghi chú read_csv có fast_path để phân tích chuỗi ngày giờ ở định dạng iso8601, e. g “2000-01-01T00. 01. 02+00. 00” và các biến thể tương tự. Nếu bạn có thể sắp xếp dữ liệu của mình để lưu trữ thời gian ở định dạng này, thì thời gian tải sẽ nhanh hơn đáng kể, đã quan sát được ~20 lần Chức năng phân tích ngày #Finally, the parser allows you to specify a custom import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0570 function to take full advantage of the flexibility of the date parsing APIIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
5gấu trúc sẽ cố gắng gọi hàm import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0570 theo ba cách khác nhau. Nếu một ngoại lệ được đưa ra, ngoại lệ tiếp theo sẽ được thửimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0570 được gọi đầu tiên với một hoặc nhiều mảng làm đối số, như được định nghĩa bằng cách sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0569 (e. g. , import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0585)Nếu #1 không thành công, thì import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0570 được gọi với tất cả các cột được nối theo hàng thành một mảng duy nhất (e. g. , import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0587)
Lưu ý rằng về hiệu suất, bạn nên thử các phương pháp phân tích ngày này theo thứ tự Cố gắng suy ra định dạng bằng cách sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0588 (xem phần bên dưới)Nếu bạn biết định dạng, hãy sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0589. import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0590Nếu bạn có định dạng thực sự không chuẩn, hãy sử dụng hàm import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0570 tùy chỉnh. Để có hiệu suất tối ưu, điều này nên được vector hóa, tôi. e. , nó sẽ chấp nhận mảng làm đối số
Phân tích cú pháp CSV với các múi giờ hỗn hợp#pandas không thể đại diện cho một cột hoặc chỉ mục với các múi giờ hỗn hợp. Nếu tệp CSV của bạn chứa các cột có nhiều múi giờ khác nhau, thì kết quả mặc định sẽ là cột kiểu đối tượng có chuỗi, ngay cả với import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0569In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
6Để phân tích cú pháp các giá trị múi giờ hỗn hợp dưới dạng cột ngày giờ, hãy chuyển một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0538 được áp dụng một phần với import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0594 là import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0570In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7Suy ra định dạng ngày giờ #If you have import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0569 enabled for some or all of your columns, and your datetime strings are all formatted the same way, you may get a large speed up by setting import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0588. If set, pandas will attempt to guess the format of your datetime strings, and then use a faster means of parsing the strings. 5-10x parsing speeds have been observed. pandas will fallback to the usual parsing if either the format cannot be guessed or the format that was guessed cannot properly parse the entire column of strings. So in general, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0598 should not have any negative consequences if enabledHere are some examples of datetime strings that can be guessed (All representing December 30th, 2011 at 00. 00. 00) Note that import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0598 is sensitive to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2400. With import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2401, it will guess “01/12/2011” to be December 1st. With import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2402 (default) it will guess “01/12/2011” to be January 12thIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
8International date formats#While US date formats tend to be MM/DD/YYYY, many international formats use DD/MM/YYYY instead. For convenience, a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2400 keyword is providedIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
9Writing CSVs to binary file objects#New in version 1. 2. 0 import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2404 allows writing a CSV to a file object opened binary mode. In most cases, it is not necessary to specify import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2405 as Pandas will auto-detect whether the file object is opened in text or binary modepip install openpyxl 120Specifying method for floating-point conversion#The parameter import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2406 can be specified in order to use a specific floating-point converter during parsing with the C engine. The options are the ordinary converter, the high-precision converter, and the round-trip converter (which is guaranteed to round-trip values after writing to a file). For examplepip install openpyxl 121Thousand separators#For large numbers that have been written with a thousands separator, you can set the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2407 keyword to a string of length 1 so that integers will be parsed correctlyBy default, numbers with a thousands separator will be parsed as strings pip install openpyxl 122The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2407 keyword allows integers to be parsed correctlypip install openpyxl 123NA values#To control which values are parsed as missing values (which are signified by In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46), specify a string in In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
73. If you specify a list of strings, then all values in it are considered to be missing values. If you specify a number (a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2411, like import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2412 or an import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2413 like import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2414), the corresponding equivalent values will also imply a missing value (in this case effectively import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2415 are recognized as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46)To completely override the default values that are recognized as missing, specify import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2417The default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46 recognized values are import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2419Let us consider some examples pip install openpyxl 124In the example above import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2414 and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2412 will be recognized as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46, in addition to the defaults. A string will first be interpreted as a numerical import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2414, then as a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46pip install openpyxl 125Above, only an empty field will be recognized as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46pip install openpyxl 126Above, both import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2426 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
84 as strings are In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46pip install openpyxl 127The default values, in addition to the string import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2429 are recognized as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46Infinity#import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2431 like values will be parsed as import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2432 (positive infinity), and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2433 as import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2434 (negative infinity). These will ignore the case of the value, meaning import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2435, will also be parsed as import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2432Returning Series#Using the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2437 keyword, the parser will return output with a single column as a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62Deprecated since version 1. 4. 0. Users should append In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
63 to the DataFrame returned by In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 instead. pip install openpyxl 128Boolean values#The common values In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2443, and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2444 are all recognized as boolean. Occasionally you might want to recognize other values as being boolean. To do this, use the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2445 and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2446 options as followspip install openpyxl 129Handling “bad” lines#Some files may have malformed lines with too few fields or too many. Lines with too few fields will have NA values filled in the trailing fields. Lines with too many fields will raise an error by default import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 050You can elect to skip bad lines import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 051Or pass a callable function to handle the bad line if import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2447. The bad line will be a list of strings that was split by the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2448import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 052You can also use the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 parameter to eliminate extraneous column data that appear in some lines but not othersimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 053In case you want to keep all data including the lines with too many fields, you can specify a sufficient number of In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
49. This ensures that lines with not enough fields are filled with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 054Dialect#The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2452 keyword gives greater flexibility in specifying the file format. By default it uses the Excel dialect but you can specify either the dialect name or a pip install openpyxl 1290 instanceSuppose you had data with unenclosed quotes import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 055By default, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 uses the Excel dialect and treats the double quote as the quote character, which causes it to fail when it finds a newline before it finds the closing double quoteWe can get around this using import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2452import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 056All of the dialect options can be specified separately by keyword arguments import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 057Another common dialect option is pip install openpyxl 1295, to skip any whitespace after a delimiterimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 058The parsers make every attempt to “do the right thing” and not be fragile. Type inference is a pretty big deal. If a column can be coerced to integer dtype without altering the contents, the parser will do so. Any non-numeric columns will come through as object dtype as with the rest of pandas objects Quoting and Escape Characters#Quotes (and other escape characters) in embedded fields can be handled in any number of ways. One way is to use backslashes; to properly parse this data, you should pass the pip install openpyxl 1294 optionimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 059Files with fixed width columns#While In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
13 reads delimited data, the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2459 function works with data files that have known and fixed column widths. The function parameters to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2460 are largely the same as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 with two extra parameters, and a different usage of the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
33 parameterimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2463. A list of pairs (tuples) giving the extents of the fixed-width fields of each line as half-open intervals (i. e. , [from, to[ ). String value ‘infer’ can be used to instruct the parser to try detecting the column specifications from the first 100 rows of the data. Default behavior, if not specified, is to inferimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2464. A list of field widths which can be used instead of ‘colspecs’ if the intervals are contiguousIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
33. Characters to consider as filler characters in the fixed-width file. Can be used to specify the filler character of the fields if it is not spaces (e. g. , ‘~’)
Consider a typical fixed-width data file import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 240In order to parse this file into a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43, we simply need to supply the column specifications to the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2460 function along with the file nameimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 241Note how the parser automatically picks column names X. when In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
36 argument is specified. Alternatively, you can supply just the column widths for contiguous columns: import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 242The parser will take care of extra white spaces around the columns so it’s ok to have extra separation between the columns in the file By default, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2460 will try to infer the file’s import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2463 by using the first 100 rows of the file. It can do it only in cases when the columns are aligned and correctly separated by the provided In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
33 (default delimiter is whitespace)import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 243import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2460 supports the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88 parameter for specifying the types of parsed columns to be different from the inferred typeimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 244Indexes#Files with an “implicit” index column#Consider a file with one less entry in the header than the number of data column import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 245In this special case, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 assumes that the first column is to be used as the index of the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 246Note that the dates weren’t automatically parsed. In that case you would need to do as before import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 247Reading an index with a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476#Suppose you have data indexed by two columns import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 248The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564 argument to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 can take a list of column numbers to turn multiple columns into a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 for the index of the returned objectimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 249Reading columns with a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476#By specifying list of row locations for the pip install openpyxl 1284 argument, you can read in a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 for the columns. Specifying non-consecutive rows will skip the intervening rowsimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 330In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 is also able to interpret a more common format of multi-columns indicesimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 331Ghi chú If an import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564 is not specified (e. g. you don’t have an index, or wrote it with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2485, then any In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
49 on the columns index will be lostAutomatically “sniffing” the delimiter#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 is capable of inferring delimited (not necessarily comma-separated) files, as pandas uses the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
25 class of the csv module. For this, you have to specify import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2489import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 332Reading multiple files to create a single DataFrame#It’s best to use import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2490 to combine multiple files. See the cookbook for an example. Iterating through files chunk by chunk#Suppose you wish to iterate through a (potentially very large) file lazily rather than reading the entire file into memory, such as the following import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 333By specifying a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66, the return value will be an iterable object of type pip install openpyxl 1232import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 334Changed in version 1. 2. import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2494 return a context-manager when iterating through a file. Specifying import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2495 will also return the pip install openpyxl 1232 objectimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 335Specifying the parser engine#Pandas currently supports three engines, the C engine, the python engine, and an experimental pyarrow engine (requires the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497 package). In general, the pyarrow engine is fastest on larger workloads and is equivalent in speed to the C engine on most other workloads. The python engine tends to be slower than the pyarrow and C engines on most workloads. However, the pyarrow engine is much less robust than the C engine, which lacks a few features compared to the Python engineWhere possible, pandas uses the C parser (specified as import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2498), but it may fall back to Python if C-unsupported options are specifiedCurrently, options unsupported by the C and pyarrow engines include import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2448 other than a single character (e. g. regex separators)import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3300import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2489 with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3302
Specifying any of the above options will produce a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3303 unless the python engine is selected explicitly using import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3304Options that are unsupported by the pyarrow engine which are not covered by the list above include import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2406In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0552import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3308import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2407import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3310import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2452import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3312import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3313import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0503import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3315pip install openpyxl 1276import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3317import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0511import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3319In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
91import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2400import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0598import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3323pip install openpyxl 1295import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3325
Specifying these options with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3326 will raise a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3327Reading/writing remote files#You can pass in a URL to read or write remote files to many of pandas’ IO functions - the following example shows reading a CSV file import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 336Mới trong phiên bản 1. 3. 0 A custom header can be sent alongside HTTP(s) requests by passing a dictionary of header key value mappings to the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3328 keyword argument as shown belowimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 337All URLs which are not local files or HTTP(s) are handled by fsspec, if installed, and its various filesystem implementations (including Amazon S3, Google Cloud, SSH, FTP, webHDFS…). Some of these implementations will require additional packages to be installed, for example S3 URLs require the s3fs library import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 338When dealing with remote storage systems, you might need extra configuration with environment variables or config files in special locations. For example, to access data in your S3 bucket, you will need to define credentials in one of the several ways listed in the S3Fs documentation. The same is true for several of the storage backends, and you should follow the links at fsimpl1 for implementations built into import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3329 and fsimpl2 for those not included in the main import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3329 distributionYou can also pass parameters directly to the backend driver. For example, if you do not have S3 credentials, you can still access public data by specifying an anonymous connection, such as New in version 1. 2. 0 import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 339import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3329 also allows complex URLs, for accessing data in compressed archives, local caching of files, and more. To locally cache the above example, you would modify the call toimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 350where we specify that the “anon” parameter is meant for the “s3” part of the implementation, not to the caching implementation. Note that this caches to a temporary directory for the duration of the session only, but you can also specify a permanent store Writing out data#Writing to CSV format#The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 objects have an instance method import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3334 which allows storing the contents of the object as a comma-separated-values file. The function takes a number of arguments. Only the first is requiredimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3335. A string path to the file to write or a file object. If a file object it must be opened with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3336import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2448 . Field delimiter for the output file (default “,”)import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3338. A string representation of a missing value (default ‘’)import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3339. Format string for floating point numbersimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340. Columns to write (default None)pip install openpyxl 1284. Whether to write out the column names (default True)import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342. whether to write row (index) names (default True)import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3343. Column label(s) for index column(s) if desired. Nếu Không có (mặc định) và pip install openpyxl 1284 và import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 là Đúng, thì tên chỉ mục được sử dụng. (A sequence should be given if the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 uses MultiIndex)import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2405 . Python write mode, default ‘w’import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0562. a string representing the encoding to use if the contents are non-ASCII, for Python versions prior to 3import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3317. Character sequence denoting line end (default import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3350)pip install openpyxl 1276. Set quoting rules as in csv module (default csv. QUOTE_MINIMAL). Note that if you have set a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3339 then floats are converted to strings and csv. QUOTE_NONNUMERIC will treat them as non-numericpip install openpyxl 1275. Character used to quote fields (default ‘”’)pip install openpyxl 1293. Control quoting of pip install openpyxl 1275 in fields (default True)pip install openpyxl 1294. Character used to escape import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2448 and pip install openpyxl 1275 when appropriate (default None)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90. Number of rows to write at a timeimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3360. Format string for datetime objects
Writing a formatted string#The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 object has an instance method import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3362 which allows control over the string representation of the object. All arguments are optionalimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3363 default None, for example a StringIO objectimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340 default None, which columns to writeimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3365 default None, minimum width of each columnimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3338 default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46, representation of NA valueimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3368 default None, a dictionary (by column) of functions each of which takes a single argument and returns a formatted stringimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3339 default None, a function which takes a single (float) argument and returns a formatted string; to be applied to floats in the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3371 default True, set to False for a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 with a hierarchical index to print every MultiIndex key at each rowimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3373 default True, will print the names of the indicesimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 default True, will print the index (ie, row labels)pip install openpyxl 1284 default True, will print the column labelsimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3376 default import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3377, will print column headers left- or right-justified
The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62 object also has a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3362 method, but with only the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3363, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3338, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3339 arguments. Ngoài ra còn có một đối số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3383, nếu được đặt thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32, sẽ xuất thêm độ dài của Sê-riJSON#Read and write import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3385 format files and stringsWriting JSON#A In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 can be converted to a valid JSON string. Use import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3388 with optional parametersimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3335 . the pathname or buffer to write the output This can be In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 in which case a JSON string is returnedimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3391 In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62default is import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342allowed values are { import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3394, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342}
In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43default is import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340allowed values are { import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3394, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3503, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504}
The format of the JSON string import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3394dict like {index -> [index], columns -> [columns], data -> [values]} import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395list like [{column -> value}, … , {column -> value}] import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342dict like {index -> {column -> value}} import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340dict like {column -> {index -> value}} import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3503just the values array import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504adhering to the JSON Table Schema import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3360 . string, type of date conversion, ‘epoch’ for timestamp, ‘iso’ for ISO8601import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3512 . The number of decimal places to use when encoding floating point values, default 10import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3513 . force encoded string to be ASCII, default Trueimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3514 . The time unit to encode to, governs timestamp and ISO8601 precision. One of ‘s’, ‘ms’, ‘us’ or ‘ns’ for seconds, milliseconds, microseconds and nanoseconds respectively. Default ‘ms’import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3515 . The handler to call if an object cannot otherwise be converted to a suitable format for JSON. Takes a single argument, which is the object to convert, and returns a serializable objectimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3516 . If import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395 orient, then will write each record per line as json
Note In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
46’s, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3519’s and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 will be converted to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3521 and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0571 objects will be converted based on the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3360 and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3514 parametersimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 351Orient options#There are a number of different options for the format of the resulting JSON file / string. Consider the following In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 352Column oriented (the default for In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43) serializes the data as nested JSON objects with column labels acting as the primary indeximport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 353Index oriented (the default for In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62) similar to column oriented but the index labels are now primaryimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 354Record oriented serializes the data to a JSON array of column -> value records, index labels are not included. This is useful for passing In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 data to plotting libraries, for example the JavaScript library import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3530import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 355Value oriented is a bare-bones option which serializes to nested JSON arrays of values only, column and index labels are not included import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 356Split oriented serializes to a JSON object containing separate entries for values, index and columns. Name is also included for In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 357Table oriented serializes to the JSON Table Schema, allowing for the preservation of metadata including but not limited to dtypes and index names Ghi chú Any orient option that encodes to a JSON object will not preserve the ordering of index and column labels during round-trip serialization. If you wish to preserve label ordering use the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3394 option as it uses ordered containersDate handling#Writing in ISO date format import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 358Writing in ISO date format, with microseconds import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 359Epoch timestamps, in seconds import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 410Writing to a file, with a date index and a date column import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 411Fallback behavior#If the JSON serializer cannot handle the container contents directly it will fall back in the following manner if the dtype is unsupported (e. g. import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3533) then the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3515, if provided, will be called for each value, otherwise an exception is raisedif an object is unsupported it will attempt the following check if the object has defined a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3535 method and call it. A import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3535 method should return a pip install openpyxl 1243 which will then be JSON serializedinvoke the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3515 if one was providedconvert the object to a pip install openpyxl 1243 by traversing its contents. However this will often fail with an import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3540 or give unexpected results
In general the best approach for unsupported objects or dtypes is to provide a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3515. For exampleimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 412can be dealt with by specifying a simple import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3515import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 413Reading JSON#Reading a JSON string to pandas object can take a number of parameters. The parser will try to parse a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 if import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3544 is not supplied or is In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24. To explicitly force In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62 parsing, pass import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3547In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
92 . a VALID JSON string or file handle / StringIO. The string could be a URL. Valid URL schemes include http, ftp, S3, and file. For file URLs, a host is expected. For instance, a local file could be file . //localhost/path/to/table. jsonimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3544 . type of object to recover (series or frame), default ‘frame’import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3391 Series default is import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342allowed values are { import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3394, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342} DataFramedefault is import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340allowed values are { import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3394, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3503, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504}
The format of the JSON string import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3394dict like {index -> [index], columns -> [columns], data -> [values]} import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395list like [{column -> value}, … , {column -> value}] import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342dict like {index -> {column -> value}} import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340dict like {column -> {index -> value}} import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3503just the values array import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504adhering to the JSON Table Schema In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88 . if True, infer dtypes, if a dict of column to dtype, then use those, if In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61, then don’t infer dtypes at all, default is True, apply only to the dataimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3570 . boolean, try to convert the axes to the proper dtypes, default is In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3572. một danh sách các cột để phân tích ngày tháng; import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3575 . boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32. If parsing dates, then parse the default date-like columnsimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3577 . direct decoding to NumPy arrays. default is In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61; Supports numeric data only, although labels may be non-numeric. Also note that the JSON ordering MUST be the same for each term if import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3579import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3580 . boolean, default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61. Set to enable usage of higher precision (strtod) function when decoding string to double values. Default (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61) is to use fast but less precise builtin functionalityimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3514 . string, the timestamp unit to detect if converting dates. Default None. By default the timestamp precision will be detected, if this is not desired then pass one of ‘s’, ‘ms’, ‘us’ or ‘ns’ to force timestamp precision to seconds, milliseconds, microseconds or nanoseconds respectivelyimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3516 . reads file as one json object per lineimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0562 . The encoding to use to decode py3 bytesIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 . when used in combination with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3587, return a JsonReader which reads in In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 lines per iteration
The parser will raise one of import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3589 if the JSON is not parseableIf a non-default import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3391 was used when encoding to JSON be sure to pass the same option here so that decoding produces sensible results, see Orient Options for an overviewData conversion#The default of import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3591, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3592, and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3593 will try to parse the axes, and all of the data into appropriate types, including dates. If you need to override specific dtypes, pass a dict to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88. import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3570 should only be set to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61 if you need to preserve string-like numbers (e. g. ‘1’, ‘2’) in an axesGhi chú Large integer values may be converted to dates if import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3593 and the data and / or column labels appear ‘date-like’. The exact threshold depends on the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3514 specified. ‘date-like’ means that the column label meets one of the following criteriait ends with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3599it ends with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4100it begins with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4101it is import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4102it is import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4103
Cảnh báo When reading JSON data, automatic coercing into dtypes has some quirks an index can be reconstructed in a different order from serialization, that is, the returned order is not guaranteed to be the same as before serialization a column that was import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2411 data will be converted to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2413 if it can be done safely, e. g. a column of import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4106bool columns will be converted to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2413 on reconstruction
Thus there are times where you may want to specify specific dtypes via the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88 keyword argumentReading from a JSON string import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 414Reading from a file import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 415Don’t convert any data (but still convert axes and dates) import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 416Specify dtypes for conversion import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 417Preserve string indices import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 418Dates written in nanoseconds need to be read back in nanoseconds import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 419The Numpy parameter#Ghi chú This param has been deprecated as of version 1. 0. 0 and will raise a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4109This supports numeric data only. Index and columns labels may be non-numeric, e. g. strings, dates etc If import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3579 is passed to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4111 an attempt will be made to sniff an appropriate dtype during deserialization and to subsequently decode directly to NumPy arrays, bypassing the need for intermediate Python objectsThis can provide speedups if you are deserialising a large amount of numeric data In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
700In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
701In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
702The speedup is less noticeable for smaller datasets In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
703In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
704In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
705Cảnh báo Direct NumPy decoding makes a number of assumptions and may fail or produce unexpected output if these assumptions are not satisfied data is numeric data is uniform. The dtype is sniffed from the first value decoded. A import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3327 may be raised, or incorrect output may be produced if this condition is not satisfiedlabels are ordered. Labels are only read from the first container, it is assumed that each subsequent row / column has been encoded in the same order. This should be satisfied if the data was encoded using import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3388 but may not be the case if the JSON is from another source
Normalization#pandas provides a utility function to take a dict or list of dicts and normalize this semi-structured data into a flat table In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
706In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
707The max_level parameter provides more control over which level to end normalization. With max_level=1 the following snippet normalizes until 1st nesting level of the provided dict In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
708Line delimited json#pandas is able to read and write line-delimited json files that are common in data processing pipelines using Hadoop or Spark For line-delimited json files, pandas can also return an iterator which reads in In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 lines at a time. This can be useful for large files or to read from a streamIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
709Table schema#Table Schema is a spec for describing tabular datasets as a JSON object. The JSON includes information on the field names, types, and other attributes. You can use the orient import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504 to build a JSON string with two fields, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4116 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
56In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
710The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4116 field contains the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4119 key, which itself contains a list of column name to type pairs, including the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4120 or import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 (see below for a list of types). The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4116 field also contains a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4123 field if the (Multi)index is uniqueThe second field, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
56, contains the serialized data with the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3395 orient. The index is included, and any datetimes are ISO 8601 formatted, as required by the Table Schema specThe full list of types supported are described in the Table Schema spec. This table shows the mapping from pandas types pandas type Table Schema type int64 integer float64 number bool boolean datetime64[ns] datetime timedelta64[ns] duration categorical any object str A few notes on the generated table schema The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4116 object contains a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4127 field. This contains the version of pandas’ dialect of the schema, and will be incremented with each revisionAll dates are converted to UTC when serializing. Even timezone naive values, which are treated as UTC with an offset of 0 In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
711datetimes with a timezone (before serializing), include an additional field import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4128 with the time zone name (e. g. import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4129)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
712Periods are converted to timestamps before serialization, and so have the same behavior of being converted to UTC. In addition, periods will contain and additional field import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4130 with the period’s frequency, e. g. import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4131In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
713Categoricals use the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4132 type and an import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4133 constraint listing the set of possible values. Additionally, an import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4134 field is includedIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
714A import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4123 field, containing an array of labels, is included if the index is uniqueIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
715The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4123 behavior is the same with MultiIndexes, but in this case the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4123 is an arrayIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
716The default naming roughly follows these rules For series, the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4138 is used. If that’s none, then the name is import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3503For import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4140, the stringified version of the column name is usedFor import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4120 (not import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476), import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4143 is used, with a fallback to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 if that is NoneFor import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4146 is used. If any level has no name, then import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4147 is used
import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4111 also accepts import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4149 as an argument. This allows for the preservation of metadata such as dtypes and index names in a round-trippable mannerIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
717
Please note that the literal string ‘index’ as the name of an import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4120 is not round-trippable, nor are any names beginning with import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4151 within a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476. These are used by default in import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4153 to indicate missing values and the subsequent read cannot distinguish the intentIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
718When using import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4149 along with user-defined import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4155, the generated schema will contain an additional import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4156 key in the respective import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4119 element. This extra key is not standard but does enable JSON roundtrips for extension types (e. g. import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4158)The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4156 key carries the name of the extension, if you have properly registered the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4160, pandas will use said name to perform a lookup into the registry and re-convert the serialized data into your custom dtypeHTML#Reading HTML content#Cảnh báo We highly encourage you to read the HTML Table Parsing gotchas below regarding the issues surrounding the BeautifulSoup4/html5lib/lxml parsers. The top-level import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4161 function can accept an HTML string/file/URL and will parse HTML tables into list of pandas import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4140. Let’s look at a few examplesGhi chú import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4163 returns a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4164 of In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 objects, even if there is only a single table contained in the HTML contentRead a URL with no options In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
719Ghi chú The data from the above URL changes every Monday so the resulting data above may be slightly different Read in the content of the file from the above URL and pass it to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4163 as a stringIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
720You can even pass in an instance of In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
11 if you so desireIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
721Ghi chú The following examples are not run by the IPython evaluator due to the fact that having so many network-accessing functions slows down the documentation build. If you spot an error or an example that doesn’t run, please do not hesitate to report it over on pandas GitHub issues page Read a URL and match a table that contains specific text In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
722Specify a header row (by default import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4168 or import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4169 elements located within a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4170 are used to form the column index, if multiple rows are contained within import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4170 then a MultiIndex is created); if specified, the header row is taken from the data minus the parsed header elements (import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4168 elements)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
723Specify an index column In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
724Specify a number of rows to skip In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
725Specify a number of rows to skip using a list ( import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4173 works as well)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
726Specify an HTML attribute In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
727Specify values that should be converted to NaN In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
728Specify whether to keep the default set of NaN values In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
729Specify converters for columns. This is useful for numerical text data that has leading zeros. By default columns that are numerical are cast to numeric types and the leading zeros are lost. To avoid this, we can convert these columns to strings In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
730Use some combination of the above In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
731Read in pandas import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4174 output (with some loss of floating point precision)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
732The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4175 backend will raise an error on a failed parse if that is the only parser you provide. If you only have a single parser you can provide just a string, but it is considered good practice to pass a list with one string if, for example, the function expects a sequence of strings. You may useIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
733Or you could pass import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4176 without a listIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
734However, if you have bs4 and html5lib installed and pass In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 or import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4178 then the parse will most likely succeed. Note that as soon as a parse succeeds, the function will returnIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
735Links can be extracted from cells along with the text using import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4179In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
736New in version 1. 5. 0 Writing to HTML files#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 objects have an instance method import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4174 which renders the contents of the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 as an HTML table. Các đối số của hàm như trong phương thức import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3362 được mô tả ở trênGhi chú Không phải tất cả các tùy chọn có thể có cho import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4184 đều được hiển thị ở đây vì lý do ngắn gọn. See import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4185 for the full set of optionsGhi chú In an HTML-rendering supported environment like a Jupyter Notebook, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4186 will render the raw HTML into the environmentIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
737The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340 argument will limit the columns shownIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
738import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3339 takes a Python callable to control the precision of floating point valuesIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
739import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4189 will make the row labels bold by default, but you can turn that offIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
740The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4190 argument provides the ability to give the resulting HTML table CSS classes. Note that these classes are appended to the existing import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4191 classIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
741The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4192 argument provides the ability to add hyperlinks to cells that contain URLsIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
742Finally, the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4193 argument allows you to control whether the “<”, “>” and “&” characters escaped in the resulting HTML (by default it is In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32). So to get the HTML without escaped characters pass import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4195In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
743Escaped In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
744Not escaped In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
745Ghi chú Some browsers may not show a difference in the rendering of the previous two HTML tables HTML Table Parsing Gotchas#There are some versioning issues surrounding the libraries that are used to parse HTML tables in the top-level pandas io function import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4163Issues with lxml Benefits
Drawbacks lxml does not make any guarantees about the results of its parse unless it is given strictly valid markup In light of the above, we have chosen to allow you, the user, to use the lxml backend, but this backend will use html5lib if lxml fails to parse It is therefore highly recommended that you install both BeautifulSoup4 and html5lib, so that you will still get a valid result (provided everything else is valid) even if lxml fails
Issues with BeautifulSoup4 using lxml as a backend Issues with BeautifulSoup4 using html5lib as a backend Benefits html5lib is far more lenient than lxml and consequently deals with real-life markup in a much saner way rather than just, e. g. , dropping an element without notifying you html5lib generates valid HTML5 markup from invalid markup automatically. This is extremely important for parsing HTML tables, since it guarantees a valid document. However, that does NOT mean that it is “correct”, since the process of fixing markup does not have a single definition html5lib is pure Python and requires no additional build steps beyond its own installation
Drawbacks The biggest drawback to using html5lib is that it is slow as molasses. However consider the fact that many tables on the web are not big enough for the parsing algorithm runtime to matter. It is more likely that the bottleneck will be in the process of reading the raw text from the URL over the web, i. e. , IO (input-output). For very large tables, this might not be true
LaTeX#Mới trong phiên bản 1. 3. 0 Currently there are no methods to read from LaTeX, only output methods Writing to LaTeX files#Ghi chú DataFrame and Styler objects currently have a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4197 method. We recommend using the Styler. to_latex() method over DataFrame. to_latex() due to the former’s greater flexibility with conditional styling, and the latter’s possible future deprecation. Review the documentation for Styler. to_latex , which gives examples of conditional styling and explains the operation of its keyword arguments. For simple application the following pattern is sufficient In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
746To format values before output, chain the Styler. format method. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
747XML#Đọc XML#Mới trong phiên bản 1. 3. 0 Hàm import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4198 cấp cao nhất có thể chấp nhận một chuỗi/tệp/URL XML và sẽ phân tích cú pháp các nút và thuộc tính thành một gấu trúc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43Ghi chú Since there is no standard XML structure where design types can vary in many ways, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7000 works best with flatter, shallow versions. If an XML document is deeply nested, use the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7001 feature to transform XML into a flatter versionLet’s look at a few examples Read an XML string In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
748Read a URL with no options In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
749Read in the content of the “books. xml” file and pass it to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7000 as a stringIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
750Read in the content of the “books. xml” as instance of In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
11 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7004 and pass it to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7000In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
751In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
752Even read XML from AWS S3 buckets such as NIH NCBI PMC Article Datasets providing Biomedical and Life Science Jorurnals In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
753With lxml as default In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7006, you access the full-featured XML library that extends Python’s ElementTree API. One powerful tool is ability to query nodes selectively or conditionally with more expressive XPathIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
754Specify only elements or only attributes to parse In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
755In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
756XML documents can have namespaces with prefixes and default namespaces without prefixes both of which are denoted with a special attribute In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7007. In order to parse by node under a namespace context, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7008 must reference a prefixFor example, below XML contains a namespace with prefix, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7009, and URI at In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7010. In order to parse In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7011 nodes, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7012 must be usedIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
757Similarly, an XML document can have a default namespace without prefix. Failing to assign a temporary prefix will return no nodes and raise a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3327. But assigning any temporary name to correct URI allows parsing by nodesIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
758However, if XPath does not reference node names such as default, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7014, then In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7012 is not requiredWith lxml as parser, you can flatten nested XML documents with an XSLT script which also can be string/file/URL types. As background, XSLT is a special-purpose language written in a special XML file that can transform original XML documents into other XML, HTML, even text (CSV, JSON, etc. ) using an XSLT processor For example, consider this somewhat nested structure of Chicago “L” Rides where station and rides elements encapsulate data in their own sections. With below XSLT, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4175 can transform original nested document into a flatter output (as shown below for demonstration) for easier parse into In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
759For very large XML files that can range in hundreds of megabytes to gigabytes, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7018 supports parsing such sizeable files using lxml’s iterparse and etree’s iterparse which are memory-efficient methods to iterate through an XML tree and extract specific elements and attributes. without holding entire tree in memoryNew in version 1. 5. 0
To use this feature, you must pass a physical XML file path into In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7000 and use the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7020 argument. Files should not be compressed or point to online sources but stored on local disk. Also, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7020 should be a dictionary where the key is the repeating nodes in document (which become the rows) and the value is a list of any element or attribute that is a descendant (i. e. , child, grandchild) of repeating node. Since XPath is not used in this method, descendants do not need to share same relationship with one another. Below shows example of reading in Wikipedia’s very large (12 GB+) latest article data dumpIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
760Writing XML#Mới trong phiên bản 1. 3. 0 In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 objects have an instance method In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7023 which renders the contents of the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 as an XML documentGhi chú This method does not support special properties of XML including DTD, CData, XSD schemas, processing instructions, comments, and others. Only namespaces at the root level is supported. However, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7001 allows design changes after initial outputLet’s look at a few examples Write an XML without options In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
761Write an XML with new root and row name In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
762Write an attribute-centric XML In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
763Write a mix of elements and attributes In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
764Any import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4140 with hierarchical columns will be flattened for XML element names with levels delimited by underscoresIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
765Write an XML with default namespace In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
766Write an XML with namespace prefix In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
767Write an XML without declaration or pretty print In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
768Write an XML and transform with stylesheet In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
769XML Final Notes#All XML documents adhere to W3C specifications. Both In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7027 and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4175 parsers will fail to parse any markup document that is not well-formed or follows XML syntax rules. Do be aware HTML is not an XML document unless it follows XHTML specs. However, other popular markup types including KML, XAML, RSS, MusicML, MathML are compliant XML schemasFor above reason, if your application builds XML prior to pandas operations, use appropriate DOM libraries like In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7027 and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4175 to build the necessary document and not by string concatenation or regex adjustments. Always remember XML is a special text file with markup rulesWith very large XML files (several hundred MBs to GBs), XPath and XSLT can become memory-intensive operations. Be sure to have enough available RAM for reading and writing to large XML files (roughly about 5 times the size of text) Because XSLT is a programming language, use it with caution since such scripts can pose a security risk in your environment and can run large or infinite recursive operations. Always test scripts on small fragments before full run The etree parser supports all functionality of both In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7000 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7023 except for complex XPath and any XSLT. Though limited in features, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7027 is still a reliable and capable parser and tree builder. Its performance may trail import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4175 to a certain degree for larger files but relatively unnoticeable on small to medium size files
Excel files#The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7035 method can read Excel 2007+ (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7036) files using the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7037 Python module. Excel 2003 (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7038) files can be read using In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7039. Binary Excel (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7040) files can be read using In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7041. The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7042 instance method is used for saving a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 to Excel. Generally the semantics are similar to working with csv data. See the cookbook for some advanced strategies. Cảnh báo The xlwt package for writing old-style In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7038 excel files is no longer maintained. The xlrd package is now only for reading old-style In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7038 filesBefore pandas 1. 3. 0, the default argument In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7046 to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7035 would result in using the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7039 engine in many cases, including new Excel 2007+ (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7036) files. pandas will now default to using the openpyxl engineIt is strongly encouraged to install In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7037 to read Excel 2007+ (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7036) files. Please do not report issues when using ``xlrd`` to read ``. xlsx`` files. This is no longer supported, switch to using In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7037 insteadCố gắng sử dụng công cụ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7053 sẽ tăng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4109 trừ khi tùy chọn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7055 được đặt thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7056. While this option is now deprecated and will also raise a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4109, it can be globally set and the warning suppressed. Users are recommended to write In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7036 files using the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7037 engine insteadReading Excel files#In the most basic use-case, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7060 takes a path to an Excel file, and the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7061 indicating which sheet to parseIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
770In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7062 class#
To facilitate working with multiple sheets from the same file, the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7062 class can be used to wrap the file and can be passed into In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7060 There will be a performance benefit for reading multiple sheets as the file is read into memory only onceIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
771The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7062 class can also be used as a context managerIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
772The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7066 property will generate a list of the sheet names in the fileThe primary use-case for an In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7062 is parsing multiple sheets with different parametersIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
773Note that if the same parsing parameters are used for all sheets, a list of sheet names can simply be passed to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7060 with no loss in performanceIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
774In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7062 can also be called with a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7070 object as a parameter. This allows the user to control how the excel file is read. For example, sheets can be loaded on demand by calling In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7071 with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7072In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
775Specifying sheets#Ghi chú The second argument is In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7061, not to be confused with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7074Ghi chú An ExcelFile’s attribute In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7066 provides access to a list of sheetsThe arguments In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7061 allows specifying the sheet or sheets to readThe default value for In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7061 is 0, indicating to read the first sheetPass a string to refer to the name of a particular sheet in the workbook Pass an integer to refer to the index of a sheet. Indices follow Python convention, beginning at 0 Pass a list of either strings or integers, to return a dictionary of specified sheets Pass a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 to return a dictionary of all available sheets
In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
776Using the sheet index In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
777Using all default values In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
778Using None to get all sheets In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
779Using a list to get multiple sheets In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
780In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7060 can read more than one sheet, by setting In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7061 to either a list of sheet names, a list of sheet positions, or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 to read all sheets. Sheets can be specified by sheet index or sheet name, using an integer or string, respectivelyReading a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7060 can read a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 index, by passing a list of columns to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564 and a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 column by passing a list of rows to pip install openpyxl 1284. If either the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 or import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340 have serialized level names those will be read in as well by specifying the rows/columns that make up the levelsFor example, to read in a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 index without namesIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
781If the index has level names, they will parsed as well, using the same parameters In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
782If the source file has both import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 index and columns, lists specifying each should be passed to import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564 and pip install openpyxl 1284In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
783Các giá trị bị thiếu trong các cột được chỉ định trong import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564 sẽ được điền chuyển tiếp để cho phép thực hiện vòng lặp với In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7095 cho In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7096. To avoid forward filling the missing values use In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7097 after reading the data instead of import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0564Parsing specific columns#It is often the case that users will insert columns to do temporary computations in Excel and you may not want to read in those columns. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7060 takes a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 keyword to allow you to specify a subset of columns to parseChanged in version 1. 0. 0 Passing in an integer for In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 will no longer work. Please pass in a list of ints from 0 to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 inclusive insteadYou can specify a comma-delimited set of Excel columns and ranges as a string In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
784If In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 is a list of integers, then it is assumed to be the file column indices to be parsedIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
785Element order is ignored, so In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
54 is the same as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
55If In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 is a list of strings, it is assumed that each string corresponds to a column name provided either by the user in In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
49 or inferred from the document header row(s). Those strings define which columns will be parsedIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
786Element order is ignored, so In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7108 is the same as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7109If In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 is callable, the callable function will be evaluated against the column names, returning names where the callable function evaluates to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
787Parsing dates#Datetime-like values are normally automatically converted to the appropriate dtype when reading the excel file. But if you have a column of strings that look like dates (but are not actually formatted as dates in excel), you can use the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0569 keyword to parse those strings to datetimesIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
788Cell converters#It is possible to transform the contents of Excel cells via the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0511 option. For instance, to convert a column to booleanIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
789This options handles missing values and treats exceptions in the converters as missing data. Transformations are applied cell by cell rather than to the column as a whole, so the array dtype is not guaranteed. For instance, a column of integers with missing values cannot be transformed to an array with integer dtype, because NaN is strictly a float. You can manually mask missing data to recover integer dtype In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
790Dtype specifications#As an alternative to converters, the type for an entire column can be specified using the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88 keyword, which takes a dictionary mapping column names to types. To interpret data with no type inference, use the type In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
15 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
72In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
791Writing Excel files#Writing Excel files to disk#To write a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 object to a sheet of an Excel file, you can use the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7095 instance method. The arguments are largely the same as import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3334 described above, the first argument being the name of the excel file, and the optional second argument the name of the sheet to which the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 should be written. For exampleIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
792Files with a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7038 extension will be written using In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7053 and those with a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7036 extension will be written using In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7124 (if available) or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7037The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 will be written in a way that tries to mimic the REPL output. The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3343 will be placed in the second row instead of the first. You can place it in the first row by setting the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7128 option in In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7042 to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
793In order to write separate import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4140 to separate sheets in a single Excel file, one can pass an In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7132In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
794Writing Excel files to memory#pandas supports writing Excel files to buffer-like objects such as In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
11 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7004 using In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7132In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
795Ghi chú In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7136 is optional but recommended. Setting the engine determines the version of workbook produced. Đặt In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7137 sẽ tạo sổ làm việc định dạng Excel 2003 (xls). Sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7138 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7139 sẽ tạo sổ làm việc định dạng Excel 2007 (xlsx). Nếu bỏ qua, sổ làm việc có định dạng Excel 2007 sẽ được tạoCông cụ soạn thảo Excel#Không dùng nữa kể từ phiên bản 1. 2. 0. Vì gói xlwt không còn được duy trì, công cụ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7053 sẽ bị xóa khỏi phiên bản pandas trong tương lai. Đây là công cụ duy nhất trong gấu trúc hỗ trợ ghi vào tệp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7038. gấu trúc chọn một trình soạn thảo Excel thông qua hai phương pháp đối số từ khóa In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7136phần mở rộng tên tệp (thông qua mặc định được chỉ định trong tùy chọn cấu hình)
Theo mặc định, gấu trúc sử dụng XlsxWriter cho tệp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7036, openpyxl cho tệp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7144 và xlwt cho tệp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7038. Nếu bạn đã cài đặt nhiều công cụ, bạn có thể đặt công cụ mặc định thông qua đặt tùy chọn cấu hình In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7146 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7055. pandas will fall back on openpyxl for In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7036 files if Xlsxwriter is not available. Để chỉ định bạn muốn sử dụng trình soạn thảo nào, bạn có thể chuyển đối số từ khóa công cụ tới In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7095 và tới In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7132. The built-in engines areIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7037. phiên bản 2. 4 hoặc cao hơn là bắt buộcIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7124In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7053
In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
796Phong cách và định dạng#Giao diện của bảng tính Excel được tạo từ gấu trúc có thể được sửa đổi bằng cách sử dụng các tham số sau trên phương pháp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7095 của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3339. Chuỗi định dạng cho số dấu phẩy động (mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7158. Một bộ gồm hai số nguyên đại diện cho hàng dưới cùng và cột ngoài cùng bên phải để đóng băng. Mỗi tham số này đều dựa trên một tham số, vì vậy (1, 1) sẽ đóng băng hàng đầu tiên và cột đầu tiên (mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24)
Using the Xlsxwriter engine provides many options for controlling the format of an Excel worksheet created with the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7095 method. Các ví dụ tuyệt vời có thể được tìm thấy trong tài liệu Xlsxwriter tại đây. https. //xlsxwriter. đọcthedocs. io/working_with_pandas. htmlBảng tính OpenDocument#Mới trong phiên bản 0. 25 Phương pháp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7035 cũng có thể đọc bảng tính OpenDocument bằng mô-đun In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7162. Ngữ nghĩa và các tính năng để đọc bảng tính OpenDocument phù hợp với những gì có thể thực hiện được đối với các tệp Excel bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7163In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
797Ghi chú Hiện tại pandas chỉ hỗ trợ đọc bảng tính OpenDocument. Viết không được thực hiện Excel nhị phân (. tệp xlsb) #Mới trong phiên bản 1. 0. 0 Phương pháp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7035 cũng có thể đọc các tệp Excel nhị phân bằng cách sử dụng mô-đun In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7041. Ngữ nghĩa và các tính năng để đọc tệp Excel nhị phân hầu hết khớp với những gì có thể thực hiện được đối với tệp Excel bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7166. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7041 không nhận ra các loại ngày giờ trong tệp và thay vào đó sẽ trả về số floatIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
798Ghi chú Hiện tại pandas chỉ hỗ trợ đọc các tệp Excel nhị phân. Viết không được thực hiện Bảng nhớ tạm #Một cách thuận tiện để lấy dữ liệu là sử dụng phương thức In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7168, lấy nội dung của bộ đệm clipboard và chuyển chúng đến phương thức In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66. Chẳng hạn, bạn có thể sao chép văn bản sau vào khay nhớ tạm (CTRL-C trên nhiều hệ điều hành)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
799Và sau đó nhập dữ liệu trực tiếp vào In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 bằng cách gọiIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
500Phương thức In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7171 có thể được sử dụng để ghi nội dung của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 vào khay nhớ tạm. Sau đó, bạn có thể dán nội dung khay nhớ tạm vào các ứng dụng khác (CTRL-V trên nhiều hệ điều hành). Ở đây chúng tôi minh họa việc viết một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 vào khay nhớ tạm và đọc lạiIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
501Chúng tôi có thể thấy rằng chúng tôi đã lấy lại cùng một nội dung mà chúng tôi đã ghi vào khay nhớ tạm trước đó Ghi chú Bạn có thể cần cài đặt xclip hoặc xsel (với PyQt5, PyQt4 hoặc qtpy) trên Linux để sử dụng các phương pháp này dưa muối#Tất cả các đối tượng pandas đều được trang bị các phương thức In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7174 sử dụng mô-đun In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7175 của Python để lưu cấu trúc dữ liệu vào đĩa bằng định dạng dưa chuaIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
502Hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7176 trong không gian tên In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7177 có thể được sử dụng để tải bất kỳ đối tượng pickled pandas nào (hoặc bất kỳ đối tượng được ngâm nào khác) từ tệpIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
503Cảnh báo Loading pickled data received from untrusted sources can be unsafe Nhìn thấy. https. // tài liệu. con trăn. org/3/library/dưa chua. html Cảnh báo In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7178 chỉ được đảm bảo tương thích ngược với pandas phiên bản 0. 20. 3Tập tin dưa nén #In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7178, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7180 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7181 có thể đọc và ghi các tệp dưa nén. Các kiểu nén của ________ 11257, ________ 11258, ________ 67184, _______ 67185 được hỗ trợ để đọc và ghi. Định dạng tệp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7186 chỉ hỗ trợ đọc và chỉ được chứa một tệp dữ liệu để đọcLoại nén có thể là một tham số rõ ràng hoặc được suy ra từ phần mở rộng tệp. Nếu 'suy ra', thì lần lượt sử dụng pip install openpyxl 1257, pip install openpyxl 1258, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7186, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7184, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7185 nếu tên tệp kết thúc bằng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7192, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7193, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7194, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7195 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7196Tham số nén cũng có thể là một pip install openpyxl 1243 để chuyển các tùy chọn cho giao thức nén. Nó phải có khóa pip install openpyxl 1247 được đặt thành tên của giao thức nén, phải là một trong {pip install openpyxl 1239, pip install openpyxl 1237, pip install openpyxl 1238, pip install openpyxl 1240, pip install openpyxl 1241}. Tất cả các cặp khóa-giá trị khác được chuyển đến thư viện nén cơ bảnIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
504Using an explicit compression type In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
505Suy ra kiểu nén từ tiện ích mở rộng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
506Mặc định là 'suy luận' In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
507Chuyển các tùy chọn cho giao thức nén để tăng tốc độ nén In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
508msgpack#hỗ trợ gấu trúc cho In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7204 đã bị xóa trong phiên bản 1. 0. 0. Bạn nên sử dụng dưa chua để thay thế. Ngoài ra, bạn cũng có thể định dạng tuần tự hóa Arrow IPC để truyền trực tuyến các đối tượng gấu trúc. Để biết tài liệu về pyarrow, xem tại đây HDF5 (PyTables)#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 là một đối tượng giống như dict đọc và ghi pandas bằng định dạng HDF5 hiệu suất cao bằng thư viện PyTables xuất sắc. Xem sách dạy nấu ăn để biết một số chiến lược nâng caoCảnh báo gấu trúc sử dụng PyTables để đọc và ghi các tệp HDF5, cho phép tuần tự hóa dữ liệu kiểu đối tượng bằng dưa chua. Loading pickled data received from untrusted sources can be unsafe Nhìn thấy. https. // tài liệu. con trăn. org/3/library/dưa chua. html để biết thêm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
509Các đối tượng có thể được ghi vào tệp giống như thêm các cặp khóa-giá trị vào một lệnh In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
510Trong phiên Python hiện tại hoặc mới hơn, bạn có thể truy xuất các đối tượng được lưu trữ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
511Xóa đối tượng được chỉ định bởi khóa In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
512Đóng Cửa hàng và sử dụng trình quản lý ngữ cảnh In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
513Đọc/ghi API#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 hỗ trợ API cấp cao nhất sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7207 để đọc và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7208 để viết, tương tự như cách hoạt động của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
66 và import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3334In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
514HDFStore theo mặc định sẽ không loại bỏ các hàng bị thiếu. Hành vi này có thể được thay đổi bằng cách đặt In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7211In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
515Định dạng cố định #Các ví dụ trên cho thấy việc lưu trữ bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7212, viết HDF5 thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 ở định dạng mảng cố định, được gọi là định dạng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7214. Các loại cửa hàng này không thể nối thêm sau khi được viết (mặc dù bạn có thể chỉ cần xóa chúng và viết lại). Chúng cũng không thể truy vấn được; . Họ cũng không hỗ trợ các khung dữ liệu có tên cột không phải là duy nhất. The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7214 format stores offer very fast writing and slightly faster reading than import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504 stores. Định dạng này được chỉ định theo mặc định khi sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7212 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7208 hoặc bởi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7219 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7220Cảnh báo Định dạng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7214 sẽ tăng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7222 nếu bạn cố truy xuất bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7223In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
516Định dạng bảng #In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 hỗ trợ định dạng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 khác trên đĩa, định dạng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504. Về mặt khái niệm, một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504 có hình dạng rất giống một DataFrame, với các hàng và cột. Một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504 có thể được thêm vào trong cùng một phiên hoặc các phiên khác. Ngoài ra, các hoạt động loại truy vấn và xóa được hỗ trợ. Định dạng này được chỉ định bởi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7229 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7230 đến In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7231 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7212 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7208Định dạng này cũng có thể được đặt làm tùy chọn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7234 để cho phép In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7235 lưu trữ theo mặc định ở định dạng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
517Ghi chú Bạn cũng có thể tạo một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504 bằng cách chuyển In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7229 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7230 cho một hoạt động In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7212Hierarchical keys#Keys to a store can be specified as a string. Chúng có thể ở định dạng giống như tên đường dẫn phân cấp (e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7241), sẽ tạo ra một hệ thống phân cấp các cửa hàng phụ (hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7242 theo cách nói của PyTables). Keys can be specified without the leading ‘/’ and are always absolute (e. g. ‘foo’ refers to ‘/foo’). Removal operations can remove everything in the sub-store and below, so be carefulIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
518You can walk through the group hierarchy using the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7243 method which will yield a tuple for each group key along with the relative keys of its contentsIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
519Cảnh báo Hierarchical keys cannot be retrieved as dotted (attribute) access as described above for items stored under the root node In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
520Instead, use explicit string based keys In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
521Storing types#Storing mixed types in a table#Storing mixed-dtype data is supported. Các chuỗi được lưu trữ dưới dạng chiều rộng cố định bằng cách sử dụng kích thước tối đa của cột được nối thêm. Subsequent attempts at appending longer strings will raise a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3327Chuyển In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7245 làm tham số để nối thêm sẽ đặt giá trị tối thiểu lớn hơn cho các cột chuỗi. Storing In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7246 are currently supported. For string columns, passing In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7247 to append will change the default nan representation on disk (which converts to/from In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7248), this defaults to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7249In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
522Storing MultiIndex DataFrames#Storing MultiIndex import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4140 as tables is very similar to storing/selecting from homogeneous index import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4140In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
523Ghi chú The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 keyword is reserved and cannot be use as a level nameQuerying#Truy vấn một bảng#Các hoạt động của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7253 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7254 có một tiêu chí tùy chọn có thể được chỉ định để chỉ chọn/xóa một tập hợp con của dữ liệu. Điều này cho phép một người có một bảng trên đĩa rất lớn và chỉ truy xuất một phần dữ liệuA query is specified using the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7255 class under the hood, as a boolean expressionimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 and import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340 are supported indexers of import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4140if In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7259 are specified, these can be used as additional indexerstên cấp độ trong MultiIndex, với tên mặc định là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7260, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7261, … nếu không được cung cấp
Các toán tử so sánh hợp lệ là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7262Biểu thức boolean hợp lệ được kết hợp với In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7263. orIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7264. vàIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7265 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7266 . để nhóm
These rules are similar to how boolean expressions are used in pandas for indexing Ghi chú In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7267 sẽ được tự động mở rộng thành toán tử so sánh In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7268In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7269 là toán tử not, nhưng chỉ có thể được sử dụng trong một số trường hợp rất hạn chếNếu một danh sách/bộ biểu thức được thông qua, chúng sẽ được kết hợp thông qua In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7264
The following are valid expressions In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7271In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7272In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7273In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7274In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7275In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7276In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7277In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7278In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7279In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7280
In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7281 nằm ở vế trái của biểu thức conimport openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7283, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7284Vế phải của biểu thức con (sau toán tử so sánh) có thể là functions that will be evaluated, e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7285strings, e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7286giống như ngày, e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7287, or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7288lists, e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7289các biến được xác định trong không gian tên cục bộ, e. g. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7290
Ghi chú Passing a string to a query by interpolating it into the query expression is not recommended. Chỉ cần gán chuỗi quan tâm cho một biến và sử dụng biến đó trong một biểu thức. For example, do this In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
524instead of this In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
525The latter will not work and will raise a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7291. Note that there’s a single quote followed by a double quote in the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7292 variableNếu bạn phải nội suy, hãy sử dụng công cụ xác định định dạng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7293In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
526sẽ trích dẫn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7292Here are some examples In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
527Use boolean expressions, with in-line function evaluation In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
528Sử dụng tham chiếu cột nội tuyến In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
529Từ khóa import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3340 có thể được cung cấp để chọn danh sách các cột sẽ được trả về, điều này tương đương với việc chuyển một số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7296In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
530Các tham số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7297 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7298 có thể được chỉ định để giới hạn tổng không gian tìm kiếm. Đây là về tổng số hàng trong một bảngGhi chú In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7253 will raise a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3327 if the query expression has an unknown variable reference. Usually this means that you are trying to select on a column that is not a data_columnIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7253 sẽ tăng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7291 nếu biểu thức truy vấn không hợp lệQuery timedelta64[ns]#Bạn có thể lưu trữ và truy vấn bằng cách sử dụng loại In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7303. Điều khoản có thể được chỉ định trong định dạng. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7304, where float may be signed (and fractional), and unit can be In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7305 for the timedelta. Đây là một ví dụIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
531Truy vấn MultiIndex#Selecting from a import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 can be achieved by using the name of the levelIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
532If the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 levels names are In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24, the levels are automatically made available via the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7309 keyword with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7310 the level of the import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 you want to select fromIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
533Lập chỉ mục #You can create/modify an index for a table with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7312 after data is already in the table (after and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7313 operation). Creating a table index is highly encouraged. This will speed your queries a great deal when you use a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7253 with the indexed dimension as the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7223Ghi chú Indexes are automagically created on the indexables and any data columns you specify. This behavior can be turned off by passing In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7316 to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7231In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
534Oftentimes when appending large amounts of data to a store, it is useful to turn off index creation for each append, then recreate at the end In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
535Then create the index when finished appending In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
536See here for how to create a completely-sorted-index (CSI) on an existing store Query via data columns#You can designate (and index) certain columns that you want to be able to perform queries (other than the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7318 columns, which you can always query). For instance say you want to perform this common operation, on-disk, and return just the frame that matches this query. You can specify In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7319 to force all columns to be In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7259In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
537There is some performance degradation by making lots of columns into In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7321, so it is up to the user to designate these. Ngoài ra, bạn không thể thay đổi các cột dữ liệu (cũng như không thể lập chỉ mục) sau thao tác thêm/đặt đầu tiên (Tất nhiên bạn có thể chỉ cần đọc dữ liệu và tạo một bảng mới. )Iterator#You can pass import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2495 or In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7323 to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7253 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7325 to return an iterator on the results. The default is 50,000 rows returned in a chunkIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
538Ghi chú You can also use the iterator with In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7207 which will open, then automatically close the store when finished iteratingIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
539Note, that the chunksize keyword applies to the source rows. So if you are doing a query, then the chunksize will subdivide the total rows in the table and the query applied, returning an iterator on potentially unequal sized chunks Đây là một công thức để tạo một truy vấn và sử dụng nó để tạo các khối trả về có kích thước bằng nhau In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
540Truy vấn nâng cao#Chọn một cột #Để truy xuất một cột dữ liệu hoặc có thể lập chỉ mục, hãy sử dụng phương thức In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7327. This will, for example, enable you to get the index very quickly. These return a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
62 of the result, indexed by the row number. Chúng hiện không chấp nhận bộ chọn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7223In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
541Đang chọn tọa độ#Đôi khi bạn muốn lấy tọa độ (a. k. a the index locations) of your query. This returns an In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7330 of the resulting locations. Các tọa độ này cũng có thể được chuyển cho các hoạt động tiếp theo của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7223In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
542Chọn bằng cách sử dụng mặt nạ #Đôi khi truy vấn của bạn có thể liên quan đến việc tạo danh sách các hàng để chọn. Thông thường, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7332 này sẽ là kết quả của import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 từ thao tác lập chỉ mục. Ví dụ này chọn các tháng của datetimeindex là 5In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
543Đối tượng lưu trữ #Nếu bạn muốn kiểm tra đối tượng được lưu trữ, hãy truy xuất qua In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7334. Bạn có thể sử dụng điều này theo lập trình để nói lấy số lượng hàng trong một đối tượngIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
544Nhiều truy vấn bảng#The methods In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7335 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7325 can perform appending/selecting from multiple tables at once. Ý tưởng là có một bảng (gọi nó là bảng chọn) mà bạn lập chỉ mục cho hầu hết/tất cả các cột và thực hiện các truy vấn của mình. The other table(s) are data tables with an index matching the selector table’s index. Sau đó, bạn có thể thực hiện một truy vấn rất nhanh trên bảng bộ chọn nhưng vẫn nhận được nhiều dữ liệu. This method is similar to having a very wide table, but enables more efficient queriesThe In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7335 method splits a given single DataFrame into multiple tables according to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7338, a dictionary that maps the table names to a list of ‘columns’ you want in that table. Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24 được sử dụng thay cho danh sách, bảng đó sẽ có các cột không xác định còn lại của Khung dữ liệu đã cho. Đối số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7340 xác định bảng nào là bảng chọn (bạn có thể thực hiện truy vấn từ đó). The argument In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7341 will drop rows from the input In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 to ensure tables are synchronized. This means that if a row for one of the tables being written to is entirely In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7343, that row will be dropped from all tablesIf In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7341 is False, THE USER IS RESPONSIBLE FOR SYNCHRONIZING THE TABLES. Remember that entirely In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7345 rows are not written to the HDFStore, so if you choose to call In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7346, some tables may have more rows than others, and therefore In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7325 may not work or it may return unexpected resultsIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
545Delete from a table#Bạn có thể xóa khỏi bảng một cách có chọn lọc bằng cách chỉ định một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7223. In deleting rows, it is important to understand the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 deletes rows by erasing the rows, then moving the following data. Thus deleting can potentially be a very expensive operation depending on the orientation of your data. To get optimal performance, it’s worthwhile to have the dimension you are deleting be the first of the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7350Data is ordered (on the disk) in terms of the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7350. Đây là một trường hợp sử dụng đơn giản. You store panel-type data, with dates in the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7283 and ids in the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7353. The data is then interleaved like thisRõ ràng là thao tác xóa trên In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7283 sẽ khá nhanh, vì một đoạn được xóa, sau đó dữ liệu sau sẽ được di chuyển. Mặt khác, thao tác xóa trên In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7353 sẽ rất tốn kém. In this case it would almost certainly be faster to rewrite the table using a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7223 that selects all but the missing dataCảnh báo Xin lưu ý rằng HDF5 KHÔNG TỰ ĐỘNG ĐẶT LẠI KHÔNG GIAN trong các tệp h5. Do đó, liên tục xóa (hoặc loại bỏ các nút) và thêm lại, SẼ CÓ XU HƯỚNG TĂNG KÍCH THƯỚC TẬP TIN Để đóng gói lại và xóa tệp, hãy sử dụng ptrepack . Notes & caveats#Nén#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 allows the stored data to be compressed. Điều này áp dụng cho tất cả các loại cửa hàng, không chỉ bàn. Hai tham số được sử dụng để kiểm soát nén. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7358 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7359In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7358 chỉ định nếu và mức độ cứng của dữ liệu được nén. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7361 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7362 disables compression and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7363 enables compressionIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7359 chỉ định sử dụng thư viện nén nào. Nếu không có gì được chỉ định, thư viện mặc định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7365 được sử dụng. Thư viện nén thường tối ưu hóa để có tốc độ hoặc tốc độ nén tốt và kết quả sẽ phụ thuộc vào loại dữ liệu. Lựa chọn kiểu nén nào tùy thuộc vào nhu cầu và dữ liệu cụ thể của bạn. Danh sách các thư viện nén được hỗ trợzlib. The default compression library. Cổ điển về mặt nén, đạt tốc độ nén tốt nhưng hơi chậm lzo. Fast compression and decompression bzip2. Tỷ lệ nén tốt blosc. Fast compression and decompression Support for alternative blosc compressors khối. blosclz This is the default compressor for In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7366khối. lz4. A compact, very popular and fast compressor khối. lz4hc. Phiên bản tinh chỉnh của LZ4, tạo ra tỷ lệ nén tốt hơn với chi phí tốc độ blosc. snappy. A popular compressor used in many places blosc. zlib. A classic; somewhat slower than the previous ones, but achieving better compression ratios blosc. zstd. An extremely well balanced codec; it provides the best compression ratios among the others above, and at reasonably fast speed
Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7359 được định nghĩa là một cái gì đó khác với các thư viện được liệt kê, một ngoại lệ import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3327 sẽ được ban hành
Ghi chú If the library specified with the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7359 option is missing on your platform, compression defaults to In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7365 without further adoKích hoạt tính năng nén cho tất cả các đối tượng trong tệp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
546Hoặc tính năng nén nhanh (điều này chỉ áp dụng cho các bảng) trong các cửa hàng không bật tính năng nén In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
547ptrepack#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 offers better write performance when tables are compressed after they are written, as opposed to turning on compression at the very beginning. Bạn có thể sử dụng tiện ích In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 được cung cấp In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7373. In addition, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7373 can change compression levels after the factIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
548Furthermore In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7375 will repack the file to allow you to reuse previously deleted space. Alternatively, one can simply remove the file and write again, or use the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7376 methodHãy cẩn thận #Cảnh báo In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 is not-threadsafe for writing. The underlying In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 only supports concurrent reads (via threading or processes). Nếu bạn cần đọc và ghi đồng thời, bạn cần tuần tự hóa các hoạt động này trong một chuỗi trong một quy trình duy nhất. Bạn sẽ làm hỏng dữ liệu của mình nếu không. See the (GH2397) for more informationNếu bạn sử dụng khóa để quản lý quyền ghi giữa nhiều quy trình, bạn có thể muốn sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7379 trước khi giải phóng khóa ghi. For convenience you can use In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7380 to do this for youKhi một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504 được tạo, các cột (DataFrame) được cố định; Hãy nhận biết rằng múi giờ (e. g. , In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7382) không nhất thiết phải bằng nhau giữa các phiên bản múi giờ. So if data is localized to a specific timezone in the HDFStore using one version of a timezone library and that data is updated with another version, the data will be converted to UTC since these timezones are not considered equal. Sử dụng cùng một phiên bản thư viện múi giờ hoặc sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7383 với định nghĩa múi giờ được cập nhật
Cảnh báo In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 sẽ hiển thị In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7385 nếu không thể sử dụng tên cột làm bộ chọn thuộc tính. Định danh tự nhiên chỉ chứa các chữ cái, số và dấu gạch dưới và không được bắt đầu bằng số. Other identifiers cannot be used in a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7223 clause and are generally a bad ideaLoại dữ liệu#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 sẽ ánh xạ một dtype đối tượng tới dtype bên dưới của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213. Điều này có nghĩa là các loại sau được biết là hoạt độngLoại Represents missing values floating . In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7389In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7248số nguyên. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7391boolean In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7392import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3519In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7303import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3519categorical . see the section below object . In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7396In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7248In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7398 columns are not supported, and WILL FAILDữ liệu phân loại#You can write data that contains In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7399 dtypes to a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205. Queries work the same as if it was an object array. Tuy nhiên, dữ liệu dtyped In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7399 được lưu trữ theo cách hiệu quả hơnIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
549String columns#min_itemsize Việc triển khai cơ bản của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 sử dụng chiều rộng cột cố định (kích thước vật phẩm) cho các cột chuỗi. Kích thước cột chuỗi được tính bằng độ dài tối đa của dữ liệu (đối với cột đó) được chuyển đến In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205, trong phần nối thêm đầu tiên. Các phần bổ sung tiếp theo, có thể giới thiệu một chuỗi cho một cột lớn hơn cột có thể chứa, một Ngoại lệ sẽ được đưa ra (nếu không, bạn có thể cắt ngắn các cột này, dẫn đến mất thông tin). In the future we may relax this and allow a user-specified truncation to occurVượt qua In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7404 trong lần tạo bảng đầu tiên để a-priori chỉ định độ dài tối thiểu của một cột chuỗi cụ thể. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7404 can be an integer, or a dict mapping a column name to an integer. Bạn có thể chuyển import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3503 làm khóa để cho phép tất cả các mục có thể lập chỉ mục hoặc cột dữ liệu có kích thước tối thiểu nàyPassing a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7404 dict will cause all passed columns to be created as data_columns automaticallyGhi chú If you are not passing any In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7259, then the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7404 will be the maximum of the length of any string passedIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
550nan_rep Các cột chuỗi sẽ tuần tự hóa một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7248 (một giá trị bị thiếu) với biểu diễn chuỗi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7411. Giá trị này mặc định là giá trị chuỗi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7249. Bạn có thể vô tình biến một giá trị thực tế của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7249 thành một giá trị bị thiếuIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
551Khả năng tương thích bên ngoài#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 viết các đối tượng định dạng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3504 ở các định dạng cụ thể phù hợp để tạo các chuyến khứ hồi không mất dữ liệu tới các đối tượng gấu trúc. Để tương thích với bên ngoài, ________ 67205 có thể đọc các bảng định dạng gốc của ________ 67213Có thể viết một đối tượng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7205 để có thể dễ dàng nhập vào In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7419 bằng thư viện In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7420 (Trang web gói). Tạo một cửa hàng định dạng bảng như thế nàyIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
552Trong R, tệp này có thể được đọc thành đối tượng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7421 bằng thư viện In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7420. Hàm ví dụ sau đọc tên cột và giá trị dữ liệu tương ứng từ các giá trị và tập hợp chúng thành một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7421In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
553Bây giờ bạn có thể nhập In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 vào RIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
554Ghi chú Hàm R liệt kê toàn bộ nội dung của tệp HDF5 và tập hợp đối tượng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7421 từ tất cả các nút phù hợp, vì vậy chỉ sử dụng hàm này làm điểm bắt đầu nếu bạn đã lưu trữ nhiều đối tượng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 vào một tệp HDF5Định dạng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7427 đi kèm với hình phạt về hiệu suất viết so với các cửa hàng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7214. Lợi ích là khả năng nối thêm/xóa và truy vấn (có thể là lượng dữ liệu rất lớn). Thời gian viết thường dài hơn so với các cửa hàng thông thường. Thời gian truy vấn có thể khá nhanh, đặc biệt là trên trục được lập chỉ mụcBạn có thể chuyển In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7429 đến In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7231, chỉ định khối lượng ghi (mặc định là 50000). Điều này sẽ làm giảm đáng kể mức sử dụng bộ nhớ của bạn khi viếtBạn có thể chuyển In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7431 cho In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7231 đầu tiên, để đặt TỔNG số hàng mà In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7213 sẽ mong đợi. Điều này sẽ tối ưu hóa hiệu suất đọc/ghiCác hàng trùng lặp có thể được ghi vào bảng, nhưng được lọc ra trong vùng chọn (với các mục cuối cùng được chọn; do đó, một bảng là duy nhất trên các cặp chính, phụ) Một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7434 sẽ được nâng lên nếu bạn đang cố lưu trữ các loại sẽ được PyTables chọn (chứ không phải được lưu trữ dưới dạng các loại đặc hữu). Xem tại đây để biết thêm thông tin và một số giải pháp
Lông vũ#Feather cung cấp tuần tự hóa cột nhị phân cho các khung dữ liệu. Nó được thiết kế để làm cho việc đọc và ghi các khung dữ liệu trở nên hiệu quả và giúp việc chia sẻ dữ liệu giữa các ngôn ngữ phân tích dữ liệu trở nên dễ dàng Feather được thiết kế để tuần tự hóa và hủy tuần tự hóa DataFrames một cách trung thực, hỗ trợ tất cả các kiểu dữ liệu gấu trúc, bao gồm cả các kiểu mở rộng như phân loại và thời gian với tz Một số lưu ý The format will NOT write an import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4120, or import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2476 for the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 and will raise an error if a non-default one is provided. Bạn có thể In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7438 để lưu trữ chỉ mục hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7439 để bỏ qua nóTên cột trùng lặp và tên cột không phải chuỗi không được hỗ trợ Các đối tượng Python thực tế trong các cột dtype đối tượng không được hỗ trợ. Những điều này sẽ đưa ra một thông báo lỗi hữu ích khi cố gắng tuần tự hóa
Xem tài liệu đầy đủ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
555Ghi vào một tập tin lông vũ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
556Đọc từ tệp lông vũ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
557Sàn gỗ #Apache Parquet cung cấp tuần tự hóa cột nhị phân được phân vùng cho các khung dữ liệu. Nó được thiết kế để làm cho việc đọc và ghi các khung dữ liệu trở nên hiệu quả và giúp việc chia sẻ dữ liệu giữa các ngôn ngữ phân tích dữ liệu trở nên dễ dàng. Sàn gỗ có thể sử dụng nhiều kỹ thuật nén khác nhau để thu nhỏ kích thước tệp càng nhiều càng tốt trong khi vẫn duy trì hiệu suất đọc tốt Parquet được thiết kế để tuần tự hóa và hủy tuần tự hóa một cách trung thực In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 s, hỗ trợ tất cả các kiểu dữ liệu pandas, bao gồm các kiểu mở rộng như datetime với tzMột số lưu ý Tên cột trùng lặp và tên cột không phải chuỗi không được hỗ trợ The import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497 engine always writes the index to the output, but In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7442 only writes non-default indexes. Cột bổ sung này có thể gây ra sự cố cho những người tiêu dùng không phải là pandas không mong đợi điều đó. Bạn có thể buộc bao gồm hoặc bỏ qua các chỉ mục bằng đối số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342, bất kể công cụ cơ bản là gìTên cấp chỉ mục, nếu được chỉ định, phải là chuỗi Trong công cụ import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497, các kiểu dữ liệu phân loại cho các loại không phải chuỗi có thể được đánh số thứ tự thành sàn gỗ, nhưng sẽ hủy đánh số thứ tự như kiểu dữ liệu nguyên thủy của chúngCông cụ import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497 duy trì cờ import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4134 của các kiểu dữ liệu phân loại với các loại chuỗi. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7442 không giữ cờ import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4134Các loại không được hỗ trợ bao gồm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7449 và các loại đối tượng Python thực tế. Những điều này sẽ đưa ra một thông báo lỗi hữu ích khi cố gắng tuần tự hóa. Loại In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7450 được hỗ trợ với pyarrow >= 0. 16. 0Công cụ import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497 bảo tồn các loại dữ liệu mở rộng, chẳng hạn như loại dữ liệu chuỗi và số nguyên có thể null (yêu cầu pyarrow >= 0. 16. 0 và yêu cầu loại tiện ích mở rộng triển khai các giao thức cần thiết, hãy xem tài liệu về loại tiện ích mở rộng ).
Bạn có thể chỉ định một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7136 để điều khiển quá trình lập số sê-ri. Đây có thể là một trong số import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7442 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7455. Nếu động cơ KHÔNG được chỉ định, thì tùy chọn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7456 sẽ được chọn; Xem tài liệu về pyarrow và fastparquet Ghi chú These engines are very similar and should read/write nearly identical parquet format files. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7460 supports timedelta data, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7461 supports timezone aware datetimes. Các thư viện này khác nhau do có các phụ thuộc cơ bản khác nhau (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7442 bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7463, trong khi import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497 sử dụng thư viện c)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
558Ghi vào một tập tin sàn gỗ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
559Đọc từ một tập tin sàn gỗ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
560Chỉ đọc một số cột nhất định của tệp sàn gỗ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
561Xử lý chỉ mục#Nối tiếp một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 thành sàn gỗ có thể bao gồm chỉ mục ẩn dưới dạng một hoặc nhiều cột trong tệp đầu ra. Như vậy, mã nàyIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
562tạo một tệp sàn gỗ có ba cột nếu bạn sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2497 để tuần tự hóa. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7467, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7468 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7469. Nếu bạn đang sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7442, chỉ mục có thể được ghi hoặc không vào tệpCột bổ sung không mong muốn này khiến một số cơ sở dữ liệu như Amazon Redshift từ chối tệp vì cột đó không tồn tại trong bảng đích Nếu bạn muốn bỏ qua các chỉ mục của khung dữ liệu khi viết, hãy chuyển In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7316 đến In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7472In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
563Điều này tạo ra một tệp sàn gỗ chỉ với hai cột dự kiến, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7467 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7468. Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 của bạn có một chỉ mục tùy chỉnh, bạn sẽ không lấy lại được nó khi tải tệp này vào một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43Vượt qua In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7477 sẽ luôn ghi chỉ mục, ngay cả khi đó không phải là hành vi mặc định của công cụ cơ bảnPhân vùng tệp Parquet#Sàn gỗ hỗ trợ phân vùng dữ liệu dựa trên giá trị của một hoặc nhiều cột In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
564In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7478 chỉ định thư mục mẹ mà dữ liệu sẽ được lưu. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7479 là các tên cột mà tập dữ liệu sẽ được phân vùng. Các cột được phân vùng theo thứ tự chúng được cung cấp. Sự phân chia phân vùng được xác định bởi các giá trị duy nhất trong các cột phân vùng. Ví dụ trên tạo một tập dữ liệu được phân vùng có thể trông giống nhưIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
565ORC#Mới trong phiên bản 1. 0. 0 Tương tự như định dạng sàn gỗ , Định dạng ORC là tuần tự hóa cột nhị phân cho khung dữ liệu. Nó được thiết kế để làm cho việc đọc khung dữ liệu hiệu quả. gấu trúc cung cấp cả người đọc và người viết cho định dạng ORC, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7480 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7481. This requires the pyarrow library. Cảnh báo Rất nên cài đặt pyarrow bằng conda do một số sự cố xảy ra bởi pyarrow In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7481 yêu cầu pyarrow>=7. 0. 0In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7480 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7481 chưa được hỗ trợ trên Windows, bạn có thể tìm thấy các môi trường hợp lệ trên cài đặt các phần phụ thuộc tùy chọn . Đối với các loại được hỗ trợ, vui lòng tham khảo các tính năng ORC được hỗ trợ trong Mũi tên Các múi giờ hiện tại trong các cột ngày giờ không được giữ nguyên khi khung dữ liệu được chuyển đổi thành tệp ORC
In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
566Ghi vào tệp orc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
567Đọc từ tệp orc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
568Chỉ đọc một số cột nhất định của tệp orc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
569truy vấn SQL#The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7485 module provides a collection of query wrappers to both facilitate data retrieval and to reduce dependency on DB-specific API. Trừu tượng hóa cơ sở dữ liệu được cung cấp bởi SQLAlchemy nếu được cài đặt. Ngoài ra, bạn sẽ cần một thư viện trình điều khiển cho cơ sở dữ liệu của mình. Ví dụ về các trình điều khiển như vậy là psycopg2 cho PostgreSQL hoặc pymysql cho MySQL. Đối với SQLite, điều này được bao gồm trong thư viện chuẩn của Python theo mặc định. Bạn có thể tìm thấy tổng quan về các trình điều khiển được hỗ trợ cho từng phương ngữ SQL trong tài liệu SQLAlchemyNếu SQLAlchemy chưa được cài đặt, dự phòng chỉ được cung cấp cho sqlite (và cho mysql để tương thích ngược, nhưng điều này không được dùng nữa và sẽ bị xóa trong phiên bản tương lai). Chế độ này yêu cầu bộ điều hợp cơ sở dữ liệu Python tôn trọng Python DB-API Xem thêm một số ví dụ về sách dạy nấu ăn để biết một số chiến lược nâng cao. Các chức năng chính là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7486(tên_bảng, con[, lược đồ,. ])Đọc bảng cơ sở dữ liệu SQL vào DataFrame In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7487(sql, con[, index_col,. ])Đọc truy vấn SQL vào DataFrame In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7488(sql, con[, index_col,. ])Đọc truy vấn SQL hoặc bảng cơ sở dữ liệu vào DataFrame In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7489(tên, con[, sơ đồ,. ])Ghi các bản ghi được lưu trữ trong DataFrame vào cơ sở dữ liệu SQL Ghi chú The function In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7490 is a convenience wrapper around In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7491 and In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7492 (and for backward compatibility) and will delegate to specific function depending on the provided input (database table name or sql query). Tên bảng không cần trích dẫn nếu có ký tự đặc biệtTrong ví dụ sau, chúng tôi sử dụng công cụ cơ sở dữ liệu SQLite SQL. Bạn có thể sử dụng cơ sở dữ liệu SQLite tạm thời nơi dữ liệu được lưu trữ trong “bộ nhớ” Để kết nối với SQLAlchemy, bạn sử dụng hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7493 để tạo đối tượng công cụ từ URI cơ sở dữ liệu. Bạn chỉ cần tạo công cụ một lần cho mỗi cơ sở dữ liệu mà bạn đang kết nối. Để biết thêm thông tin về In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7493 và định dạng URI, hãy xem các ví dụ bên dưới và tài liệu SQLAlchemyIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
570Nếu bạn muốn quản lý các kết nối của riêng mình, bạn có thể chuyển một trong các kết nối đó. Ví dụ bên dưới mở một kết nối tới cơ sở dữ liệu bằng trình quản lý bối cảnh Python tự động đóng kết nối sau khi khối hoàn thành. Xem tài liệu SQLAlchemy để được giải thích về cách xử lý kết nối cơ sở dữ liệu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
571Cảnh báo Khi bạn mở một kết nối tới cơ sở dữ liệu, bạn cũng chịu trách nhiệm đóng nó. Tác dụng phụ của việc mở kết nối có thể bao gồm khóa cơ sở dữ liệu hoặc hành vi vi phạm khác Viết DataFrames#Giả sử dữ liệu sau nằm trong một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
56, chúng ta có thể chèn nó vào cơ sở dữ liệu bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7497Tôi Ngày tháng Cột_1 Cột_2 Cột_3 26 2012-10-18 X 25. 7 Thật 42 2012-10-19 Y -12. 4 Sai 63 2012-10-20 Z 5. 73 Thật In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
572Với một số cơ sở dữ liệu, việc ghi DataFrames lớn có thể dẫn đến lỗi do vượt quá giới hạn kích thước gói. Điều này có thể tránh được bằng cách đặt tham số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 khi gọi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7499. Ví dụ: phần sau ghi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
56 vào cơ sở dữ liệu theo lô 1000 hàng cùng một lúcIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
573Các kiểu dữ liệu SQL#In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7497 sẽ cố gắng ánh xạ dữ liệu của bạn sang loại dữ liệu SQL thích hợp dựa trên loại dữ liệu. Khi bạn có các cột dtype In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
72, gấu trúc sẽ cố gắng suy ra kiểu dữ liệuBạn luôn có thể ghi đè loại mặc định bằng cách chỉ định loại SQL mong muốn của bất kỳ cột nào bằng cách sử dụng đối số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
88. Đối số này cần tên cột ánh xạ từ điển tới các loại SQLAlchemy (hoặc chuỗi cho chế độ dự phòng sqlite3). Ví dụ: chỉ định sử dụng loại sqlalchemy In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7504 thay vì loại In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7505 mặc định cho các cột chuỗiIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
574Ghi chú Do hỗ trợ hạn chế cho timedelta trong các hương vị cơ sở dữ liệu khác nhau, các cột có loại In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7506 sẽ được ghi dưới dạng giá trị số nguyên dưới dạng nano giây vào cơ sở dữ liệu và cảnh báo sẽ được đưa raGhi chú Các cột của In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7399 dtype sẽ được chuyển thành biểu diễn dày đặc như bạn sẽ nhận được với In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7508 (e. g. đối với các danh mục chuỗi, điều này mang lại một chuỗi các chuỗi). Do đó, việc đọc lại bảng cơ sở dữ liệu không tạo ra một phân loạiKiểu dữ liệu ngày giờ#Sử dụng SQLAlchemy, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7497 có khả năng ghi dữ liệu ngày giờ không biết múi giờ hoặc nhận biết múi giờ. Tuy nhiên, dữ liệu kết quả được lưu trữ trong cơ sở dữ liệu cuối cùng phụ thuộc vào loại dữ liệu được hỗ trợ cho dữ liệu ngày giờ của hệ thống cơ sở dữ liệu đang được sử dụngBảng sau đây liệt kê các kiểu dữ liệu được hỗ trợ cho dữ liệu ngày giờ đối với một số cơ sở dữ liệu phổ biến. Các phương ngữ cơ sở dữ liệu khác có thể có các loại dữ liệu khác nhau cho dữ liệu ngày giờ cơ sở dữ liệu SQL Datetime Types Hỗ trợ múi giờ SQLite In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7510Không mysql In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7511 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7512Không PostgreSQL In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7511 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7514Đúng Khi ghi dữ liệu nhận biết múi giờ vào cơ sở dữ liệu không hỗ trợ múi giờ, dữ liệu sẽ được ghi dưới dạng dấu thời gian ngây thơ múi giờ theo giờ địa phương đối với múi giờ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7491 cũng có khả năng đọc dữ liệu ngày giờ nhận biết múi giờ hoặc ngây thơ. Khi đọc các loại In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7514, gấu trúc sẽ chuyển đổi dữ liệu sang UTCPhương pháp chèn #Tham số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7517 kiểm soát mệnh đề chèn SQL được sử dụng. giá trị có thể làIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
24. Sử dụng mệnh đề SQL In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7519 tiêu chuẩn (mỗi hàng một cái)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7520. Truyền nhiều giá trị trong một mệnh đề In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7519. Nó sử dụng một cú pháp SQL đặc biệt không được hỗ trợ bởi tất cả các chương trình phụ trợ. Điều này thường mang lại hiệu suất tốt hơn cho các cơ sở dữ liệu phân tích như Presto và Redshift, nhưng lại có hiệu suất kém hơn đối với phần phụ trợ SQL truyền thống nếu bảng chứa nhiều cột. Để biết thêm thông tin, hãy kiểm tra tài liệu SQLAlchemycó thể gọi được với chữ ký In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7522. Điều này có thể được sử dụng để triển khai phương thức chèn hiệu quả hơn dựa trên các tính năng phương ngữ phụ trợ cụ thể
Ví dụ về một mệnh đề có thể gọi được bằng PostgreSQL COPY In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
575Bảng đọc #In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7491 sẽ đọc một bảng cơ sở dữ liệu được cung cấp tên bảng và tùy chọn một tập hợp con các cột để đọcGhi chú Để sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7491, bạn phải cài đặt phần phụ thuộc tùy chọn SQLAlchemyIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
576Ghi chú Lưu ý rằng gấu trúc suy ra các kiểu cột từ đầu ra truy vấn chứ không phải bằng cách tra cứu các loại dữ liệu trong lược đồ cơ sở dữ liệu vật lý. Ví dụ: giả sử In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7525 là một cột số nguyên trong bảng. Then, intuitively, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7526 will return integer-valued series, while In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7527 will return object-valued (str) series. Theo đó, nếu đầu ra truy vấn trống, thì tất cả các cột kết quả sẽ được trả về dưới dạng giá trị đối tượng (vì chúng là tổng quát nhất). Nếu bạn thấy trước rằng truy vấn của mình đôi khi sẽ tạo ra một kết quả trống, thì bạn có thể muốn đánh máy rõ ràng sau đó để đảm bảo tính toàn vẹn của dtypeBạn cũng có thể chỉ định tên của cột là chỉ mục In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 và chỉ định một tập hợp con các cột sẽ được đọcIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
577Và bạn rõ ràng có thể buộc các cột được phân tích thành ngày In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
578Nếu cần, bạn có thể chỉ định rõ ràng một chuỗi định dạng hoặc một lệnh của các đối số để chuyển tới In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7529In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
579Bạn có thể kiểm tra xem một bảng có tồn tại hay không bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7530Hỗ trợ lược đồ #Việc đọc và ghi vào các lược đồ khác nhau được hỗ trợ thông qua từ khóa import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 4116 trong các hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7491 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7497. Tuy nhiên, lưu ý rằng điều này phụ thuộc vào hương vị cơ sở dữ liệu (sqlite không có lược đồ). Ví dụIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
580Querying#Bạn có thể truy vấn bằng SQL thô trong hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7492. Trong trường hợp này, bạn phải sử dụng biến thể SQL phù hợp với cơ sở dữ liệu của mình. Khi sử dụng SQLAlchemy, bạn cũng có thể chuyển các cấu trúc ngôn ngữ Biểu thức SQLAlchemy, không liên quan đến cơ sở dữ liệuIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
581Tất nhiên, bạn có thể chỉ định một truy vấn “phức tạp” hơn In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
582Hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7492 hỗ trợ đối số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90. Việc chỉ định điều này sẽ trả về một trình vòng lặp thông qua các đoạn kết quả truy vấnIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
583In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
584Bạn cũng có thể chạy một truy vấn đơn giản mà không cần tạo một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 với In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7538. Điều này hữu ích cho các truy vấn không trả về giá trị, chẳng hạn như INSERT. Điều này có chức năng tương đương với việc gọi In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7539 trên công cụ SQLAlchemy hoặc đối tượng kết nối db. Một lần nữa, bạn phải sử dụng biến thể cú pháp SQL phù hợp với cơ sở dữ liệu của mìnhIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
585Ví dụ về kết nối động cơ#Để kết nối với SQLAlchemy, bạn sử dụng hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7493 để tạo đối tượng công cụ từ URI cơ sở dữ liệu. Bạn chỉ cần tạo công cụ một lần cho mỗi cơ sở dữ liệu mà bạn đang kết nốiIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
586Để biết thêm thông tin, hãy xem các ví dụ về tài liệu SQLAlchemy Advanced SQLAlchemy queries#You can use SQLAlchemy constructs to describe your query Sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7541 để chỉ định các tham số truy vấn theo cách trung lập với phụ trợIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
587Nếu bạn có một mô tả SQLAlchemy về cơ sở dữ liệu của mình, bạn có thể biểu thị các điều kiện ở đâu bằng các biểu thức SQLAlchemy In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
588Bạn có thể kết hợp các biểu thức SQLAlchemy với các tham số được truyền tới In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7490 bằng cách sử dụng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7543In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
589Dự phòng Sqlite#Việc sử dụng sqlite được hỗ trợ mà không cần sử dụng SQLAlchemy. Chế độ này yêu cầu bộ điều hợp cơ sở dữ liệu Python tôn trọng Python DB-API Bạn có thể tạo các kết nối như vậy In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
590Và sau đó đưa ra các truy vấn sau In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
591Google BigQuery#Cảnh báo bắt đầu bằng 0. 20. 0, pandas đã tách hỗ trợ Google BigQuery thành gói riêng biệt In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7544. Bạn có thể In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7545 để lấy nóThe In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7544 package provides functionality to read/write from Google BigQuerygấu trúc tích hợp với gói bên ngoài này. nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7544 được cài đặt, bạn có thể sử dụng các phương thức pandas In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7548 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7549, sẽ gọi các hàm tương ứng từ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7544Tài liệu đầy đủ có thể được tìm thấy ở đây định dạng thống kê #Ghi vào định dạng stata#The method In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7551 will write a DataFrame into a . dta file. Phiên bản định dạng của tệp này luôn là 115 (Stata 12)In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
592Các tệp dữ liệu Stata có hỗ trợ loại dữ liệu hạn chế; . Ngoài ra, Stata dự trữ các giá trị nhất định để biểu thị dữ liệu bị thiếu. Xuất một giá trị không bị thiếu nằm ngoài phạm vi cho phép trong Stata cho một loại dữ liệu cụ thể sẽ nhập lại biến có kích thước lớn hơn tiếp theo. Ví dụ: các giá trị In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7552 bị hạn chế nằm trong khoảng từ -127 đến 100 trong Stata và do đó, các biến có giá trị trên 100 sẽ kích hoạt chuyển đổi thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7553. Các giá trị In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7249 trong kiểu dữ liệu dấu phẩy động được lưu trữ dưới dạng kiểu dữ liệu bị thiếu cơ bản (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7561 trong Stata)Ghi chú Không thể xuất giá trị dữ liệu bị thiếu cho kiểu dữ liệu số nguyên Người viết Stata xử lý một cách duyên dáng các loại dữ liệu khác bao gồm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7562, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7563, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7564, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7565, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7566 bằng cách chuyển sang loại được hỗ trợ nhỏ nhất có thể biểu thị dữ liệu. Ví dụ: dữ liệu có loại In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7564 sẽ được chuyển thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7552 nếu tất cả các giá trị nhỏ hơn 100 (giới hạn trên đối với dữ liệu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7552 không bị thiếu trong Stata) hoặc, nếu các giá trị nằm ngoài phạm vi này, biến sẽ được chuyển thành Cảnh báo Chuyển đổi từ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7562 sang In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7556 có thể dẫn đến mất độ chính xác nếu giá trị In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7562 lớn hơn 2**53Cảnh báo In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7574 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7551 chỉ hỗ trợ các chuỗi có chiều rộng cố định chứa tối đa 244 ký tự, giới hạn do định dạng tệp dta phiên bản 115 áp đặt. Cố gắng ghi các tệp Stata dta với các chuỗi dài hơn 244 ký tự sẽ gây ra lỗi import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3327Đọc từ định dạng Stata#Hàm cấp cao nhất In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7577 sẽ đọc tệp dta và trả về In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7579 có thể được sử dụng để đọc tệp tăng dầnIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
593Specifying a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 yields a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7579 instance that can be used to read In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 lines from the file at a time. The In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7579 object can be used as an iteratorIn [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
594Để kiểm soát chi tiết hơn, hãy sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2495 và chỉ định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 với mỗi lệnh gọi tới In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
18In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
595Hiện tại, import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 3342 được truy xuất dưới dạng một cộtTham số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7588 cho biết có nên đọc và sử dụng nhãn giá trị để tạo biến import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 từ chúng hay không. Các nhãn giá trị cũng có thể được truy xuất bằng hàm In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7590, hàm này yêu cầu gọi ____In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
18 trước khi sử dụngTham số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7592 cho biết liệu các biểu diễn giá trị bị thiếu trong Stata có nên được giữ lại hay không. Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
61 (mặc định), các giá trị bị thiếu được biểu thị dưới dạng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7248. Nếu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32, các giá trị bị thiếu được biểu diễn bằng các đối tượng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7596 và các cột chứa các giá trị bị thiếu sẽ có kiểu dữ liệu In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
72Ghi chú Hỗ trợ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7598 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7579. định dạng dta 113-115 (Stata 10-12), 117 (Stata 13) và 118 (Stata 14)Ghi chú Cài đặt In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7600 sẽ upcast lên kiểu dữ liệu pandas tiêu chuẩn. In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7562 cho tất cả các loại số nguyên và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7556 cho dữ liệu dấu chấm động. Theo mặc định, kiểu dữ liệu Stata được giữ nguyên khi nhậpDữ liệu phân loại#Dữ liệu import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 có thể được xuất sang tệp dữ liệu Stata dưới dạng dữ liệu được gắn nhãn giá trị. Dữ liệu đã xuất bao gồm các mã danh mục cơ bản dưới dạng giá trị dữ liệu số nguyên và danh mục dưới dạng nhãn giá trị. Stata không có tương đương rõ ràng với import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 và thông tin về việc biến có được sắp xếp hay không bị mất khi xuấtCảnh báo Stata chỉ hỗ trợ các nhãn giá trị chuỗi và do đó, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
15 được gọi trên các danh mục khi xuất dữ liệu. Exporting import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 variables with non-string categories produces a warning, and can result a loss of information if the In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
15 representations of the categories are not uniqueTương tự, dữ liệu được gắn nhãn có thể được nhập từ các tệp dữ liệu Stata dưới dạng các biến import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 bằng cách sử dụng đối số từ khóa In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7588 (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32 theo mặc định). Đối số từ khóa In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7611 (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
32 theo mặc định) xác định xem các biến import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 đã nhập có được sắp xếp hay khôngGhi chú Khi nhập dữ liệu phân loại, giá trị của các biến trong tệp dữ liệu Stata không được bảo toàn do các biến import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 luôn sử dụng các kiểu dữ liệu số nguyên trong khoảng từ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7615 đến In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7616 trong đó In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7310 là số lượng phân loại. Nếu các giá trị gốc trong tệp dữ liệu Stata là bắt buộc, thì có thể nhập các giá trị này bằng cách đặt ____67618, thao tác này sẽ nhập dữ liệu gốc (chứ không phải nhãn biến). Các giá trị ban đầu có thể khớp với dữ liệu phân loại đã nhập vì có một ánh xạ đơn giản giữa các giá trị dữ liệu Stata ban đầu và mã danh mục của các biến Phân loại đã nhập. các giá trị bị thiếu được gán mã In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7615 và giá trị ban đầu nhỏ nhất được gán là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
84, giá trị nhỏ thứ hai được gán là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7621, v.v. cho đến khi giá trị ban đầu lớn nhất được gán mã In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7616Ghi chú Stata hỗ trợ sê-ri được dán nhãn một phần. Các chuỗi này có nhãn giá trị cho một số nhưng không phải tất cả các giá trị dữ liệu. Nhập chuỗi được gắn nhãn một phần sẽ tạo ra một import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 0524 với các danh mục chuỗi cho các giá trị được gắn nhãn và danh mục số cho các giá trị không có nhãnđịnh dạng SAS #Hàm cấp cao nhất In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7624 có thể đọc (nhưng không ghi) SAS XPORT (. xpt) và (kể từ v0. 18. 0) SAS7BDAT (. sas7bdat) định dạng tập tinTệp SAS chỉ chứa hai loại giá trị. ASCII text and floating point values (usually 8 bytes but sometimes truncated). Đối với tệp xuất, không có chuyển đổi loại tự động thành số nguyên, ngày hoặc phân loại. Đối với các tệp SAS7BDAT, mã định dạng có thể cho phép các biến ngày được tự động chuyển đổi thành ngày. Theo mặc định, toàn bộ tệp được đọc và trả về dưới dạng In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43Chỉ định một In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
90 hoặc sử dụng import openpyxl
worksheet = openpyxl.load_workbook("codespeedy.xlsx")
sheet = worksheet.active
sheet.column_dimensions['A'].width = 20
worksheet.save("codespeedy1.xlsx") 2495 để lấy các đối tượng người đọc (In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7628 hoặc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7629) để đọc tệp dần dần. Các đối tượng người đọc cũng có các thuộc tính chứa thông tin bổ sung về tệp và các biến của nóĐọc tệp SAS7BDAT In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
596Lấy một trình vòng lặp và đọc một tệp XPORT 100.000 dòng cùng một lúc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
597Thông số kỹ thuật cho định dạng tệp xport có sẵn trên trang web của SAS Không có tài liệu chính thức nào cho định dạng SAS7BDAT định dạng SPSS#Mới trong phiên bản 0. 25. 0 Hàm cấp cao nhất In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7630 có thể đọc (nhưng không ghi) SPSS SAV (. sav) và ZSAV (. tệp định dạng zsav)Tệp SPSS chứa tên cột. By default the whole file is read, categorical columns are converted into In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7631, and a In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 with all columns is returnedChỉ định tham số In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 để có được một tập hợp con các cột. Chỉ định In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7618 để tránh chuyển đổi các cột phân loại thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7631Đọc một tệp SPSS In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
598Trích xuất một tập hợp con các cột có trong In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
47 từ tệp SPSS và tránh chuyển đổi các cột phân loại thành In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7631In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
599Thông tin thêm về các định dạng tệp SAV và ZSAV có tại đây Các định dạng tệp khác#bản thân gấu trúc chỉ hỗ trợ IO với một bộ định dạng tệp giới hạn ánh xạ rõ ràng tới mô hình dữ liệu dạng bảng của nó. Để đọc và ghi các định dạng tệp khác vào và từ gấu trúc, chúng tôi khuyên dùng các gói này từ cộng đồng rộng lớn hơn netCDF#xarray cung cấp cấu trúc dữ liệu lấy cảm hứng từ gấu trúc In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
43 để làm việc với bộ dữ liệu đa chiều, tập trung vào định dạng tệp netCDF và chuyển đổi dễ dàng sang và từ gấu trúcCân nhắc về hiệu suất#Đây là một so sánh không chính thức của các phương pháp IO khác nhau, sử dụng pandas 0. 24. 2. Thời gian phụ thuộc vào máy và nên bỏ qua những khác biệt nhỏ In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
00Các chức năng kiểm tra sau đây sẽ được sử dụng bên dưới để so sánh hiệu suất của một số phương pháp IO In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
01Khi viết, ba chức năng hàng đầu về tốc độ là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7639, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7640 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7641In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
02Khi đọc, ba chức năng hàng đầu về tốc độ là In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7642, In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7643 và In [13]: import numpy as np
In [14]: data = "a,b,c,d\n1,2,3,4\n5,6,7,8\n9,10,11"
In [15]: print(data)
a,b,c,d
1,2,3,4
5,6,7,8
9,10,11
In [16]: df = pd.read_csv(StringIO(data), dtype=object)
In [17]: df
Out[17]:
a b c d
0 1 2 3 4
1 5 6 7 8
2 9 10 11 NaN
In [18]: df["a"][0]
Out[18]: '1'
In [19]: df = pd.read_csv(StringIO(data), dtype={"b": object, "c": np.float64, "d": "Int64"})
In [20]: df.dtypes
Out[20]:
a int64
b object
c float64
d Int64
dtype: object
7644 |