programming python

Sắp xếp nhiều danh sách python

Việc bản sao hoặc tham chiếu được trả lại cho thao tác cài đặt có thể phụ thuộc vào ngữ cảnh. Điều này đôi khi được gọi là In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 06 và nên tránh. Nhìn thấy

Xem một số chiến lược nâng cao

Lập chỉ mục phân cấp (MultiIndex)

Lập chỉ mục phân cấp / đa cấp rất thú vị vì nó mở ra cơ hội cho một số thao tác và phân tích dữ liệu khá phức tạp, đặc biệt là để làm việc với dữ liệu nhiều chiều hơn. Về bản chất, nó cho phép bạn lưu trữ và thao tác dữ liệu với số lượng kích thước tùy ý trong các cấu trúc dữ liệu có chiều thấp hơn như In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07 (1d) và In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08 (2d)

Trong phần này, chúng tôi sẽ chỉ ra chính xác ý nghĩa của việc lập chỉ mục “phân cấp” và cách nó tích hợp với tất cả các chức năng lập chỉ mục của gấu trúc được mô tả ở trên và trong các phần trước. Sau này, khi thảo luận và , chúng ta sẽ chỉ ra các ứng dụng không tầm thường để minh họa cách nó hỗ trợ trong việc cấu trúc dữ liệu để phân tích

Xem một số chiến lược nâng cao

Tạo đối tượng MultiIndex (chỉ mục phân cấp)

Đối tượng là đối tượng tương tự phân cấp của đối tượng tiêu chuẩn thường lưu trữ các nhãn trục trong đối tượng gấu trúc. Bạn có thể nghĩ về In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 như một mảng các bộ trong đó mỗi bộ là duy nhất. Một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 có thể được tạo từ một danh sách các mảng (sử dụng ), một mảng các bộ dữ liệu (sử dụng ), một tập hợp các lần lặp chéo (sử dụng ) hoặc một (sử dụng ). Hàm tạo In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 10 sẽ cố gắng trả về một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 khi nó được truyền vào một danh sách các bộ dữ liệu. Các ví dụ sau minh họa các cách khác nhau để khởi tạo MultiIndexes

In [1]: arrays = [ ...: ["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"], ...: ["one", "two", "one", "two", "one", "two", "one", "two"], ...: ] ...: In [2]: tuples = list(zip(*arrays)) In [3]: tuples Out[3]: [('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')] In [4]: index = pd.MultiIndex.from_tuples(tuples, names=["first", "second"]) In [5]: index Out[5]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) In [6]: s = pd.Series(np.random.randn(8), index=index) In [7]: s Out[7]: first second bar one 0.469112 two -0.282863 baz one -1.509059 two -1.135632 foo one 1.212112 two -0.173215 qux one 0.119209 two -1.044236 dtype: float64

Khi bạn muốn mọi cặp phần tử trong hai lần lặp, có thể sử dụng phương thức này dễ dàng hơn

In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second'])

Bạn cũng có thể trực tiếp xây dựng một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 từ một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08, sử dụng phương pháp. Đây là phương pháp bổ trợ cho

Để thuận tiện, bạn có thể chuyển trực tiếp danh sách các mảng vào In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07 hoặc In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08 để tự động xây dựng một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09

Tất cả các hàm tạo In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 đều chấp nhận đối số In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 29 lưu trữ tên chuỗi cho chính các mức. Nếu không cung cấp tên, In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 30 sẽ được chỉ định

In [17]: df.index.names Out[17]: FrozenList([None, None])

Chỉ mục này có thể sao lưu bất kỳ trục nào của đối tượng gấu trúc và số lượng cấp độ của chỉ mục tùy thuộc vào bạn

In [18]: df = pd.DataFrame(np.random.randn(3, 8), index=["A", "B", "C"], columns=index) In [19]: df Out[19]: first bar baz .. foo qux second one two one .. two one two A 0.895717 0.805244 -1.206412 .. 1.340309 -1.170299 -0.226169 B 0.410835 0.813850 0.132003 .. -1.187678 1.130127 -1.436737 C -1.413681 1.607920 1.024180 .. -2.211372 0.974466 -2.006747 [3 rows x 8 columns] In [20]: pd.DataFrame(np.random.randn(6, 6), index=index[:6], columns=index[:6]) Out[20]: first bar baz foo second one two one two one two first second bar one -0.410001 -0.078638 0.545952 -1.219217 -1.226825 0.769804 two -1.281247 -0.727707 -0.121306 -0.097883 0.695775 0.341734 baz one 0.959726 -1.110336 -0.619976 0.149748 -0.732339 0.687738 two 0.176444 0.403310 -0.154951 0.301624 -2.179861 -1.369849 foo one -0.954208 1.462696 -1.743161 -0.826591 -0.345352 1.314232 two 0.690579 0.995761 2.396780 0.014871 3.357427 -0.317441

Chúng tôi đã “phân tán” các mức chỉ mục cao hơn để làm cho đầu ra của bảng điều khiển dễ nhìn hơn một chút. Lưu ý rằng cách hiển thị chỉ mục có thể được kiểm soát bằng cách sử dụng tùy chọn In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 31 trong In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 32

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....:

Cần lưu ý rằng không có gì ngăn cản bạn sử dụng các bộ làm nhãn nguyên tử trên một trục

In [22]: pd.Series(np.random.randn(8), index=tuples) Out[22]: (bar, one) -1.236269 (bar, two) 0.896171 (baz, one) -0.487602 (baz, two) -0.082240 (foo, one) -2.182937 (foo, two) 0.380396 (qux, one) 0.084844 (qux, two) 0.432390 dtype: float64

Lý do mà In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 quan trọng là nó có thể cho phép bạn thực hiện các hoạt động nhóm, lựa chọn và định hình lại như chúng tôi sẽ mô tả bên dưới và trong các phần tiếp theo của tài liệu. Như bạn sẽ thấy trong các phần sau, bạn có thể thấy mình đang làm việc với dữ liệu được lập chỉ mục theo thứ bậc mà không cần tự mình tạo một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 một cách rõ ràng. Tuy nhiên, khi tải dữ liệu từ một tệp, bạn có thể muốn tạo In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 của riêng mình khi chuẩn bị tập dữ liệu

Xây dựng lại các nhãn mức

Phương thức sẽ trả về một vectơ nhãn cho từng vị trí ở một cấp độ cụ thể

In [23]: index.get_level_values(0) Out[23]: Index(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'], dtype='object', name='first') In [24]: index.get_level_values("second") Out[24]: Index(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'], dtype='object', name='second')

Lập chỉ mục cơ bản trên trục với MultiIndex

Một trong những tính năng quan trọng của lập chỉ mục phân cấp là bạn có thể chọn dữ liệu theo nhãn “một phần” xác định một nhóm con trong dữ liệu. Lựa chọn một phần "làm giảm" các mức của chỉ mục phân cấp dẫn đến kết quả theo cách hoàn toàn tương tự với việc chọn một cột trong DataFrame thông thường

In [25]: df["bar"] Out[25]: second one two A 0.895717 0.805244 B 0.410835 0.813850 C -1.413681 1.607920 In [26]: df["bar", "one"] Out[26]: A 0.895717 B 0.410835 C -1.413681 Name: (bar, one), dtype: float64 In [27]: df["bar"]["one"] Out[27]: A 0.895717 B 0.410835 C -1.413681 Name: one, dtype: float64 In [28]: s["qux"] Out[28]: one -1.039575 two 0.271860 dtype: float64

Xem để biết cách chọn ở cấp độ sâu hơn

mức xác định

Giữ tất cả các mức được xác định của một chỉ mục, ngay cả khi chúng không thực sự được sử dụng. Khi cắt một chỉ mục, bạn có thể nhận thấy điều này. Ví dụ

Điều này được thực hiện để tránh tính toán lại các cấp độ nhằm làm cho việc cắt lát có hiệu suất cao. Nếu bạn chỉ muốn xem các cấp độ đã sử dụng, bạn có thể sử dụng phương pháp

Để xây dựng lại In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 chỉ với các mức đã sử dụng, có thể sử dụng phương pháp

Căn chỉnh dữ liệu và sử dụng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 41

Các hoạt động giữa các đối tượng được lập chỉ mục khác nhau có In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 trên các trục sẽ hoạt động như bạn mong đợi;

Phương thức của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07/_______0_______45 có thể được gọi với một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 khác, hoặc thậm chí là một danh sách hoặc mảng các bộ dữ liệu

Lập chỉ mục nâng cao với chỉ mục phân cấp

Việc tích hợp cú pháp In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 trong lập chỉ mục nâng cao với In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 48 hơi khó khăn, nhưng chúng tôi đã cố gắng hết sức để làm như vậy. Nói chung, các khóa MultiIndex có dạng bộ. Ví dụ: các công việc sau đây như bạn mong đợi

Lưu ý rằng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 49 cũng sẽ hoạt động trong ví dụ này, nhưng ký hiệu tốc ký này có thể dẫn đến sự mơ hồ nói chung

Nếu bạn cũng muốn lập chỉ mục một cột cụ thể với In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 48, bạn phải sử dụng một bộ dữ liệu như thế này

Bạn không cần phải chỉ định tất cả các cấp của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 bằng cách chỉ chuyển các phần tử đầu tiên của bộ dữ liệu. Ví dụ: bạn có thể sử dụng lập chỉ mục “một phần” để lấy tất cả các phần tử có In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 52 ở cấp độ đầu tiên như sau

Đây là cách viết tắt của ký hiệu dài dòng hơn một chút In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 53 (tương đương với In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 54 trong ví dụ này)

Cắt “một phần” cũng hoạt động khá độc đáo

Bạn có thể cắt với một 'phạm vi' giá trị, bằng cách cung cấp một lát các bộ dữ liệu

Truyền danh sách nhãn hoặc bộ dữ liệu hoạt động tương tự như lập chỉ mục lại

In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 0

Ghi chú

Điều quan trọng cần lưu ý là các bộ dữ liệu và danh sách không được xử lý giống hệt nhau trong gấu trúc khi lập chỉ mục. Trong khi một bộ được hiểu là một khóa đa cấp, một danh sách được sử dụng để chỉ định một số khóa. Hay nói cách khác, các bộ dữ liệu đi theo chiều ngang (các cấp độ duyệt qua), các danh sách đi theo chiều dọc (các cấp độ quét)

Điều quan trọng là, một danh sách các bộ lập chỉ mục cho một số khóa In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 hoàn chỉnh, trong khi một bộ danh sách đề cập đến một số giá trị trong một mức

Sử dụng máy thái

Bạn có thể cắt một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 bằng cách cung cấp nhiều bộ chỉ mục

Bạn có thể cung cấp bất kỳ bộ chọn nào như thể bạn đang lập chỉ mục theo nhãn, xem , bao gồm các lát cắt, danh sách nhãn, nhãn và bộ chỉ mục boolean

Bạn có thể sử dụng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 57 để chọn tất cả nội dung của cấp độ đó. Bạn không cần chỉ định tất cả các cấp độ sâu hơn, chúng sẽ được ngụ ý là In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 57

Như thường lệ, cả hai mặt của máy cắt được bao gồm vì đây là chỉ mục nhãn

Cảnh báo

Bạn nên chỉ định tất cả các trục trong bộ xác định In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 48, nghĩa là bộ chỉ mục cho chỉ mục và cho các cột. Có một số trường hợp không rõ ràng trong đó trình chỉ mục được thông qua có thể bị hiểu sai là lập chỉ mục cho cả hai trục, thay vì nói ____0__09 cho các hàng

Bạn nên làm điều này

bạn không nên làm điều này

Cắt Multi Index cơ bản bằng cách sử dụng các lát, danh sách và nhãn

Bạn có thể sử dụng để tạo điều kiện cho một cú pháp tự nhiên hơn bằng cách sử dụng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 62, thay vì sử dụng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 57

Có thể thực hiện các lựa chọn khá phức tạp bằng phương pháp này trên nhiều trục cùng một lúc

Sử dụng bộ chỉ mục boolean, bạn có thể cung cấp lựa chọn liên quan đến các giá trị

Bạn cũng có thể chỉ định đối số In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 64 thành In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 48 để diễn giải các slicer đã truyền trên một trục

Hơn nữa, bạn có thể đặt các giá trị bằng các phương pháp sau

Bạn cũng có thể sử dụng mặt phải của đối tượng có thể căn chỉnh

mặt cắt ngang

Phương pháp của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08 cũng sử dụng đối số cấp độ để giúp việc chọn dữ liệu ở cấp độ cụ thể của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 dễ dàng hơn

Bạn cũng có thể chọn trên các cột có In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 69, bằng cách cung cấp đối số trục

Bạn có thể vượt qua In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 71 đến In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 69 để giữ nguyên cấp độ đã chọn

So sánh ở trên với kết quả bằng cách sử dụng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 73 (giá trị mặc định)

Lập chỉ mục lại và căn chỉnh nâng cao

Sử dụng tham số In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 74 trong và các phương thức của đối tượng gấu trúc rất hữu ích để phát các giá trị trên một cấp độ. Ví dụ

In [17]: df.index.names Out[17]: FrozenList([None, None]) 0

Hoán đổi cấp độ với In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 77

Phương pháp có thể chuyển đổi thứ tự của hai cấp độ

In [17]: df.index.names Out[17]: FrozenList([None, None]) 1

Sắp xếp lại các cấp độ với In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 79

Phương thức tổng quát hóa phương thức In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 77, cho phép bạn hoán vị các mức chỉ số phân cấp trong một bước

In [17]: df.index.names Out[17]: FrozenList([None, None]) 2

Đổi tên của một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 10 hoặc In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09

Phương pháp này được sử dụng để đổi tên nhãn của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 và thường được sử dụng để đổi tên các cột của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08. Đối số In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 87 của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 88 cho phép chỉ định một từ điển chỉ bao gồm các cột bạn muốn đổi tên

In [17]: df.index.names Out[17]: FrozenList([None, None]) 3

Phương pháp này cũng có thể được sử dụng để đổi tên các nhãn cụ thể của chỉ mục chính của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08

In [17]: df.index.names Out[17]: FrozenList([None, None]) 4

Phương thức được sử dụng để đổi tên của một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 10 hoặc In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09. Đặc biệt, có thể chỉ định tên của các mức của một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09, điều này rất hữu ích nếu sau này sử dụng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 94 để di chuyển các giá trị từ In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 sang một cột

In [17]: df.index.names Out[17]: FrozenList([None, None]) 5

Lưu ý rằng các cột của In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08 là một chỉ mục, vì vậy việc sử dụng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 97 với đối số In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 87 sẽ thay đổi tên của chỉ mục đó

In [17]: df.index.names Out[17]: FrozenList([None, None]) 6

Cả In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 88 và In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 97 đều hỗ trợ chỉ định từ điển, In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07 hoặc chức năng ánh xạ để ánh xạ nhãn/tên thành giá trị mới

Khi làm việc trực tiếp với một đối tượng In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 10, thay vì thông qua một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08, có thể được sử dụng để thay đổi tên

In [17]: df.index.names Out[17]: FrozenList([None, None]) 7

Bạn không thể đặt tên của MultiIndex qua một mức

In [17]: df.index.names Out[17]: FrozenList([None, None]) 8

sử dụng thay thế

Sắp xếp một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09

Đối với các đối tượng -ed được lập chỉ mục và cắt hiệu quả, chúng cần được sắp xếp. Như với bất kỳ chỉ mục nào, bạn có thể sử dụng

In [17]: df.index.names Out[17]: FrozenList([None, None]) 9

Bạn cũng có thể chuyển tên cấp độ cho In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 09 nếu cấp độ In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 được đặt tên

Trên các đối tượng có chiều cao hơn, bạn có thể sắp xếp bất kỳ trục nào khác theo cấp độ nếu chúng có In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09

Lập chỉ mục sẽ hoạt động ngay cả khi dữ liệu không được sắp xếp, nhưng sẽ khá kém hiệu quả (và hiển thị In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 12). Nó cũng sẽ trả về một bản sao của dữ liệu chứ không phải là một dạng xem

Hơn nữa, nếu bạn cố lập chỉ mục thứ gì đó chưa được sắp xếp từ vựng hoàn toàn, điều này có thể làm tăng

Phương pháp In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 13 trên In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 hiển thị nếu chỉ mục được sắp xếp

Và bây giờ lựa chọn hoạt động như mong đợi

Thực hiện các phương pháp

Tương tự như NumPy ndarrays, gấu trúc In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 10, In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07 và In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08 cũng cung cấp phương thức truy xuất các phần tử dọc theo một trục nhất định tại các chỉ số đã cho. Các chỉ số đã cho phải là một danh sách hoặc một dãy các vị trí chỉ mục số nguyên. In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 19 cũng sẽ chấp nhận số nguyên âm làm vị trí tương đối cho đến cuối đối tượng

Đối với DataFrames, các chỉ mục đã cho phải là danh sách 1d hoặc ndarray chỉ định vị trí hàng hoặc cột

Điều quan trọng cần lưu ý là phương thức In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 19 trên các đối tượng gấu trúc không nhằm mục đích hoạt động trên các chỉ số boolean và có thể trả về kết quả không mong muốn

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 0

Cuối cùng, như một lưu ý nhỏ về hiệu suất, bởi vì phương pháp In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 19 xử lý phạm vi đầu vào hẹp hơn, nên nó có thể mang lại hiệu suất nhanh hơn nhiều so với lập chỉ mục ưa thích

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 1

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 2

Các loại chỉ mục

Chúng ta đã thảo luận khá nhiều về In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 09 trong các phần trước. Tài liệu về In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 23 và In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 24 được hiển thị và tài liệu về In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 25 được tìm thấy

Trong các phần phụ sau đây, chúng tôi sẽ nêu bật một số loại chỉ mục khác

Phân loại Index

là một loại chỉ mục hữu ích cho việc hỗ trợ lập chỉ mục với các mục trùng lặp. Đây là vùng chứa xung quanh a và cho phép lập chỉ mục và lưu trữ hiệu quả một chỉ mục có số lượng lớn các phần tử trùng lặp

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 3

Đặt chỉ mục sẽ tạo ra một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 26

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 4

Lập chỉ mục với In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 29 hoạt động tương tự như một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 10 với các bản sao. Người lập chỉ mục phải nằm trong danh mục nếu không hoạt động sẽ tăng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 31

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 5

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 6

Việc sắp xếp chỉ mục sẽ sắp xếp theo thứ tự của các danh mục (nhớ lại rằng chúng tôi đã tạo chỉ mục với In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 33, vì vậy thứ tự được sắp xếp là In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 34)

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 7

Hoạt động nhóm trên chỉ mục cũng sẽ bảo vệ bản chất chỉ mục

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 8

Các hoạt động lập chỉ mục lại sẽ trả về một chỉ mục kết quả dựa trên loại của trình chỉ mục đã chuyển. Vượt qua một danh sách sẽ trả về một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 10 đơn giản cũ; . Điều này cho phép một người tùy ý lập chỉ mục những giá trị này ngay cả với các giá trị không có trong danh mục, tương tự như cách bạn có thể lập chỉ mục lại bất kỳ chỉ mục gấu trúc nào

In [21]: with pd.option_context("display.multi_sparse", False): ....: df ....: 9

Cảnh báo

Các thao tác Định hình lại và So sánh trên một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 26 phải có cùng danh mục nếu không một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 40 sẽ được nâng lên

Int64 Index và Range Index

Không dùng nữa kể từ phiên bản 1. 4. 0. Trong gấu trúc 2. 0, sẽ trở thành loại chỉ mục mặc định cho các loại số thay vì In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 42, In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 43 và In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 44 và các loại chỉ mục đó do đó không được dùng nữa và sẽ bị xóa trong phiên bản tương lai. In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 45 sẽ không bị xóa vì nó đại diện cho phiên bản được tối ưu hóa của chỉ mục số nguyên.

là một chỉ số cơ bản cơ bản trong gấu trúc. Đây là một mảng bất biến thực hiện một tập hợp có thể cắt được sắp xếp theo thứ tự

là một lớp con của In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 42 cung cấp chỉ mục mặc định cho tất cả các đối tượng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 49. In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 45 là phiên bản được tối ưu hóa của In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 42 có thể biểu diễn một tập hợp có thứ tự đơn điệu. Chúng tương tự như Python

Float64Index

Không dùng nữa kể từ phiên bản 1. 4. 0. sẽ trở thành loại chỉ mục mặc định cho các loại số trong tương lai thay vì In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 42, In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 43 và In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 44 và các loại chỉ mục đó do đó không được dùng nữa và sẽ bị xóa trong phiên bản tương lai của Pandas. In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 45 sẽ không bị xóa vì nó đại diện cho phiên bản được tối ưu hóa của chỉ mục số nguyên.

Theo mặc định, a sẽ được tạo tự động khi chuyển các giá trị nổi hoặc số nguyên hỗn hợp trong tạo chỉ mục. Điều này cho phép mô hình cắt lát dựa trên nhãn thuần túy làm cho In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 58 lập chỉ mục vô hướng và cắt lát hoạt động giống hệt nhau

Lựa chọn vô hướng cho In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 59 sẽ luôn dựa trên nhãn. Một số nguyên sẽ khớp với một chỉ số float bằng nhau (e. g. In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 60 tương đương với In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 61)

Lập chỉ mục vị trí duy nhất là thông qua In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 62

Một chỉ số vô hướng không được tìm thấy sẽ tăng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 31. Cắt lát chủ yếu dựa trên các giá trị của chỉ mục khi sử dụng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 58 và luôn có vị trí khi sử dụng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 62. Ngoại lệ là khi lát cắt là boolean, trong trường hợp đó, nó sẽ luôn ở vị trí

Trong các chỉ mục float, cho phép cắt bằng float

Trong các chỉ mục không nổi, việc cắt bằng cách sử dụng số float sẽ tăng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 40

Đây là trường hợp sử dụng điển hình để sử dụng loại chỉ mục này. Hãy tưởng tượng rằng bạn có sơ đồ lập chỉ mục giống như timedelta hơi bất thường, nhưng dữ liệu được ghi dưới dạng float. Ví dụ, đây có thể là độ lệch mili giây

Sau đó, các phép toán lựa chọn sẽ luôn hoạt động trên cơ sở giá trị, cho tất cả các toán tử lựa chọn

Bạn có thể truy xuất 1 giây đầu tiên (1000 ms) dữ liệu như vậy

Nếu bạn cần lựa chọn dựa trên số nguyên, bạn nên sử dụng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 62

Chỉ số khoảng thời gian

cùng với dtype của riêng nó, In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 69 cũng như kiểu vô hướng, cho phép hỗ trợ hạng nhất trong gấu trúc cho ký hiệu khoảng

Lập chỉ mục với một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 68

Có thể sử dụng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 68 trong In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07 và trong In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08 làm chỉ mục

Lập chỉ mục dựa trên nhãn thông qua In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 48 dọc theo các cạnh của khoảng thời gian hoạt động như bạn mong đợi, chọn khoảng thời gian cụ thể đó

Nếu bạn chọn một nhãn có trong một khoảng thời gian, điều này cũng sẽ chọn khoảng thời gian

Chọn bằng cách sử dụng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 70 sẽ chỉ trả về kết quả khớp chính xác (bắt đầu từ pandas 0. 25. 0)

Việc cố gắng chọn một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 70 không có chính xác trong In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 68 sẽ làm tăng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 31

Việc chọn tất cả In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 83 chồng lên một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 70 đã cho có thể được thực hiện bằng phương pháp tạo bộ chỉ mục boolean

Kết hợp dữ liệu với In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 86 và In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 87

và cả hai đều trả về một đối tượng In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 27 và các thùng mà chúng tạo được lưu trữ dưới dạng một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 68 trong thuộc tính In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 92 của nó

cũng chấp nhận một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 68 cho đối số In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 95 của nó, cho phép một thành ngữ pandas hữu ích. Đầu tiên, chúng tôi gọi với một số dữ liệu và In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 95 được đặt thành một số cố định, để tạo các thùng. Sau đó, chúng tôi chuyển các giá trị của In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 92 làm đối số In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 95 trong các lệnh gọi tiếp theo tới , cung cấp dữ liệu mới sẽ được chuyển vào cùng một thùng

Bất kỳ giá trị nào nằm ngoài tất cả các ngăn sẽ được gán giá trị In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 01

Tạo phạm vi khoảng

Nếu chúng ta cần các khoảng thời gian với tần suất đều đặn, chúng ta có thể sử dụng hàm để tạo một In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 68 bằng cách sử dụng các kết hợp khác nhau của In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 04, In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 05 và In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 06. Tần suất mặc định cho In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 07 là 1 cho các khoảng thời gian số và ngày theo lịch cho các khoảng thời gian giống như ngày tháng

Tham số In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 08 có thể được sử dụng để chỉ định các tần số không mặc định và có thể sử dụng nhiều loại với các khoảng thời gian giống như ngày giờ

Ngoài ra, tham số In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 09 có thể được sử dụng để chỉ định (các) bên mà các khoảng được đóng trên. Khoảng thời gian được đóng ở phía bên phải theo mặc định

Chỉ định In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 04, In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 05 và In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 06 sẽ tạo ra một phạm vi các khoảng cách đều nhau từ In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 04 đến bao gồm cả In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 05, với số lượng phần tử In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 06 trong kết quả là In [10]: df = pd.DataFrame( ....: [["bar", "one"], ["bar", "two"], ["foo", "one"], ["foo", "two"]], ....: columns=["first", "second"], ....: ) ....: In [11]: pd.MultiIndex.from_frame(df) Out[11]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('foo', 'one'), ('foo', 'two')], names=['first', 'second']) 68

Câu hỏi thường gặp về lập chỉ mục khác

lập chỉ mục số nguyên

Lập chỉ mục dựa trên nhãn với nhãn trục số nguyên là một chủ đề gai góc. Nó đã được thảo luận rất nhiều trên các danh sách gửi thư và giữa các thành viên khác nhau của cộng đồng khoa học Python. Ở gấu trúc, quan điểm chung của chúng tôi là nhãn quan trọng hơn vị trí số nguyên. Do đó, với chỉ mục trục số nguyên, chỉ có thể lập chỉ mục dựa trên nhãn bằng các công cụ tiêu chuẩn như In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 48. Đoạn mã sau sẽ tạo ra ngoại lệ

Quyết định có chủ ý này được đưa ra để ngăn chặn sự mơ hồ và các lỗi tinh vi (nhiều người dùng đã báo cáo việc tìm thấy lỗi khi thay đổi API được thực hiện để ngăn chặn tình trạng “ngã ngửa” khi lập chỉ mục dựa trên vị trí)

Các chỉ mục không đơn điệu yêu cầu khớp chính xác

Nếu chỉ số của một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07 hoặc In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 08 tăng hoặc giảm đơn điệu, thì giới hạn của một lát cắt dựa trên nhãn có thể nằm ngoài phạm vi của chỉ mục, giống như lát cắt lập chỉ mục một Python bình thường In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 20. Tính đơn điệu của một chỉ mục có thể được kiểm tra bằng các thuộc tính và

Mặt khác, nếu chỉ mục không đơn điệu, thì cả hai giới hạn của lát cắt phải là thành viên duy nhất của chỉ mục

In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 23 và In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 24 chỉ kiểm tra xem một chỉ mục có đơn điệu yếu không. Để kiểm tra tính đơn điệu nghiêm ngặt, bạn có thể kết hợp một trong số đó với thuộc tính

Điểm cuối được bao gồm

So với việc cắt chuỗi Python tiêu chuẩn trong đó không bao gồm điểm cuối của lát cắt, thì việc cắt dựa trên nhãn trong gấu trúc được bao gồm. Lý do chính cho điều này là thường không thể dễ dàng xác định “phần tử kế tiếp” hoặc phần tử tiếp theo sau một nhãn cụ thể trong một chỉ mục. Ví dụ, hãy xem xét điều sau In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07

Giả sử chúng ta muốn cắt từ In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 27 thành In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 28, sử dụng số nguyên, điều này sẽ được thực hiện như vậy

Tuy nhiên, nếu bạn chỉ có In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 27 và In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 28, việc xác định phần tử tiếp theo trong chỉ mục có thể hơi phức tạp. Ví dụ: những điều sau đây không hoạt động

Một trường hợp sử dụng rất phổ biến là giới hạn chuỗi thời gian bắt đầu và kết thúc vào hai ngày cụ thể. Để kích hoạt tính năng này, chúng tôi đã đưa ra lựa chọn thiết kế để tạo tính năng cắt dựa trên nhãn bao gồm cả hai điểm cuối

Đây chắc chắn là một thứ "tính thực tế đánh bại độ tinh khiết", nhưng nó là điều cần chú ý nếu bạn mong đợi phép cắt dựa trên nhãn hoạt động chính xác theo cách mà phép cắt số nguyên tiêu chuẩn của Python hoạt động

Việc lập chỉ mục có khả năng thay đổi kiểu Sê-ri bên dưới

Hoạt động lập chỉ mục khác nhau có khả năng thay đổi dtype của một In [8]: iterables = [["bar", "baz", "foo", "qux"], ["one", "two"]] In [9]: pd.MultiIndex.from_product(iterables, names=["first", "second"]) Out[9]: MultiIndex([('bar', 'one'), ('bar', 'two'), ('baz', 'one'), ('baz', 'two'), ('foo', 'one'), ('foo', 'two'), ('qux', 'one'), ('qux', 'two')], names=['first', 'second']) 07

Điều này là do các hoạt động lập chỉ mục (lại) ở trên âm thầm chèn In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 32 và thay đổi In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 33 tương ứng. Điều này có thể gây ra một số vấn đề khi sử dụng In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 34 In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 35 chẳng hạn như In [12]: arrays = [ ....: np.array(["bar", "bar", "baz", "baz", "foo", "foo", "qux", "qux"]), ....: np.array(["one", "two", "one", "two", "one", "two", "one", "two"]), ....: ] ....: In [13]: s = pd.Series(np.random.randn(8), index=arrays) In [14]: s Out[14]: bar one -0.861849 two -2.104569 baz one -0.494929 two 1.071804 foo one 0.721555 two -0.706771 qux one -1.039575 two 0.271860 dtype: float64 In [15]: df = pd.DataFrame(np.random.randn(8, 4), index=arrays) In [16]: df Out[16]: 0 1 2 3 bar one -0.424972 0.567020 0.276232 -1.087401 two -0.673690 0.113648 -1.478427 0.524988 baz one 0.404705 0.577046 -1.715002 -1.039268 two -0.370647 -1.157892 -1.344312 0.844885 foo one 1.075770 -0.109050 1.643563 -1.469388 two 0.357021 -0.674600 -1.776904 -0.968914 qux one -1.294524 0.413738 0.276662 -0.472035 two -0.013960 -0.362543 -0.006154 -0.923061 36