Hướng dẫn how do you find duplicate words in a list python? - làm cách nào để bạn tìm thấy các từ trùng lặp trong danh sách python?

Tôi có thể thấy nơi bạn sẽ sắp xếp, vì bạn có thể biết khi nào bạn đã nhấn một từ mới và theo dõi số lượng cho mỗi từ duy nhất. Tuy nhiên, những gì bạn thực sự muốn làm là sử dụng hàm băm (từ điển) để theo dõi số lượng vì các khóa từ điển là duy nhất. Ví dụ:

words = sentence.split() counts = {} for word in words: if word not in counts: counts[word] = 0 counts[word] += 1

Bây giờ sẽ cung cấp cho bạn một từ điển trong đó khóa là từ và giá trị là số lần nó xuất hiện. Có những điều bạn có thể làm như sử dụng collections.defaultdict(int) để bạn chỉ có thể thêm giá trị:

counts = collections.defaultdict(int) for word in words: counts[word] += 1

Nhưng thậm chí còn có một cái gì đó tốt hơn thế ... collections.Counter sẽ lấy danh sách các từ của bạn và biến nó thành một từ điển (một phần mở rộng của từ điển thực sự) có chứa số lượng.

counts = collections.Counter(words)

Từ đó bạn muốn danh sách các từ theo thứ tự được sắp xếp với số lượng của chúng để bạn có thể in chúng. counts = collections.defaultdict(int) for word in words: counts[word] += 1 0 sẽ cung cấp cho bạn một danh sách các bộ dữ liệu và counts = collections.defaultdict(int) for word in words: counts[word] += 1 1 sẽ sắp xếp (theo mặc định) theo mục đầu tiên của mỗi tuple (từ trong trường hợp này) ... đó chính xác là những gì bạn muốn.

Đầu ra

"As" is repeated 1 time. "are" is repeated 2 times. "as" is repeated 3 times. "certain" is repeated 2 times. "do" is repeated 1 time. "far" is repeated 2 times. "laws" is repeated 1 time. "mathematics" is repeated 1 time. "not" is repeated 2 times. "of" is repeated 1 time. "reality" is repeated 2 times. "refer" is repeated 2 times. "the" is repeated 1 time. "they" is repeated 3 times. "to" is repeated 2 times.

Cải thiện bài viết

Lưu bài viết

Đôi khi, trong khi làm việc với danh sách Python, chúng ta có thể gặp vấn đề trong đó chúng ta cần thực hiện xóa các từ trùng lặp khỏi danh sách chuỗi. Điều này có thể có ứng dụng khi chúng ta ở trong miền dữ liệu. Hãy để thảo luận về những cách nhất định trong đó nhiệm vụ này có thể được thực hiện. & NBSP;

Phương thức số 1: Sử dụng Set () + Split () + Vòng lặp Kết hợp các phương thức trên có thể được sử dụng để thực hiện tác vụ này. Trong đó, trước tiên chúng tôi chia từng danh sách thành các từ kết hợp và sau đó sử dụng Set () để thực hiện nhiệm vụ loại bỏ trùng lặp. & NBSP; The combination of above methods can be used to perform this task. In this, we first split each list into combined words and then employ set() to perform the task of duplicate removal.

Python3

counts = collections.defaultdict(int) for word in words: counts[word] += 1 2counts = collections.defaultdict(int) for word in words: counts[word] += 1 3 counts = collections.defaultdict(int) for word in words: counts[word] += 1 4counts = collections.defaultdict(int) for word in words: counts[word] += 1 5counts = collections.defaultdict(int) for word in words: counts[word] += 1 6counts = collections.defaultdict(int) for word in words: counts[word] += 1 7counts = collections.defaultdict(int) for word in words: counts[word] += 1 6counts = collections.defaultdict(int) for word in words: counts[word] += 1 9counts = collections.Counter(words) 0

counts = collections.Counter(words) 1counts = collections.Counter(words) 2counts = collections.Counter(words) 3 counts = collections.Counter(words) 4 counts = collections.Counter(words) 5counts = collections.Counter(words) 6

counts = collections.Counter(words) 7counts = collections.defaultdict(int) for word in words: counts[word] += 1 3 counts = collections.Counter(words) 9

import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 4import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 5import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 6import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 7import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 8import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 9

counts = collections.Counter(words) 1counts = collections.Counter(words) 2"As" is repeated 1 time. "are" is repeated 2 times. "as" is repeated 3 times. "certain" is repeated 2 times. "do" is repeated 1 time. "far" is repeated 2 times. "laws" is repeated 1 time. "mathematics" is repeated 1 time. "not" is repeated 2 times. "of" is repeated 1 time. "reality" is repeated 2 times. "refer" is repeated 2 times. "the" is repeated 1 time. "they" is repeated 3 times. "to" is repeated 2 times. 2 counts = collections.Counter(words) 4 counts = collections.Counter(words) 5"As" is repeated 1 time. "are" is repeated 2 times. "as" is repeated 3 times. "certain" is repeated 2 times. "do" is repeated 1 time. "far" is repeated 2 times. "laws" is repeated 1 time. "mathematics" is repeated 1 time. "not" is repeated 2 times. "of" is repeated 1 time. "reality" is repeated 2 times. "refer" is repeated 2 times. "the" is repeated 1 time. "they" is repeated 3 times. "to" is repeated 2 times. 5

Đầu ra: & nbsp;

The original list is : ['gfg, best, gfg', 'I, am, I', 'two, two, three'] The list after duplicate words removal is : [{'best', 'gfg'}, {'I', 'am'}, {'three', 'two'}]

Phương thức số 2: Sử dụng danh sách hiểu + set () + split () Đây là phương thức tương tự như ở trên. Sự khác biệt là chúng tôi sử dụng khả năng hiểu danh sách thay vì các vòng lặp để thực hiện phần lặp. & NBSP; This is similar method to above. The difference is that we employ list comprehension instead of loops to perform the iteration part.

Python3

counts = collections.Counter(words) 7counts = collections.defaultdict(int) for word in words: counts[word] += 1 3 counts = collections.Counter(words) 9

Đầu ra: & nbsp;

The original list is : ['gfg, best, gfg', 'I, am, I', 'two, two, three'] The list after duplicate words removal is : [{'best', 'gfg'}, {'I', 'am'}, {'three', 'two'}]

Python3

counts = collections.Counter(words) 7counts = collections.defaultdict(int) for word in words: counts[word] += 1 3 counts = collections.defaultdict(int) for word in words: counts[word] += 1 4import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 6import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 7import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 8The original list is : ['gfg, best, gfg', 'I, am, I', 'two, two, three'] The list after duplicate words removal is : [{'best', 'gfg'}, {'I', 'am'}, {'three', 'two'}]7__

Phương thức: Sử dụng Sắp xếp ()+Index ()+Split ()

counts = collections.defaultdict(int) for word in words: counts[word] += 1 2counts = collections.defaultdict(int) for word in words: counts[word] += 1 3 counts = collections.defaultdict(int) for word in words: counts[word] += 1 4collections.defaultdict(int)1counts = collections.defaultdict(int) for word in words: counts[word] += 1 6collections.defaultdict(int)3counts = collections.defaultdict(int) for word in words: counts[word] += 1 6collections.defaultdict(int)5 collections.defaultdict(int)6counts = collections.defaultdict(int) for word in words: counts[word] += 1 3counts = collections.Counter(words) 9

collections.Counter3counts = collections.Counter(words) 1counts = collections.Counter(words) 2counts = collections.defaultdict(int) for word in words: counts[word] += 1 00counts = collections.defaultdict(int) for word in words: counts[word] += 1 01counts = collections.defaultdict(int) for word in words: counts[word] += 1 1counts = collections.Counter(words) 2import collections sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality""" words = sentence.split() word_counts = collections.Counter(words) for word, count in sorted(word_counts.items()): print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else "")) 6counts = collections.defaultdict(int) for word in words: counts[word] += 1 05counts = collections.defaultdict(int) for word in words: counts[word] += 1 3counts = collections.defaultdict(int) for word in words: counts[word] += 1 07counts = collections.defaultdict(int) for word in words: counts[word] += 1 3counts = collections.defaultdict(int) for word in words: counts[word] += 1 00counts = collections.defaultdict(int) for word in words: counts[word] += 1 10

Đầu ra

gfg best I am two three

Hướng dẫn how do you find duplicate words in a list python? - làm cách nào để bạn tìm thấy các từ trùng lặp trong danh sách python?

Python3

Python3

Python3

Bài Viết Liên Quan

Hướng dẫn what is pool multiprocessing python? - python đa xử lý pool là gì?

Hướng dẫn add function python - thêm chức năng python

Hướng dẫn create button tkinter python - tạo nút tkinter python

Giá xe kona lăn bánh 2023

Hướng dẫn how to sort data into groups in excel - cách sắp xếp dữ liệu theo nhóm trong excel

Hướng dẫn phpdocumentor tags - thẻ phpdocumentor

2023 BMW S1000RR M

Hướng dẫn what does html website mean? - trang web html có nghĩa là gì?

Hướng dẫn choropleth map python folium - bản đồ choropleth trăn lá

Hướng dẫn mysql match date without time - ngày khớp mysql không có thời gian

Toplist

Top 30 bài tập bổ trợ tiếng anh 6 i learn smart world 2022

Top 10 giáo án tự nhiên xã hội lớp 3 cả năm môi nhất violet 2022

Top 9 download mẫu phong bì mừng đám cưới 2022

Top 9 gia đình và con cái ông nguyễn phú trọng 2022

Top 29 lời dân chương trình bài hát gửi về quan họ 2022

Top 10 giáo án i learn smart world violet 2022

Top 9 đề thi vào lớp 6 trường lê lợi hà đông môn toán 2022

Top 10 thủ tục giám đốc thẩm và tái thẩm trong tố tụng hành chính 2022

Top 9 lễ cô sáu ở công viên tuổi trẻ 2022

Bài mới nhất

Yeu tố nước trong kiến trúc cảnh quan là gì năm 2024

An toàn thông tin và hệ thống là gì năm 2024

Lỗi file word 2010 bị kẻ ô vuông khi mở năm 2024

Rách giời rơi xuống nghĩa là gì năm 2024

Chính sách kế hoạch hóa gia đình 2023 năm 2024

Chảy máu cam là bị bệnh gì năm 2024

Dau tư trung quốc vào tây nam thái bình dương năm 2024

He had two trains but they were armored là gì năm 2024

Bởi vì em hết yêu anh đạo từ bài nào năm 2024

Bạch cầu tăng là bị bệnh gì năm 2024

Chủ đề