Show Learn several methods to remove duplicate values from a Python listImage by authorIn today's article, you will learn several different ways to remove duplicate values from a Python list. We will consider two types of scenarios —
Let's get started! Table of Contents:· Removing Duplicates From a Sorted List Removing Duplicates From a Sorted ListWe will have some advantages if the list is in sorted order. If the list is sorted, we can compare the two values next to each other. Because in a sorted list, duplicates will appear next to each other. Consider the list below: lst = [1, 1, 2, 3, 4, 4, 4, 4, 5, 5, 7, 11, 11, 11, 21, 21] To remove duplicates from this list, we will loop through the entire list, compare the elements next to each other, and store the unique elements in another list. If you run the code, you will see duplicate values were removed. Output: Removing Duplicates From an Unsorted ListIf the list is not sorted, we can’t compare values next to each other. Because duplicates may appear anywhere in the list. In that case, we can use several methods:
Using for loopThe approach is to insert elements in the temporary list one by one. Before inserting a value, we will check if the value is already in the temporary list. If a value is already in the temporary list, we will not insert it. Output: Using list comprehension, we can do it using fewer lines of code: If you are interested to learn more about Python comprehensions, you can read my article here: Using setSets in Python have special characteristics. Sets only contain unique values. If we want to insert duplicates to a set, it will remove them automatically. So the trick is to simply copy the given list to a set and the duplicates will be removed automatically. We can copy the values of the set in a list again. Output: Or we can do it in one line. If you are interested to learn more about Python sets, you can read my article here: Using OrderedDictWe can remove duplicates from the given list by using Output: Using Numpy
Output: Using Pandas
Output: In today's article, I discussed different ways to remove duplicates from a Python list. If you observe closely, you will see in some methods the order of the original array is preserved and in some, the original order is changed. Take set for example. If you use a set to remove duplicates from a list, the order of the original list will be changed. So you need to decide which method suits your need. And that’s it for today. I hope you find it helpful. Thanks for reading. More content atplainenglish.io I have a df that looks like the following:
Sorted by time event ascending, I'd like to delete any back-to-back (w/r/t time_event) repeat occurrences where the screen, user_id, and time_install are the same. asked May 9, 2021 at 3:45
2 To keep the earliest time_event, you can first sort the df by time_event and then use 'keep=first' in drop_duplicates(). To sort, you can use And to drop and keep the earliest, you can use
answered May 9, 2021 at 4:04
Shubham PeriwalShubham Periwal 2,1082 gold badges7 silver badges23 bronze badges |