An in-place methodThis method is quadratic, because we have a linear lookup into the list for every element of the list (to that we have to add the cost of rearranging the list because of the Show
That said, it is possible to operate in place if we start from the end of the list and proceed toward the origin removing each term that is present in the sub-list at its left This idea in code is simply
A simple test of the implementation Removing duplicates from a list is pretty simple. You can do it with a Python one-liner: >>> initial = [1, 1, 9, 1, 9, 6, 9, 7] >>> result = list(set(initial)) >>> result [1, 7, 9, 6] Python set elements have to be unique so converting a list into a set and back again achieves the desired result. What if the original order of the list is important though? That makes things a bit more complicated because sets are unordered, so once you’ve finished the conversion the order of the list will be lost. Fortunately, there are several ways to overcome this issue. In this article we’ll look at a range of different solutions to the problem and consider their relative merits.
Method 1 – For LoopA basic way to achieve the required result is with a for loop: >>> initial = [1, 1, 9, 1, 9, 6, 9, 7] >>> result = [] >>> for item in initial: if item not in result: result.append(item) >>> result [1, 9, 6, 7]
That might not be a problem with this simple example, but the time overhead will become increasingly evident if the list gets very large. Method 2 – List ComprehensionHow to Remove Duplicates From a Python List? One alternative is to use a list comprehension: >>> initial = [1, 1, 9, 1, 9, 6, 9, 7] >>> result = [] >>> [result.append(item) for item in initial if item not in result] [None, None, None, None] >>> result [1, 9, 6, 7] List comprehensions are handy and very powerful Python tools that enable you to combine variables, for loops and if statements. They make it possible to create a list with a single line of code (but you can split them into multiple lines to improve readability too!). Although shorter and still fairly clear, using a list comprehension in this instance is not a very good idea. That’s because it takes the same inefficient approach to membership testing that we saw in Method 1. It also relies on the side effects of the comprehension to build the result list, which many consider to be bad practice. To explain further, even if it’s not assigned to a variable for later use, a list comprehension still creates a list object. So, in the process of
appending items from the initial list to the Python functions return the value [None, None, None, None] A for loop is clearer and does not rely on side effects so is the better method of the two on this occasion. Method 3 – Sorted SetWe can’t simply convert our list to a set to remove duplicates if we want to preserve order. However, using this approach in conjunction with the sorted function is another potential way forward: >>> initial = [1, 1, 9, 1, 9, 6, 9, 7] >>> result = sorted(set(initial), key=initial.index) >>> result [1, 9, 6, 7]
The problem is that although it’s pretty easy to understand it’s not much faster than the basic for loop shown in Method 1. Method 4 – Dictionary fromkeys()A seriously quick approach is to use a dictionary: >>> initial = [1, 1, 9, 1, 9, 6, 9, 7] >>> result = list(dict.fromkeys(initial)) >>> result [1, 9, 6, 7]
Python dictionary keys are unique by default so converting our list into a dictionary will remove duplicates automatically. The Once this has been done with our initial list, converting the dictionary back to a list gives the result we’re looking for. Dictionaries only became ordered in all python implementations when Python 3.7 was released (this was also an implementation detail of CPython 3.6). So, if you’re using an older version of Python, you will need to import the >>> from collections import OrderedDict >>> initial = [1, 1, 9, 1, 9, 6, 9, 7] >>> result = list(OrderedDict.fromkeys(initial)) >>> result [1, 9, 6, 7] This approach might not be as fast as using a standard dictionary, but it’s still very speedy! Exercise: Run the code. Does it work? Method 5 – more-itertoolsUp to this point, we’ve only looked at lists containing immutable items. But what if your list contains mutable data types such as lists, sets or dictionaries? It’s still possible to use the basic for loop shown in Method 1, but that won’t cut the mustard if speed is of the essence. Also, if we try to use A great answer to this conundrum comes in the form of a library called more-itertools. It’s not part of the Python standard library so you’ll need to pip install it. With that done, you can import and use its >>> from more_itertools import unique_everseen >>> mutables = [[1, 2, 3], [2, 3, 4], [1, 2, 3]] >>> result = list(unique_everseen(mutables)) >>> result [[1, 2, 3], [2, 3, 4]] The library The function The function also provides a way to remove duplicates even more quickly from a list of lists: ... >>> result = list(unique_everseen(mutables, key=tuple)) >>> result [[1, 2, 3], [2, 3, 4]] This works well because it converts the unhashable lists into hashable tuples to speed things up further. If you want to apply this trick to a list of sets, you can use frozenset as the key: ... >>> mutables = [{1, 2, 3}, {2, 3, 4}, {1, 2, 3}] >>> result = list(unique_everseen(mutables, key=frozenset)) >>> result [{1, 2, 3}, {2, 3, 4}] Specifying a key with a list of dictionaries is a little more complicated, but can still be achieved with the help of a lambda function: ... >>> mutables = [{'one': 1}, {'two': 2}, {'one': 1}] >>> result = list( unique_everseen(mutables, key=lambda x: frozenset(x.items())) ) >>> result [{'one': 1}, {'two': 2}] The function Method 6 – NumPy unique()If you’re working with numerical data, the third-party library numpy is an option too: >>> import numpy as np >>> initial = np.array([1, 1, 9, 1, 9, 6, 9, 7]) >>> _, idx = np.unique(initial, return_index=True) >>> result = initial[np.sort(idx)] >>> result [1 9 6 7] The index values of the unique items can be stored by using the These can then be passed to Technically this method could be applied to a standard list by first converting it into a numpy array and then converting it back to list format at the end. However, this would be an overcomplicated and inefficient way of achieving the result. Using these kinds of techniques only really makes sense if you are also utilizing some of numpy’s powerful features for other reasons. Method 7 – pandas unique()Another third-party library we could use is pandas: >>> import pandas as pd >>> initial = pd.Series([1, 1, 9, 1, 9, 6, 9, 7]) >>> result = pd.unique(initial) >>> result [1 9 6 7]
As with the numpy method, it would be perfectly possible to convert the result to a standard list at the end. Again though, unless you’re employing the amazing data analysis tools provided by pandas for another purpose, there is no obvious reason to choose this approach over the even faster option utilizing Python’s built-in dictionary data type (Method 4). SummaryAs we’ve seen, there are a wide range of ways to solve this problem and the decision about which one to select should be driven by your particular circumstances. If you’re writing a quick script and your list isn’t huge, you may opt to use a simple for loop for the sake of clarity. However, if efficiency is a factor and your lists don’t contain mutable items then going with Alternatively, if you’re using an older version of Python, If you need to work with lists that contain mutable items, importing more-itertools so you can take advantage of the brilliant Lastly, if you’re doing some serious number crunching with numpy or manipulating data with pandas, it would probably be wise to go with the methods built into those tools for this purpose. The choice is of course yours, and I hope this article has provided some useful insights that will help you pick the right approach for the job at hand. How do you remove duplicates from a list without changing the order in Python?Python List: Remove Duplicates and Keep the Order. Method 1 – For Loop.. Method 2 – List Comprehension.. Method 3 – Sorted Set.. Method 4 – Dictionary fromkeys(). Method 5 – more-itertools.. Method 6 – NumPy unique(). Method 7 – pandas unique(). Summary.. How do you remove duplicates from a list whilst preserving order?Preserving Order: Use OrderedDict
fromkeys(list) to obtain a dictionary having duplicate elements removed, while still maintaining order. We can then easily convert it into a list using the list() method. NOTE: If you have Python 3.7 or later, we can use the built in dict. fromkeys(list) instead.
How do you remove duplicates from a consecutive list in Python?Using the groupby function, we can group the together occurring elements as one and can remove all the duplicates in succession and just let one element be in the list. This function can be used to keep the element and delete the successive elements with the use of slicing.
How do I remove multiple duplicates from a list in Python?5 Ways to Remove Duplicates from a List in Python. Method 1: Naïve Method.. Method 2: Using a list comprehensive.. Method 3: Using set(). Method 4: Using list comprehensive + enumerate(). Method 5: Using collections. OrderedDict. fromkeys(). |