Removing Duplicates in Python Lists

Learn how to efficiently remove duplicates from lists in Python using various methods, including sets, dictionaries, and more. …

Updated July 21, 2023

Learn how to efficiently remove duplicates from lists in Python using various methods, including sets, dictionaries, and more.

Working with lists is an essential part of any programming project. However, sometimes you might encounter duplicate values within a list that need to be removed or made unique. In this article, we’ll delve into the world of removing duplicates in Python lists using various techniques.

What Are Duplicates in Lists?

Duplicates refer to identical elements present multiple times within a list. For example:

my_list = [1, 2, 3, 2, 4, 5, 5]

In this case, the numbers 2, 5 appear twice in the list.

Step-by-Step Explanation: Removing Duplicates Using Sets

One of the most efficient ways to remove duplicates from a list is by converting it into a set. A set in Python is an unordered collection of unique elements:

my_list = [1, 2, 3, 2, 4, 5, 5]

# Convert the list to a set (removing duplicates)
unique_values = set(my_list)

print(unique_values)  # Output: {1, 2, 3, 4, 5}

However, keep in mind that converting to a set also removes any order or indices present in the original list.

Step-by-Step Explanation: Removing Duplicates Using Dicts

You can use dictionaries to remove duplicates from a list while preserving their original order. The key idea is to create an empty dictionary and then iterate through your list, adding each value as a key-value pair to the dictionary:

my_list = [1, 2, 3, 2, 4, 5, 5]

# Create an empty dictionary to store unique values
unique_dict = {}

for item in my_list:
    # If the item is not already in the dict, add it
    if item not in unique_dict:
        unique_dict[item] = None

# Convert the keys back into a list (order preserved)
unique_values = list(unique_dict.keys())

print(unique_values)  # Output: [1, 2, 3, 4, 5]

Step-by-Step Explanation: Using List Comprehensions

If you want to stick with using lists only and don’t mind sacrificing some readability for efficiency, consider the following:

my_list = [1, 2, 3, 2, 4, 5, 5]

# Remove duplicates by iterating through the list and adding each value if it's not already there
unique_values = []
for item in my_list:
    if item not in unique_values:
        unique_values.append(item)

print(unique_values)  # Output: [1, 2, 3, 4, 5]

While less efficient than the set or dictionary methods for large lists, list comprehensions can be more intuitive and straightforward to understand for smaller datasets.

Conclusion