Sets in Python

Python as some great data types that make is great to use, but sets are often overlooked. Set are a powerful tool that can be used to make orderly code where each object of unique to a collection, while still working at with the efficiency and speed of Python’s dictionaries.

Basics

Sets have syntex that looks alot like that of dictionaries

my_set = {1,2,3,4}

The similarities between sets and dictionaries is no coincidence. From the outside sets can be thought of as dictionaries without value pairs. In face the code behind sets shars a lot in common with the code behind dictionaries.

You can also create a set with the set() built-in, which takes any iterable:

my_set = set([1,2,3,4])

Set members can contain any hashable type or simply put any object in Python that can be guaranteed not to change over its lifetime. Numbers and strings and instances of user-defined classes are great example of this.

Redundant members, however will be removed automatically, with previously defined members taking priority. For instance, if we defined my_set as {1,2,3,2,4,5}, the result would be {1,2,3,4,5}.

How to Use Python Sets

list_1 = [1,2,3,4,3,4,2,4,5,3]
set_1 = set(list_1)
# yields {1,2,3,4,5}

This feature alone save time by not having to manually go over your code looking for duplicates. This can also do this for any iterable. If you do this with a string, for instance, you’ll get a set that contains all the unique characters in the string:

s1="Hello there"
set(s1)
# yields {' ', 'r', 'l', 't', 'e', 'h', 'o', 'H'}

All the object need to be hashable however, or you will get a TypeError. Adding .add() as you code can help you test whether or not the code is hashable. When in doubt, test it out.

Another common use for sets is to quickly test for the presence of a small collection of objects within a larger collection, or vice versa, by way of the superset/subset methods described below. Note that this works best when the larger of the two collections is something you can convert to a set once and then test against many times, because the overhead of converting a list to a set may outweigh the performance improvement from using sets in the first place. In general, set membership testing is generally faster than iterating through objects and testing membership manually.

Adding and removing members of Python sets

Adding and removing members from sets is as easy as using the .add() and .remove() methods. For example, my_set.add(3) would update my_set to include 3, and my_set.remove(3) would remove 3 if it were present.

If you try to .remove() something from a set that isn’t there, you’ll get a KeyError — same as if you try to reference a key in a dictionary that doesn’t exist. To remove something without the risk of raising an error if it isn’t there, use .discard() instead of remove().

To drop all elements from a set, you can use .clear(), or reassign the variable to an empty set:

my_set = set()

Unions and intersections with Python sets

Sets support a number of operations where you take two or more sets and generate new ones from them. A union of two sets combines the two into a single set, removing any duplicates:

set_1 = {1,2,3}
set_2 = {4,5,6}
set_3 = set_1.union(set_2)
# yields {1,2,3,4,5,6}

You can also use the pipe operator to perform a union:

set_3 = set_1 | set_2

Again, this is a nifty way to perform filtration across multiple collections of items.

Intersection is another way to generates a new set from only the elements common to multiple sets:

set_1 = {1,2,3}
set_2 = {2,3,4}
set_3 = set_1.intersection(set_2)
# yields {2,3}

The & operator can also be used to combine two sets :

set_3 = set_1 & set_2

Many set operations can be expressed with operators, which we’ll illustrate below.

Differences with Python sets

To determine if two set have member who are not similar you can use the difference() method:

set_1 = {1,2,3}
set_2 = {4,5,6}
set_3 = set_1.difference(set_2)
# yields {1,2,3}
set_3 = set_1 - set_2
# different way to express same operation

By contrast, if we used set_3 = set_2.difference(set_1), the results would be {4,5,6}.

Symmetric difference operation will return elements that are in one set or the other, but not both.

set_1 = {1,2,3,4}
set_2 = {4,5,6,7}
set_3 = set_1.symmetric_difference(set_2)
# yields {1, 2, 3, 5, 6, 7}
set_3 = set_1 ^ set_2
# operator version

Supersets and subsets in Python

You’re probably familiar by now with Python’s in operator, which you can use to search for the presence of a character in a string or an object in a list, is also useful with sets:

set_1 = {1,2,3,4}
1 in set_1 # this is True
5 in set_1 # this is False

What if you wanted to test for the presence of all the elements of one set inside another set? You can’t use in for that — Python will think you’re testing for the presence of the entire set object, not its individual elements. Fortunately, Python does provide ways to check such things with other set methods:

set_1 = {1,2,3,4}
set_2 = {1,2}

# Tests if members of set_2 are in set_1:
set_2.issubset(set_1)
# Operator version:
set_2 <= set_1

# Tests if set_1 contains all members of set_2:
set_1.issuperset(set_2)
# Operator version:
set_1 >= set_2

Set updates in Python

Up until now we’ve only explored how to generate new sets from intersections or differences of existing sets. Python handles this by letting you update a set in-place with intersections or differences:

# In-place update of set_1 with set_2:
set_1 |= set_2

# In-place intersection of set_1 with set_2;
set_1 &= set_2

# In-place difference of set_1 with set_2:
set_1 -= set_2

# In-place symmetric difference of set_1 with set_2:
set_1 ^= set_2

In-place updates are handy when you’re dealing with a very large set, and you don’t want to create an entirely new instance of the set. Instead, you can make the changes directly to the existing set, which is more efficient.

Frozen Sets in Python

I mentioned before how sets can only be made of things that are hashable. Since sets are mutable, they can’t themselves be used as set elements or dictionary keys. But there is a variety of set called the frozen set that isn’t mutable, and so can be used as a set element, as a dictionary key, or in any other context where you need a hashable type.

To create a frozen set, just use frozenset() to generate one from an existing set or iterable:

set_1 = {1,2,3,4}
f_set = frozenset(set_1)
set_2 = {f_set,2,3,4}

Note that once you create a frozen set, it can’t be altered. The .add() and .remove() methods won’t work on a frozen set. You can use a frozen set to generate set intersections or differences, as long as you don’t try to store the results of such operations in-place.

--

--

--

Full Stack Developer with a background in Natural Resource Management and Leadership

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Looking at Road Deaths, Baltimore’s Emerging Traffic Problems, and an Attempt at Developing a…

Display 3D Model using Window Presentation Foundation

How to improve development speed in a software project without sacrificing quality

Hackthebox Walkthrough — Nineveh

CashBook — Account and inventory Management System (Desktop App) Nulled

CashBook - Account and inventory Management System (Desktop App) - 1

Using bit.io with R and dbplyr

I don’t know what awake means

19 preparation days before 27/10

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Carlos Fernandez

Carlos Fernandez

Full Stack Developer with a background in Natural Resource Management and Leadership

More from Medium

String in Python

NumHow to remove nan value from Numpy array in Python?

Python List and Boolean variables inbuilt functions

🤔 Python generators. When to use?