Python Sets: Unordered, Unique Collections
Sets are a fundamental data structure in Python, offering a powerful way to manage collections of unique items. Unlike lists or tuples, sets do not maintain any specific order, and each element within a set must be unique. This makes them ideal for tasks involving membership testing, removing duplicates, and performing mathematical set operations like union, intersection, and difference.
Creating Sets
You can create sets in Python using curly braces
{}
set()
set()
Using curly braces {}
or the set()
constructor.
Example of creating sets:
500 italic"># Using curly bracesmy_set = {1, 2, 3, 4, 5}400">print(my_set)500 italic"># With 400">duplicates (duplicates are removed)duplicate_set = {1, 2, 2, 3, 3, 3}400">print(duplicate_set)500 italic"># Using the 400">set() constructor 400">"text-blue-400 font-medium">with a listlist_data = [10, 20, 30, 20, 10]set_from_list = 400">set(list_data)400">print(set_from_list)
Key Set Operations
Sets support a rich set of operations that mirror mathematical set theory. These operations are highly efficient for data manipulation, especially in data science contexts.
Operation | Syntax (Operator) | Syntax (Method) | Description |
---|---|---|---|
Union | A | B | A.union(B) | Returns a new set with elements from both sets. |
Intersection | A & B | A.intersection(B) | Returns a new set with elements common to both sets. |
Difference | A - B | A.difference(B) | Returns a new set with elements in A but not in B. |
Symmetric Difference | A ^ B | A.symmetric_difference(B) | Returns a new set with elements in either A or B, but not both. |
Modifying Sets
Sets are mutable, meaning you can add or remove elements. Methods like
add()
update()
remove()
discard()
pop()
remove()
and discard()
when removing an element from a set?remove()
raises a KeyError
if the element is not found, while discard()
does nothing if the element is not present.
Example of modifying sets:
my_set = {1, 2, 3}500 italic"># Add an elementmy_set.400">add(4)400">print(my_set) 500 italic"># Output: {1, 2, 3, 4}500 italic"># Add multiple elements 400">"text-blue-400 font-medium">from an iterablemy_set.400">update([5, 6, 3])400">print(my_set) 500 italic"># Output: {1, 2, 3, 4, 5, 6}500 italic"># Remove an 400">element (will raise error 400">"text-blue-400 font-medium">if 400">"text-blue-400 font-medium">not present)my_set.400">remove(5)400">print(my_set) 500 italic"># Output: {1, 2, 3, 4, 6}500 italic"># Discard an 400">element (no error 400">"text-blue-400 font-medium">if 400">"text-blue-400 font-medium">not present)my_set.400">discard(7)400">print(my_set) 500 italic"># Output: {1, 2, 3, 4, 6}500 italic"># Remove 400">"text-blue-400 font-medium">and 400">"text-blue-400 font-medium">return an arbitrary elementremoved_element = my_set.400">pop()400">print(removed_element) 500 italic"># e.g., 1400">print(my_set) 500 italic"># e.g., {2, 3, 4, 6}
Sets for Data Science and AI
In data science and AI, sets are invaluable for tasks such as:
- Finding unique values: Quickly identify distinct categories or features in a dataset.
- Data cleaning: Efficiently remove duplicate records or entries.
- Feature engineering: Creating new features based on set operations (e.g., finding common attributes between two groups of data).
- Algorithm implementation: Many machine learning algorithms rely on set-theoretic concepts.
Remember that set elements must be immutable (e.g., numbers, strings, tuples). You cannot have mutable types like lists or dictionaries as elements within a set.
Visualizing Set Operations: Imagine two circles representing sets A and B. The union (A | B) is the area covered by both circles. The intersection (A & B) is the overlapping area. The difference (A - B) is the part of circle A that does not overlap with B. The symmetric difference (A ^ B) is the area covered by either circle, but not their overlap.
Text-based content
Library pages focus on text content
Learning Resources
The authoritative source for Python's set data structure, covering creation, methods, and operations.
A comprehensive and practical guide to Python sets with clear examples and explanations.
Explores the practical applications of Python sets specifically within the context of data science workflows.
A visual explanation of Python sets and their common operations, ideal for visual learners.
Detailed walkthrough of various set operations with code examples and explanations of their usage.
A quick reference for all available set methods in Python, including add, remove, union, intersection, etc.
Focuses on how sets can be leveraged for efficient data cleaning and analysis tasks in Python.
Covers the basics of Python sets, including creation, accessing elements, and performing set operations.
A segment from a popular course introducing sets and their fundamental properties.
Provides a concise overview of Python sets, their characteristics, and common operations.