LibraryData structures: Lists, Tuples, Dictionaries, Sets

Data structures: Lists, Tuples, Dictionaries, Sets

Learn about Data structures: Lists, Tuples, Dictionaries, Sets as part of Python Data Science and Machine Learning

Python Data Structures: Lists, Tuples, Dictionaries, and Sets

In Python, data structures are fundamental building blocks for organizing and manipulating data. For data science and machine learning, understanding these structures is crucial for efficient data handling, analysis, and algorithm implementation. We will explore four core Python data structures: Lists, Tuples, Dictionaries, and Sets.

Lists: Ordered, Mutable Collections

Lists are ordered, mutable sequences of items. This means they maintain the order of elements and can be changed after creation (elements can be added, removed, or modified). Lists are defined using square brackets

code
[]
.

Lists are versatile, ordered, and changeable collections of items.

Lists are created with square brackets [] and can hold items of different data types. Their order is preserved, and you can modify them after creation.

Lists are one of the most commonly used data structures in Python. They are defined by enclosing comma-separated values within square brackets. For example, my_list = [1, 'hello', 3.14, True]. Lists are zero-indexed, meaning the first element is at index 0. You can access elements using their index, slice lists to get sub-lists, and perform operations like appending, inserting, removing, and sorting elements. Their mutability makes them ideal for scenarios where data needs to be dynamically updated.

What are the two primary characteristics of Python lists?

Ordered and mutable.

Tuples: Ordered, Immutable Collections

Tuples are similar to lists in that they are ordered collections of items. However, tuples are immutable, meaning once a tuple is created, its contents cannot be changed. Tuples are defined using parentheses

code
()
.

Tuples are fixed, ordered collections of items.

Tuples are created with parentheses () and, unlike lists, cannot be modified after creation. This immutability makes them suitable for data that should not change.

Tuples are created by enclosing comma-separated values within parentheses. For instance, my_tuple = (10, 'world', False). Like lists, tuples are zero-indexed and support slicing. However, any attempt to modify an element (e.g., my_tuple[0] = 5) will result in a TypeError. Tuples are often used for returning multiple values from a function, or for representing fixed collections of related data, such as coordinates or database records. Their immutability also makes them hashable, allowing them to be used as keys in dictionaries.

FeatureListTuple
MutabilityMutable (can be changed)Immutable (cannot be changed)
SyntaxSquare brackets []Parentheses ()
Use CasesDynamic data, collections that need modificationFixed data, function return values, dictionary keys

Dictionaries: Unordered, Mutable Key-Value Pairs

Dictionaries are collections of key-value pairs. They are unordered (in Python versions prior to 3.7, though insertion order is preserved from 3.7 onwards), mutable, and use curly braces

code
{}
. Each key must be unique and immutable (like strings, numbers, or tuples), while values can be of any data type.

Dictionaries store data as key-value pairs for efficient lookups.

Dictionaries use curly braces {} and map unique keys to values. They are ideal for retrieving data quickly using its associated key.

Dictionaries are defined with keys and values separated by a colon :, with each pair separated by a comma, all enclosed in curly braces. For example, my_dict = {'name': 'Alice', 'age': 30, 'city': 'New York'}. You access values by their keys, like my_dict['name'] which returns 'Alice'. Dictionaries are highly efficient for searching, adding, and removing items based on their keys. They are widely used to represent structured data, configuration settings, and mappings.

What is the primary way to access data in a Python dictionary?

Using its unique key.

Sets: Unordered, Mutable Collections of Unique Elements

Sets are unordered collections of unique elements. They are mutable, meaning you can add or remove elements, but they do not allow duplicate values. Sets are defined using curly braces

code
{}
or the
code
set()
constructor.

Sets store unique items and are useful for membership testing and mathematical set operations.

Sets use curly braces {} (or set()) and automatically discard duplicates. They are optimized for checking if an item exists and for operations like union, intersection, and difference.

Sets are created like dictionaries but without values, e.g., my_set = {1, 2, 3, 3, 4} will result in my_set being {1, 2, 3, 4}. If you need an empty set, you must use set(), as {} creates an empty dictionary. Sets are highly efficient for membership testing (checking if an element is present) and for performing set operations like union (|), intersection (&), difference (-), and symmetric difference (^). They are invaluable for tasks like removing duplicates from a list or finding common elements between collections.

Visualizing the core differences: Lists are like ordered shopping lists where you can add or cross off items. Tuples are like a fixed recipe card – the ingredients and steps are set. Dictionaries are like a phone book, where you look up a name (key) to find a number (value). Sets are like a bag of unique marbles, where duplicates are automatically removed, and you can easily check if you have a specific marble.

📚

Text-based content

Library pages focus on text content

Choosing the Right Data Structure

The choice of data structure depends on the specific requirements of your task:

  • Lists: Use when you need an ordered collection that might change.
  • Tuples: Use for fixed, ordered collections, or when you need hashable items (like dictionary keys).
  • Dictionaries: Use when you need to associate keys with values for fast lookups.
  • Sets: Use when you need to store unique items and perform set operations or fast membership tests.

Understanding these fundamental data structures is a cornerstone of effective Python programming for data science. Mastering their properties will significantly enhance your ability to write efficient and readable code.

Learning Resources

Python Official Documentation: Data Structures(documentation)

The official Python tutorial provides a comprehensive overview of lists, tuples, dictionaries, and sets, including their methods and common use cases.

Real Python: Python Lists(blog)

A detailed guide to Python lists, covering creation, manipulation, common methods, and best practices for data science.

Real Python: Python Tuples(blog)

An in-depth exploration of Python tuples, focusing on their immutability, use cases, and advantages over lists in certain scenarios.

Real Python: Python Dictionaries(blog)

Learn about Python dictionaries, including how to create, access, modify, and iterate over key-value pairs effectively.

Real Python: Python Sets(blog)

A comprehensive tutorial on Python sets, covering their unique properties, set operations, and practical applications.

W3Schools: Python Data Types(tutorial)

A beginner-friendly introduction to Python's core data types, including lists, tuples, dictionaries, and sets, with interactive examples.

GeeksforGeeks: Python Data Structures(blog)

An overview of various Python data structures, with a focus on lists, tuples, dictionaries, and sets, including their time complexity for operations.

Programiz: Python Lists(tutorial)

A clear explanation of Python lists, including how to create, access, modify, and iterate through them, with practical code examples.

Programiz: Python Dictionaries(tutorial)

Learn the fundamentals of Python dictionaries, from creation to accessing and manipulating key-value pairs.

DataCamp: Python Data Structures(blog)

A practical guide to Python's built-in data structures, emphasizing their use in data analysis and manipulation tasks.