LibraryOther Key Databases: UniProt, PDB

Other Key Databases: UniProt, PDB

Learn about Other Key Databases: UniProt, PDB as part of Bioinformatics and Computational Biology

Exploring Key Biological Databases: UniProt and PDB

Beyond the foundational sequence databases, a rich ecosystem of specialized biological databases provides critical information about proteins, structures, and their functions. This module delves into two of the most influential: UniProt for protein sequences and functional annotations, and the Protein Data Bank (PDB) for 3D structural data.

UniProt: The Universal Protein Resource

UniProt is a comprehensive, high-quality, and freely accessible resource of protein sequence and functional information. It is a collaboration between the European Bioinformatics Institute (EMBL-EBI), the Swiss Institute of Bioinformatics (SIB), and the University of Toronto. UniProt aims to provide a central hub for protein-centric data, integrating information from various sources.

UniProt is the go-to resource for detailed protein information, including sequences, functions, and related data.

UniProt offers curated entries with extensive functional annotations, cross-references to other databases, and experimental data. It's essential for understanding a protein's role in biological systems.

The UniProt Knowledgebase (UniProtKB) is the core of the UniProt resource. It consists of two main sections: Swiss-Prot, which contains highly curated, manually annotated entries with a high level of detail, and TrEMBL, which is automatically annotated and serves as a supplement to Swiss-Prot. UniProt entries include information on protein names, synonyms, sequence, post-translational modifications, active sites, domains, protein families, biological pathways, and interactions. The extensive cross-referencing allows users to link to related data in other databases, such as PDB, GO (Gene Ontology), and InterPro.

What are the two main sections of the UniProt Knowledgebase (UniProtKB)?

Swiss-Prot (manually curated) and TrEMBL (automatically annotated).

The Protein Data Bank (PDB): A Repository of 3D Structures

The Protein Data Bank (PDB) is the single global archive of experimentally determined three-dimensional structures of large biological molecules, such as proteins and nucleic acids. These structures are determined by experimental methods like X-ray crystallography, NMR spectroscopy, and cryo-electron microscopy.

The PDB provides atomic-level insights into how biological molecules work. Each PDB entry (identified by a unique 4-character ID) contains coordinates for every atom in the molecule, allowing researchers to visualize the protein's shape, active sites, and how it interacts with other molecules. This structural information is crucial for understanding protein function, designing drugs, and engineering new proteins. Visualizing these complex 3D structures helps in understanding concepts like protein folding, enzyme mechanisms, and molecular recognition.

📚

Text-based content

Library pages focus on text content

Understanding the relationship between a protein's sequence (from UniProt) and its 3D structure (from PDB) is a cornerstone of modern bioinformatics. For example, a mutation in a protein sequence might alter its 3D conformation, leading to a loss of function or a change in its interaction partners. The PDB is essential for structure-based drug discovery, protein engineering, and understanding disease mechanisms at a molecular level.

What experimental methods are commonly used to determine structures deposited in the PDB?

X-ray crystallography, NMR spectroscopy, and cryo-electron microscopy.

Connecting UniProt and PDB

UniProt entries often contain direct links to corresponding PDB entries. This integration allows researchers to seamlessly move from a protein's sequence and functional annotation to its experimentally determined 3D structure. This linkage is vital for a holistic understanding of protein biology.

FeatureUniProtPDB
Primary Data TypeProtein Sequences & Functional Annotations3D Atomic Coordinates of Macromolecules
Key InformationSequence, function, modifications, pathways, interactionsMolecular shape, atomic positions, experimental method
Main PurposeUnderstanding protein function and biologyUnderstanding protein structure and molecular interactions
Data SourceLiterature curation, computational analysisExperimental determination (X-ray, NMR, Cryo-EM)

Think of UniProt as the protein's biography, detailing its life story and roles, while PDB is its detailed physical blueprint, showing its exact shape and how it's built.

Learning Resources

UniProt: The Universal Protein Resource(documentation)

The official website for UniProt, offering access to the comprehensive protein sequence and functional annotation database.

UniProt Tutorial: Getting Started(tutorial)

A guide to navigating and searching the UniProt database, explaining its key features and data types.

The Protein Data Bank (PDB)(documentation)

The official website for the PDB, the primary archive for experimentally determined 3D structures of biological macromolecules.

RCSB PDB: Tutorials and Learning Resources(tutorial)

Provides various resources for learning how to use the PDB, including guides on structure visualization and data interpretation.

UniProt vs. PDB: Understanding the Differences(blog)

An article explaining the distinct roles and data provided by UniProt and PDB in bioinformatics.

Introduction to Protein Structure Databases(video)

A YouTube video explaining the importance and usage of protein structure databases like PDB.

UniProtKB/Swiss-Prot: A functional perspective(paper)

A foundational paper discussing the curated UniProtKB/Swiss-Prot section and its functional annotations.

The Protein Data Bank: A historical perspective(paper)

A historical overview of the Protein Data Bank and its evolution as a critical resource.

Protein Data Bank(wikipedia)

Wikipedia entry providing an overview of the PDB, its history, and its significance in structural biology.

UniProt Consortium(wikipedia)

Wikipedia entry detailing the UniProt consortium, its databases, and its role in protein information.