LibraryData Sources

Data Sources

Learn about Data Sources as part of Terraform Infrastructure as Code Mastery

Terraform Data Sources: Accessing Existing Infrastructure

Terraform's power lies not only in provisioning new infrastructure but also in interacting with existing resources. Data sources are a fundamental mechanism for this interaction, allowing your Terraform configurations to read information about infrastructure that was provisioned outside of your current Terraform configuration or by a different Terraform configuration.

What are Terraform Data Sources?

Data sources enable Terraform to fetch information about infrastructure resources that are not managed by the current Terraform configuration. This is crucial for several reasons: referencing existing resources, discovering dynamic values, and integrating with infrastructure managed by other teams or tools.

Data sources allow Terraform to read information about existing infrastructure.

Think of data sources as Terraform's way of asking questions about your environment. They let you query for details about resources that Terraform didn't create itself, like an existing VPC, a specific AMI ID, or the current state of a database.

Data sources are declared in Terraform configuration files using the data block. They specify the type of resource to query and arguments to filter or identify the specific resource. Once retrieved, the attributes of the data source can be referenced in your Terraform configuration, just like attributes of resources you manage.

Common Use Cases for Data Sources

Data sources are incredibly versatile. Here are some common scenarios where they shine:

<ul><li><b>Referencing existing infrastructure:</b> Querying for an existing VPC, subnet, or security group to attach new resources to.</li><li><b>Discovering dynamic values:</b> Fetching the latest Amazon Machine Image (AMI) ID for a specific operating system or the public IP address of an existing server.</li><li><b>Cross-configuration dependencies:</b> Accessing information from infrastructure managed by a different Terraform state or workspace.</li><li><b>Provider-specific information:</b> Retrieving details about cloud provider regions, availability zones, or service endpoints.</li></ul>

How Data Sources Work

When Terraform plans an execution, it first reads all the data sources. It queries the relevant infrastructure provider (e.g., AWS, Azure, GCP) to retrieve the requested information. If a data source cannot find the requested resource or encounters an error, Terraform will halt the plan. Once the data is successfully retrieved, Terraform uses these values to configure the resources it manages.

What is the primary purpose of a Terraform data source?

To read information about existing infrastructure that is not managed by the current Terraform configuration.

Example: Fetching an AWS AMI ID

Let's illustrate with a common example: finding the latest Amazon Linux 2 AMI ID for use in an EC2 instance. This avoids hardcoding AMI IDs, which can quickly become outdated.

The data block defines a data source. The first argument is the data source type (e.g., aws_ami), followed by a local name (e.g., amazon_linux). The filter block specifies criteria to find the desired AMI. Here, we filter by name containing 'amzn2-ami-hvm-2.0.20230614.0-x86_64-gp2' and most_recent = true. The output block then references the id attribute of the retrieved AMI.

📚

Text-based content

Library pages focus on text content

In this example:

hcl
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
resource "aws_instance" "example" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t2.micro"
tags = {
Name = "HelloWorld"
}
}

Terraform will look up the most recent Amazon Linux 2 AMI and use its ID to launch the

code
aws_instance
resource.

Key Attributes and Considerations

When using data sources, remember:

<ul><li><b>Provider-Specific:</b> Data sources are tied to specific Terraform providers. You'll use `aws_vpc` for AWS, `azurerm_resource_group` for Azure, etc.</li><li><b>Read-Only:</b> Data sources do not modify infrastructure; they only retrieve information.</li><li><b>Dependencies:</b> Terraform automatically creates implicit dependencies. If a resource depends on a data source, Terraform will fetch the data before attempting to create the resource.</li><li><b>State Management:</b> Data sources are not stored in the Terraform state file as managed resources. Their values are determined at plan time.</li></ul>

Data sources are essential for building robust and dynamic Terraform configurations that adapt to your existing environment.

Summary

Terraform data sources are a powerful feature that allows your infrastructure code to be aware of and interact with existing resources. By fetching information about infrastructure not managed by the current configuration, you can create more flexible, dynamic, and integrated infrastructure deployments.

Learning Resources

Terraform Data Sources Documentation(documentation)

The official HashiCorp documentation explaining the concept and syntax of Terraform data sources.

Terraform AWS Provider - Data Sources(documentation)

A comprehensive list of all available AWS data sources in the Terraform registry, with examples.

Terraform Azure Provider - Data Sources(documentation)

Explore the various Azure data sources supported by Terraform, including their attributes and usage.

Terraform GCP Provider - Data Sources(documentation)

Discover the Google Cloud Platform data sources available for Terraform, with practical examples.

Terraform Data Sources: A Deep Dive(blog)

An in-depth blog post explaining the benefits and common patterns for using Terraform data sources.

Using Data Sources in Terraform - Tutorial(video)

A video tutorial demonstrating how to effectively use data sources in Terraform with practical examples.

Terraform Data Source Examples(documentation)

Official Terraform examples showcasing various data source use cases across different providers.

Terraform Best Practices: Data Sources(blog)

A blog post discussing best practices for writing Terraform code, including effective use of data sources.

Terraform Data Sources Explained(blog)

A tutorial from DigitalOcean explaining the fundamentals of Terraform data sources and their application.

Terraform Data Source: `aws_caller_identity`(documentation)

Specific documentation for the `aws_caller_identity` data source, useful for understanding current AWS credentials.