Manage Pipelines with Terraform

Note

This guide assumes that you are comfortable using Terraform . To get familiar with Terraform, this tutorial is a good place to start.

Caution

The terraform provider is currently in preview

The provider is in preview and therefore likely contains bugs and rough edges. The syntax of the resource is also likely to change without notice.

You can use Etleap’s Terraform provider to manage your connections, pipelines, dbt schedules, models and teams.

Why manage Etleap with Terraform?

While managing Etleap connections and pipelines via the UI or the API is often sufficient, using Terraform helps with the following use cases:

Simplified managment of multiple pipelines. With our Terraform provider you can easily manage a large number of pipelines, with minimal code duplication.
Duplicating pipelines. Simplify the deployment of similar pipelines to multiple environments (e.g. staging vs production), by having a unified definition for both environments.
Detecting configuration drift. Terraform not only creates resources, but it can detect configuration changes that are manually applied, without having to check the UI, or use the API to detect them. It can also help bring your pipeline back in sync if changes were made accidentally.
Understanding relationships. With Terraform, you can easily identify dependencies between connections, pipelines and models or dbt schedules.

This guide will walk you how through how to setup the provider and create an example Postgres pipeline.

For detailed information about available resources and properties, please consult our provider documentation .

Step 1. Set Up the Provider

In order to set up the provider, please use the following snippet:


terraform {
  required_providers {
    etleap = {
      source  = "etleap/etleap"
      version = ">= 0.1.3"
    }
  }
}

You can find the latest version of the provider here .

To initialize the provider for your deployments, please use to following snippet:


provider "etleap" {
  server_url = "https://<your-local-deployment-hostname>/api/v2"
  username   = "<access_key>"
  password   = "<secret_key>"
}

Note

If you’re using are accessing Etleap at app.etleap.com, omit the server_url argument.

In order to generate an access key and a secret key for your user, please follow these steps .

Step 2. Create or Import a Connection

Specify a Connection

To specify a Postgres connection, please follow the example below:


resource "etleap_connection_postgres" "test_postgres" {
    type     = "POSTGRES"
    name     = "Postgres source"
    address  = "postgres.example.com"
    port     = "5432"
    username = "user"
    password = "pass"
    database = "example"
    schema   = "public"
 
    lifecycle {
      ignore_changes = [ password ]
    }
}

Once the connection is created, you can override the password field with a random value (e.g. <redacted>, and this will not affect the connection). This will prevent you from checking a secret into version control.

Import an Existing Connection

To import an existing connection you will need the ID for the connection. You can get it from the URL. e.g: https://<deployment-url>/#/connections/edit/<ID>/POSTGRES/<name>/schedules

Note

For Standard deployments, the deployment-hostname is app.etleap.com, or app.eu.etleap.com, depending on the region your account is in.

In order to import the resource defined above, you’ll need to run:


terraform import etleap_connection_postgres.test_postgres <ID>

Step 3. Create or Import a Pipeline

Specify a Pipeline

In order to specify a pipeline via Terraform, please follow the example below.


resource "etleap_pipeline" "postgres_test" {
  name     = "TF Postgres Test"
 
  source = {
    postgres = {
      type                = "POSTGRES"
      connection_id       = etleap_connection_postgres.test_postgres.id
      schema              = "public"
      table               = "payments"
      primary_key_columns = ["id"]
      last_updated_column = "updated_at"
    }
  }
 
  destination = {
    redshift = {
      type                     = "REDSHIFT"
      connection_id            = "ZnKPEhPr"
      schema                   = "public"
      table                    = "payments"
      automatic_schema_changes = true
    }
  }
}

Import an Existing Pipeline

In order to import an existing pipeline, you’ll need to use its ID. This can be retrieved from the URL: https://<deployment-url>/#/pipeline/<ID>/overview

Note

For Standard deployments, the deployment-hostname is app.etleap.com, or app.eu.etleap.com, depending on the region your account is in.

To import the resource defined above, run:


terraform import etleap_pipeline.postgres_test <ID>

Recommended patterns

Managing Multiple Pipelines

To reduce code duplication you can iterate over defined variables.

Example 1: Pipelines with Uniform Primary Key and Update Timestamp


locals {
  tables = toset(["table1", "table2"])
}
 
resource "etleap_pipeline" "postgres_test" {
  for_each = local.tables
  name     = "Postgres ${each.value}"
 
  source = {
    postgres = {
      type                = "POSTGRES"
      connection_id       = etleap_connection_postgres.test_postgres.id
      schema              = "public"
      table               = each.value
      primary_key_columns = ["id"],
      last_updated_column = "updated_at"
    }
  }
 
  destination = {
    redshift = {
      type                     = "REDSHIFT"
      connection_id            = "ZnKPEhPr"
      schema                   = "public"
      table                    = each.value
      automatic_schema_changes = true
    }
  }
}

The local.tables variable contains the list of tables to create pipelines from.

To import an existing pipeline for a table, run:


terraform import 'etleap_pipeline.postgres_test["<table_name>"]' <ID>

Example 2: Different Primary Keys and/or Update Timestamps


locals {
  tables = {
    "table1": {
      "primary_keys": ["id"]
      "last_updated": null # no update timestamps for this table, append only pipeline will be created
    },
    "table2": {
      "primary_keys": ["id"],
      "last_updated": "updated_time"
    }
  }
}
 
resource "etleap_pipeline" "postgres_test" {
  for_each = local.tables
  name     = "Postgres ${each.key}"
 
  source = {
    postgres = {
      type                = "POSTGRES"
      connection_id       = etleap_connection_postgres.test_postgres.id
      schema              = "public"
      table               = each.key
      primary_key_columns = each.value.primary_keys, # primary key
      last_updated_column = each.value.last_updated  # last updated column
    }
  }
 
  destination = {
    redshift = {
      type                     = "REDSHIFT"
      connection_id            = "ZnKPEhPr"
      schema                   = "public"
      table                    = each.key
      automatic_schema_changes = true
    }
  }
}

As before, importing a resource for an existing table is done via:


terraform import 'etleap_pipeline.postgres_test["<table_name>"]' <ID>

Known Issues

After importing an existing pipeline into Terraform, the subsequent plan might show a remove operation on the primary_keys property on the destination. It is safe to apply this suggested change. It will not have an effect on the pipelines, and it’s due to an inconsistency when importing the resource.