Glossary

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

B

Batch

A subset of data that the pipeline processes as a single unit. See How a Pipeline Works.

C

Catch-up Phase

The initial phase of pipeline processing where it gets all the data in the source as quickly as possible. See How a Pipeline Works.

Change Data Capture

A method of extracting data from database sources, usually by reading changes from a log on the source instance. This can be explicitly enabled on some database source connections in Etleap. The pipelines that ingest from those connections will capture all changes, including deletes, from the source tables by reading from these logs, instead of using SQL queries to extract data.

D

Destination

A destination is a data warehouse or lake where Etleap saves your data after it passes through a pipeline. Etleap currently supports three destinations:

Amazon Redshift
Amazon S3, optionally with schema information in AWS Glue
Snowflake

E

Exhaustive Activity

An extraction, transformation, or load for a pipeline that processed all remaining data available in the source.

Extraction

The part of pipeline processing that fetches data from the source. See How a Pipeline Works.

End-to-End Latency

How up-to-date the destination is relative to the source. See Latency.

H

High Watermark

A watermark identifies which data is included in the batch. See How a Pipeline Works.

I

Incremental Phase

The phase of pipeline processing where it gets only the data that was updated since the previous batch. See How a Pipeline Works.

L

Latency

Latency is how old the data in the destination is compared to the source for a pipeline.

Latent

A classification for pipelines whose current latency is more than four hours past its update schedule.

Load

The part of pipeline processing that adds the batch data to the destination table. See How a Pipeline Works.

M

Model

A model is a way of persisting the results of an arbitrary query in a data warehouse: a materialized view in database terms. In Etleap, a model allows you to update it incrementally and configure the update frequency in the UI. See Introduction to Models

Model Dependency

Model Dependencies are models or pipelines that “feed into” your Etleap model. Etleap allows updating the model when the dependency changes, and also ensures the model query is updated if the dependent table is renamed. See Model Depencies

P

Parsing Errors

Invalid row(s) in your pipeline where the source data does not comply with either the type defined for the column or the operation applied to it.

Pipeline

A data pipeline is a process that regularly moves data between a source and a destination. It may or may not include transformations that change the data while moving it.

R

Refresh

A refresh is when Etleap either re-extracts all the data from the source or reprocesses the previously extracted data for a pipeline.

S

Script

A script is a sequence of transformations that are applied to the data in a data pipeline.

Script Editor

The script editor is a part of the Wrangler where you can search for transforms – transformation steps, – customize them, and apply to your data to build a pipeline.

Source

A source is a service that provides the data you want to extract. It can be a database, file server, web-based or on-premise application, or event stream.

Source Latency

How long ago since the source was updated with new data. See Latency.

Step

A step is a single operation in the Wrangler. It performs only one transformation that can be customized in the script editor. A step is an instantiation of a transform.

T

Transform

A transform is a customizable operator that allows you to specify how the Wrangler should transform your data.

Transformation

The part of pipeline processing that applies the script to the batch data. See How a Pipeline Works.

W

Wrangler

The Wrangler is Etleap’s tool for data transformations. It stands out due to its unique preview feature that allows you to see the result of an applied transformation in real-time. The Wrangler helps you parse and structure data while building a data pipeline.