Latency

This document defines pipeline latency, how it is calculated, and how latency is monitored.

There are two types of latency:

End-to-End Latency is how far behind the data in the destination is compared to the source.
Source Latency is how long it has been since the source was updated with new data.

If you hear just the term latency, it usually refers to End-to-End Latency. Precise definitions for these latencies and how they are monitored are given below.

End-to-End Latency

End-to-End Latency is how up-to-date the destination is relative to the source. For example, if the latency is 47 minutes, then the destination data matches the data that was in the source 47 minutes ago.

In the first example below, the latest loaded batch was extracted at 4:13PM, so the latency is the time between ‘now’ and 4:13PM (e.g. at 5:00PM the latency would be 47 minutes). The batch that started at 4:18PM is not counted towards the latency metric because it has not finished loading.

Loading batch example — Loading pipeline overview

In the second example below, there was no new data found when Etleap last checked 59 minutes ago, so the latency is 59 minutes (i.e., time between last check and last successful load).

Monitoring End-to-End Latency

You can see the end-to-end latency for a pipeline in:

Pipelines page
Pipeline overview page, in which you’ll see the current latency as well as its trend over a selected time period

Note

Latency is defined only for pipelines that are in the Incremental Phase. Pipelines that are still catching up do not have all the source data yet, so their latency is N/A.

Etleap support engineers begin investigating a pipeline when its end-to-end latency is more than four hours beyond the Update Schedule for the pipeline. At this point, the pipeline is considered to be latent. e.g. if a pipeline has an update schedule of every 4 hours and has data to be extracted, the pipeline is latent if we have not loaded any data after 8 hours.

For pipelines that are not yet caught up and therefore don’t have their end-to-end latency defined yet, support engineers will investigate if it takes more than a four hours for a pipeline to complete an individual extraction, transformation, or load.

Source Latency

The source latency is how long it has been since new data was added to the source. If the source latency is 32 minutes, then the source hasn’t been updated with new data for 32 minutes.

The exact definition varies depending on the source type:

For file-based sources, the source latency is how long ago a file was added or modified in the source.
For other sources, the source latency is how long ago Etleap successfully extracted.

In the first example below, the source latency is the time between now and 4:18PM because that was the last time that Etleap successfully extracted for this pipeline, despite the fact that the data was not yet loaded.

In the second example below, the source latency is 34 days because the data in the source has not changed since Etleap last extracted 34 days ago.

The third example below shows the source files in S3 for a pipeline. The most recent file was modified on June 14th 2021, so the source latency is the time between June 14th 2021 and now.

Monitoring Source Latency

Etleap supports automatic notifications if the source latency exceeds a threshold. This will alert you of potential issues with your data producers.

For example, let’s say your source is a replica of a frequently updated database. If Etleap does not detect any new changes for 3 hours, that suggests the source replica is not updating correctly.

For another example, let’s say your source is an S3 bucket where you expect a data producer to add new files to S3 a few times a day. If you set the threshold to 24 hours, then Etleap will alert you if no new files were added for a whole day, indicating that the data producer is not sending new files.

You can set the source latency threshold in the pipeline’s source settings:

Search for the pipeline name.
Click on the Settings tab.
In the Source settings, edit the Latency Threshold

Edit latency threshold source setting — Edit Latency Threshold Summary