Load Manifest Files
Etleap has the option to generate a manifest file every time a load is done to an S3 destination. To enable manifest generation for a destination, select the option when creating your S3 Destination connection.
A manifest file describes the data that is being loaded. It includes information about the data (the columns, their names, and types), as well as information about data recency. The sourceHighWatermark property has the date and time of the last extraction. The columns property describes the data that was loaded (the column names, their types and, optionally, precision and scale). The primaryKeys property defines any primary keys contained in the data, and the dataFiles property has a list of all the S3 files that were generated during the load. All these files have the same data format, as described by columns.
If the extraction supports deletes, then the manifest will have the optional deletesFile field. This contains the path to the file that lists the deleted records. The format is a CSV file, with the same columns as the primaryKeys. This file is placed in the same directory as the manifest file.
The manifest is generated as _load.manifest and is placed in the same directory as the files being loaded. The manifest file has the following schema:
{
"type": "object",
"id": "urn:jsonschema:etleap:load:1.0",
"properties": {
"version": {
"type": "string",
"required": true
},
"loadType": {
"type": "string",
"required": true,
"enum": ["FULL", "INCREMENTAL_APPEND", "INCREMENTAL_UPDATE"]
},
"exhaustive": {
"type": "boolean",
"required": true
},
"sourceHighWatermark": {
"type": "string",
"required": true
},
"sourceHighWatermarkType": {
"type": "string",
"required": true,
"enum": ["TIMESTAMP", "STRING", "EXTERNAL_ID"]
},
"columns": {
"type": "array",
"required": true,
"items": {
"type": "object",
"properties": {
"name": {
"type": "string",
"required": true
},
"type": {
"type": "string",
"required": true,
"enum": ["INTEGER", "TIMESTAMP", "DATE", "BOOLEAN", "VARCHAR", "NUMERIC"]
},
"scale": {
"type": "integer"
},
"precision": {
"type": "integer"
}
}
}
},
"primaryKeys": {
"type": "array",
"required": true,
"items": {
"type": "string"
}
},
"dataFiles": {
"type": "array",
"required": true,
"items": {
"type": "string"
}
},
"deletesFiles": {
"type": "array",
"items": {
"type": "string"
}
}
}
}