Data Format
Etleap supports two formats for the output to your data lake: CSV and Parquet. You pick your format in the last step of the pipeline setup. The Parquet files are compressed with Snappy compression .

Due to limitations with Glue, the CSV output format is not supported if the pipeline destination connection has a Glue database defined.
Parquet Type Mappings
The following outlines how Etleap data types map to Parquet data types when loading to a data lake destination.
Etleap Type | Parquet Physical Type | Parquet Logical Type | Notes |
---|---|---|---|
INT | int64 | null | |
BIGINT | binary | null | Width exceeds int64 range |
BOOLEAN | boolean | null | |
NUMBER(s,p) | fixed_len_byte_array | DECIMAL(s,p) | |
NUMBER | double | null | |
DATE | int32 | DATE | |
DATETIME | int64 | TIMESTAMP_MILLIS | Milliseconds are used since AWS Athena and Glue do not currently support microseconds resolution for timestamps |
STRING | binary | STRING | |
JSON | binary | JSON |