Skip to Content

Data Format

Etleap supports two formats for the output to your data lake: CSV and Parquet. You pick your format in the last step of the pipeline setup. The Parquet files are compressed with Snappy compression .

Selecting output format (CSV or Parquet) at pipeline creation

Due to limitations with Glue, the CSV output format is not supported if the pipeline destination connection has a Glue database defined.

Parquet Type Mappings

The following outlines how Etleap data types  map to Parquet data types  when loading to a data lake destination.

Etleap TypeParquet Physical TypeParquet Logical TypeNotes
INTint64null
BIGINTbinarynullWidth exceeds int64 range
BOOLEANbooleannull
NUMBER(s,p)fixed_len_byte_arrayDECIMAL(s,p)
NUMBERdoublenull
DATEint32DATE
DATETIMEint64TIMESTAMP_MILLISMilliseconds are used since AWS Athena and Glue do not currently support microseconds resolution for timestamps 
STRINGbinarySTRING
JSONbinaryJSON