MySQL
There are two methods Etleap uses to extract data from a MySQL database: Query-based extractions and Change Data Capture (CDC). With CDC, Etleap will take an initial snapshot of the table using SQL queries and capture incremental changes from the binlog. With Query-based extractions, all data is extracted with SQL queries.
Source Setup
Step 1. Create a Database User
To create a user that will allow Etleap to connect to your database, follow the steps below:
- Create a new user for the Etleap connection. The
<user_name>
and<password>
values will be needed when creating the connection in Etleap.CREATE USER '<user_name>'@'%' IDENTIFIED BY '<password>';
- Grant Read Permission to the database or tables you would like to extract data from.
GRANT SELECT, SHOW VIEW ON <database_name>.* TO '<user_name>'@'%';
You can now use this user to set up a new connection in Etleap.
Step 2. Whitelist Etleap’s IP Addresses
Etleap’s IP addresses must be whitelisted in MySQL. There are different addresses to whitelist depending on your deployment model. Review the IP Whitelisting article for more information on which to use.
Prefer using an API? Go here and select MySQL under the Body header to create your connection via API.
Change Data Capture (CDC) Setup
If you want to enable CDC with MySQL, you will need to first assign permissions and configure the binlog on your MySQL database. The setup steps for both self-managed MySQL instances and RDS hosted MySQL instances are detailed below.
Etleap will use the binlog to read transaction data when extracting changes.
Step 1. Assign Permissions to the User
The user that Etleap uses to connect to MySQL needs the following privileges:
REPLICATION CLIENT
REPLICATION SLAVE
SUPER
– this privilege is required only in MySQL versions before 5.6.6.SELECT
privileges for the source tables designated for replication.
- Grant replication privileges to the user.
GRANT REPLICATION CLIENT ON *.* TO <user_name>@'%'; GRANT REPLICATION SLAVE ON *.* TO <user_name>@'%';
- If the MySQL source instance version is before 5.6.6, grant the
SUPER
privilege to the user.GRANT SUPER ON *.* TO <user_name>@'%';
- Grant
SELECT
privileges for the source tables to be replicated.Choose one of the statements below to assign select permissions at a certain levelGRANT SELECT ON <database_name>.<table_name> TO '<user_name>'@'%'; GRANT SELECT ON <database_name>.* TO '<user_name>'@'%'; GRANT SELECT ON *.* TO '<user_name>'@'%';
Step 2. Configure the Binlog
Self-Hosted Databases
To enable binary logging, configure the following parameters in MySQL’s my.ini
(Windows) or my.cnf
(UNIX) file:
server_id
- Set this parameter to a value of 1 or greater.log-bin
- Set the path to the binary log file, such aslog-bin=E:\MySql_Logs\BinLog
. Don’t include the file extension.binlog_format
- Set this parameter toROW
.expire_logs_days
- Set this parameter to a value of 1 or greater. To prevent overuse of disk space, we recommend that you don’t use the default value of 0.binlog_row_image
- Set this parameter toFULL
.log_slave_updates
- Set this parameter toTRUE
if you are using a read-replica as a source.
CDC Limitations
- Change data capture (CDC) isn’t supported for Amazon RDS MySQL 5.5 or lower. For Amazon RDS MySQL, you must use version 5.6 or 5.7 to enable CDC. CDC is supported for self-managed MySQL 5.5 sources.
- CDC isn’t supported when a table name contains uppercase and lowercase characters, and the source engine is hosted on an operating system with case-insensitive file names. An example is Microsoft Windows or OS X using HFS+.
- The database password must not contain the following invalid character :
;
. - Capturing changes when the binary logs aren’t stored on standard block storage isn’t supported. For example, CDC doesn’t work when the binary logs are stored on Amazon S3.
- For Aurora MySQL databases, only Writer instances are supported
- Temporal data tables or system—versioned tables are not supported on MariaDB databases
- No support for compressed transaction log payloads introduced in MySQL 8.0.20.
- No support for XA transactions.
- No support for using GTID for replication.
- Altering partitions is not supported.
- Any partitioned-table-related DDLs will be ignored, but the task will continue processing subsequent binary log changes.
- For example, row deletions that result from a
DROP PARTITION
operation will not be captured by Etleap.
- For example, row deletions that result from a
- To maintain accurate data in the destination, we recommend deleting rows from the source table before dropping the partition.
- Any partitioned-table-related DDLs will be ignored, but the task will continue processing subsequent binary log changes.
CDC Error Guides
Type Mapping
The table below shows how MySQL data types map to Etleap data types.
Mapped Values
MySQL Type | Etleap Type |
---|---|
BIT | BOOOLEAN |
BIGINT | INTEGER |
BIGINT UNSIGNED | INTEGER |
BOOL | BOOLEAN |
DATE | DATE |
DECIMAL | NUMBER |
DOUBLE PRECISION | NUMBER |
FLOAT | NUMBER |
INTEGER | INTEGER |
INTEGER UNSIGNED | INTEGER |
LONGTEXT | STRING |
MEDIUMINT | INTEGER |
MEDIUMINT UNSIGNED | INTEGER |
MEDIUMTEXT | STRING |
NUMERIC | NUMBER |
REAL | NUMBER |
SMALLINT | INTEGER |
SMALLINT UNSIGNED | INTEGER |
TEXT | STRING |
TINYTEXT | STRING |
TINYINT | INTEGER |
TINYINT UNSIGNED | INTEGER |
YEAR | DATE |
Unmapped Values
Each type listed below does not have a specified mapping and will default to STRING
in Etleap.
- BINARY
- BLOB
- CHAR
- DATETIME
- ENUM
- JSON
- LONGBLOB
- LONG VARBINARY
- LONG VARCHAR
- MEDIUMBLOB
- SET
- TIME
- TIMESTAMP
- TINYBLOB
- VARBINARY
- VARCHAR
Arrays
Arrays of any type are treated as comma-delimited strings in Etleap.