Skip to Content
DocumentationWrangling DataApply Custom Transforms

Apply Custom Transforms

The transforms provided by Etleap’s Wrangler are sufficient for the vast majority of standard use cases. However, sometimes there’s business-specific logic that requires custom transforms. Etleap currently supports custom transforms implemented in Java, Scala, Javascript, Python, and Ruby. This guide describes the steps to create custom transforms and use them in the Wrangler.

Prerequisites

Custom transforms are hosted in a GitHub repository. You can use your own repository or we can host one for you.

  1. Contact Etleap support to request that a GitHub repository is created and linked to your Etleap organization.
  2. Provide the GitHub usernames that you would like to have write access to the repository.

Create a Custom Transform

Check in any function into the root of the GitHub repository. Below are examples of how to create custom transforms in Python and Java.

The example below is a Python file called test_function.py. The rules for Python transforms are as follows:

  • A top-level function called apply that takes one or more arguments has to be defined. The arguments correspond to the values of the input fields.
  • The apply function should return a dictionary where the column names are the keys.
  • The functions can raise errors, which are caught and handled by Etleap’s data parsing error logic.
  • Any standard library can be used. If you need a custom library, please contact support.
import math import re def apply(x): return {"col1":"Demo!" + x, "col2":math.cos(math.pi / 4.0), "col3":None, "col4":"asf"}

Using a Custom Transform in the Wrangler

In the wrangler, under Add Step, pick Custom. You’ll have access to the current version of any custom transform you have checked in to the master branch of the GitHub repository. Configure the transformation by picking the column(s) to apply the transformation to, and click Add.

Configuring a custom transform

The script will be pinned to the version represented by the commit sha in the square brackets at the end of the custom transform’s name. If you push a new version of the custom transform to the master branch in GitHub you will need to re-wrangle the pipeline and select the new version in order for the change to be used by your pipeline.