How to secure and manage context configuration files using the Talend integration Custom connections and Implicit Context Load?

It can be quite challenging to manage your context configuration files across different environments. Talend Integration Cloud (TIC) has an interesting feature which makes this exercise a bit easier: you can combine the power of implicit context load and the connection parameters in TIC to run jobs with different context variables on one single remote engine.

This blogpost describes how you can manage your contexts on remote engines. We also explain how this improves security.

  • Configuration in Talend Studio: setting the implicit context load property
  • Configuration in TIC: defining a custom connection
  • Building and using this custom connection within your Talend flows

I.   Configuration in Talend Studio

  1. In the object browser of Talend Studio, you add a context group with the name connection_configFilePath. In this group you add a context variable with the name connection_custom_configFilePath. It is important that you start the name with the “connection_custom_” prefix. Using this prefix, TIC will recognize this variable as a custom connection.

  1. You can configure the Implicit Context Load by ticking the checkbox Implicit tContextLoad in Project settings. Once this checkbox is ticked, more options wil appear. You can add the context, configured in the previous step, in the From File text field. You will have to fill in a Field Separator. In our example we have set the field seperator to “=”.
  2. The next thing you need to do, is to link the context from the first step to the job(s) you want to publish to the cloud. When you don’t do this, the jobs will fail on the implicit context load because they will not be able to find the context variable.
  3. Now (re-)publish your Talend jobs to the cloud by right clicking each job and select ‘Publish to Cloud’.

II.    Configuration in Talend Integration Cloud

  1. Open the Integration Cloud environment. From the Management page, select the workspace in which you want to create the connection. Click on the Add connection button to create a new connection.

  2. Click Custom under the applications list.

  3. Enter the name of the connection. In our example we called it FileLoc _[CUSTOM NAME].

  1. Add one Parameter by clicking on the Add parameter button. Call this the same as in the studio. Thisparameter should have the same name as you defined in Talend Studio omitting the prefix ‘connection_custom_’.This naming convention allows TIC to know which context variable needs to be updated with this value.
  2. Then place the file path of your context configuration file in the configFilePath In the screenshot is placed “value”. This is where you enter the pathname.Note: The file path is the location of the configuration file on the remote engine.
  1. Press Create to save your connection. You can repeat this process for every context configuration file you want to use.

III.   Using the connections in your flows

Now that you have created the context in the Talend Studio and that you have defined your connections in the TIC,  you can use the custom connection in your flows. You do this by ‘building’ your newly published jobs.

  1. Go to your flow and press the builder button, then press live and select a connection.

How does this mechanism work behind the scenes?
When a job is executed from the Talend Integration Cloud, the file path you defined in the custom connection will be passed to the job. Since you defined the file path context in the implicit context load process of the job, this file path will be used to locate the context configuration file. In this manner Talend loads all the context values from this file. This file with the context values,is only visible and available for the remote engine that executes the job.

Conclusion

This way of implementing context allows you to run the same job with different parameters, only with a few clicks in the TIC environment, and without any extra development in the studio. This is also secure because you don’t need to store sensitive connectivity information in the cloud; all the connection data will be stored on the secured environment behind the firewall, where the remote engine is installed.