Skip to main content

Data Imports

Actito gives you several options to import data in the tables of your data model.

You can of course manually import profiles or custom table entries yourself in the interface, but if you want to automate these data flows, you are most likely using ETL synchronizations or API imports set up by your technical operators or by the Actito teams.

While you can receive an execution report, it is also interesting to be able to visualize these synchronizations directly in the Actito platform. That's what the 'Manage imports' app is here for!

images/download/thumbnails/667255520/image2023-5-17_12-17-12.png

Just like the 'Manage exports' app allows you to visualize your data flows OUT of Actito, the 'Manage imports' app helps you visualize data flows INTO Actito.

You can reach this app from the Catalog (Profiles > Manage imports) or from the 'Manage exports' app.

Several tabs and filters allows you to review past, ongoing, and future executions of your imports, depending on their type.

tip

For ETL synchronizations, it's important to make the distinction between 'ETL definition' and 'ETL execution'.

  • The first is the definition of the flow, set up by API, that includes all the parameters including the frequency.
  • The second is the periodic execution of what has been defined

images/download/attachments/667255520/image2023-5-17_12-22-57.png

Understanding the filters

In the top left corner, you can choose to display only the imports on a specific table.

As data can be imported in profiles or custom tables, first choose the type of table, then select the table name in the dropdown list.

images/download/thumbnails/667255520/image2023-5-17_12-27-50.png

tip

ETL synchronizations can be multi-file, meaning that the same definition triggers imports in several tables.

In this case, the import will appear when you select any impacted table in the filter.

You can also filter on the type of import:

  • One shot imports are all imports for which the frequency is not defined in Actito: manual profile or custom table imports, and mass imports by API (API imports might be programmed on your side by your developers, and therefore be scheduled in some sense, but this scheduling is not defined within Actito so they count as one shot). The file transfer type is manual if the import is made in the interface or provided directly in the API call, or it can be in the cloud if the API call is programmed to retrieve the file on a FTPS.

  • Scheduled imports are all import for which the frequency is defined in Actito, meaning that as soon as an execution is finished, the next execution is already scheduled: this includes only ETLs synchronizations. For such imports, the file is always retrieved in the cloud.

tip

For scheduled imports, the technical name of the ETL synchronization will be displayed in the Name column.

For one shot imports, you don't give a name to manual imports in the interface or by API: the Name column will remain empty.

images/download/thumbnails/667255520/image2023-5-17_15-46-40.png

Understanding the statuses

Navigate through the Draft, Scheduled, In progress and Finished tabs to see the different statuses of executions.

status

Draft executions

This tab only contains ETL synchronizations that have been paused.

Scheduled executions

This tab only contains scheduled imports with files transferred in the cloud, so ETL synchronizations.

This allows to easily check when the next synchronization is going to run.

info

As soon as a daily execution has finished running, the execution of the next day is created.

This tab only displays the very next execution of a synchronization.

Alt text

Click on the 'Stop' button to pause an ETL synchronization. This is useful if you have doubts about the file that was uploaded to the cloud location and need to double check, or if you temporarily need to stop processing synchronizations.

'Paused' ETLs are found in the 'Draft' tab, where they can be resumed.

View import definition

Click on the 'View import definition' button to get an overview of the different steps of the import, as defined during the creation of the ETL synchronization.

This is very useful to get information about what is expected at each step. You can get a similar breakdown in the execution details of 'Finished' executions, with an added status for each step (see 'Finished execution' for detailed explanations about each step).

Update frequency

When clicking on 'More', you have the possibility to click on 'Update frequency'.

Alt text

You have the opportunity to update the frequency of your ETL to synschronize:

  • Every day at HH:MM
  • Every week (one or several days) at HH:MM
  • Every month on the [number] at HH:MM

Alt text

info

If you choose the frequency 'Every month', the chosen number will correspond to the day of each month when the execution should take place.

caution

If you select the frequency 'Every month' and you choose day 31, your ETL will not be synchronized every month as there are not 31 days in each month.

If the frequency update you want to set is more complex, you can also use the 'Expert mode'.

Alt text

This mode enables you to choose the CRON expression of your choice. The constraints for this expression are the same as the ones from the ETL definition through API.

info

By default, the frequency set on the ETL definition is displayed. If the CRON expression is not every day, every week or every month, then the expert mode is displayed.

Update report recipients

When clicking on 'More', you also have the possibility to click on 'Update report recipients'.

Alt text

This button will enable you to update the selected ETL's report recipient list. You will have the opportunity to:

  • Add a new recipient
  • Modify an existing recipient
  • Delete an existing recipient

Alt text

It also allows you to empty an existing list, as filling a recipient list is not mandatory.

info

For email addresses that are duplicated when you update the report recipients, the system only keeps one automatically.

The definition of the ETL is updated once you click on 'Validate'.

info

A newly added recipient will receive his first report after the first next execution.

In progress executions

This tab contains imports that are currently running. While bigger files may take longer to integrate, you will only see data in this tab temporarily, just after the start of an execution.

This allows you to easily check if an import or a synchronization is still running and has not finished yet.

Admin and advanced users have the possibility to directly retrieve the file that is being integrated into Actito thanks to the 'Download input files'.

images/download/attachments/667255520/image2023-5-17_16-53-16.png

Thanks to the 'View execution details' button, you can also get an overview of each step of the import. As the execution is ongoing, the final status of each step might not be available (see 'Finished execution' for detailed explanations about each step).

Finished executions

This tab contains all the imports that have finished running: this includes both past executions of scheduled synchronizations and one shot imports.

info

The 'Finished' executions tab keeps 15 days of history for scheduled imports (ETLs) and 5 days for one shot imports.

This allows you to check when an import has been completed and when all the data have been integrated in to Actito. More importantly, you can check the Import result column to confirm whether the the import has succeeded or fell into error.

tip

For scheduled imports, the technical name of the ETL synchronization is displayed in the Name column. You can use the 'Search' function to quickly find the executions of a specific ETL.

Check the 'Started on' date to find the execution on a specific day.

images/download/attachments/667255520/image2023-5-17_17-23-57.png

Admin and advanced users have the possibility to retrieve the file that was imported as well as the output files: depending on the result, it can be a result file, or an error file.

tip

The output files of ETL synchronizations are only generated if the generateErrorFiles and generateResultFiles parameters have been set to true in the definition.

The error file can help you find the issue with failed executions to let you correct them.

For ETL synchronizations that fell in error (possibly because the file was not available in the cloud), it is possible to make a RELAUNCH by API.

The relaunch requires knowing the original execution id. You can easily retrieve it by adding the "id" column in the top right corner.

images/download/thumbnails/667255520/image2023-5-19_11-22-41.png

Viewing the execution details

Click on the 'View execution details' button to get a detailed overview with the results of each steps.

On the left panel, you can see the schedule, start and end date of the execution, the frequency that determines the schedule, the description of the synchronization and the list of recipients of the execution report.

tip

The frequency is defined through a CRON expression. It can be read more easily by looking at the 'Scheduled on' moment. In the example below, 0 00 16 * * ? translates to 'every day at 16:00'.

You can also see its global status, which can be:

  • SUCCESS: all the files were correctly retrieved and integrated into your license, without a single line encountering an error.
  • IN ERROR: the import encountered a global error and didn't go through, meaning that not a single line was correctly integrated. This is usually related to the absence or the format of the files.
  • WARNING: all (mandatory) files were correctly retrieved and they were partly integrated into Actito, but at least one line encountered an issue because it contained an invalid value.

images/data-imports/view-executions-details.png

Each of the five steps has their own status.

Click on one of the steps to see its details.

Input files transfer

In this step, you can see the remote location where the file has been retrieved (in the case of an ETL).

You also have details about the files expected in the definition of an ETL, such as:

  • The expected name pattern
  • The name of the file for a specific execution
  • Whether the file is mandatory for the execution of the import
  • Whether the file was present in this specific execution.

This step will encounter an ERROR if the remote location (such as an SFTP or FTPS server) was not availalbe at the time, if the file was not found on the location (including if the file for the date pattern of the execution was missing) or if the file found in a zipped archive was incorrect.

images/data-imports/input-files-transfer.png

If a non-mandatory file is missing, this step will be marked as a SUCCESS.

Files format validation

In this step, you can see the format of the files, as defined in the parameters of the ETL.

This includes :

  • The separator of the CSV file: while typically semi-colons, commas or tabulations are used, other characters can be defined as separators.
  • The encoding: the character set used by the file. It can be UTF-8 (default value) or ISO-8859-1.
  • The enclosing and escaping characters, used to escape data when the value contains the separator or enclosing character.

format validation

Data transformations

This step gives your an overview of the transformations applied to the data.

It can only encounter errors if the value of the input does not match the value defined in the transformation .

images/data-imports/data-transformations.png

The 'data transformations' step is only present in ETL synchronization where transformations have been defined. It will always be greyed out for manual or bulk API imports.

Data loadings

This step if the most important one in the import: the actual writing of data in the license. In the case of a multifile ETL, you will have a status for each file.

You can first see the definition of the step:

  • Click on the 'Mapping' icon to see the mapping between the headers of the input file and the name of the attributes in the table. You can also see the behaviour in case of empty, existing or invalid values, as well as multi-value attributes.
  • The 'Parameters' toggle allows you to see the Writing mode (CREATE, UPDATE, CREATE/UPDATE, DELETE), and whether error and result files will be generated for this step.

The integration results give you information about the number of lines integrated into the table.

  • The number of lines 'read' is the number of lines found in the files.
  • The number of 'rejected' lines is the number of lines that contain an invalid value for the corresponding attribute (such as an invalid e-mail address, an invalid language code, ...). If there is at least 1 rejected line, the global status of the import will be in WARNING. You can download the error file to check the validation errors (provided that the generateErrorFiles parameter has been set to true)
  • The number of 'inserted' lines is the number of lines that did not exist in Actito and that were created by the import.
  • The number of 'updated' lines is the number of lines that already existed in Actito but for which a modification of data was found in the file. If you import a line that is identical to an already existing one, it will not count as an update. Hence the sum of rejected + inserted + updated lines may be lower the number of lines read, because some lines existed already without any update to their values.
  • The number of 'deleted' lines is only applicable for DELETE type ETLs, which can only delete data without creating any new record.

images/data-imports/data-loading.png

Output files transfer

The details of this step give you information about the possible output files generated, including:

  • The remote location (FTPS, SFTP, Transferbox) on which the files have been dropped.
  • The name of the files.

The output files of ETL synchronizations are only generated if the generateErrorFiles and generateResultFiles parameters have been set to true in the definition.

https://res.cloudinary.com/dmn1io5db/image/upload/v1686240087/execDetails5_auq8cb.png

tip

The execution details appear in a side panel. Click on the cross in the top left corner of your screen to exit it.

Retrieving output files

Admin and advanced users have the possibility to retrieve the file that was imported as well as the output files: depending on the result, it can be a result file, or an error file.

The output files of ETL synchronizations are only generated if the generateErrorFiles and generateResultFiles parameters have been set to true in the definition.

The error file can help you find the issue with failed executions to let you correct them.

It contains a replica of each failed row, but with 2 extra columns:

  • "errorCode": This is the error code, which details the reason for failure.
  • "errorColumn": This indicates in which column is the error.

If several columns fell into error for the same row, this row will be repeated once for each error.

The possible error codes are:

  • "INVALID_FIELD_VALUE": The row value for the field indicated in the 'errorColumn' row is not valid, because the format is not compatible.
  • "DATA_ALREADY_EXISTS": The error occurs in 'createOnly' mode when a row of the import file refers to a business key that already exists in the table.
  • "UNKNOWN_DATA": The error occurs in 'updateOnly' mode when a row of the import file refers to a business key that does not exist in the table.
  • "DUPLICATE_OBJECT": The error occurs because the new record contains an existing value for a unique attribute which is not the business key.
  • "MISSING_FIELD_VALUE": The error occurs because the value for a mandatory attribute is missing.

Viewing the report

Click on a 'View report' button to access a copy of the execution report, identical to the one received by e-mail by the recipients defined in the parameters of the import.

It also displays the list of recipients.

https://res.cloudinary.com/dmn1io5db/image/upload/v1686318291/reportImport_bb9eta.png

tip

The execution report appears in a side panel. Click on the cross in the top left corner of your screen to exit it.