Data Imports

Actito gives you several options to import data in the tables of your data model.

You can of course manually import profiles or custom table entries yourself in the interface, but if you want to automate these data flows, you are most likely using ETL synchronizations or API imports set up by your technical operators or by the Actito teams.

While you can receive an execution report, it is also interesting to be able to visualize these synchronizations directly in the Actito platform. That's what the 'Imports' app is here for!

Imports

Just like the 'Exports' app allows you to visualize your data flows OUT of Actito, the Imports' app helps you visualize data flows INTO Actito.

You can reach this app from the Apps center (Apps > Data > Imports), from the 'Datamart Studio' or from the 'Exports' app.

Navigating the Imports app

Several tabs and filters allows you to review past, ongoing, and future executions of your imports, depending on their type.

tip

For ETL synchronizations, it's important to make the distinction between 'ETL definition' and 'ETL execution'.

The first is the definition of the flow, set up by API, that includes all the parameters including the frequency.
The second is the periodic execution of what has been defined

alt text

Creating an import

By clicking on the "Create an import" button, you can create a new manual, one shot import, which means uploading a file directly in the Actito interface.

alt text

Depending on the type of table you are importing data into, please follow the relevant guide:

alt text

Understanding the filters

In the top left corner you can choose to display only the specific imports based on :

a particular datamodel, if you have several ones
a type of table (Profiles, Repository, Linked data, Interactions)
a specific table
a type of import

As data can be imported in profiles or custom tables, first choose the type of table, then select the table name in the dropdown list.

images/download/thumbnails/667255520/image2023-5-17_12-27-50.png

tip

ETL synchronizations can be multi-file, meaning that the same definition triggers imports in several tables.

In this case, the import will appear when you select any impacted table in the filter.

Import type

alt text

One shot imports are all imports for which the frequency is not defined in Actito: manual profile or custom table imports, and mass imports by API (API imports might be coded on your side by your developers, and therefore be scheduled in some sense, but this scheduling is not defined within Actito so they count as one shot). The file transfer type is manual if the import is made in the interface or API if the import was triggered by a webservices call.
Automated imports are ETL synchronizations that have been programmed by the Actito team or your developers to automatically retrieve a file on a cloud location. Once they've been defined, they do not require additional developments. As such, they are ideal if you have limited technical resources on your side. There are 2 types of automated imports, with different file retrieval modes:
- Automated (Scheduled) imports are ETLs synchronizations for which the frequency is defined in Actito and which run at a specific time every day. This means that they are dependent on your upstream processes: the file must be present on the cloud location at the defined time. However, a 'retry policy' can be defined, which is handy in case of delays. As soon as an execution is finished, the next execution is already scheduled. For such imports, the file is always retrieved in the cloud.
- Automated (Triggered) imports are ETLs synchronizations for which an active polling has been set-up. Such imports are not dependent on a frequency, but every 5 minutes, Actito checks if a file with the correct name pattern has been dropped on the cloud location. This means that several imports can be triggered per day. The polling on the cloud location is defined through a file synchronization.

tip

For automated imports, the technical name of the ETL synchronization will be displayed in the Name column.

For one shot imports, you don't give a name to manual imports in the interface or by API: the Name column will remain empty.

Understanding the statuses

Navigate through the Draft, Active, Scheduled, In progress and Finished tabs to see the different statuses of executions.

status

Draft imports

This tab only contains ETL synchronizations that have been paused.

There, you can click on the 'View import definition' button to get an overview of the different steps of the import or Resume it to reactivate or reschedule it.

From the 'More' button, it is also possible to Update its frequency or Update the report recipients.

Active imports

This tab only contains automated triggered imports, so ETL synchronizations that rely on a polling on a cloud location to check if a file matching the defined name pattern has been uploaded.

You can click on the 'View import definition' button to get an overview of the different steps of the import. However, the 'Triggered on', 'Started on' and 'Ended on' dates to the left will always remain empty, as once the polling finds a synchronized file, an 'in progress' execution is created.

This tab therefore only contains the definition of the import. Executions triggered in the past are found in the 'Finished' tab.

Scheduled executions

This tab only contains automated scheduled imports with files transferred in the cloud, so ETL synchronizations that run at a defined time every day.

This allows to easily check when the next synchronization is going to run.

info

As soon as a daily execution has finished running, the execution of the next day is created.

This tab only displays the very next execution of a synchronization.

Alt text

Click on the 'Stop' button to pause an ETL synchronization. This is useful if you have doubts about the file that was uploaded to the cloud location and need to double check, or if you temporarily need to stop processing synchronizations.

'Paused' ETLs are found in the 'Draft' tab, where they can be resumed.

View import definition

Click on the 'View import definition' button to get an overview of the different steps of the import, as defined during the creation of the ETL synchronization.

This is very useful to get information about what is expected at each step. You can get a similar breakdown in the execution details of 'Finished' executions, with an added status for each step (see 'Finished execution' for detailed explanations about each step).

Update frequency

When clicking on 'More', you have the possibility to click on 'Update frequency'.

Alt text

You have the opportunity to update the frequency of your ETL to synschronize:

Every day at HH:MM
Every week (one or several days) at HH:MM
Every month on the [number] at HH:MM

Alt text

info

If you choose the frequency 'Every month', the chosen number will correspond to the day of each month when the execution should take place.

caution

If you select the frequency 'Every month' and you choose day 31, your ETL will not be synchronized every month as there are not 31 days in each month.

If the frequency update you want to set is more complex, you can also use the 'Expert mode'.

Alt text

This mode enables you to choose the CRON expression of your choice. The constraints for this expression are the same as the ones from the ETL definition through API.

info

By default, the frequency set on the ETL definition is displayed. If the CRON expression is not every day, every week or every month, then the expert mode is displayed.

Update report recipients

When clicking on 'More', you also have the possibility to click on 'Update report recipients'.

Alt text

This button will enable you to update the selected ETL's report recipient list. You will have the opportunity to:

Add a new recipient
Modify an existing recipient
Delete an existing recipient

Alt text

It also allows you to empty an existing list, as filling a recipient list is not mandatory.

info

For email addresses that are duplicated when you update the report recipients, the system only keeps one automatically.

The definition of the ETL is updated once you click on 'Validate'.

info

A newly added recipient will receive his first report after the first next execution.

In progress executions

This tab contains imports that are currently running. While bigger files may take longer to integrate, you will only see data in this tab temporarily, just after the start of an execution.

This allows you to easily check if an import or a synchronization is still running and has not finished yet.

Admin and advanced users have the possibility to directly retrieve the file that is being integrated into Actito thanks to the 'Download input files'.

images/download/attachments/667255520/image2023-5-17_16-53-16.png

Thanks to the 'View execution details' button, you can also get an overview of each step of the import. As the execution is ongoing, the final status of each step might not be available (see 'Finished execution' for detailed explanations about each step).

Finished executions

This tab contains all the imports that have finished running: this includes both past executions of automated synchronizations and one shot imports.

info

The 'Finished' executions tab keeps 15 days of history for automated imports (ETLs) and 5 days for one shot imports.

This allows you to check when an import has been completed and when all the data have been integrated in to Actito. More importantly, you can check the Import result column to confirm whether the the import has succeeded or fell into error.

tip

For scheduled imports, the technical name of the ETL synchronization is displayed in the Name column. You can use the 'Search' function to quickly find the executions of a specific ETL.

Check the 'Started on' date to find the execution on a specific day.

alt text

Admin and advanced users have the possibility to retrieve the file that was imported as well as the output files: depending on the result, it can be a result file, or an error file.

tip

The output files of ETL synchronizations are only generated if the generateErrorFiles and generateResultFiles parameters have been set to true in the definition.

The error file can help you find the issue with failed executions to let you correct them.

For ETL synchronizations that fell in error (possibly because the file was not available in the cloud), it is possible to make a RELAUNCH by API.

The relaunch requires knowing the original execution id. You can easily retrieve it by adding the "id" column in the top right corner.

images/download/thumbnails/667255520/image2023-5-19_11-22-41.png

Viewing the execution details

Click on the 'View execution details' button to get a detailed overview with the results of each steps.

On the left panel, you can see the schedule, start and end date of the execution, the frequency that determines the schedule, the description of the synchronization and the list of recipients of the execution report.

tip

The frequency is defined through a CRON expression. It can be read more easily by looking at the 'Scheduled on' moment. In the example below, 0 00 16 * * ? translates to 'every day at 16:00'.

You can also see its global status, which can be:

SUCCESS: all the files were correctly retrieved and integrated into your license, without a single line encountering an error.
IN ERROR: the import encountered a global error and didn't go through, meaning that not a single line was correctly integrated. This is usually related to the absence or the format of the files.
WARNING: all (mandatory) files were correctly retrieved and they were partly integrated into Actito, but at least one line encountered an issue because it contained an invalid value.

images/data-imports/view-executions-details.png

Each of the five steps has their own status.

Click on one of the steps to see its details.

Input files transfer

In this step, you can see the remote location where the file has been retrieved (in the case of an ETL).

You also have details about the files expected in the definition of an ETL, such as:

The expected name pattern
The name of the file for a specific execution
Whether the file is mandatory for the execution of the import
Whether the file was present in this specific execution.

images/data-imports/input-files-transfer.png

This step will encounter an ERROR if the remote location (such as an SFTP or FTPS server) was not availalbe at the time, if the file was not found on the location (including if the file for the date pattern of the execution was missing) or if the file found in a zipped archive was incorrect.

If a retry policy has been defined in the ETL, Actito will continue trying to retrieve the file at a fixed frequency (defined by the minimumInterval parameter). If after a lapse of time defined by the giveUpAfter parameter (max 8 hours), no file has been found, the execution will definitely fall in error.

tip

The ETL falls in error as soon as the 'give up after' time period has been reached, even if it doesn't coincide with a last attempt.

The number of attempts and the time of the last attempt are also displayed in the execution details of this step.

alt text

tip

If a non-mandatory file is missing, this step will be marked as a SUCCESS and no retries will be attempted.

Files format validation

In this step, you can see the format of the files, as defined in the parameters of the ETL.

This includes :

The separator of the CSV file: while typically semi-colons, commas or tabulations are used, other characters can be defined as separators.
The encoding: the character set used by the file. It can be UTF-8 (default value) or ISO-8859-1.
The enclosing and escaping characters, used to escape data when the value contains the separator or enclosing character.

format validation

Data transformations

This step gives your an overview of the transformations applied to the data.

It can only encounter errors if the value of the input does not match the value defined in the transformation .

images/data-imports/data-transformations.png

The 'data transformations' step is only present in ETL synchronization where transformations have been defined. It will always be greyed out for manual or bulk API imports.

Data loadings

This step if the most important one in the import: the actual writing of data in the license. In the case of a multifile ETL, you will have a status for each file.

You can first see the definition of the step:

Click on the 'Mapping' icon to see the mapping between the headers of the input file and the name of the attributes in the table. You can also see the behaviour in case of empty, existing or invalid values, as well as multi-value attributes.
The 'Parameters' toggle allows you to see the Writing mode (CREATE, UPDATE, CREATE/UPDATE, DELETE), and whether error and result files will be generated for this step.

The integration results give you information about the number of lines integrated into the table.

The number of lines 'read' is the number of lines found in the files.
The number of 'rejected' lines is the number of lines that contain an invalid value for the corresponding attribute (such as an invalid e-mail address, an invalid language code, ...). If there is at least 1 rejected line, the global status of the import will be in WARNING. You can download the error file to check the validation errors (provided that the generateErrorFiles parameter has been set to true)
The number of 'inserted' lines is the number of lines that did not exist in Actito and that were created by the import.
The number of 'updated' lines is the number of lines that already existed in Actito but for which a modification of data was found in the file. If you import a line that is identical to an already existing one, it will not count as an update. Hence the sum of rejected + inserted + updated lines may be lower the number of lines read, because some lines existed already without any update to their values.
The number of 'deleted' lines is only applicable for DELETE type ETLs, which can only delete data without creating any new record.

Output files transfer

The details of this step give you information about the possible output files generated, including:

The remote location (FTPS, SFTP, Transferbox) on which the files have been dropped.
The name of the files.

The output files of ETL synchronizations are only generated if the generateErrorFiles and generateResultFiles parameters have been set to true in the definition.

tip

The execution details appear in a side panel. Click on the cross in the top left corner of your screen to exit it.

Retrieving output files

Admin and advanced users have the possibility to retrieve the file that was imported as well as the output files: depending on the result, it can be a result file, or an error file.

The output files of ETL synchronizations are only generated if the generateErrorFiles and generateResultFiles parameters have been set to true in the definition.

The error file can help you find the issue with failed executions to let you correct them.

It contains a replica of each failed row, but with 2 extra columns:

"errorCode": This is the error code, which details the reason for failure.
"errorColumn": This indicates in which column is the error.

If several columns fell into error for the same row, this row will be repeated once for each error.

The possible error codes are:

"INVALID_FIELD_VALUE": The row value for the field indicated in the 'errorColumn' row is not valid, because the format is not compatible.
"DATA_ALREADY_EXISTS": The error occurs in 'createOnly' mode when a row of the import file refers to a business key that already exists in the table.
"UNKNOWN_DATA": The error occurs in 'updateOnly' mode when a row of the import file refers to a business key that does not exist in the table.
"DUPLICATE_OBJECT": The error occurs because the new record contains an existing value for a unique attribute which is not the business key.
"MISSING_FIELD_VALUE": The error occurs because the value for a mandatory attribute is missing.

Viewing the report

Click on a 'View report' button to access a copy of the execution report, identical to the one received by e-mail by the recipients defined in the parameters of the import.

It also displays the list of recipients.

tip

The execution report appears in a side panel. Click on the cross in the top left corner of your screen to exit it.

Navigating the Imports app​

Creating an import​

Understanding the filters​

Import type​

Understanding the statuses​

Draft imports​

Active imports​

Scheduled executions​

View import definition​

Update frequency​

Update report recipients​

In progress executions​

Finished executions​

Viewing the execution details​

Input files transfer​

Files format validation​

Data transformations​

Data loadings​

Output files transfer​

Retrieving output files​

Viewing the report​

Navigating the Imports app

Creating an import

Understanding the filters

Import type

Understanding the statuses

Draft imports

Active imports

Scheduled executions

View import definition

Update frequency

Update report recipients

In progress executions

Finished executions

Viewing the execution details

Input files transfer

Files format validation

Data transformations

Data loadings

Output files transfer

Retrieving output files

Viewing the report