Datahub - Automatic imports

🌐

Other languages available for this guide

🇫🇷 La version française de ce guide
🇪🇸 La versión española de esta guía (WIP)
🇮🇹 La versión italiana de esta guía (WIP)

Working with customer relations tools such as Marketing Automation often requires using large amounts of data. These data then need to be fed, or imported, into the system. Yes, users with administrative rights can import new contacts into Marketing Automation, but contacts are just one data type.

To address this need, Splio offers an alternative in the form of the automatic import feature. With automatic imports, you can feed Splio various kinds of data in an organized way. All you have to do is to prepare files and upload them into a designated folder, from which an automatic script will pick them up and import the data for you.

Prerequisites

  • The automatic import feature must first be configured by your Project Manager from Splio.
  • You need the ability to open and save files using the CSV format and UTF-8 encoding.
  • You must be able to move, copy, and rename files locally.
  • An SFTP connection is required to upload import files.

Import steps

Follow these steps to make Splio import your data files automatically:

  • Prepare CSV files – follow the guidelines below to create and format files containing your data.
  • Upload files – connect to the sFTP repository and upload the files. Your Project Manager will provide you with connection details if you need them.
  • Check status – Splio processes imported files on schedule and sends alerts and results to the email addresses configured for this purpose. You can also examine the log files in the repository to learn more details.

Guidelines for data files

Follow the guidelines below when preparing data files for upload.

  • Each file must contain data belonging to a single scope.
  • Files must use the CSV format, with no multi-line data.
  • UTF-8 encoding without BOM must be used.
  • Filenames must indicate the universe, scope, subsection, and date.
  • Import priority depends on scopes and can be overridden by grouping files.
  • Optional identifiers can be used to tell your files apart.

These guidelines are explained in more detail below.

📘

To make handling easier, the import files can be uploaded as archives.

Scopes

Each import file must belong to a data type, or "scope", which Splio can recognize. All scope names are predefined, lowercase, and case-sensitive.

All available scopes are listed below, with short descriptions. Each scope name is a link you can click to open the corresponding guide.

  • contacts – individuals, customers, or contacts in your database;
  • stores – stores and POS;
  • products – products in the catalog;
  • abandonedcarts – abandoned carts, or the orders which were never finalized or paid;
  • orders –orders placed by customers;
  • ordersitems – individual products within orders and abandonedcarts (like individual items on an order).

Consult the dedicated guides to learn which data fields (columns) you need and can include for each scope.

Deletion scopes

Splio supports the deletion of records from the database using the automatic imports feature. Should you need to delete records, please refer to the deletion guide.

Loyalty scopes

There is a number of additional scopes used to import data for loyalty programs. Follow the links to learn more about the particular scopes:

  • cardcode – to manage individual membership in loyalty programs;
  • creditpoints –to credit (add) points to loyalty program members;
  • events – to associate events with individuals.
  • masterreward, earnreward, and burnreward – to create master reward entries, assign them to program members (earn), and allow them to claim (burn) the rewards;

File format

Import files use the Comma-Separated Values format, so they are usually referred to as CSV files or *.csv files. Basically, CSV files are text files in which the content is arranged into columns and rows, like in a table. Columns are separated by semicolons (";"), and each line in the file is a row. It is a popular format, editable in text editors, which can also be saved and exported to by most database and spreadsheet software.

UTF-8 encoding

All import files must be saved using the UTF-8 encoding without BOM. This is a popular system which allows characters from different languages, e.g., French and Chinese, to co-exist in the same file.

Practically all modern text file editors and spreadsheet software can save UTF-8 encoded CSV files, and UTF-8 has become a default in many systems.

📘

The byte order marker (BOM) is a special character added at the beginning of the file. Make sure not to use it.

Header

The header is the first line of the CSV file. This line informs Splio how many columns there are in the data file and what each column represents.

The header must always indicate columns appropriate for the scope of the file and must contain the required columns. If you put a column name which is unknown in the scope or omit a required column, the file will not be imported.

Lines in the file

All lines in the file must contain the same number of columns as the header. Splio will skip all lines which fail to meet this requirement.

Each line in the data file represents a single item in the scope. Multiline data is not supported: it will cause Splio to skip all involved lines.

You have the option to enclose textual values in double quotes. However, you cannot use it to provide values spanning multiple lines.

To represent decimal numbers, use the dot (".") as the decimal separator or point. Do not use any separator for thousands. Use "-" to mark negative numbers and no sign for positive numbers.

Example

Mike Cole received a contacts file to import. The file is small, so he decides to take a look at it. He opens it in his text editor and looks at the first two lines:

email;firstname;lastname;cellphone;subscriptions 
"[email protected]";"Jean";"Boulanger";"33120202020";"-5" 

Mike reads the header, which contains 5 contacts columns. He recognizes all the columns and can tell they are correct.

He now goes to the second line. He counts 5 columns, which is good. The first is a text value, in quotation marks, and is a properly constructed email address. The second is also text in double quotes, the third is text without quotes. Then comes a numerical value, the phone number, and finally a negative number, which is normal for a subscription column.

Using archives and compressed files

For convenience, you can pack your CSV files into archives to reduce size and decrease the number of uploaded files. Splio supports the following archive types, by filename extension: .zip, .tar, .gz, .bz2.

However, all files within archives need to be organized into groups. This is done by adding a special prefix to the filename according to the file sequencing rules described below. This allows Splio to process the files within the archive in a clear preset order.

Note that grouping/sequencing is part of the naming rules. Archived import files still need to follow all the rules for naming (below), format and encoding (above). On the other hand, there are no specific rules to naming the archives. Splio will examine and unpack them, then process the contents.

File naming guidelines and treatment order

When picking up files to import, Splio relies on a very specific naming convention. The way in which the filenames are constructed must be followed precisely, otherwise the import will either not start or fail mid-way. On the other hand, it is enough to follow the rules below to make sure that your import files are processed.

🚧

If you use the same file name for subsequent imports, only the last file will be kept in the archives directory. It means that the content of previous files will be lost and no information about it can be given.

Basic file naming

All files must be named according to the following scheme:

<universe>_<scope>_<subsection>_<date>.csv 

The chevrons surround items which need to be replaced by an actual value and which are obligatory.

There are five required parts, as explained below. Please be aware that the filename can be extended with additional information at the beginning (called a prefix) and directly before the ".csv" extension (a suffix). Prefixes and suffixes are optional and are discussed in sections below.

  • is the name of your Splio universe, and all files you intend to import automatically must be marked with it;
  • is one of the scopes described in the scopes section above;
  • is used to identify a specific import or import settings, e.g., being the name of the source or allowing you to choose to update existing records or not. Email addresses to receive alerts and results are configured by your Project Manager for each subsection;
  • is the date of the file, used to determine the chronological order of imports within each scope. The date must follow this format: "yyyymmdd", like "20180315".

For instance, in a Splio universe "myuniverse", a "toystore" subsection file containing contacts and bearing one of the dates above would be named myuniverse_contacts_toystore_20180315.csv.

Remember that, in order for the files to be imported, all scope and name pairs need to be defined in the configuration file. This file is managed with the help of your Project Manager.

File treatment order

Splio always follows the same order when importing files, one scope at a time, moving to the next scope only after all imports within the current one have finished. The files are taken up from the FTP/SFTP repository in the following order:

If you don't use Loyalty:

  1. Groups of files (see below)
  2. contacts
  3. stores
  4. products
  5. orders
  6. abandonedcarts
  7. ordersitems
  8. batch

If you use Loyalty, you can add afterwards:

  1. cardcode
  2. earnreward
  3. burnreward
  4. creditpoints
  5. deletecontacts
  6. events
  7. tierchange
  8. masterreward

Files within the same scope are ordered by the dates contained in the filenames, from the earliest dates to the latest.

Advanced file naming: File sequencing with groups

While the default import order is fairly clear, it may become insufficient. You and other users from your organization will frequently need to import files in a specific order, e.g., contacts or stores before the sales data (orders and orders items) which rely on these contacts. If the orders are placed in the repository and are taken up for import before the relevant contacts are imported, Splio will skip all lines referring to the contacts which have not yet been imported.

This serious risk, however, can be easily avoided by employing groups. Using this will give you the hand on the order of import of the files.

Groups are created by adding a special prefix <groupid>.YY-ZZ at the beginning of the filename. This prefix consists of:

  • a group identifier, or <group_id>, followed by
  • a dot ".",
  • a 2-digit number of the current file in the sequence, represented as YY,
  • a dash "-",
  • a 2-digit number of the last file in the sequence, represented as ZZ, and
    an underscore "_".

The <group_id> is a character string which is the name of the group. The YY and ZZ numbers can take values between 01 and 99. Files within a sequence must receive subsequent numbers beginning with 01 and up to ZZ.

📘

The ZZ value tells Splio how many numbers there are. Do not omit any number within a sequence, or Splio wait for the missing file.

Names of grouped files with examples

The complete filename can be summarized as:

<group_id>.YY-ZZ_<universe>_<scope>_<subsection>_<date>.csv

As you can see, the part beginning with is the same basic filename you already know. Take a look at the examples below:

daily.01-03_myuniverse_contacts_toystore_20180315.csv 
daily.02-03_myuniverse_products_toystore_20180315.csv 
daily.03-03_myuniverse_orders_toystore_20180315.csv 
weekly.02-02_myuniverse_stores_toystore_20180315.csv 

There are two groups: "daily" is complete, with 3 files, "weekly" is still incomplete, awaiting a file with the weekly.01-02_ prefix to be imported.

File sequencing: How groups are processed

Splio attempts to import all groups (sequenced files) before importing any individual (non-sequenced) files. The order in which they are imported can be found above.

Groups are analyzed one by one, in alphabetical order according to group_id.

First, Splio checks if the group is complete, that is, if there is at least one file in the group for each number from 01 to ZZ. If one or more files are missing, Splio passes to the next group. The files forming the incomplete group are not removed or processed but instead are waiting for the remaining file(s) to be imported.

When working with a group, Splio imports files one by one in the order dictated by the number within the group (YY). If more than one file with the same number is found, the one with the earlier is imported first. This allows the automatic import to proceed even with files delivered at various time intervals. For instance, you may retain control over the order of import even if one of your coworkers delivers files three times a week, and another does so daily.

🚧

Be aware that once the group import begins, files within the group are treated sequentially. If any sequenced file is rejected (e.g., an obligatory column is missing or, conversely, there is an illegal column), Splio will abort the import of the group and pass to the next one.

Advanced file naming: Optional identifier

When uploading multiple files daily, you may feel the need to additionally mark some files: add a comment, some identifier, or perhaps just a number. With a large amount of similarly named files, being able to distinguish them at a glance and find them more quickly can be invaluable.

To meet this need, Splio offers you the optional identifier in the form of a _ suffix added to filenames. Suffixes are always added at the end (immediately before the .csv part, just like prefixes are added at the beginning). This suffix is composed of an underscore followed by alphanumeric characters (letters and digits). A filename including it can be summarized as follows:

<universe>_<scope>_<subsection>_<date>_<id>.csv 

All other components of this filename are explained above in the basic file naming section. The examples below show three filenames. Note how the optional suffix helps you tell them apart.

myuniverse_contacts_toystore_20180315_updates.csv 
myuniverse_contacts_toystore_20180315_noon.csv 
myuniverse_orders_toystore_20180228_daily.csv 

The optional identifier is not processed by Splio, that is, no special actions will be taken in response to it. However, Splio will keep this identifier intact when moving files to other directories. It will also mark the corresponding log files with it.

SFTP/FTP Repository at Splio

The repository you have seen in the graph at the beginning of the article is created as part of your Splio universe setup. Your contact at Splio will provide you with the address of your FTP repository, a user name, and a password. If you do not know how to configure an FTP client yourself, ask an IT specialist at your organization.

🚧

If after logging in you find that your repository is empty, contact your Project Manager or the Splio support. Your auto import feature may require activation.

Directory structure

The FTP repository contains a set structure of directories used by the import automation.

Within the repository, you will find the "imports" directory. It contains all the other directories. This directory is the location where you need to place your configuration file as well as all files you want to be imported.

Once the import is completed, the import file is removed from the imports directory and, depending on the outcome of the import, placed in one of the subfolders. Then a report is sent to the email addresses indicated in the configuration file.

  • "archives" is the default location for processed files. If your import succeeded, your import file will be moved there.
  • "badlines" is where the files containing rejected lines go (e.g., lines from contacts files with neither email nor cellphone, from stores files without store_id, etc.).
  • "bogus" is the directory where Splio moves the files it considers illegal. If you forgot to include an obligatory column or inserted a non-existent one, this is where your CSV file will be moved.
  • "logs" is the location for the text files containing the monitoring data regarding your imports (such files are called log files or logs). Please refer to the logging section below.
  • "notprocessed" is where Splio puts files it did not process.

Logging

When performing the automatic import, Splio prepares special monitoring files (called log files, or logs for short) which list errors and events that occurred during import. A separate file is created for each .csv file in the /imports/logs/ directory.

The table below lists the messages you can find in a log file, with a short description of each message and suggestions what can be done to improve the situation, if available. The placeholders in angular brackets will be replaced with actual names and data in the log.

Log messageDescription
can't find universe ""The filename or entry in the configuration file references a Splio universe which does not exist. Either the universe needs to be created or you should verify if you are using a correct universe name.
can't find import with id The import has not been set up yet.
can't find ftp account with id Splio is unable to pick up files from the FTP repository . The import has not been set up correctly or has yet to be configured.
can't open working directory ""The directory is missing (e.g., has been deleted) or there is a system error.
can't open file "<workdir/filename>"The file is missing or corrupted: try to validate the import file and upload it again. This log message may also indicate a system error.
invalid JSON in "_imports_config.json"If you see this error in your log, please report it to a Splio operative as soon as possible.
no configuration available for importsApparently your configuration file does not contain a configuration for this scope. If you see this error in your log, please report it to a Splio operative as soon as possible.
can't find config for scope import named ""If you see this error in your log, please report it to a Splio operative as soon as possible.
missing mandatory configuration value ""If you see this error in your log, please report it to a Splio operative as soon as possible.
wrong format for configuration value "<$field>"If you see this error in your log, please report it to a Splio operative as soon as possible.
can't parse file headerThis error usually means that the encoding of the import file is incorrect. Make sure that your file is encoded in UTF-8 and upload it again.
(:) can't parse lineIndicates a problem with the specific line of the file. You should be able to open the file and check the indicated line.