Lattice Atlas allows you to track all the data manipulations you do on the tenant on the Data Processing and Analysis tab in the Jobs page. Every time you submit a task, you will see it listed on this page. Every task is an Action and sometimes logically grouped together into one Action. All the Actions are grouped together into a job. This job is called the Process and Analyze job. Hence, you will need to keep two levels in mind, Actions and Job.
Each Job kicks off a workflow behind the scenes that is divided into four separate steps. Only after all the steps in the workflow complete successfully, all the Actions you took come into effect before your data is refreshed. You may access this page by clicking on the Jobs icon.
Job and Actions
Let us see the different components of the Data Processing and Analysis Job. We will reference the picture below to understand the different parts of the Job.
Let us look at each component
- The number of Actions those will be processed in a single job. This block also indicates if any Actions were not successful. In such a case, you will see a Partial Success. The job will not fail and process the successful Action when kicked off.
- The picture is showing a Job in a Ready state. There are five different states
- Ready, meaning the job is ready to be kicked off, either by its schedule or by clicking the Run Now button
- Running, meaning the data processing is in-progress
- Blocked, meaning the job cannot be run because there is a job already running. You will have to wait until the running job completes
- Successful, meaning the data was successfully refreshed
- Failed, meaning one of the workflow steps has failed and hence the job has failed. When a job fails, all the actions on the job also failed
Callout: Certain actions such as Segment Edits, Attributes Activation and Curated Attribute configuration are saved and will be picked up automatically in the next job. Actions that involve importing and deleting data will be reverted and you will need to repeat them in the next job.
- The Run Now button allows you to manually kick off the job.
- Example Action showing Attribute Activate.
- Example Action showing manual scoring kickoff.
- Example Action showing the configuration of a curated attribute
- Example Action showing Import Data
- Example Action showing segments edited
- A column showing if the Action was successful, partially successful or failed.
- Table showing the user who took the action, file upload stats if the Action was file related such as an Import.
The work done in the job is divided into four steps. The below picture shows you an example of a Completed job (expanded view). You can see the four steps in the sequence they are run.
Merging, De-duping & Matching
There are several things that happen in this step. All the Import and Delete actions are merged together.
Callout: Delete will always run before any Import even if they are a part of the same job, no matter in what sequence you provide them
If there are multiple Import Actions on the same entity, such as Accounts, the list if first de-duped using the dedupe key provided. And finally, they imported data is matched to the Lattice Data Cloud.
In this step, all the Actions are used to bring into effect the changes to all relevant objects on your tenant. For example, if the Action was Attribute Activation, this step would analyze and generate buckets and make the new attribute visible on the My Data page. Another example, if you had loaded new Accounts, all these Accounts will be distributed into respective segments, segment counts will be re-calculated and Plays will be updated.
In this step, all the data generated from the previous steps are loaded into the database on the cloud.
In this step, all the new Accounts get Ratings and Scores. If the job changes the state of the data by more than 30%, then all the historical data is also re-scored.
Callout: Keep in mind that there is Score Segment feature that can be used to manually trigger a scoring job which will happen regardless of this Scoring step.
What are some best practices to consider to speed up loading data in to the platform?
- Incremental updates to accounts and contacts are faster than full refresh. Full refresh should be avoided as much as possible. If you have 100,000 accounts and you have 100 new accounts that needs to be added only providing the 100 new accounts is much faster than upload the entire data set.
- Combine as many actions as possible in a single job. The platform is designed to processing large volumes of data and grouping these actions in a single job is faster than splitting these actions across multiple jobs.
- Combine smaller files in to larger files - If you have multiple small files containing account and contact data to be loaded, group them in to larger files. This reduces the pre-processing and hence improves the overall time taken to load the data
- Adding new columns takes more time than updating existing data. Try to group the actions of adding new attributes in a single job rather than spreading it over time.
- If you are adding new columns to capture transient information (e.g. account ratings from models managed outside of the platform), try to pre-create new columns ahead and use them for storing the actual values as and when the need arises. Pre-creating new columns ahead of time groups expensive process in a single job thus improving the performance of future updates
Why should I run a job when I edit a segment or change a model?
Creating or updating a segment or a job
Are there any limits on executing Jobs?