How to Build a Future Events Pipeline
Vidora Cortex is an easy-to-use platform that enables anyone to automate Machine Learning Pipelines from continuous streams of event data. In this guide, we’ll show you how to predict the future behavior of your users using Future Events pipelines in Cortex.
What are Future Events pipelines?
Predictions from a Future Events pipeline answer the question: how likely is each user to perform a certain action, or set of actions, in a specified time range? The actions available to predict come directly from the data being ingested into Cortex. Every event action that is being tracked and sent to Cortex can be used to power a Future Events pipeline prediction.
Note that while we’ll be using the example of predicting future user behavior, your Cortex account can be configured to make predictions about any type of object tied to your event data (e.g. commerce items, media content, home listings, etc.).
When Should I Use Future Events pipelines?
The key feature of Future Events pipelines is their ability to predict event-based outcomes for each object in the future. This differs from Cortex’s other pipelines, which instead predict attributes of those objects today. Anytime you are trying to predict the future behavior of a user, whether in the next few days or months, use Future Event pipelines.
The following diagram will help explain which pipeline type is best suited for different predictions.
What are Examples of Future Events pipelines?
Future Event Pipeline Examples
- What is the probability that each user purchases from category “shoes” within 7 days?
- What is the probability that each user will take any action within the next 30 days?
- What is the probability each Visitor to our site will Subscribe?
- What is the probability a Lapsed User will Resubscribe?
How do I build these pipelines in Cortex?
Step 1: Choose Pipeline Type
Select ‘Create New Pipeline’ from within your Cortex account, and choose the Future Events pipeline type.
Step 2: Define Events
In this step, you select the events, or events, that you want to use as the basis of your prediction. All events, and event conditions, are shown based on the data set that has been ingested into Cortex.
Future Events pipelines are used to predict the probability that some event happens in the future for each of your data points. Defining that event means providing four pieces of information to Cortex:
(A) happens (B) or more times where (C) within (D) days.
- (A) Event Type: The type of event that Cortex should predict. These event types are based on the data being ingested into Cortex.
- (B) Frequency: The minimum number of times the event occurs. The default value is “1”, i.e. you are only predicting that a user will take this action once in the specified time range. However, if you are looking to predict if users will do an event multiple times, e.g. Predict how likely each user is to make 5 purchases, then you can increase the event frequency.
- (C) Event Conditions: Conditions (if any) under which the event occurs. These event conditions are based on the data being ingested into Cortex.
- (D) Prediction Window: The future window of time over which the event occurs. Common values here are 7 and 30 days if you are looking to predict behavior over the next week or month.
You can build a Future Events pipeline to predict any event that you can describe in this way using your data. Add any number of conditions to an event, or chain any number of distinct events together using AND/OR/IF operators. No matter how complex, your events will always read like an English sentence. Here are a few examples based on e-commerce use cases:
- “Purchase happens 1 or more times OR Rent happens 1 or more times within 7 days”. This pipeline predicts the probability that each user will either purchase or rent within the next week. The OR clause may be useful if you plan to use the predictions to power a marketing newsletter which promotes movies available for both purchase and rental.
- “Purchase happens 1 or more times where Category is equal to Shoes AND Purchase happens 1 or more times where Category is equal to Pants within 14 days”. This pipeline predicts the probability that each user will purchase both shoes and pants within the next two weeks. The AND clause may be useful if you plan to use the predictions to power a marketing newsletter which promotes a both shoes and pants.
- “Purchase happens 1 or more times IF Login were to also happen 1 or more times within 7 days”. This pipeline predicts the probability that each user will purchase within the next week, assuming that the user first logs on and is active within that week. The IF clause may be useful if you plan to use the predictions to dynamically vary each user’s onsite experience. You don’t want the prediction about a user’s likelihood of purchasing to reflect their likelihood of being active in the first place, since the user is already active by the time you’re deciding which experience to show. Read this post for more info about these types of conditional goals.
Step 3: Define Groups
Sometimes you are looking to make a prediction for every user in your system, while other times you want a prediction only for a certain group of users. By default, Future Events pipelines are set up to make a prediction for All Users. If you are looking to predict for All Users simply click Next to proceed to the next step.
If you are looking to limit the prediction group, you can either include a specific group or exclude a specific group. Like events in step 2, the user attribute values available come directly from the data being ingested by Cortex.
Note that having an Include or Exclude group will also limit the model training for only those groups of users, ensuring that the behavior of users outside of this group doesn’t affect the model.
Step 4: Specify Settings
Specify settings such as your pipeline’s name, schedule, tags, and more. Finally, review that everything looks correct and begin training your pipeline!
Every time your pipeline runs, Cortex goes through the end-to-end process of generating fresh predictions from the latest data that’s been ingested. If you’d like to power automation based on predictions that are always up-to-date, make sure your pipeline is set to run repeatedly. If you’re just testing things out or building a pipeline for one-time use, your pipeline should only run once.
Note, within More Options you will have the ability to create Custom Features (this is not typically necessary).
Step 5: Review
The final step is to review your pipeline and ensure all settings look accurate! If anything needs updated, simply go ‘Back’ in the workflow and update any step.
Continuous Training Over Time
Since Cortex will be receiving and ingesting data continuously over time, Future Event pipelines can be set up to automatically re-run with new data at a set frequency (including both re-training and re-predicting). Many pipelines are set to run daily or weekly, while others may only need to be run manually on an ad-hoc basis.
- Future Events Performance
- How to Build a Look Alike Pipeline
- How to Build a Classification Pipeline
- How to Build a Regression Pipeline
Still have questions? Reach out to email@example.com for more info!