Batched File Uploads

Sending Data into Vidora Cortex

Vidora Cortex is an easy-to-use platform that enables anyone to automate machine learning predictions from continuous streams of event data.

Events, defined by an object​ completing an action​ at a point in time, are collected by businesses in a wide variety of industries. Whether it’s commerce purchase behavior, media content consumption, or factory equipment failures, event data streams are critical for any business looking to form competitive moats from their unique first-party data.

Cortex extends the value of these datasets by making it easy to leverage them machine learning. Traditionally, events are difficult to use in an ML context due to their raw form, massive scale, and continuously updating nature. Cortex vastly simplifies the equation by connecting all the hard parts of ML (including data wrangling and feature engineering) into end-to-end pipelines that anyone in your business can run.

This document provides an overview of the two types of datasets that Cortex supports, and how your business can import those datasets into the platform.

Events (​required) Object Attributes​ ​(optional)
A stream of actions taken by/on uniquely identified objects at particular points in time.

Example: Customer A completes a purchase​ event on item B at time C.
A record of relatively static attributes associated with the objects tied to your events.

Example #1: Customer A has job title X.
Example #2: Item B belongs to category Y.​

 

Event Data ​ (​required)

What are events?

Events are the required building blocks of Cortex’s predictions. At minimum, Cortex expects each event to contain three* pieces of information –

  1. Object ID: A unique identifier for the object which completed the event (e.g. user ID). Your​ ML pipelines will generate predictions for each object ID contained in your events.
  2. Timestamp: Time at which the event was recorded (in ​ unix or ISO 8601 format).
  3. Type: Type of event completed (e.g. purchase).​

*Optionally: your events can also include any other information which contextualizes the object completing the event, or conditions of the event itself. The more info you include in your events, the wider the set of predictions you can make, and the more accurate your predictions will be.

*NOTE: If you’d like to generate personalized item recommendations for each user, your events must also contain an item ID which can be tied to an item catalogue that is also sent to Cortex (see the below section on object attributes).

How do I send event data into Cortex?

1) Real-time APIs
Cortex offers a set of scalable APIs to stream live event data into Cortex as it occurs. These APIs can be deployed either server-side or client-side (including integrations with Google Tag Manager and Tealium), with or without authentication.

The below example shows a sample API POST which includes a customer purchase event. The event contains the three required fields, along with optional parameters which describe the user who completed the event, the item that was purchased, and details of the event itself.

POST https://a.vidora.com/v1/validate?api_key=​<YOUR_KEY>
Content-Type: application/json
'{"data":[{"user_id":"​ABC​","type":"​purchase​",​"timestamp"​:1563492745​   ​,"device":"​mobile​","user_geo":"​usa​", "item_id":"​XYZ​","sale_price":​49.99​}]}'

2) Batch File Uploads
Alternatively, transfer batches of event data into Cortex from your data lake or analytics vendor. To use this method, schedule a recurring file upload into a specified directory hosted by either you or Vidora (e.g. AWS S3 bucket). Your Cortex account will point to this directory and automatically ingest the event data contained inside any uploaded file (CSV or JSON preferred).

The below example shows the first few rows of a CSV file containing the eCommerce events. Note the presence of the three required fields, plus four optional fields which provide additional context.

user_id type timestamp device user_geo item_id sale_price
ABC purchase 1563492745 mobile usa XYZ 49.99
DEF click 1563492779 desktop usa
GHI add_to_cart 1563492801 tablet usa JKL

 

Attribute Data​ (​optional)

What are object attributes?

Attribute data provides additional information about the objects tied to your events. These attributes can be included in the events themselves (recommended), but may also be split out into a separate data source. Cortex will then automatically join these attributes with your events based on a shared object ID that exists in both data streams.

Attribute data can include any descriptors of your objects. For example, if you are tracking customer purchase behavior via an events stream, you may choose to supplement that data with separate records of customer attributes (e.g. age, gender, loyalty status) and/or item attributes (e.g. category, price, brand).

How do I send attribute data into Cortex?

The simplest way to import object attributes into Cortex is to include them as fields within your events (e.g. user_geo in previous examples). Alternatively, you can upload recurring files (CSV, JSON, or Parquet) containing up-to-date records of each object’s attributes.

The below example shows a CSV file containing records of each user’s geographic region, subscriber status, gender, and age.

user_id geo subscriber_status gender age
ABC usa free male 27
DEF usa premium female 41
GHI usa plus female 53

 

The below example shows a JSON containing attributes for a particular item.

{
  "item_id": "​XYZ​",
  "price": 49.99
  "categories": ["​cat1​", "​cat2​", "​cat3​"] 
}

Related Links

Still have questions? Reach out to support@vidora.com for more info!

Table of Contents