How Can We Help?

Pipelines API

Cortex’s Pipelines API provides programmatic access to any Machine Learning Pipeline built in your Cortex account. The API allows you to build automated workflows for managing, updating, and deploying your ML Pipelines. This means smoother integrations between Cortex and your business.

The following sections describe the various requests that may be made to the Pipelines API. Each description includes the optional and required parameters that should be submitted with the request.

All API requests should be made to api.vidora.com, using the methods listed below. Note that authentication is required for each of these methods.

Manage your ML Pipelines

The below methods allow you to list, filter, and fetch details for the ML pipelines that have been built in your Cortex account. This makes it easy for you to manage up to hundreds of unique pipelines.

List ML Pipelines

Returns an array of pipelines built in your Cortex account, ordered by how recently each pipeline was run. Optionally, the response can be filtered by the parameters listed below.

Request

Method URL
GET /v1/api/pipelines

Parameters

Param Required? Type Description
recurring No Boolean Whether or not the pipeline is set to run on a recurring schedule.
active No Boolean Whether or not the pipeline is currently active.

Example Request

http://api.vidora.com/v1/api/pipelines?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

[
  { 
    id: "f90819314266a344",                      # Unique ID for the pipeline
    name: "Sample Classification Pipeline",      # Name of the pipeline
    type: "Classification",                      # Pipeline type
    active: true,                                # Whether the pipeline is active
    recurring: true,                             # Whether it runs on recurring schedule
    status: "complete",                          # Status ("running|complete|failed")
    created_at: "2020-04-05T00:00:00",           # When the pipeline was created
    last_trained_at: "2020-04-05T05:00:00",      # When the pipeline last trained
    last_prediction_at: "2020-04-05T05:00:00".   # When the pipeline last made predictions
  },
  { 
    id: "f8d544d30e43a550", 
    name: "Sample Look Alike Pipeline",
    type: "Look Alike",
    active: true,
    recurring: true,
    status: "running", 
    created_at: "2020-04-05T00:00:00",
    last_trained_at: "2020-04-05T05:00:00",
    last_prediction_at: "2020-04-05T05:00:00"
  }
]

The above example shows a signed GET request to return all pipelines. The response indicates that there are two pipelines in the account, and includes details for each pipeline such as id, name, type, and more.

Get ML Pipeline

Returns details for a given pipeline.

Request

Method URL
GET /v1/api/pipelines/<PIPELINE_ID>

Parameters

Param Required? Type Description
pipeline_id Yes string Unique ID for the pipeline.

Example Request

http://api.vidora.com/v1/api/pipelines/<PIPELINE_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

{
  id: "f90819314266a344",                       # Unique ID for the pipeline    
  name: "Sample Classification Pipeline",       # Name of the pipeline
  type: "Classification",                       # Pipeline type
  active: true,                                 # Whether the pipeline is active
  recurring: true,                              # Whether it runs on recurring schedule
  status: "complete",                           # Status ("running|complete|failed")
  created_at: "2020-04-05T00:00:00",            # When the pipeline was created
  last_trained_at: "2020-04-05T05:00:00",       # When the pipeline last trained
  last_prediction_at: "2020-04-05T05:00:00"     # When the pipeline last made predictions
  active_label_set_id: "j8daw02sz57hjwze"       # Unique ID for the active label set          
}

The above example shows a signed GET request to fetch details for pipeline <PIPELINE_ID>. If the pipeline requires uploaded labels (Classification, Look Alike, or Regression), these details will include a unique ID for the most recently uploaded label set assigned to the pipeline.

Update your ML Pipelines

The below methods allow you to update details of an existing pipeline. Most notably, you can add a new set of labels to trigger an automatic retraining of any Classification, Look Alike, or Regression pipeline. This process involves a POST request to create a new label set (Create Label Set), and a PUT request to assign that label set to a specific pipeline (Update Pipeline).

Create Label Set

Creates a new label set. Once a label set has been successfully validated, you may assign it to a pipeline (via the Update Pipeline method) in order to automatically begin retraining that pipeline with the new labels.

Request

Method URL
POST /v1/api/label_sets

Parameters

Param Required? Type Description
label_type Yes string Pipeline type that the labels are intended for. Valid values are “classification”, “look alike”, or “regression”.
name Yes string A descriptive name for the label set.
labels Yes file A file containing the training labels. Uploaded files are required to be in CSV format (optionally gzipped), with an id column (string) and a label column (boolean if label_type is “classification” or “look alike”, float if label_type is “regression”). Sample File Formats:

Classification Look Alike Regression
“id”,”label”
“A”,1
“B”,0
“C”,1
“id”,”label”
“A”,1
“B”,1
“C”,1
“id”,”label”
“A”,15.3
“B”,91.7
“C”,54.8
start_date No date Earliest date for which the uploaded labels are valid (e.g. “2020-02-01”). Default value is 90 days from today. Along with end_date, this value will define the training window for your pipeline.
end_date No date Latest date for which the uploaded labels are valid (e.g. “2020-04-01”). Default value is today. Along with start_date, this value will define the training window for your pipeline.
ancestor_set_id No string The ID of another label set to compare labels with. If your new label set is meant to refresh a previously-defined label set, it is useful to pass in an ancestor_set_id to ensure at least 80% of your labels overlap.

Example Request

Creating a new label set requires uploading a file, which must be signed in the POST body like any other Cortex API request. However, since you’re uploading a file, you cannot send JSON as the payload. Instead, the file must be signed and sent as form data. Below is an example which shows how to sign the request.

Payload of Form-Data

Just like signatures for any other Cortex API request, your POST body needs to be included in the signature generation process. The POST body for this request is sent as form-data, which requires a boundary to be defined between each parameter. In this example, we’ll use –BOUNDARY as the defined boundary. If we were to send a common CSV as the labels file, the payload would look like the following:

body =  "--BOUNDARY\r\n"                                                                  \
        "Content-Disposition: form-data; name=\"labels\"; filename=\"my_file.csv\"\r\n"   \
        "Content-Type: text/csv\r\n"                                                      \
        "\r\n"                                                                            \
        "#{file.read}\r\n"                                                                \
         "--BOUNDARY\r\n"                                                                 \

A few things to note about the above:

  • To end the form-data, append “” at the end of your boundary. In this example, the end to the form-data is defined by –BOUNDARY.
  • The Content-Disposition is form-data.
  • The file being uploaded is a common CSV, so the Content-Type is text/csv.
  • The file contents are read directly into the payload so it can be signed. In this example, file contents are read using Ruby code.
  • IMPORTANT: the form data must have the correct line breaks which includes both \r and \n.

Sending a gzipped file

It’s likely that you will want to gzip your labels file to speed up the upload. If sending a gzipped file, the content type must be set to application/gzip or application/x-gzip like the below:

body =  "--BOUNDARY\r\n"                                                                    \
        "Content-Disposition: form-data; name=\"labels\"; filename=\"my_file.csv.gz\"\r\n"  \
        "Content-Type: application/gzip\r\n"                                                \
        "\r\n"                                                                              \
        "#{file.read}\r\n"                                                                  \
         "--BOUNDARY--\r\n"                                                                 \

Adding more params to the payload

When adding other parameters to the request, you can either put them in the form data or append them in the URL. Below is an example of how you would add more parameters to the form data. Note that each parameter is specified using the –BOUNDARY from above.

body =  "--BOUNDARY\r\n"                                                                                  \
        "Content-Disposition: form-data; name=\"labels\"; filename=\"my_file.csv\"\r\n"                   \
        "Content-Type: text/csv\r\n"                                                                      \
        "\r\n"                                                                                            \
        "#{file.read}\r\n"                                                                                \
        "--BOUNDARY\r\n"                                                                                  \
        "Content-Disposition: form-data; name=\"label_type\"\r\n"                                         \
        "\r\n"                                                                                            \
        "classification\r\n"                                                                              \
        "--BOUNDARY\r\n"                                                                                  \
        "Content-Disposition: form-data; name=\"name\"\r\n"                                               \
        "\r\n"                                                                                            \
        "My Classification Label Set\r\n"                                                                 \
        "--BOUNDARY--\r\n"                                                                                \

If you were to append the additional parameters to the URL, they would look like the below. Keep in mind that the labels file would still be sent in the payload as form-data.

POST 

https://api.vidora.com/v1/api/label_sets?label_type=classification&name=My%20Classification%20Label%20Set

Sending a gzipped file

Please see the example code in your Cortex account for how to generate a signature for your upload. Like any signed request, you must join the following to create the signature:

  • Secret Key
  • Method
  • Path
  • URL Params
  • Body

Sending the request

Since label uploads are sent as form-data, you cannot send a request header of Content-Type: application/json. Instead, it must be sent as form-data that defines the boundary. An example request header for the examples above would specify the following:

"Content-Type": "multipart/form-data; boundary=BOUNDARY"

Ruby gzip example

The below example shows how to create a label set with a gzipped file using Ruby. Ask your account manager for examples using other languages (e.g. Bash).

require "digest/sha2"
require "base64"
require "active_support/time"
require "rest-client"
​
# Function for generating signatures
def generate_signature(secret, http_method, request_path, params = {}, body = nil)
  string_to_sign = [
    secret,
    http_method,
    request_path,
    params.sort { |pair1, pair2| pair1[0] <=> pair2[0] }.map { |k, v| "#{k}=#{v}" }.join("&"),
    body,
  ].join("\n")
​
  Base64.strict_encode64(Digest::SHA256.digest(string_to_sign))[0, 43].chomp("=")
end
​
# Access your labels file on disk and create body
labels_file = File.new("<GZIPPED_FILE>")
body = "--BOUNDARY\r\nContent-Disposition: form-data; name=\"labels\"; filename=\"<GZIPPED_FILE>\"\r\n" \
       "Content-Type: application/gzip\r\n\r\n#{labels_file.read}\r\n--BOUNDARY--\r\n"
​
params = {
  api_key:    "<API_KEY>",
  expires:    (Time.now.utc + 2.days.to_i).strftime("%Y-%m-%dT%H:%M"),
  name:       "<LABEL_SET_NAME>",
  label_type: "classification",
}
​
secret = "<API_SECRET>"
params["signature"] = generate_signature(secret, "POST", "/v1/api/label_sets", params, body)
​
url = "https://api.vidora.com/v1/api/label_sets?#{params.to_query}"
request_headers = { "Content-Type": "multipart/form-data; boundary=BOUNDARY" }
response = RestClient.post(url, body, request_headers)
puts response

Example Response

{ 
  id: "0tbeb8mg9ljyod4h",                         # Unique ID for the label set
  name: "My New Label Set",                       # Name of the label set
  status: "validating",                           # Status ("validating|validated|failed")
  label_type: "classification",                   # Type of pipeline the labels will be applied to
  filename: "subscribers-data-2019.csv",          # Name of the uploaded file
  updated_at: "2020-04-05T00:00:00",              # When the label set was last updated
  start_date: "2020-02-01",                       # Earliest date for which the labels are valid
  end_date: "2020-04-01"                          # Last date for which the labels are valid
}

The response includes details about the new label set, including a status field which indicates that the set is currently being validated. When the status changes to “validated”, you may assign the label set to a pipeline via the Update Pipeline method. If the status shows “failed”, an error field in the response will show more information about the validation error.

Get Label Set

Returns details for a given label set. This is useful for checking the status of a label set after you’ve created it.

Request

Method URL
GET /v1/api/label_sets/<LABEL_SET_ID>

Parameters

Param Required? Type Description
label_set_id Yes string Unique ID for the label set.

Example Request

http://api.vidora.com/v1/api/label_sets/<LABEL_SET_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

{ 
  id: "0tbeb8mg9ljyod4h",                       # Unique ID for the label set
  name: "My New Label Set",                     # Name of the label set
  status: "validated",                          # Status ("validating|validated|failed")
  type: "classification",                       # Type of pipeline the labels will be applied to   
  filename: "subscribers-data-2019.csv",        # Name of the uploaded file 
  updated_at: "2020-04-05T00:00:00",            # When the label set was last updated
  start_date: "2020-02-01",                     # Earliest date for which the labels are valid
  end_date: "2020-04-01",                       # Last date for which the labels are valid
  download_url: "http://s3...",                 # URL to download the file
  errors: [],                                   # Array of validation error messages, if any
  warnings: []                                  # Array of validation warning messages, if any
}

The above example shows a signed GET request to fetch details for label set <LABEL_SET_ID>. The response includes details such as the set’s name, label_type, and more.

Update Label Set

Updates details for a given label set. A label set may only be updated if it has not yet been PUT to a pipeline. If the label set has already been assigned to a pipeline, you should POST a new label set instead.

Request

Method URL
PUT /v1/api/label_sets/<LABEL_SET_ID>

Parameters

Param Required? Type Description
label_set_id Yes string Unique ID for the label set that you would like to update.
label_type Yes string Pipeline type that the labels are intended for. Valid values are “classification”, “look alike”, or “regression”.
name Yes string A descriptive name for the label set.
labels Yes file A file containing the training labels. Uploaded files are required to be in CSV format (optionally gzipped), with an id column (string) and a label column (boolean if label_type is “classification” or “look alike”, float if label_type is “regression”). Sample File Format:

Classification Look Alike Regression
“id”,”label”
“A”,1
“B”,0
“C”,1
“id”,”label”
“A”,1
“B”,1
“C”,1
“id”,”label”
“A”,15.3
“B”,91.7
“C”,54.8
start_date No date Earliest date for which the uploaded labels are valid (e.g. “2020-02-01”). Default value is 90 days from today. Along with end_date, this value will define the training window for your pipeline.
end_date No date Latest date for which the uploaded labels are valid (e.g. “2020-04-01”). Default value is today. Along with start_date, this value will define the training window for your pipeline.
ancestor_set_id No string The ID of another label set to compare labels with. If your new label set is meant to refresh a previously-defined label set, it is useful to pass in an ancestor_set_id to ensure at least 80% of your labels overlap.

Example Request

PUT http://api.vidora.com/v1/api/label_sets/<LABEL_SET_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Content-Type: application/json
{ "start_date": "2020-03-01" }>

Example Response

{ 
  id: "0tbeb8mg9ljyod4h",                       # Unique ID for the label set
  name: "My New Label Set",                     # Name of the label set
  status: "validating",                         # Status ("validating|validated|failed")
  label_type: "classification",                 # Type of pipeline the labels will be applied to
  filename: "subscribers-data-2019.csv",        # Name of the uploaded file
  updated_at: "2020-04-05T00:00:00",            # When the label set was last updated
  start_date: "2020-03-01",                     # Earliest date for which the labels are valid
  end_date: "2020-04-01"                        # Last date for which the labels are valid
}

The above example shows a signed PUT request to update the start date for label set <LABEL_SET_ID>. The response includes details for the label set such as its name, status, and more.

List Label Sets

Returns an array of label sets. Optionally, the response can be filtered by the parameters listed below.

Request

Method URL
GET /v1/api/label_sets

Parameters

Param Required? Value Description
status No string Status of the label set (“validating”, “validated”, or “failed”).
label_type No integer Pipeline type that the labels are intended for (“classification”, “look alike”, or “regression”).

Example Request

http://api.vidora.com/v1/api/label_sets?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

[
  { 
    id: "0tbeb8mg9ljyod4h",                     # Unique ID for the label set
    name: "My New Label Set",                   # Name of the label set
    status: "validating",                       # Status ("validating|validated|failed")
    label_type: "classification",               # Type of pipeline the labels will be applied to
    filename: "subscribers-data-2019.csv",      # Name of the uploaded file
    updated_at: "2020-04-05T00:00:00"           # When the label set was last updated
  },
  { 
    id: "p54avfxnd8soj96g", 
    name: "Look Alike Label Set",
    status: "validated",
    label_type: "look alike",
    filename: "survey-data-2020.csv",
    updated_at: "2020-04-07T00:00:00"
  }
]

The above example shows a signed GET request to return all label sets in the account. The response indicates that there are two label sets, and includes details such as each set’s id, name, label_type, and more.

Update Pipeline

Updates details for a given pipeline. Editable information includes the pipeline’s name, and its active label set. If the active label set is updated, the pipeline will automatically begin retraining using the new label set.

Request

Method URL
PUT /v1/api/pipelines/<PIPELINE_ID>

Parameters

Param Required? Value Description
pipeline_id Yes string Unique ID for the pipeline.
name No string Edits the name of the pipeline.
active_label_set_id No integer Sets the active label set for the pipeline and triggers an automatic retraining. This option applies only to Classification, Look Alike, and Regression pipelines.

Example Request

PUT 
http://api.vidora.com/v1/api/pipelines/<PIPELINE_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Content-Type: application/json
{ "active_label_set_id": "0tbeb8mg9ljyod4h" }

Example Response

{
  id: "f90819314266a344",                       # Unique ID for the pipeline
  name: "Sample Classification Pipeline",       # Name of the pipeline
  type: "Classification",                       # Pipeline type
  active: true,                                 # Whether the pipeline is active
  recurring: true,                              # Whether it runs on recurring schedule
  status: "running"                             # Status ("running|complete|failed")
  created_at: "2020-04-05T00:00:00",            # When the pipeline was created
  last_trained_at: "2020-04-05T05:00:00",       # When the pipeline last trained
  last_prediction_at: "2020-04-05T05:00:00",    # When the pipeline last made predictions
  active_label_set_id: "0tbeb8mg9ljyod4h"      # Unique ID for the active label set
}

The above example shows a signed PUT request to update the active label sets for pipeline <PIPELINE_ID>. The response includes details for the pipeline such as its name, type, status, and more.

Deploying Pipeline Predictions

The below methods allow you to access prediction files that you have exported from any existing pipeline. Most notably, you can fetch a download link for a given set of exported predictions, allowing you to deploy any pipeline in an automated workflow.

List Prediction Exports for a Pipeline

Returns an array of prediction exports for a given pipeline.

The ID for a pipeline can be fetched by making a request to the List ML Pipelines method.

Request

Method URL
GET /v1/api/pipelines/<PIPELINE_ID>/prediction_exports

Parameters

Param Required? Value Description
pipeline_id Yes string Unique ID for the pipeline.
recurring No boolean Whether the prediction export is set to run on a recurring schedule.
exported_since No integer A Unix timestamp which limits the response to Prediction Exports that have been exported at or after this point in time.

Example Request

http://api.vidora.com/v1/api/pipelines/<PIPELINE_ID>/prediction_exports?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

[
  {
    id: "vrhmjwf10b66zl9k",                     # Unique ID for the prediction export
    name: "Sample Recurring Export",            # Name of the prediction export
    status: "complete"                          # Status ("exporting|complete|failed")
  },
  {
    id: "q86yqblw2ejp39bd", 
    name: "Sample One-Time Export",
    status: "exporting"
  }
]

The above example shows a signed GET request to return all prediction exports for pipeline <PIPELINE_ID>. The response indicates that there are two such exports, and includes details for each one such as its name and status.

Get Prediction Export

Returns details for a given prediction export from a given pipeline.

The ID for a pipeline can be fetched by making a request to the List ML Pipelines method. The ID for a prediction export can be fetched by making a request to the List Prediction Exports for a Pipeline method.

Request

Method URL
GET /v1/api/pipelines/<PIPELINE_ID>/prediction_exports/<EXPORT_ID>

Parameters

Param Required? Value Description
pipeline_id Yes string Unique ID for the pipeline.
export_id Yes string Unique ID for the prediction export.
recurring No boolean Whether the prediction export is set to run on a recurring schedule.

Example Request

http://api.vidora.com/v1/api/pipelines/<PIPELINE_ID>/prediction_exports/<EXPORT_ID>?api_key=<YOUR_KEY>&expires=2020-06-01T00%3A00&signature=<YOUR_SIGNATURE>

Example Response

{
  id: "vrhmjwf10b66zl9k",                          # Unique ID for the prediction export
  name: "Sample Recurring Export",                 # Name of the prediction export
  recurring: true                                  # Whether the export runs on a recurring basis
  status: "complete"                               # Status ("exporting|complete|failed")
  last_exported: "2020-04-05T00:00:00",            # When the export last ran
  columns: ["User ID", "Conversion Probability"],  # Columns included in the exported file
  total: 2581,                                     # Number of predictions exported
  download_url: "http://s3..."                     # Download URL for accessing the exported file
}

The above example shows a signed GET request to fetch details for prediction export <EXPORT_ID> from pipeline <PIPELINE_ID>. The response includes details for the export such as its name, number of predictions, and a link to download the exported file.

Response Codes

Response Description
200 OK The request was successful.
400 Bad Request The request was invalid, possibly due to malformed parameters.
401 Unauthorized The api_key and/or signature was invalid.
404 Not Found The id used in the request was not found or the request URI does not exist.
500 Internal Error There was a server side error, and we cannot serve the request at the current time.

Related Links

Still have questions? Reach out to support@mparticle.com for more info!

Table of Contents