How Can We Help?

Predict the Current Value of a House

In this use case example, we will be walking through how to predict a numeric attribute value for different house listings on a real estate website using a type of Machine Learning Pipeline called Regression. Specifically, we’ll cover how to predict each house’s current value based on a list of currently known house values.

What data do I need for this prediction?

Predictions from a Regression pipeline answer the question: what is the value of a numeric attribute for each object being tracked (where an object can be a User_ID, or in this case a House_ID)? In order for Cortex to make this particular prediction, it is necessary to upload a list of House IDs associated with currently known values. In this example the upload list will be a 2 column file consisting of a House ID and the known value of that house during a specific time period. Cortex will then analyze these listings, and use that information to predict the current value for every other house.

While the list of House IDs and known values is the only information required when setting up our Regression pipeline, additional information about these houses is needed in order to make accurate predictions. This information is used to build features for our pipeline:

  • Additional Actions tracked for Each Listing: with online real estate listings, it is useful to track not only static attributes but also event based interactions.  This could range from visitors viewing the listing to tracking offers and sell events.  This information is incredibly valuable as it’s information only you will know.  Whereas housing attributes like square footage or number of bedrooms if often known publicly.  Cortex will automatically combine these house actions and attributes for use in the prediction.
  • House Attributes: in this use case we are using an uploaded list of current house values to help predict the value for other houses, but to do that more precisely we should also know more about each house we are predicting for.  You can track as many attributes as you need for each house listing, and Cortex will use that information to make the most accurate predictions.

How do I predict the current value of a house?

Step 1: Choose Pipeline Type

Select ‘Create New Pipeline’ from your Cortex account, and choose the Regression pipeline type.

Step 2: Upload Sets

This is where we upload our list of House IDs with their known values, in this example this could be a list of known sell prices for recent listings. This file should be a .csv or or .csv.gz file, consisting of two columns: id and value, where the value is the known value for the corresponding House ID.

id label
abc123 100000
xyz987 500000

Step 3: Define Dates

Traits can change over time, and in our example house values can and will continuously change. Therefore it is important to specify the date range in which the data in the previous upload was known to be correct, otherwise the training of the pipeline may be using incorrect labels and thus would lead to less accurate predictions.

In this example we are choosing the default range, but this can be any date range in which the labels upload for the house values were known to be correct.

Step 4: Specify Settings

In this step, we will name our Pipeline ‘Predict Current House Value’, have it rerun weekly on Sundays, and tag the pipeline with the House Value tag. Setting a weekly schedule means that your pipeline will use the latest available data to re-generate up-to-date predictions on a weekly basis.

Step 5: Review

The final step is to review your pipeline and ensure all settings look accurate! If anything needs updated, simply go ‘Back’ in the workflow and update any step. Otherwise, click ‘Start Training’ and sit back while Cortex generates the predictions.

Step 6: Update Labels Over Time (Optional)

If you’re collecting new house values over time, you can import these ​extra labels into Cortex so that your pipelines are always learning from the most recent information. To upload new labels, hit the “Edit” button on your pipeline (next to “Export Predictions”).

Related Links

Still have questions? Reach out to support@mparticle.com for more info!

Table of Contents