How Can We Help?

Predict which Properties will Go On-Sale

In this use case example, we will be walking through how to predict an attribute of a real estate property using a type of Machine Learning Pipeline called Look Alikes. Specifically, we’ll cover how to predict the probability that properties which are not up for sale will change to being on-sale.

What data do I need for this prediction?

This prediction will answer the question: Based on a list of recent properties which went on sale, how likely is it that all other properties which are also not for sale will change to being on-sale?  This list of recently listed properties is uploaded to Cortex in a CSV with the Property IDs of each of those listings.

While the list of Property IDs is the only information required when setting up our Look Alike pipeline, additional information is needed in order to make accurate predictions. This information is used to build features for our pipeline:

  • Property Attributes: this represents information about the property listing itself. Information could include asking price, size, location, etc.
  • User Behaviors: with listings being available to view an interact with online, any events and behaviors tracked to this property can be used in the prediction. These events can be a user viewing the listing, favoriting, contacting the owner or agent, etc.

How do I predict the Likelihood a Property will Go On-Sale?

Step 1​ : Choose Pipeline Type

Select ‘Create New Pipeline’ from your Cortex account, and choose the Look Alikes pipeline type.

Step 2: Upload Sets

This is where we upload our list of Property IDs representing recent listings. This file should be a .csv or or .csv.gz file, consisting of two columns: id and label. The value of 1 will be given to each of the Property IDs in the upload.

id label
abc123 1
xyz987 1

Step 3 Define Dates

Patterns can change over time, therefore it is important to specify the date range in which these listings came on the market. This date range allows the Machine Learning Pipeline to learn from that data and listings specifically from that period.

In this example we are choosing the default range, but this can be any date range from which the new listings were collected.

Step 4: Specify Settings

In this step, we will name our Pipeline ‘Predict New Property Listings’, have it rerun weekly on Sundays, and tag the pipeline with the ‘Listing’ tag. Setting a weekly schedule means that your pipeline will use the latest available data to re-generate up-to-date predictions on a weekly basis.

Step 5: Review

The final step is to review your pipeline and ensure all settings look accurate! If anything needs updated, simply go ‘Back’ in the workflow and update any step. Otherwise, click ‘Start Training’ and sit back while Cortex generates the predictions.

Step 6: Update Labels Over Time (Optional)

If you’re collecting new User IDs of known <Attribute_Plural> over time, you can import these ​extra labels into Cortex so that your pipelines are always learning from the most recent information. To upload new labels, hit the “Edit” button on your pipeline (next to “Export Predictions”).

Related Links

Still have questions? Reach out to support@mparticle.com for more info!

Table of Contents