To Predict Churn, You Shouldn’t Predict Churn

Minimizing churn is one of the keys to success for a subscription media or commerce business. Loyal customers create predictable and compounding revenue for your business. If you can’t predict and prevent churn, you’ll spend a fortune trying to constantly acquire new users.

Machine Learning is empowering businesses to develop more forward-thinking retention strategies than ever before, and subscription-driven companies are increasingly turning to machine learning to predict and prevent churn. In this post we look at how best to model user churn, and propose that looking at a user’s activity is often a better course of action then looking at a user’s actual churn date..

A User’s Engagement & Your Churn

A successful business aims to have as many healthy users as possible – not just as many subscriptions as possible. Healthy users are more likely to renew and stay with your business for a longer period of time, act as advocates for your business, etc. For a media or commerce subscription businesses, healthy users are those that are engaged with and are driving value from your product on a consistent basis. Often users who are less engaged — or worse, not engaged at all — are at high risk of unsubscribing. In fact, a Vidora analysis showed that once a user is inactive for a certain amount of time, it is a matter of pure chance when they actually churn – likely on the date they realize they are still a paying subscriber.

Through our experience modeling dozens of churn problems, we learned that for every subscription business, there exists a threshold of user inactivity after which it is very difficult, if not impossible, to win that user back. We have also found that building models to predict this point of inactivity yields more accurate leading indicators of churn than building models to predict actual churn itself.

Importantly, using inactivity as a proxy in our churn models still results in predicting true churn:

Inactivity and its Relationship to True Churn

We constructed a churn model using inactivity as the target label (i.e. the metric that is being predicted). Cortex allows our partners to build models to predict various labels or combinations of labels including whether a user will have any activity in the next ‘X’ number of days. These models are built based on behavioral data streamed into Cortex in real time, directly from your business.

Create a Retention Model in just a few clicks in Cortex. This model is predicting who will be retained in the next 30 days.

We first wanted to validate that the inactivity prediction model accurately predicted true churn. The plot below validated this assumption.

We ran retention models and looked at actual churn rates of those users. Graph above compares users’ activity retention model score with those users’ actual churn. It shows that Cortex’s retention model correctly predicts actual churn.

Why is predicting using inactivity better than using actual cancellations?

We found that inactivity serves as a  better target label than actual churn, because the time a user actually cancels their subscription is a highly noisy variable (e.g. when they remember they have a subscription they are not using) while understanding a user’s engagement can be explicitly measured.

To demonstrate this phenomenon, the plot below examines the churn rate of a set of users as a function of the number of days of inactivity.

The plot shows that the rate of churn is fairly constant after 40 days of being inactive. In other words, once a user is inactive for 40 days it is merely a matter of chance when they will actually unsubscribe from the product. Thus training a model to predict whether a user will be inactive for 40 days can be a more accurate indicator of churn.

For example, Sally has been inactive for 41 days. Harry has been inactive for 60 days. Because they both passed the 40-day threshold of inactivity, Sally and Harry have the same probability of churning next week. Because their churn probabilities, conditional on their inactivity, are essentially random, there is little signal that a machine learning model can learn from and accurately predict churn.

Predicting churn is both difficult and time consuming. With Cortex, you can iterate and create a variety of churn models with ease. At Vidora, our experience working with dozens of subscription businesses has allowed us to hone the best techniques for predictive churn. We built those techniques into an intuitive platform that enables anyone to accurately predict which of your users are at risk of churning from your business.

Posted by Anastasia Turin on November 16, 2018