‘The Future of Machine Learning in Business’
– from Alex Holub’s Keynote Speech at the 2018 Applied Artificial Intelligence Conference, San Francisco
The Future of Machine Learning in Business: An Introduction
Consider the Machine Learning (ML) ecosystem, as it applies to business problems. We are moving towards a single unified system capable of taking in massive amounts of raw data. This system will be able to answer a huge variety of predictive problems that business people have. Imagine a system where there is a constant inflow of raw data (this is the system’s fuel) and humans asking questions to that system as they pertain to their business.
We are still far from this vision today. Today, a specialized set of data scientists can solve each of those questions. They also use a specialized set of algorithms and a specialized set of techniques. As a result, weend up with very disparate methodologies to solve different problems. Academics such as Pedro Domingos refer to the unification of those methods as “the Master Algorithm”.
A Brief History of Machine Learning
First of all, it is important to distinguish the terms Artificial Intelligence and Machine Learning – even though they are often used interchangeably and people seem to have their own definitions, they have different definitions. This post takes an in-depth approach to defining these different terms, but at a high level:
- Artificial Intelligence (AI) is a manufactured system that mimics some aspect of human intelligence. There have traditionally been two different ways to solve AI problems: with expert systems, and with Machine Learning
- In an Expert System, you explicitly define a rule (usually one that follows an “if-then” structure). Expert Systems take knowledge from humans and codify them within a machine in an “if-then” framework. For example, say you’re creating an autonomous car. An example of a rule would be “if there is a concrete wall, then do not drive into it”
- Machine Learning takes a different approach – it attempts to learn those rules using the data itself. Take the example where you are creating an autonomous car. Over time, a machine learning system will figure out that hitting the accelerator in front of a concrete wall is likely to cause a crash. The huge amount of press coverage about how AI is transforming industries largely refers to ML, rather than expert systems
There are many different algorithms that allow you to implement ML. Particular algorithms such as Deep Learning algorithms are trendy today. However, the reality is that there are literally hundreds, if not thousands, of different machine learning algorithms that you can use to solve business problems. The fundamental algorithms that are being used in ML today are not new. Many of those, such as back propagation (which is used to train neural networks) were developed as early as the 1970s. Convolutional Neural Networks were developed in the 1990s by Yann LeCun. Only recently has it really taken off – and the key reason for that is the development of computational power that these ML algorithms have access to through cloud-based servers that provide them today such as AWS and Azure.
Overcoming Technical Challenges
Most businesses have their data sitting in a massive data store in an unstructured format. For example, that includes formats such as Redshift, BigQuery, and Oracle. These can’t be directly fed into a machine learner, because it won’t know what to do with it. The way we solve for that is in four key steps (you can read about how these steps apply to solving customer purchase prediction problems here):
- Preprocessing Data
- Feature Cleaning
- Feature Engineering
- Model Selection
Preprocessing data involves sorting your data into a framework that makes sense for your machine learner. If you’re a commerce company, for example, you might preprocess into a user-centric framework. Feature cleaning involves doing things such as normalizing data and removing outliers from your data, while feature engineering looks at things such as how activity changes over time. 90-95% of the work happens in these first three steps, and there are no tried and true techniques to these – it is very time consuming and complicated, still very subjective, and it all happens before you actually train and select models.
What you ideally want is an algorithm that can learn the best path through all of these steps and their different possibilities at once. This is where the technique of meta learning has a huge role to play in ML. It is meta learning which allows us to get towards the master algorithm. This is a single algorithm that can tackle all of these steps together, and learn the best path through all of them together.
Since every ML problem is different, and requires different techniques to preprocess data, clean and engineer features and select models, each problem has a different path to arrive at a solution. The master algorithm will be able to distinguish between different types of ML problems, and identify the unique path to solve each of them.
The Concentric Circles of ML – How Will The Capabilities of ML Expand?
Currently, ML can solve a small subset of problems. There are some problems it solves really well, and some it solves 80% as well. There are also some cannot solve well. Because of the volume of data and problems a machine learner may have seen, some problems are already far better solved by ML than by humans. In cases where ML performs almost as well, it adds value by virtue of being much quicker and more scalable than humans. This is the case even if performance is inferior. For example, a human might today be better at spotting spam emails than a robot. However, it’s not practical to have a human screen every single email and label as spam or legitimate.
Over time, those concentric circles will continue to grow. Systems will get smarter and smarter. As a result, they will solve an ever-expanding range of business problems.
What Does the Future of Machine Learning Look Like?
It is unlikely that there will be some new ML model or algorithm to rule the world, especially given how established some of the most common ML algorithms we use today are. Instead, it is likely that the biggest innovations in the future of machine learning in business will be on automating more steps of the ML process. This means automating processing massive amounts of data, feature cleaning, feature engineering, and so on. In addition, we will continue to see the development of ML algorithms through meta-learning. These developments will encompass more of those steps in the process, and begin to combine steps such as feature engineering with model selection, for instance.
It is a very exciting time to be in this space. Tremendous innovations are coming in the future of machine learning in business. And they will empower people to do more than they ever could before.
See Alex Holub’s full keynote speech at the 2018 Applied AI Conference in San Francisco here.