Join us for a 3-part interview with
Lionel Port, Head of Data Technology at News Corp Australia
In Part 1, Lionel Port, Head of Data Technology discussed the role of Vidora’s Cortex solution in enabling machine learning in order to attract and drive advertising revenue at News Corp Australia, Australia’s leading online publisher.
He describes the critical evaluation process he underwent in deciding which ML solution to integrate into his team’s continuous data workflows in order to automate the creation of custom audience segments and provide his team maximum leverage.
It was very easy to justify Vidora as our platform because the costs were a lot lower … we were able to create this premium product for less cost.
In Part 2, we dive deeper into the qualifications for automation criteria, including the often-overlooked steps of data cleansing, feature engineering, and data analysis/model selection. These difficult steps, when automated, can drive significant time-savings for teams.
It takes a data scientist less than 30 minutes to actually create a new pipeline. And then with our fully automated solution, that now will be activated in the DMP within that 24 hour period.
He also discusses how Cortex enabled the rapid iteration and experimentation with new machine learning pipelines
[Before Cortex] it was costly to do experiments. And so we would be less likely to do that. And it meant also that if we didn’t have the right features in our platform, that was a big turnaround time in terms of getting that all set up.
Watch the discussion below:
Learn more about Cortex use cases
Read the Interview Below:
On the role of automating ML within News Corp
Can you qualify or even quantify some of the more challenging pieces of that, what people are calling machine learning operation, data science automation. Where are those steps in the past? I know you guys were actually doing some stuff in-house as well. Where are those steps in the past where Cortex has been able to give you some efficiency throughout that data science pipeline?
Yeah. So I guess there’s a lot of work in creating a machine learning pipeline. And a lot of the work is actually in the prep of your data sets. So there are a few off the shelf products, including from Amazon and Google, but where Cortex is helping us is actually even with those existing platforms, even if you use SageMaker, you still have to do your data analysis, you still have to do your cleansing, and you still have to do your feature selection, and you still need a data scientist or an analyst to do a lot of that grunt work upfront. And so it still takes you a number of days to get your data into the right step. And then you might run a model and then you have that stage of tuning it and making sure that it’s actually performing as well as you expect.
And so I guess the thing that Cortex takes away from us, it’s actually doing the feature extraction as we ingest the data and essentially generating lots of features. And then it’s actually doing almost a brute force attempt and taking everything in, but then only picking the ones that are actually important to that machine learning model. It also tries five different methods and says which one actually performs the best for us so that we don’t have to do that. And so it will save us, it’ll take a data scientist less than 30 minutes to actually create a new pipeline. And then with our fully automated solution, that now will be activated in the DMP within that 24 hour period. Whereas we also do SageMaker models and when we do it through SageMaker, it’s more like a week. So we’d only use SageMaker if it’s something that has a lot of value in creating, and it is something that it would be difficult to do.
On the Business Case for ML Process Automation
Got you. Makes sense. And I know automation is obviously one of the more core aspects of the process automation that Vidora handles. Can you talk previously about what you guys are doing before Cortex adoption? I know we were talking about some of the challenges sometimes around making, less about the automation case and more around the business case, the C level and your entire process through that. So talk to me about how you guys went through that process.
Yeah. So previous to using Cortex, we were actually using a data science vendor, or more partner. And so they would actually do the data science services for us, they would still have those steps that we would take a week if we did encounter, but they actually did it for us and they ran the platform for us.
And so last year we made the decision to bring our machine learning in-house as a look at, it is probably two things. It was constant agility to create models. And so previously, every time we went to create a new model, it was basically a request out to the vendor to create those models and then an associated cost with creating those bespoke models. So it meant it was costly to do experiments. And so we would be less likely to do that. And it meant also that if we didn’t have the right features in our platform, that was a big turnaround time in terms of getting that all set up.
So I guess in terms of ROI, we’re quite lucky in the advertising business that we actually do report the revenue generated against each audience segment. So we do have the revenue side of the equation and bespoke segments are always a premium offering. So that’s a customer requested segment, not one that’s built out of the box and so they do have a high value. So it was very easy to justify switching to Vidora as our platform because the costs were a lot lower. And we didn’t, the platform costs were about the same, but we didn’t have that cost of creating bespoke segments and so it actually ended up, we were able to create this premium product for less cost.