Here’s how to have impact by delivering your machine learning models to the business
*Predictive models are powerful tools that can benefit business in many ways, but not when all they do is sitting in Data Scientists’ laptops. Deployment to production, or delivery, of these models is vital to impacting business.*
This article is the first of a series of two and explains what we mean by ‘delivery’, and what a serious attempt at building a delivery system could look like. The next instalment will dive into various approaches to implementing the delivery system.
Data science and Models
Over the course of the past decade the ‘data scientist’ has made a remarkable appearance. Where previously data was primarily used to report on the past as well as monitor current events, the data scientist with his or her statistical toolbox promises to predict the future.
Using state-of-the-art models, often packaged as Python libraries or R packages, data scientists take historical data, mix it with real-time events and deliver a cocktail of improved customer engagement, reduced costs and overall higher revenue. And while many organisations were able to realise exactly that, others were not. Many ideas exist on why not, but in this article we’ll focus on a single one: the challenges associated with bringing advanced predictive models to production.
Data scientists take data and deliver a cocktail of improved customer engagement, reduced costs and overall higher revenue.
The R&D and experimental style associated with data science, make it largely incompatible with day to day product operations where reliability and performance are crucial. Indeed it is hard to imagine a scenario where end-users gladly accept significantly deteriorated UX, or even downtime, caused by a hastily rolled out experiment. Product teams manage the SLA’s around performance and reliability using well established processes such as agile, peer review, tests, continuous integration, devOps and such. Applying these processes to data science, risks suffocating the creative energy needed for out-of-the-box thinking. Adding to this, building and maintaining ML systems is proving to be different, and much harder, than conventional software systems due to inherent characteristics such as boundary erosion, entanglement and undeclared consumers.
The delivery system promises to not stifle data scientists’ creativity, it ensures outcomes are measurable, and it doesn’t cause unmanageable operational risks for product teams.
In order to reconcile R&D with the product, this article proposes the delivery mechanism as a separation between data science and development teams. This delivery mechanism takes care of ingesting data science outputs, and making these available to product teams and ultimately the end-user. The mechanism promises to not stifle data scientists’ creativity, it ensures outcomes are measured, and it doesn’t cause unmanageable operational risks for product teams.
Adherence to strict SLA’s in terms of security, latencies and throughput
For any model to be used in production, such as customer facing applications, strict SLA’s on latency and throughput are needed and Amazon recognises this as early as 2008. Peaks in traffic should not bring the service down, nor should customers have to wait for their recommendations to load. Of course the system should be secure, and it should address end-user privacy concerns.
The delivery mechanism guarantees these SLA’s regardless of model type and technology used.
Monitoring
In order for the system to maintain its promises in terms of latencies and throughput, advanced monitoring is needed. Not only to spot problems early on and deal with those proactively, but also to help in debugging after calamities have occurred.
Robustness of the system is not only needed to ensure consistent service to the end-user, but also to cultivate and maintain trust within the organisation for its data science department.
Blending
By blending we mean combining two or more models to obtain a single, optimal result. There are two major reasons to do blending:
Increase performance of predictions by merging models that optimally describe a subset of the possible predictions space, as is done for ensemble modelling. For example recommending items based on both user-preference as well as on time of day.
Combine human expertise (domain knowledge) with predictive models, again to improve the performance of predictions. For example, mixing a human curated list of ‘staff picks’ with a collaborative filter based on user preference.
The system should allow for systematic evaluation of different blends, so that incremental improvements can be made.
Systematic evaluation
If it can’t be measured it doesn’t exist. As cliché as that may be, it is ever so relevant in this domain. Models can be validated offline on historical data, but there is no real substitute for online testing against actual users and verifying that hypotheses are valid.
Adding to this, systematic evaluation attaches value to the data science efforts by visualising impact on business in terms of customer engagement (measured as click-through-rate, time spent online, retention, return visits, etc.). Results should be made available to both data scientists and product teams. This means that (realtime) dash-boarding will need to be available.
Technology agnostic
As the domain is still very young, most of the predictive model frameworks and libraries are still heavily in flux. New libraries and technologies emerge nearly on a daily basis; most are open-source and immature. Many organisations with the resources and business cases for leveraging data science also come with substantial amounts of legacy systems.
Allowing data scientists to use ‘the right tool for the job’ means that models can come in many ways and formats — it makes sense to ensure the delivery mechanism can cope with all; now ánd in the future.
Realtime
Models that operate in production will need to be updated in real-time. At the very least, the delivery mechanism will need to be kept up to date with real-world changes as they occur. For example: recommendations should not link to deleted articles, and newly added articles should be recommended nearly immediately after publication.
Potentially, the system supports online learning use cases as well.
Frontend integration
Simply making models available is not enough. Frontends will need to integrate them, and preferably in a generic manner, so that each new generation of model, or every new use case, does not translate to frontend development dependencies.
A REST api will do in some cases, but well documented client-side libraries will reduce the load for frontend development teams during initial implementation, as well as on future development.
In summary
Bringing predictive models to production is harder than it seems at first, but it is absolutely vital for reaping the benefits of data science efforts. Any model that did not make it to production will, by definition not impact business in a meaningful way.
This article has outlined what a delivery mechanism should do, the next instance in this two article series will dive into various strategies for implementing the delivery system.
Do you think this was interesting, but you’re also convinced you can do better? That’s great, because then we want to talk to you! We’re hiring :) Have a look at our website @ https://primed.io/careers/ or send us an email on info@primed.io