Taking Machine Learning Research to Production: Solving Real Problems
You need to be signed in to add a collection
Most of the focus in the ML community is on research, which is exciting and important. Equally important however is bringing that research to production applications to solve real-world problems, but the issues and approaches for doing that are often poorly understood. An ML application in production must address all of the issues of modern software development methodology, as well as issues unique to ML and data science. Often ML applications are developed and trained using tools like notebooks and suffer from inherent limitations in testability, scalability across clusters, training/serving skew, and the modularity and reusability of components. In addition, ML application measurement often emphasizes top level metrics, leading to issues in model fairness as well as predictive performance across user segments. The user experience of any ML application is unique to the model’s performance on that user’s input data, so if the model doesn’t perform well on that particular data segment then the user has a poor experience. We discuss the use of ML pipeline architectures for implementing production ML applications, and in particular we review Google’s experience with TensorFlow Extended (TFX). Google uses TFX for large scale ML applications, and offers an open-source version to the community. TFX scales to very large training sets and very high request volumes, and enables strong software methodology including testability, hot versioning, and deep performance analysis. Robert Crowe is a data scientist and TFX Developer Advocate at Google and will discuss how developers can move their ML applications to TFX or similar platforms for both training and inference. **What will the audience learn from this talk?** The audience will learn about issues and approaches for developing ML applications which are intended for commercial deployments in the real world. Creating production ML applications and the infrastructure to support them is very different from doing ML research, or coding up an ML model to try to achieve a target level of performance. A developer needs to think much more in terms of modern software methodology, with the additional ML aspects to consider as well. **Does it feature code examples and/or live coding?** No live coding, but there will be code examples in slides. **Prerequisite attendee experience level:** [250-300](https://gotocph.com/2019/pages/experience-level)
Transcript
Most of the focus in the ML community is on research, which is exciting and important. Equally important however is bringing that research to production applications to solve real-world problems, but the issues and approaches for doing that are often poorly understood.
An ML application in production must address all of the issues of modern software development methodology, as well as issues unique to ML and data science. Often ML applications are developed and trained using tools like notebooks and suffer from inherent limitations in testability, scalability across clusters, training/serving skew, and the modularity and reusability of components. In addition, ML application measurement often emphasizes top level metrics, leading to issues in model fairness as well as predictive performance across user segments. The user experience of any ML application is unique to the model’s performance on that user’s input data, so if the model doesn’t perform well on that particular data segment then the user has a poor experience.
We discuss the use of ML pipeline architectures for implementing production ML applications, and in particular we review Google’s experience with TensorFlow Extended (TFX). Google uses TFX for large scale ML applications, and offers an open-source version to the community. TFX scales to very large training sets and very high request volumes, and enables strong software methodology including testability, hot versioning, and deep performance analysis. Robert Crowe is a data scientist and TFX Developer Advocate at Google and will discuss how developers can move their ML applications to TFX or similar platforms for both training and inference.
What will the audience learn from this talk? The audience will learn about issues and approaches for developing ML applications which are intended for commercial deployments in the real world. Creating production ML applications and the infrastructure to support them is very different from doing ML research, or coding up an ML model to try to achieve a target level of performance. A developer needs to think much more in terms of modern software methodology, with the additional ML aspects to consider as well.
Does it feature code examples and/or live coding? No live coding, but there will be code examples in slides.
Prerequisite attendee experience level: 250-300