Inextricably Linked: Reproducibility and Productivity in Data Science and AI

Updated on December 12, 2018
GOTO Copenhagen 2018
Mark Coleman
Mark Coleman

VP Marketing at dotscience and Marketing Chairperson for Cloud Native Computing Foundation

Because it is more complex and has far more moving parts, Data Science & AI is where Software Development was in 1999: people are emailing and Slacking notebooks to each other, due to a lack of appropriate tooling. There are few CI/CD pipelines and model health monitoring is scarce. A lot that could be automated is still manual. And teams are siloed. This causes problems both for productivity: it's hard to collaborate, and reproducibility: which impacts on governance and compliance.

In this talk, Mark shares his team’s research comparing the evolution of Software Development & DevOps with that of Data Science & AI. Mark then presents a proposal for an architecture and a set of open source tools to solve both the collaboration and the governance problem in Data Science & AI. With live demos!