Inextricably Linked: Reproducibility and Productivity in Data Science and AI
You need to be signed in to add a collection
Because it is more complex and has far more moving parts, Data Science & AI is where Software Development was in 1999: people are emailing and Slacking notebooks to each other, due to a lack of appropriate tooling. There are few CI/CD pipelines and model health monitoring is scarce. A lot that could be automated is still manual. And teams are siloed. This causes problems both for productivity: it's hard to collaborate, and reproducibility: which impacts on governance and compliance. In this talk, Mark shares his team’s research comparing the evolution of Software Development & DevOps with that of Data Science & AI. Mark then presents a proposal for an architecture and a set of open source tools to solve both the collaboration and the governance problem in Data Science & AI. With live demos!
Transcript
Because it is more complex and has far more moving parts, Data Science & AI is where Software Development was in 1999: people are emailing and Slacking notebooks to each other, due to a lack of appropriate tooling. There are few CI/CD pipelines and model health monitoring is scarce. A lot that could be automated is still manual. And teams are siloed. This causes problems both for productivity: it's hard to collaborate, and reproducibility: which impacts on governance and compliance.
In this talk, Mark shares his team’s research comparing the evolution of Software Development & DevOps with that of Data Science & AI. Mark then presents a proposal for an architecture and a set of open source tools to solve both the collaboration and the governance problem in Data Science & AI. With live demos!