Machine learning (ML) model deployment involves a transition from experimentation to rigorous engineering constraints, presenting challenges in balancing flexibility and stability.
Etsy's ML platform team uses Kubernetes for model scaling and orchestration, with Barista managing model deployments.
Initially, model configurations were managed as code, providing tight control but leading to delays and bottlenecks.
To address these issues, configurations were decoupled and stored in a database, enabling instant changes via a CLI.
However, the CLI required technical expertise, prompting the development of a user-friendly web interface for model management.
The Barista web interface provides comprehensive control over deployments, integrates with various APIs, and streamlines the deployment process.
Increased model deployment rates raised concerns about cost and misconfigurations, leading to the implementation of Kube Downscaler to automatically scale down unused deployments.
The focus has shifted from meeting basic technical requirements to building a complete product that empowers ML users.
Current efforts aim to enhance service cohesiveness and automation to optimize infrastructure settings and further reduce cloud costs.
As the ML practice expands, the platform must continue to evolve to meet the growing needs of the team.
etsy.com
etsy.com
Create attached notes ...
