A Decade of AI Platform at Pinterest
Pinterest's AI journey evolved from fragmented machine learning stacks to a unified AI Platform. Early ML efforts involved individual teams building custom solutions, leading to redundancy and training-serving skew. Linchpin DSL and Scorpion inference service were early attempts at unification, but faced limitations with evolving technologies. A small ML Platform team struggled to drive adoption without organizational alignment and incentives. EzFlow aimed to improve training orchestration, but adoption was slow due to product teams' focus on immediate metrics. Seed bets like PySpark, Training Compute Platform, and Galaxy laid the foundation for future advancements. DNNs emerged in recommendation systems, with teams like Home Feed building solutions like AutoML, which exposed brittle foundations. Adoption was driven by organizational alignment, product goals, and industry momentum. Efficiency became a limiter, demanding deeper collaboration between modeling and platform teams as transformer models and GPUs reshaped infrastructure.