Understanding Etsy’s Vast Inventory with LLMs
Etsy's marketplace, featuring millions of unique, handmade items from various sellers, faces challenges in organizing its unstructured data. Traditional methods of product attribute extraction struggled with the diverse inventory and limited structured data. Large Language Models (LLMs) provided a new opportunity to transform unstructured product information into structured data. Etsy developed a scalable pipeline using LLMs, focusing on context engineering to improve attribute extraction accuracy. This pipeline leverages seller-provided data, expert examples, and Etsy's taxonomy. Evaluation of the LLM output involves generating 'silver labels' and utilizing domain experts for quality assurance. The inference process extracts attributes, uses LiteLLM for regional routing, and employs Pydantic for data validation. Robust monitoring systems track pipeline health and model performance metrics. The application of LLM-generated attributes to search filters has enhanced buyer engagement and conversion rates. Etsy aims to expand its use of LLMs to further improve the shopping and selling experience. The ultimate goal is to ensure that buyer and seller needs are met with maximum efficiency.