Etsy Engineering | Code as Cra... Note

Etsy Engineering | Code as Craft

Codeascraft on Etsy is an artisan collaboration where computer programming meets traditional craftsmanship. The project consists of various digital artifacts crafted by famous programmers and artists alike transformed into knick-knacks for tech aficionados. These include printed renderings of binary algorithms on glass candleholders, handcrafted binary number necklaces, and digital bit portraits encapsulated in glass paperweights. Codeascraft focuses on bridging the gap between technology and craftsmanship, offering unique collectibles for code enthusiasts.

Thread Of Notes

Etsy's marketplace features diverse handmade and unique products, requiring nuanced understanding for effective search and recommendations. Current product information, while rich, is often unstructured and difficult for machine learning models to fully utilize. The core challenge lies in bridging the gap between raw data and the complex details that define each product's appeal. The solution involves using a reinforcement learning approach and contrastive signal. The method fine-tunes an LLM to generate concise product summaries emphasizing distinguishing features using buyer engagement data. This is achieved by training the model to prioritize details based on buyer choices, improving relevance predictions. The model is trained on search interaction data, rewarding summaries that highlight the features that led a buyer to choose one listing over another. This reinforcement learning drives the model to produce summaries which lead to improvements in search relevance metrics. Human evaluations and quantitative offline testing demonstrated the summaries' high quality and their impact on downstream models, improving performance. The approach focuses on understanding products based on buyer behavior, rather than rigid definitions, reflecting seller creativity. The enhanced product understanding ultimately helps buyers discover products which appeal to their tastes, thus improving the shopping experience. The resulting concise summaries highlight key characteristics that differentiate listings of similar products. The project has shown strong ability to surface the important product details compared to using only simple text features, like keywords.
CdXz5zHNQW_UugGSCbu5d.jpeg
Etsy improved its Ads Search ranking model to enhance buyer engagement and seller visibility. The goal was to better predict purchase intent by surfacing more relevant ad listings. This was achieved through two major enhancements: integrating the Multigate Mixture of Experts (MMoE) architecture and utilizing add-to-cart as an auxiliary signal.The original multitask model optimized for click-through rate (CTR) and post-click conversion rate (PCCVR), but suffered from data sparsity in later stages of the purchase journey. MMoE addresses the "seesaw phenomenon" in multitask learning, where optimizing one task can degrade another. It introduces specialized "experts" and "gates" that allow tasks to learn unique patterns while still benefiting from shared representations.The MMoE architecture includes a shared bottom, and then experts, which are parallel subnetworks that learn different data patterns. Each task has a gating network that controls how it combines expert outputs, optimizing for both CTR and PCCVR.Tuning the MMoE involved experimenting with the number, size, and type of experts. Heterogeneous experts (DCN- and MLP-based) showed improved metrics. Challenges included ensuring expert utilization and specialization.Regularization techniques like expert dropout and temperature scaling were explored to address these issues. Temperature scaling, which softens the probability distribution of expert selection, proved more effective in promoting both utilization and specialization.Beyond clicks and purchases, Etsy recognized the value of other user interactions like add-to-cart and favorites. These actions indicate high purchase intent and are more plentiful than purchases, offering stronger signals for the model.Introducing auxiliary tasks, specifically add-to-cart, helps the model learn more generalizable representations of user engagement. This leverages more frequent signals to benefit the sparser purchase prediction, ultimately leading to a more effective ranking system.
CdXz5zHNQW_Ply7jT2xQK.jpeg
Etsy implemented the Speculation Rules API (SRA) to significantly improve the performance of its product listing pages. This new browser API allows websites to instruct browsers on how to fetch resources for future navigation. SRA offers prefetching, which downloads the HTML, and prerendering, which fully loads and renders a page. While prerendering offers greater gains, prefetching is a less risky starting point. SRA provides a simpler API than traditional prefetching methods and an upgrade path to prerendering. Etsy specifically used SRA to prefetch listing pages when users hovered over organic listings on the desktop search page. The implementation involved adding a script tag with JSON rules to dictate when and what to prefetch. A key lesson learned was that SRA prefetches cache pages in both memory and the HTTP cache, unlike traditional prefetch which only uses the HTTP cache. The browser limits memory cache to two prefetched pages, after which older ones are evicted. Etsy found that setting a five-minute cacheability for pages when a prefetch request is detected helps mitigate this limitation. Challenges were encountered with elements like video players that used shadow DOM, requiring workarounds. Setting cookies during prefetching could lead to misleading analytics, but this can be avoided using specific HTTP headers. Redirects did not negatively impact prefetching as long as proper caching headers were set. Modifying href attributes on hover could cause prefetching to fail and evict other cached pages. A significant hurdle was ensuring analytics remained accurate, as prefetched pages are not necessarily viewed. Etsy addressed this by logging events only after page activation and avoiding asset loading and JavaScript execution during prefetches. The implementation resulted in notable performance improvements.
CdXz5zHNQW_zYShJSRIgI.jpeg
CdXz5zHNQW_Nxw1u6jGOy.jpeg
Etsy adopted Jetpack Compose as its preferred means of building Android apps due to its modern toolkit for defining native UIs. The adoption process involved a gradual expansion of features built using Compose, culminating in a full rewrite of a primary screen in the app. To facilitate the adoption, Etsy engineers created a curriculum based on Compose documentation and examples, and the Design Systems team created Compose versions of internal UI toolkit components. The team also utilized Compose Previews to visualize Composables in various configurations, reducing rework and bugs. After successfully rebuilding an entire screen and exposing it to real users, Etsy adopted Compose on a larger scale in the Shop screen, which resulted in improved performance, developer satisfaction, and user analytics. The adoption of Compose has become the standard way Etsy builds features for its app, with benefits including decoupling state from the UI, isolating business logic, and easier manipulation of spacing and margins. However, the team encountered some challenges, including issues with a third-party library and unexpected behavior with LazyRows and LazyColumns. Overall, Etsy is thrilled with the progress and outcomes of adopting Compose, which has enabled the team to be more productive in delivering new features to buyers. The adoption of Compose is a tangible example of Etsy's commitment to its craft and culture of learning. The team has now fully rewritten several key UI screens using Compose, with more to come.
CdXz5zHNQW_mBvVZwI8dc.jpeg
CdXz5zHNQW_mxNDOlthCn.jpeg
In July 2023, Etsy's App Updates team set out to revamp the Updates feed into Deals, a hub for coupons and sales. The team considered developing a new tab from scratch using modern technologies such as Swift UI and Tuist.Balancing ambition with realism, the team adopted a hybrid approach, using Swift UI for modular development and previewability, while integrating with existing UIKit codebase for navigation and other functionality.Swift UI's modularity allowed them to break down views into reusable components, enabling rapid development and efficient previewing with Tuist.They implemented Decodable models for clear and efficient API parsing, simplifying view construction and handling of optionals.Preview enums were introduced to streamline the creation of mock data for complex views, enabling them to build modules even before API support was complete.However, interfacing with the existing codebase posed challenges, particularly in areas like navigation and favoriting.To address this, they created a custom @Environment struct, DealsAction, which passed navigation responsibility back to the main target while maintaining Swift UI's callAsFunction() feature.Environment objects were utilized for other functionality, such as favoriting, following shops, and logging performance metrics.This hybrid approach allowed the team to leverage the benefits of modern technologies while respecting the constraints of the legacy codebase, resulting in a successful product launch before the Cyber Week deadline.
Etsy's search by image feature allows users to search for items similar to photos they upload. The feature uses a machine learning model to convert images into numerical representations called embeddings, which are then used for similarity searches.The model is based on a pre-trained convolutional neural network (CNN) that has been fine-tuned for the task of learning image embeddings. The model is trained using a multitask learning approach, where it learns to perform several classification tasks simultaneously, including item category, color, and attributes.To reduce bias towards high-quality product images, the model is also trained on a dataset of user-submitted review photos.The inference pipeline involves building an approximate nearest neighbor (ANN) index using an inverted file (IVF) algorithm to optimize search performance.Query photos are inferred in real-time using GPU inferencing technology to ensure fast response times.The search by image feature was initially developed during Etsy's CodeMosaic hackathon and has since been implemented as a production feature.The feature helps buyers discover unique and special items on Etsy by providing them with a new and intuitive way to search for similar products.The model's architecture and learning objective have been optimized to produce visually consistent results while maintaining categorical accuracy.The addition of review photos to the training dataset has significantly improved the model's ability to surface relevant results from user-submitted photos.The feature has been well-received by users and has contributed to increased buyer engagement and satisfaction on Etsy.
Etsy employs recommendation modules to present relevant items to users, each powered by a ranker that scores candidate items for relevance. Traditionally, Etsy used module-specific rankers, but this approach became unwieldy as the number of modules grew.To address this, Etsy developed canonical rankers, which are trained to power multiple modules, ensuring efficiency and consistency. The first canonical ranker focused on visit frequency, using favoriting rate as a surrogate for revisits.The frequency ranker's model structure included a shared-bottom architecture with separate layers for favoriting and purchasing predictions, combined into a final ranking score. The ranker also incorporated a module name feature and balanced training data across modules to ensure generalizability.Despite training on data from a limited subset of modules, the canonical ranker outperformed module-specific rankers on modules not used for training, demonstrating its effectiveness as a canonical solution.The frequency ranker improved favorite rates on both item page and homepage modules, with significant improvements in purchase metrics and other engagement indicators.Since its launch, Etsy has deployed the canonical ranker on multiple modules across web and app platforms.Moving forward, Etsy plans to iterate on the frequency ranker, incorporating more context and exploring novel architectures.The canonical ranker represents a shift in Etsy's recommendation strategy, providing more personalized recommendations and a consistent user experience across platforms and modules.
In 2018, Etsy migrated its Kafka brokers to Google Cloud Platform's Kubernetes Engine. Initially operating in a single zone, the team later redesigned the architecture for zonal resilience, distributing brokers across multiple zones with even distribution of partition replicas.To achieve a zero-downtime migration, brokers were moved first by snapshotting disks and then recreating them in the correct zones. Partition relocation was manually handled using scripts and tools to minimize data movement and impact.Post-migration testing in production demonstrated the effectiveness of the multizone design, with minimal disruption during a zone outage. While inter-zone network costs increased as expected, the benefits of automated zone resilience outweigh the costs.The team is optimizing costs by leveraging Kafka's follower fetching feature and exploring additional approaches to reduce cross-zone traffic. Despite some cost increases, the benefits of zonal resilience are significant, justifying the investment.The migration involved complex steps, including disk and Pod movement, partition relocation, and configuration adjustments. The team's careful planning and execution ensured zero downtime and data integrity throughout the process.Etsy's experience highlights the importance of designing for resilience in critical services. By embracing zonal redundancy, the team mitigated the risks associated with single-zone failures and improved the stability and availability of their Kafka cluster.The multizone architecture enables Etsy to handle increased production traffic and critical user-facing features, such as search indexing, with confidence.The company's ongoing efforts to optimize costs demonstrate a commitment to balancing resilience with financial considerations.The case study provides valuable insights into the challenges and strategies involved in migrating and operating a highly available Kafka cluster in a multi-zone cloud environment.