VaultGemma: The world's most c... Note

VaultGemma: The world's most capable differentially private LLM

Building AI with privacy at its core is a crucial frontier as AI becomes more integrated into our lives. Differential privacy (DP) offers a mathematically robust solution by adding calibrated noise to prevent memorization. However, applying DP to LLMs introduces trade-offs that alter traditional scaling laws, reducing training stability and increasing costs. New research has established laws that accurately model these intricacies, providing a complete picture of compute-privacy-utility trade-offs. Guided by this research, VaultGemma, the largest open model (1B-parameters) trained from scratch with differential privacy, has been introduced. This research quantified the benefit of increasing model sizes, batch sizes, and iterations in DP training, primarily focusing on the noise-batch ratio. A key finding is that one should train a smaller model with a larger batch size than without DP. Using these scaling laws and advanced training algorithms, VaultGemma was built, representing a significant step forward in private AI. VaultGemma demonstrates no detectable memorization of its training data, validating the efficacy of DP training. While a utility gap persists between DP-trained and non-DP-trained models, this research aims to systematically narrow it.
CdXz5zHNQW_JPtqyvsr4p.png