Google DeepMind recently published a new algorithm for curating AI training datasets: multimodal contrastive learning with joint example selection (JEST), which uses a pre-trained model to score the learnability of batches of data. Google's experiments show that image-text models trained with JEST-curated data require 10x less computation than baseline methods.
www.infoq.com
www.infoq.com
Create attached notes ...