Ars Technica

New study accuses LM Arena of gaming its popular AI benchmark

The popular AI vibe test may not be as fair as it seems.
favicon
arstechnica.com
arstechnica.com
Image for the article: New study accuses LM Arena of gaming its popular AI benchmark