OpenAI is testing another new way to expose the complicated processes at work inside large language models. Researchers at the company can make an LLM produce what they call a confession, in which the model explains how it carried out a task and (most of the time) owns up to any bad behavior. Figuring out…
technologyreview.com
technologyreview.com
bsky.app
AI and ML News on Bluesky @ai-news.at.thenote.app
