Slashdot

DeepSeek Accelerates AI Model Timeline as Market Reacts To Low-Cost Breakthrough

Chinese AI startup DeepSeek is accelerating the release of its R2 model after the success of its R1 model, which outperformed many US competitors at a lower cost and triggered a market selloff. The R2 model promises improved coding capabilities and reasoning in multiple languages beyond English. DeepSeek's parent company, High-Flyer, invested early in computing power, including two supercomputing clusters, which gives the company a competitive advantage. The second cluster, Fire-Flyer II, consists of around 10,000 Nvidia A100 chips. DeepSeek's cost-efficiency comes from innovative architecture choices like Mixture-of-Experts and multihead latent attention. The company's pricing is 20-40 times cheaper than OpenAI's equivalent models, according to Bernstein analysts. The competitive pressure has already forced OpenAI to cut prices and release a scaled-down model. Google's Gemini has also introduced discounted access tiers in response to DeepSeek's competitive pricing. DeepSeek's R1 model was released in January and was planned to be followed by the R2 model in May, but the company now wants to release it as early as possible. The success of DeepSeek's models has triggered a significant market response, with a market selloff of over $1 trillion.
favicon
slashdot.org
slashdot.org
favicon
bsky.app
AI and ML News on Bluesky @ai-news.at.thenote.app
Create attached notes ...