Deepseek’s R1 AI Model: A Temporary Win in China’s AI Race Against the US

Deepseek : Deepseek's R1 AI Model: A Temporary Win in China's AI Race Against the US

The recent release of Deepseek’s R1 AI model from China has caused a stir among major US tech companies. However, this does not mean that China has won the race for technological supremacy in AI. The US public’s reaction to R1 is reminiscent of the shock felt in 1957 when the Soviet Union launched the first satellite, demonstrating their technological superiority. R1 not only matches the capabilities of OpenAI’s models but is also more efficient and open source.

Initial reactions to R1 range from panic, with Nvidia’s stock prices dropping, to fatalism, where some believe China has won the AI race. Others, like tech investor Neal Khosla, suggest that R1’s release is a psychological tactic by China to destabilize the US economy. However, a closer look at the technical aspects reveals that R1 alone will not end Silicon Valley’s dominance.

Technically, Deepseek’s R1 is well-executed but not groundbreaking. Experts have praised the model, but it contains no fundamental innovations. R1 is a reasoning model with a large language model at its core. It breaks tasks into smaller steps, processes them sequentially, and chooses the best solution. This model learned by being trained on examples of effective thought processes.

The efficiency of R1’s training and execution likely results from earlier work using less efficient large models to create heuristic methods and training data for smaller models. This approach resembles Meta’s strategy with its Llama models. Deepseek’s training began when US export restrictions on AI chips were not yet strict, allowing them to stockpile Nvidia’s powerful A100 chips.

The release of R1 doesn’t mean US export restrictions were ineffective. Such measures take time to show results. However, the model indicates that sanctions can accelerate technological development. A lack of powerful hardware incentivizes finding alternative solutions.

Deepseek’s decision to release the model openly is likely political, aiming to weaken OpenAI. The Chinese government hopes rapid AI development will boost productivity. In China, there’s a pattern known as Fang-Shou, where periods of openness alternate with increased control. Currently, China is focusing on AI development, but this openness may not last long.

Former OpenAI developer Miles Brundage argues that China will not allow a chaotic AI landscape indefinitely. The current openness will eventually end. Although reasoning models require more computing power than regular language models, Deepseek’s existing resources are sufficient to keep R1 online for now. This success is significant for China’s AI industry and challenges US companies aiming to profit from AI.

However, this situation might not last. If R1 proves useful, demand will rise both domestically and internationally. Decisions will need to be made about whether to use resources for training new models or enhancing existing ones. The shortage of powerful AI chips will continue to slow China’s AI progress. If sanctions remain or intensify, China’s AI industry will need a major breakthrough to replicate R1’s success. When or if this will happen is uncertain.