Deepseek: Innovation Amidst US Chip Sanctions in China’s AI Landscape

Deepseek : Deepseek: Innovation Amidst US Chip Sanctions in China's AI Landscape

Deepseek is making waves in the tech world, surprising even its US competitors. Hancheng Cao, an assistant professor at Emory University, suggests that it could be a form of justice for researchers in the Global South, who often have limited resources. The success of Deepseek, a Chinese language model, is noteworthy, especially given the challenges Chinese startups face due to US export controls on advanced chips. These sanctions, intended to weaken China’s AI capabilities, seem to have pushed startups like Deepseek to innovate, focusing on efficiency and local collaboration.

To develop its reasoning system, R1, Deepseek had to revise its training process to reduce the load on existing GPUs. The company claims to use a version of Nvidia chips tailored for the Chinese market. These chips reportedly offer about half the performance of US top-tier products. Despite this, Deepseek’s R1 is praised for handling complex tasks in math and programming, using a “Chain of Thought” approach similar to ChatGPT, allowing it to solve problems step by step.

Dimitris Papailiopoulos from Microsoft’s AI Frontiers lab notes the simplicity of R1’s technology. Deepseek aims for precise answers without detailing every logical step, reducing computation time while maintaining effectiveness. The company also released smaller R1 variants that can run on laptops, claiming one even outperforms OpenAI’s o1-mini in certain benchmarks.

Deepseek remains relatively unknown, founded in Hangzhou in July 2023 by Liang Wenfeng, an alumnus of Zhejiang University. Liang also founded High-Flyer, a hedge fund, in 2015. Like Sam Altman of OpenAI, Liang aims to build Artificial General Intelligence (AGI), a form of AI that matches or surpasses human abilities in various tasks. Training large language models requires a skilled team and significant computing power, a challenge exacerbated by US export controls on high-end chips.

Anticipating sanctions, Liang stockpiled Nvidia A100 chips, which are now banned for export to China. Reports suggest Deepseek has between 10,000 to 50,000 units. This stockpile was crucial for Deepseek’s development, despite the chips being outdated. In a market dominated by giants like Alibaba and ByteDance, Deepseek stands out for not seeking investor funds. Former employee Zihan Wang mentions having access to ample computing resources and the freedom to experiment at Deepseek.

Liang acknowledges the inefficiency in Chinese AI development, often requiring double the computing power for similar results. Deepseek has found ways to reduce memory usage and speed up calculations without sacrificing accuracy. Liang remains heavily involved in research, fostering a culture of collaboration and hardcore research within the team.

Chinese companies are increasingly adopting open-source principles. Alibaba Cloud has released over 100 open-source AI models, supporting 29 languages. Startups like Minimax and 01.AI have also made their models open source. A whitepaper from the China Academy of Information and Communications Technology notes that 36% of large AI language models globally are from China, making it the second-largest provider after the US.

US export controls have forced Chinese companies to manage their limited computing resources more efficiently. Matt Sheehan from Carnegie Endowment for International Peace predicts a consolidation in the industry due to this resource scarcity. Alibaba Cloud recently partnered with 01.AI to create a joint research lab, reflecting a trend towards division of labor in the AI sector. The rapid development of AI requires Chinese companies to be agile to survive, and how the US responds remains to be seen.