Deepseek, a Chinese AI company, has unveiled its latest model, Deepseek V3, which is open-source and available for various applications and adaptations. According to internal benchmark tests reported by TechCrunch, Deepseek V3 surpasses both open and closed AI models like Meta’s Llama 3.1 and OpenAI’s GPT-4o in several categories.
Deepseek V3 excels in programming tasks. On Codeforces, a platform hosting programming competitions, it performed better than its competitors. It also topped the Aider Polyglot benchmark test, which evaluates if a model can write new code that integrates successfully with existing code.
Deepseek V3 represents a new level of open-source AI. The model was trained with a massive dataset of 14.8 trillion tokens. For perspective, one million tokens are roughly equivalent to 750,000 words. With 671 billion parameters, or 685 billion on the AI-Dev platform Hugging Face, it is significantly larger than Meta’s Llama 3.1, which has only 405 billion parameters. The number of parameters often correlates with a model’s performance but also demands powerful hardware.
Without optimization, Deepseek V3 requires a series of advanced GPUs to function at an acceptable speed. The development efficiency is impressive: despite US trade restrictions, the Chinese AI company trained the model in two months on Nvidia H800 GPUs, spending only 5.5 million USD. In comparison, estimates suggest OpenAI invested much more in training GPT-4.
Deepseek is not new to the AI scene. With its previously released model, Deepseek-R1, the company competed with OpenAI’s o1-Reasoning model. The organization is backed by High-Flyer Capital Management, a hedge fund that focuses on AI-based trading strategies and operates large data centers with thousands of Nvidia GPUs.
Despite its technical achievements, Deepseek V3 faces a clear challenge of a political nature. Chinese AI models must align with the “core values of socialism,” as mandated by China’s internet regulatory authority. Consequently, many Chinese AI systems avoid responding to sensitive questions that might upset regulators.
For instance, Deepseek V3 remains silent when asked about the Tiananmen Square Massacre in Beijing. Instead of acknowledging the bloody suppression of the Chinese pro-democracy movement protests, the bot merely states that it is designed to provide useful and harmless answers.