DeepSeek V3: The Chinese AI That’s Outsmarting the World


DeepSeek, a Chinese AI firm, has launched DeepSeek V3, a groundbreaking open AI model boasting unparalleled capabilities in text-based tasks such as coding, translating, and content creation. With a staggering 671 billion parameters, DeepSeek V3 surpasses its competitors, including Meta’s Llama 3.1 and OpenAI’s GPT-4, in both open and closed AI model categories. It shines particularly in coding competitions and the Aider Polyglot test, demonstrating superior integration of new code into existing frameworks.

Trained on a dataset of 14.8 trillion tokens, DeepSeek V3’s development represents a significant achievement, especially considering it was trained on Nvidia H800 GPUs—a hardware recently restricted by the U.S. Department of Commerce. Remarkably, the training process took only two months and cost $5.5 million, a fraction of the cost associated with similar models like GPT-4.

However, DeepSeek V3’s political neutrality is questionable, as it avoids topics like Tiananmen Square, reflecting China’s regulatory influence on AI to promote “core socialist values.” Despite this, DeepSeek’s advancements, supported by High-Flyer Capital Management, underscore the rapid evolution of AI technology and the diminishing gap between open and closed-source AI models.
Read more at TechCrunch…