DeepSeek, an AI startup based in Hangzhou, China, has garnered significant attention in the global AI industry with the launch of the large language model DeepSeek-V3 in late December 2024. This model has 671 billion parameters but only took about two months to train at a cost of 5.58 million USD, significantly lower than the investments of other major technology companies. DeepSeek-V3 achieves top performance among open-source models and compares with the most advanced models in the world. The company has optimized the training process to minimize costs, using about 2.78 million GPU hours with Nvidia's H800 GPUs produced in China. This demonstrates that Chinese AI companies have made significant progress despite restrictions from the US in accessing the advanced semiconductors needed for AI training. The success of DeepSeek has raised concerns in the US technology industry, with stocks of Nvidia and other technology companies plummeting. Experts believe that DeepSeek has achieved high performance at a much lower cost than its US counterparts, thanks to the use of open source technology and effective training methods. In addition, DeepSeek has released the source code and detailed technical explanations of the model, allowing researchers and developers worldwide to access and improve this technology. This transparency contrasts with the more secretive approach of leading US AI companies and may change the way large technology companies develop models in the future.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
#Deepseek Goes Viral
DeepSeek, an AI startup based in Hangzhou, China, has garnered significant attention in the global AI industry with the launch of the large language model DeepSeek-V3 in late December 2024. This model has 671 billion parameters but only took about two months to train at a cost of 5.58 million USD, significantly lower than the investments of other major technology companies.
DeepSeek-V3 achieves top performance among open-source models and compares with the most advanced models in the world. The company has optimized the training process to minimize costs, using about 2.78 million GPU hours with Nvidia's H800 GPUs produced in China. This demonstrates that Chinese AI companies have made significant progress despite restrictions from the US in accessing the advanced semiconductors needed for AI training.
The success of DeepSeek has raised concerns in the US technology industry, with stocks of Nvidia and other technology companies plummeting. Experts believe that DeepSeek has achieved high performance at a much lower cost than its US counterparts, thanks to the use of open source technology and effective training methods.
In addition, DeepSeek has released the source code and detailed technical explanations of the model, allowing researchers and developers worldwide to access and improve this technology. This transparency contrasts with the more secretive approach of leading US AI companies and may change the way large technology companies develop models in the future.