Chinese Startup DeepSeek Unveils AI Model Surpassing Meta and OpenAI
Chinese AI startup DeepSeek has introduced its own large language model, which has outperformed competitors from Meta and OpenAI in tests.
? Introducing DeepSeek-V3!
Biggest leap forward yet:
⚡ 60 tokens/second (3x faster than V2!)
? Enhanced capabilities
? API compatibility intact
? Fully open-source models & papers? 1/n pic.twitter.com/p1dV9gJ2Sd
— DeepSeek (@deepseek_ai) December 26, 2024
DeepSeek V3 boasts 671 billion parameters. In comparison, Llama 3.1 405B has 405 billion. This figure reflects the AI’s ability to adapt to more complex applications and provide more accurate responses.
The Hangzhou-based company trained the neural network in two months with $5.58 million, using significantly fewer computational resources (2048 GPUs) compared to larger tech companies. It promises to offer the best price-to-performance ratio in the market.
? API Pricing Update
? Until Feb 8: same as V2!
? From Feb 8 onwards:
Input: $0.27/million tokens ($0.07/million tokens with cache hits)
Output: $1.10/million tokens? Still the best value in the market!
? 3/n pic.twitter.com/OjZaB81Yrh
— DeepSeek (@deepseek_ai) December 26, 2024
Future plans include adding multimodality and “other advanced features.”
OpenAI team member Andrej Karpathy noted that DeepSeek has demonstrated impressive research and development under limited resources.
DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M).
For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being… https://t.co/EW7q2pQ94B
— Andrej Karpathy (@karpathy) December 26, 2024
“Does this mean you don’t need large GPU clusters for frontier LLM? No, but you should ensure you’re not wasting what you have. This looks like a good demonstration that there’s still much to be done with both data and algorithms,” he added.
Previously, DeepSeek introduced a “competitor to OpenAI’s o1″—the thinking “super-powered” AI model DeepSeek-R1-Lite-Preview.
Back in July, Chinese company Kuaishou unveiled the Kling AI model for video generation to the public.
Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!