Site iconSite icon ForkLog

DeepSeek Unveils Updated AI Model V3.1

DeepSeek Unveils Updated AI Model V3.1

Chinese AI startup DeepSeek has updated its flagship AI model V3 and removed the mention of the reasoning neural network R1 in its chatbot, according to SCMP.

The company announced the release of V3.1 on WeChat. The update expands the model’s context window to 128,000 tokens, allowing it to retain more information during user interactions. This volume is equivalent to a book of approximately 300 pages.

The model is also noted for its high token efficiency.

Source: X.

In the Aider Polyglot benchmark, which evaluates LLMs in solving complex programming tasks across multiple languages, DeepSeek V3.1 outperforms Claude 4 Opus.

V3.1 maintains a balance between speed and quality of generation. It contains 685 billion parameters and is based on a hybrid architecture, providing high performance in tasks of dialogue, reasoning, and programming.

DeepSeek has removed the mention of R1 from its deep thinking function. SCMP speculated that this might indicate challenges in developing the anticipated R2 version.

Update:

On August 21, the company released an official announcement on X.

Key features include:

  • Hybrid reasoning mode — the model decides on its own whether to engage more resources for “thinking” about a question;
  • Faster thinking — V3.1 provides answers more quickly than DeepSeek-R1-0528;
  • Enhanced agent skills.

The AI startup DeepSeek gained attention in January with the release of the R1 model, focused on reasoning. It demonstrated high efficiency with low capital investment, leading experts to question the necessity of billion-dollar investments in the AI sector and the industry’s potential overvaluation.

In June, the Chinese startup began hiring interns to label medical data to improve the application of artificial intelligence in hospitals.

Exit mobile version