Chinese AI laboratory DeepSeek has updated its “reasoning” AI model R1. Its “distilled” version is capable of running on a single graphics card.
DeepSeek-R1-0528-Qwen3-8B is based on Qwen3-8B, which Alibaba introduced in May. According to the company, it outperformed Google’s Gemini 2.5 Flash in AIME 2025—a collection of complex mathematical questions.
The “distilled” version is a simplified and accelerated variant of a large machine learning model, achieved through the method of knowledge distillation. Such neural networks are often less performant but much less demanding computationally.
According to NodeShift, Qwen3-8B requires a graphics processor with 40-80 GB of video memory. It can be run on a single Nvidia H100 graphics card.
DeepSeek used the updated version of R1 and Qwen3-8B for training and tuning DeepSeek-R1-0528-Qwen3-8B.
The new variant of the main R1 neural network has minor updates, the company claims. It is available on the Hugging Face platform.
A developer with the nickname xlr8harder noted that the model is less willing to engage in discussions on controversial topics, especially those related to the Chinese government.
Deepseek R1 0528 is substantially less permissive on contentious free speech topics than previous Deepseek releases.
It’s unclear if this indicates they’ve adapted their post-training goals, or if this is another example of a reasoning model. pic.twitter.com/BPOYodBCAH
— xlr8harder (@xlr8harder) May 29, 2025
“DeepSeek deserves criticism for this release: this model is a significant step back for free speech. It is mitigated by the fact that the neural network is open source with a permissive license, so the community can (and will) address this issue,” he noted.
In one example, the model refused to provide arguments for human rights violations in internment camps in Xinjiang. It acknowledged the fact but avoided direct criticism of the Chinese government.
“It’s interesting, though not entirely surprising, that it can cite the camps as an example of human rights violations, but denies it when asked directly,” wrote xlr8harder.
Back in April, DeepSeek released a new math-oriented AI model, Prover, to the public.
