AI translated#Anthropic #Artificial Intelligence

Claude Opus 4.6 Surpasses GPT-5.2 in Logic Tests and Introduces ‘Agent Teams’

Anthropic upgraded Claude Opus to 4.6, enhancing planning and task management.

06.02.2026 ForkLog

AI startup Anthropic has upgraded its flagship model, Claude Opus, to version 4.6. The neural network has improved its ability to plan actions, handle long-term tasks, and work more efficiently with large codebases.

The context window has been expanded to 1 million tokens, allowing for the analysis of massive documents and extended dialogues without losing the logical thread.

The updated algorithms are tailored for professional tasks: conducting financial analysis, research, and the use and creation of documents, spreadsheets, and presentations.

Opus 4.6 received the highest score in the Terminal-Bench 2.0 programming test and outperformed competitors in the complex interdisciplinary benchmark for logical reasoning, Humanity’s Last Exam.

Comparison of Opus 4.6 with competitors in various tests. Source: Anthropic.

In GDPval-AA, which assesses reasoning and decision-making quality, the model surpassed OpenAI’s GPT-5.2. LLM also achieved better results in BrowseComp, which measures the ability to find hard-to-access information online.

Opus 4.6 efficiently extracts data from large documents. Thanks to the expanded context window, the model tracks and captures subtle hidden details.

Agent Teams

A key innovation is the ability to create groups of agents for collaborative work. In this mode, multiple AI assistants work in parallel and coordinate their tasks autonomously.

This tool is suitable for assignments that are divided into independent parts and require the analysis of large amounts of text.

Closed Loop

Anthropic stated that they are “creating Claude with Claude.” Developers write code using their own AI model, and each new product undergoes testing on internal company tasks before release.

The team found that Opus 4.6 pays more attention to the most challenging parts of a task without additional instructions, quickly completes simple tasks, handles ambiguous problems better, and maintains efficiency over long distances.

“Opus 4.6 often thinks more deeply and thoroughly revises its reasoning before making a decision. This yields better results in solving complex cases but may increase costs and expenses in simpler ones,” the company noted.

Safety

An automated audit revealed that Opus 4.6 has a low propensity for undesirable behavior: deception, flattery, reinforcing user misconceptions, and aiding in improper actions.

To evaluate the model, the company conducted the most comprehensive series of assessments, applying new testing methodologies and enhancing existing ones for the first time.

Availability and New Features

Claude Opus 4.6 is now available via web interface, API, and on major cloud platforms.

New features in the developer toolkit include:

adaptive thinking — the neural network independently determines when to engage in deep reasoning mode;
effort adjustment — four levels of work intensity are provided, from low to maximum;
context compression — the tool automatically summarizes and replaces old context when the conversation approaches the token limit.

Opus 4.6 works better with office tools like Excel and PowerPoint.

Back in January, Anthropic CEO Dario Amodei predicted the imminent emergence of AGI and job reductions.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X

Found a mistake? Select it and press CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

US Uncovers $2.5 Billion AI Chip Smuggling Scheme to China

The right to be offline

OpenClaw Hype Triggers Phishing Attacks on Crypto Wallets

AI Band Neon Oni to Tour Japan with Live Musicians

Criticism Mounts Over Nvidia’s DLSS 5 AI Technology

Samsung Adopts Crisis Measures Amid Memory Shortage

Brain Implant Enables Paralysed Individuals to Type on Virtual Keyboard

OpenAI Unveils Fast AI Models GPT-5.4 Mini and GPT-5.4 Nano