Telegram (AI) YouTube Facebook X
Ру
Office AI: inside the new race between Anthropic and OpenAI

The Race for the AI Office Worker

OpenAI and Anthropic train office AI agents in RL simulators—and eye 'virtual co-workers'.

Leading artificial-intelligence developers such as OpenAI and Anthropic have entered a race to build autonomous AI agents for office work, reports The Information. Against the backdrop of Jensen Huang’s claim that IT departments will turn into ‘HR for neural networks’, the approach is fast becoming a defining industry trend.

ForkLog looks at how large language models (LLMs) are being adapted to carry out corporate tasks.

A new school

Traditionally, AI is trained in two stages: first a model consumes vast amounts of data, then it is fine-tuned for a specialism.

For office work, that is not enough. A model must learn to operate applications like a person—taking goal-directed actions in a digital environment while grasping cause and effect.

The gap between merely ‘knowing’ and ‘doing’ has given rise to a new school for artificial intelligence. Instead of learning only from static data, models are now sent on ‘internships’ in virtual copies of real office apps.

These so-called RL environments are simulations of popular services such as Salesforce, LinkedIn or Gmail, where a model experiments, receives feedback and improves its skills.

Simulators for AI agents

RL environments are akin to flight simulators for AI agents. Leading labs are already building them. Turing was among the first startups to create more than 1,000 virtual trainers—from Airbnb and Zendesk clones to Microsoft Excel sheets. The project supplies its tools to, among others, OpenAI. In June 2025 it raised $111m.

Others have followed. Scale has received $14bn from Meta, while rival Surge is in talks to raise $1bn.

An AI agent’s ‘training day’ typically proceeds as follows:

  • first it receives a natural-language instruction—for example, analyse a Salesforce database, find clients with no contact for more than six months and email them to propose a meeting;
  • the model ‘wakes up’ inside the app’s virtual interface and begins to act by trial and error;
  • a checklist of correct actions covers each detail, and a system verifies completion. If the agent succeeds, its strategy is reinforced. If not, error analysis helps adjust steps on the next attempt.

The chief advantage is scale and safety. An AI agent can iterate on the same task for hours, turning actions into muscle memory—without spamming real clients, breaking a database or deleting critical information.

Costly teachers

To make training as effective as possible, companies hire specialists across fields—from biology to software engineering and medicine. They demonstrate how to use workplace tools correctly. The model captures not only the steps but the expert’s decision logic.

Standards tighten as progress is made. Early on, student-level knowledge sufficed; now labs recruit professionals from corporations such as NASA and other government projects.

Demand is pushing up prices. According to Labelbox, which supplies specialists to OpenAI and other giants, about 20% of its contractors earn more than $90 per hour and nearly 10% make over $120. Rates for top experts are expected to jump to $150–250 within the next 18 months.

OpenAI plans to spend about $1bn on experts and RL environments in 2025 and $8bn by 2030. Anthropic, by rumour, may devote up to $1bn over the next year solely to creating and using virtual applications.

The bigger prize

One might assume Anthropic and OpenAI are pouring billions into simulators and experts merely to make their models a touch smarter. The real goal is larger: to break through the ceiling of today’s AI and build a new business model.

First, training LLMs on internet text to predict the next word has hit diminishing returns. RL environments offer a qualitatively different path. They enable AI not just to generate text but to act within complex, multi-step processes—the key to real autonomy.

According to Surge CEO Edwin Chen, Anthropic and OpenAI’s methods “mirror how people learn”, placing AI models in conditions as close as possible to the real world.

Most importantly, RL environments promise monetisation. For the AI giants, selling API access to a chatbot is only the first step. The next, far more valuable model is renting out ‘virtual employees’. In this hybrid reality, AI agents will chiefly process data and handle administrative work.

On the one hand, such a shift sparks optimism about unprecedented gains in productivity and the quality of routine operations. On the other, it raises understandable anxiety about job displacement.

Will they replace us all?

Startup Magazine argues that the ideal hybrid model augments, rather than replaces, people. As an example, experts highlighted customer support:

  • from the employee’s standpoint. A digital assistant, trained on an internal knowledge base and able to adopt the company’s brand voice, takes on up to 80% of routine tasks: tracking orders, answering FAQs, generating standard reports. That frees people from monotony, lowers burnout risk and lets them focus on complex, emotionally charged cases requiring empathy and non-standard thinking;
  • from the client’s standpoint. They get instant, accurate answers to simple queries at any hour; when they do reach a human, the service is deeper and better because the specialist is not consumed by drudgery.

Experts also note that success will depend on the company’s approach. Leaders need to focus on employee sentiment, invest in reskilling and keep experimenting.

The strategy of big firms betting on AI agents indirectly supports this view. The aim is not to replace humans but to give them a powerful partner that shoulders most day-to-day tasks.

Moreover, as studies show, the rapid advance of AI has not materially affected the labour market.

An economy-sized simulator

The vision of the future discussed in OpenAI’s corridors is bolder still. As The Information reports, one of the company’s senior executives privately said he expects the “entire economy” to turn into one big “RL machine”.

That would mean AI learning not from artificial simulations but from real recordings of professionals’ workflows across the world: how a doctor makes a diagnosis in a medical system, a logistics specialist optimises supply chains, or a lawyer drafts a contract.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X
Нашли ошибку в тексте? Выделите ее и нажмите CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

We use cookies to improve the quality of our service.

By using this website, you agree to the Privacy policy.

OK