Site iconSite icon ForkLog

OpenAI Unveils AI Agent ‘Operator’

OpenAI Unveils AI Agent 'Operator'

OpenAI has introduced its own AI agent, ‘Operator’, capable of performing tasks on the internet on behalf of users.

The new tool can browse web pages and interact with them, type text, scroll, and click buttons.

‘Operator’ can be asked to perform numerous repetitive basic tasks such as filling out forms, ordering groceries, or booking hotels.

“The ability to use the same interfaces and tools that people interact with daily expands the application of AI, helping to save time on everyday tasks and opening new opportunities for business interaction,” states the OpenAI announcement.

‘Operator’ is powered by the new AI model Computer-Using Agent (CUA). It combines the ability of GPT-4o to see the screen with enhanced reasoning through reinforcement learning. The agent processes information via screenshots and can perform the same actions as a human using a mouse and keyboard.

The model is trained to request confirmation before completing tasks such as booking a hotel or sending an email.

Currently, a preliminary research version is operational, which will evolve based on user feedback. The AI agent is available to ChatGPT Pro subscribers for $200 in the US on a dedicated platform. Future plans include expanding access to more users.

At this stage, the agent does not function perfectly, and if it encounters difficulties, it will ask the user to complete the task.

Back in October 2024, AI startup Anthropic released an updated version of the Claude 3.5 Sonnet model, which can interact with a computer like a human—moving the cursor, clicking buttons, and typing text.

Exit mobile version