OpenAI has introduced its first AI agent, Operator, designed to handle various tasks for users. Operator can autonomously browse the web to make restaurant reservations, compile shopping lists, or create memes. In a demonstration video, Sam Altman and his colleagues showcased the skills Operator currently possesses.
Operator’s interface is similar to ChatGPT, making it familiar to users. The interface includes several linked services like Uber, OpenTable, or eBay, which can be selected to instruct Operator to complete tasks through these services. For instance, Operator can book a restaurant table by selecting OpenTable and specifying the restaurant and time. The AI then opens a browser in the cloud, accesses the OpenTable website, and searches for the restaurant and time.
Operator is capable of self-correction. In a demonstration, the AI initially selected the wrong restaurant location. However, since the OpenAI team had set their location as San Francisco, Operator adjusted the restaurant choice. When the desired time wasn’t available, Operator suggested an alternative.
In another example, OpenAI staff uploaded a shopping list photo, allowing Operator to autonomously shop for the items. Users can monitor each step in the interface and intervene if necessary, or let the AI proceed, freeing up their time for other tasks. If Operator encounters logins, CAPTCHAs, or payment methods during shopping, users must intervene, as the AI doesn’t handle payment with personal data.
Operator navigates websites using screenshots and a virtual mouse and keyboard, powered by a new AI model called Computer-Using Agent (CUA). CUA processes visual elements and is trained with user interfaces to recognize menus, text fields, and buttons. Despite being in its early development phase, CUA has achieved excellent results in WebArena and WebVoyager benchmarks.
OpenAI emphasizes that Operator is currently a preview, and problems may still occur. Therefore, Operator is initially available to a small group of users, specifically Pro users in the USA, who pay $200 per month for access. OpenAI plans to eventually offer Operator to Plus, Team, and Enterprise users and integrate some of its features into ChatGPT.
With these advancements, OpenAI continues to develop AI tools that can be used locally. The company is exploring various applications of AI, aiming to enhance user experience and productivity. As AI technology progresses, users can expect more sophisticated and efficient tools to assist with everyday tasks.
OpenAI’s Operator exemplifies the potential of AI to streamline daily activities, offering a glimpse into a future where AI agents can handle more complex tasks, allowing users to focus on more critical aspects of their lives and work. As AI continues to evolve, it promises to revolutionize how we interact with technology, making it a more integral part of our routines.
While Operator is currently in a limited preview phase, its capabilities demonstrate the promise of AI in automating tasks and improving efficiency. OpenAI’s ongoing efforts in AI development highlight the transformative power of technology and its potential to reshape various aspects of our lives.