OpenAI's Deep Research: A New AI Agent for Extensive Internet Research

OpenAI has introduced a new AI agent named Deep Research, designed primarily for extensive research tasks. This agent utilizes the recently released model O3, which is also available in a mini-version. OpenAI’s portfolio now includes numerous models, agents, and subscriptions, which can be overwhelming to keep track of.

At the core of OpenAI’s offerings is ChatGPT, and Deep Research is an agent embedded within this chatbot. According to a blog post, it can handle “multilayered research on the internet and take on complex tasks.” The agent requires only a few dozen minutes to accomplish what might take a human many hours. However, this does not account for the time needed for human oversight, which follows any AI-driven research. OpenAI, like other AI providers, emphasizes the need for a “human-in-the-loop,” meaning a person must review and approve the AI’s output. While Deep Research can experience hallucinations, OpenAI claims this occurs less frequently compared to other models.

Deep Research is based on a version of the announced O3 model, optimized for internet browsing and data analysis. OpenAI describes Deep Research as a significant step towards developing an AGI (Artificial General Intelligence) that they believe will advance scientific research.

Users can access Deep Research via the regular input field in ChatGPT’s web version, but only with a paid account. OpenAI suggests that the new AI agent can be used for tasks such as comparing streaming services, although this might not be the first thing that comes to mind when thinking of scientific research. Answering such questions can take between 5 and 30 minutes, which implies significant costs for OpenAI. Currently, the output is limited to text, but images and graphics are expected to be added in the future.

OpenAI notes that Deep Research is ideal for lengthy investigations where precision and citation are particularly important. In contrast, GPT-4O is the preferred model for real-time multimodal conversations. The new AI agent can answer 26.6% of questions in the benchmark “Humanity’s Last Exam,” which focuses on scientific topics. Previous models could only achieve up to 10%. GPT-4O scores 3.3%, O3-mini-medium reaches 10.5%, and O3-mini-high achieves 13%, according to OpenAI.

Another recently introduced AI agent by OpenAI is called Operator. This agent is also designed to find information on the internet and take on additional tasks. If provided with credit card details, it can even place orders.

OpenAI’s Deep Research: A New AI Agent for Extensive Internet Research

Related Posts