Challenges and Misattributions in OpenAI's ChatGPT Search: A Study by Tow Center for Digital Journalism

In the age of AI, not every search is conducted through traditional search engines like Google and Bing. Many searches now begin with AI tools like OpenAI’s ChatGPT. To further cater to AI search needs, OpenAI recently launched ChatGPT Search. This tool promises to provide quick, relevant answers with links to web sources, combining natural language interface benefits with updated information like sports scores and news.

However, a study by the Tow Center for Digital Journalism highlights some significant issues with ChatGPT Search, particularly in accurately attributing content to the correct sources. This is concerning for publishers as the tool often misattributes content, leading to potential copyright issues.

ChatGPT Search was designed to help publishers present their content to millions of users through search queries. Several major media houses, such as Axel Springer, The Atlantic, Vogue, and others, have partnered with OpenAI. However, some publishers have taken legal action against OpenAI for unauthorized content use, while others have blocked the OpenAI crawler using robots.txt, though this is not legally enforceable.

The Tow Center conducted an analysis to see how ChatGPT Search handles publisher content. They tested 200 quotes from 20 different publications, asking ChatGPT to identify the sources. The results varied; some responses were correct, many were incorrect, and some were partially correct. Alarmingly, ChatGPT rarely admitted its inability to provide an accurate answer, often fabricating responses.

This poses significant problems for both publishers and users. For example, ChatGPT misattributed a letter from the Orlando Sentinel to a Time article. This is particularly concerning as Time collaborates with OpenAI. Users unaware of AI’s potential for errors might accept these misattributions as truth.

In another case, a New York Times article, which prohibited crawling, was misattributed by ChatGPT to a different source that had plagiarized the NYT article. This raises questions about ChatGPT Search’s ability to correctly attribute sources and provide transparency.

Traditional search engines typically match quotes accurately or indicate when no results can be found. In contrast, AI services often attempt to provide an answer regardless of accuracy. This can lead to confusion about the original source, which is dangerous for publishers and the digital media landscape.

The report suggests that by treating journalism as decontextualized content, ChatGPT risks distancing audiences from publishers and encouraging plagiarism over quality reporting.

OpenAI responded to the analysis, acknowledging the challenges of misattribution and the need for improvement. They support publishers by helping users discover quality content through summaries and clear links. OpenAI is working to enhance citation accuracy and respect publisher preferences.

The Tow Center shared their methodology but withheld data. Further studies are needed to understand the extent of these issues. Until improvements are made, users of ChatGPT Search should verify source attributions when necessary.

Challenges and Misattributions in OpenAI’s ChatGPT Search: A Study by Tow Center for Digital Journalism

Related