AI-Generated Fake Research Papers Threaten Scientific Integrity

Swedish researchers are raising alarms about AI-generated “research findings” appearing in Google Scholar, databases, and even respected journals. The abundance of false information could overwhelm quality control in sciences and threaten the integrity of scientific records. They warn that generative AI can create misleading documents that seem scientific and are optimized to rank high in public search engines, especially Google Scholar. This possibility undermines trust in science and poses a serious threat to society. False “results” could be used to influence societal decisions.

A group of scientists from the University of Borås and the Swedish University of Agricultural Sciences examined a sample of supposedly scientific documents from Google Scholar. They set a low bar for selection: if a document contained phrases like “as of my last knowledge update” or “I don’t have access to real-time data,” typical of GPT versions 3.5 and 4 outputs, they downloaded it. Out of 227 search results, they excluded 88 because they disclosed the use of AI or its use was legitimate. This left 139 documents partially generated by AI, with their creators failing to hide their methods.

Analysis showed that nearly one in seven of these documents appeared in a respected scientific journal. Almost two-thirds appeared in other scientific journals. About one-seventh were student papers from university databases, and a small portion were working papers. Topics in these fake studies mainly covered computer science, environment, and health, with fish and their breeding being a leading subject. Some of these dubious works were found elsewhere, like on Researchgate, IEEE, various websites, and social networks.

The researchers suggest that a simple solution does not exist. They consider simultaneous efforts in technology, education, and regulation necessary. It’s not enough to identify fraudulent works; understanding how they reach the audience and why some persist is crucial. Search engines could offer filters for scientific journal classes or peer-reviewed sources. The search index should be transparently created and based on scientific criteria. “Since Google Scholar has no real competitor, there are strong reasons to establish a freely available, general scientific search engine operated in the public interest,” the authors recommend.

They emphasize that this is not just a technical problem caused by AI text generation. The issue should be addressed in the context of the “broken” scientific publishing system and ideological battles over knowledge control. Raising awareness, especially among decision-makers and influencers like journalists, is also suggested.

The Swedish study was peer-reviewed and published in the Misinformation Review journal of the Harvard Kennedy School in September. The study aimed not to statistically measure the problem but to highlight the tip of the iceberg. “Our analysis shows that questionable and potentially manipulative GPT-fabricated papers are penetrating the research infrastructure and are likely to become a widespread phenomenon,” the researchers write. “Our findings underscore the need to take the risk of false scientific papers used as maliciously manipulative evidence seriously.”

Related