AI search engines gather information from various websites to generate answers. It’s not surprising that false or misleading information can influence search results. Obvious misinformation is relatively easy to spot. However, it becomes more challenging when false information or instructions are hidden within websites. The AI search may respond, but the person seeking an answer might not find it.
The British Guardian tested this type of poisoning with ChatGPT. OpenAI’s search function is currently available for paying ChatGPT customers. When prompted to summarize websites with hidden content, the AI search also used those hidden pieces of information. The security risk for AI models under the term “Prompt Injection” is already known. It involves prompts designed to elicit behavior from the models that the provider does not intend. Depending on the attacker’s or website operator’s intent, this can lead to different outcomes.
Fake websites with malicious intentions can, for example, cause a brand or provider to manipulate AI search engines with hidden prompts to rate products more favorably or ignore negative reviews. For the test, a website was created that looked like a product review page. ChatGPT was then asked about a camera reviewed on the site. The responses matched the fake product reviews.
Even more drastic: In tests where there were hidden prompts on the fake site instructing ChatGPT to rate a product positively, the AI chatbot followed this instruction despite negative reviews. This means the hidden prompts could virtually overwrite the actual text of a webpage.
A security researcher was consulted on the results. Jacob Larsen mentioned it is a significant risk that people might create websites solely to manipulate chatbots in the future. However, he also believes that OpenAI could partially solve the problem.