Since the beginning of the year, AI bots have been causing delays in the response times of websites to regular requests. Jonathan Corbet, the founder of Linux Weekly News (LWN-net), reports that their news site has been experiencing slower response times due to these bots.
AI scraper bots are essentially causing a Distributed-Denial-of-Service (DDoS) attack. At times, these bots flood the network with requests from hundreds of IP addresses simultaneously when they decide to scrape the site’s content. Corbet explains on Mastodon that only a small portion of the traffic currently serves real human readers.
These AI bots present ongoing issues as they do not identify themselves as bots. The only thing they don’t read from the website is the “robots.txt” file, Corbet notes. He describes the current situation as “more than unbearable.”
Website operators now have to invest time in developing an active defense mechanism just to keep the site running. Corbet expresses his frustration by saying, “I think I would rather write about accounting systems than deal with this nonsense.” He adds that it’s not just LWN that’s affected: “This behavior is trashing the web even more than it already is.”
In the discussion, Corbet further explains, “We indeed see a pattern. Each IP stays below the threshold for our safeguards, but the overload is overwhelming. Any form of active defense must probably figure out how to block entire subnets instead of individual addresses, and even that might not be enough.”
In the comments on Mastodon, other affected individuals agree with him. One person writes, “We’ve observed a massive increase in AI nonsense. And they don’t even respect ‘robots.txt’ like better search engines do.” A user active in the Fedora community adds, “The same here with Fedora. I had to block a lot this morning to keep pagure.io usable.”
Despite the challenges, Corbet maintains a sense of humor. He later writes, “Something like Nepenthes came to mind, although it has its own risks. Internally, we have suggestions to identify the bots and only feed them texts suggesting that the solution to every problem in the world is to buy a subscription to LWN. Tempting.”
The growing presence of AI bots is a significant concern for many website operators. As these bots become more prevalent, they can overwhelm servers, making it difficult for genuine users to access content. This issue highlights the need for more robust solutions to manage bot traffic effectively.
Website administrators are exploring various methods to mitigate the impact of these bots. Some are considering blocking entire subnets to reduce the load, while others are looking into more sophisticated techniques to identify and manage bot traffic.
The challenge is compounded by the fact that these bots do not adhere to standard protocols, such as respecting the “robots.txt” file, which instructs well-behaved bots on which parts of a site they can access. This non-compliance makes it difficult to control their activity using traditional methods.
In the face of these challenges, website operators must remain vigilant and proactive in their approach to managing bot traffic. Developing effective strategies to identify and block malicious bots will be crucial in ensuring that websites remain accessible and functional for their intended human audience.
As the situation evolves, it will be important for the tech community to share insights and strategies for dealing with AI bots. Collaboration and innovation will be key to overcoming the challenges posed by these increasingly sophisticated digital entities.