NEW YORK :Cloudflare has launched a tool that blocks bot crawlers from accessing content without permission or compensation to help websites make money from AI firms trying to access and train on their content, the software company said on Tuesday.
The tool allows website owners to choose whether artificial intelligence crawlers can access their material and set a price for access through a “pay per crawl” model, which will help them control how their work is used and compensated, Cloudflare said.
With AI crawlers increasingly collecting content without sending visitors to the original source, website owners are looking to develop additional revenue sources as search traffic referrals that once generated advertising revenue decline.
The initiative is supported by major publishers including Condé Nast and Associated Press, as well as social media companies such as Reddit and Pinterest.
Cloudflare’s Chief Strategy Officer Stephanie Cohen said the goal of such tools was to give publishers control over their content, and ensure a sustainable ecosystem for online content creators and AI companies.
“The change in traffic patterns has been rapid, and something needed to change,” Cohen said in an interview. “This is just the beginning of a new model for the internet.”
Google, for example, has seen its ratio of crawls to visitors referred back to sites drop to 18:1 from 6:1 just six months ago, according to Cloudflare data, suggesting the search giant is maintaining its crawling but decreasing referrals.
The decline could be a result of users finding answers directly within Google’s search results, such as AI Overviews. Still, Google’s ratio is much higher than other AI companies, such as OpenAI’s 1,500:1.
For decades, search engines have indexed content on the internet directing users back to websites, an approach that rewards creators for producing quality content. However, AI companies’ crawlers have disrupted this model because they harvest material without sending visitors to the original source and aggregate information through chatbots such as ChatGPT, depriving creators of revenue and recognition.
Many AI companies are circumventing a common web standard used by publishers to block the scraping of their content for use in AI systems, and argue they have broken no laws in accessing content for free.
In response, some publishers, including the New York Times, have sued AI companies for copyright infringement, while others have struck deals to license their content.
Reddit, for example, has sued AI startup Anthropic for allegedly scraping Reddit user comments to train its AI chatbot, while inking a content licensing deal with Google.