Skip to content

Creative Commons Expresses Cautious Support for AI 'Pay-to-Crawl' Systems

Creative Commons Expresses Cautious Support for AI 'Pay-to-Crawl' Systems
Published:

Creative Commons (CC), the nonprofit organization known for its open licensing framework, has announced its "cautious support" for "pay-to-crawl" technology. This system aims to automate compensation for website content accessed by machine agents, such as AI webcrawlers.

The announcement follows CC's earlier unveiling of a framework for an open AI ecosystem, which sought to establish legal and technical guidelines for data sharing between content owners and AI developers. The concept of pay-to-crawl involves charging AI bots for scraping site content used in model training and updates.

According to a CC blog post, the organization believes that, "Implemented responsibly, pay-to-crawl could represent a way for websites to sustain the creation and sharing of their content, and manage substitutive uses, keeping content publicly accessible where it might otherwise not be shared or would disappear behind even more restrictive paywalls." This system could also benefit smaller web publishers that lack the resources to negotiate individual content deals with AI providers, a practice already observed between entities like OpenAI and Condé Nast, as well as Amazon and The New York Times.

Historically, websites allowed webcrawlers to index content freely, benefiting from increased search engine visibility. However, the rise of AI chatbots has altered this dynamic, with consumers often receiving direct answers, reducing the need to click through to original sources. This shift has reportedly impacted publisher traffic.

Despite its support, CC outlined several caveats, including concerns that such systems could centralize power on the web and potentially restrict access for researchers, nonprofits, cultural heritage institutions, and educators. The organization proposed principles for responsible implementation, advocating against pay-to-crawl as a default setting or blanket rule. It also suggested that systems should allow for content throttling instead of outright blocking, preserve public interest access, and be open, interoperable, and built with standardized components.

Several companies are actively developing solutions in this area. Cloudflare has introduced a marketplace for websites to charge AI bots, while Microsoft is reportedly building an AI marketplace for publishers. Other initiatives include startups like ProRata.ai and TollBit, alongside the RSL Collective, which launched a Really Simple Licensing (RSL) standard to dictate crawler access. CC has also expressed support for the RSL standard, aligning with its broader CC Signals project aimed at developing tools for the AI era.

More in Live

See all

More from Industrial Intelligence Daily

See all

From our partners