A consortium of technologists and web publishers has introduced Real Simple Licensing (RSL), a new system designed to facilitate large-scale licensing of online content for Artificial Intelligence (AI) training data. The initiative emerges amidst growing legal challenges for AI companies over copyright infringement claims related to their training datasets.
The launch follows a $1.5 billion copyright settlement involving Anthropic and precedes approximately 40 other pending lawsuits seeking damages for unlicensed data. These cases, including one against Midjourney for generating copyrighted images, highlight an industry concern regarding potential legal exposure that some analysts suggest could impede AI development.
RSL aims to address these issues by providing a standardized framework. Eckart Walther, RSL co-founder and co-creator of the RSS standard, stated the system's objective is to establish "machine-readable licensing agreements for the internet." The RSL Protocol enables publishers to define specific licensing terms for their content, including custom licenses or Creative Commons provisions, which are then integrated into their websites' "robots.txt" files in a predefined format.
On the legal front, the RSL team has established the RSL Collective, a collective licensing organization structured similarly to ASCAP for music or MPLC for films. This entity is designed to negotiate terms and collect royalties, offering a single point of contact for licensors and a mechanism for rights holders to manage agreements with numerous potential licensees simultaneously.
Initial adopters and supporters of the RSL standard and collective include significant web publishers such as Yahoo, Reddit, Medium, O’Reilly Media, Ziff Davis (owner of Mashable and Cnet), Internet Brands (owner of WebMD), People Inc., The Daily Beast, Fastly, Quora, and Adweek. Reddit, for instance, reportedly maintains a separate $60 million annual content licensing agreement with Google, indicating that the RSL system is intended to complement, rather than replace, existing direct deals.
Challenges to widespread adoption include the complexity of tracking royalty attribution for specific pieces of training data within AI models, particularly for per-inference payment models. However, RSL co-founder Doug Leeds, former CEO of IAC Publishing, noted that some existing licensing agreements already require AI companies to report on data usage, suggesting such tracking is feasible. Leeds added, "It doesn’t have to be perfect. It just has to be good enough to get people paid."
The system's creators also cite calls for a standardized licensing protocol from prominent AI industry figures, including Sundar Pichai, as evidence of a perceived market need. The success of RSL will depend on the willingness of major AI laboratories to transition from accessing largely free web data to a structured licensing model.