The New York Times has updated its terms of service to prohibit the use of its content to train AI models.
The publication also requires written permission for automated tools, such as website crawlers, to access or collect its content. Although the reason for the update is unclear, it comes after Google gave itself permission to train AI services on public web data.
The updated terms may also target other companies, such as OpenAI and Microsoft. It says that without written consent, no one is allowed to “use the Content for the development of any software program, including, but not limited to, training a machine learning or artificial intelligence (AI) system.”
The New York Times has yet to change its robots.txt file, which tells search engine and AI model crawlers which URLs can be accessed.
Publishers prepare for a new age of AI
Media mogul Barry Diller is reportedly joining leading publishers, including the New York Times and Axel Springer, in a potential lawsuit against AI developers for using copyrighted content to train AI systems.
Tech giants such as Google, Microsoft, and OpenAI are in early talks with publishers to address copyright issues through options such as content subscription models. Both Microsoft’s Satya Nadella and OpenAI have previously hinted at sharing revenue with publishers if their AI systems are successful, but no concrete plans have been revealed.
OpenAI also recently took down its web browsing feature in ChatGPT when it realized it was able to scrape content from behind paywalls, stating that it wanted to do right by the companies owning the content. However, it already took tons of content from those companies to train its models in the first place.
Meanwhile, OpenAI and the Associated Press (AP) are teaming up to explore the potential of generative AI in news products and services. The collaboration involves OpenAI licensing a portion of AP’s text archive and AP using OpenAI’s technology and product expertise.
While AP already uses AI to automate tasks such as corporate earnings reports and audio transcriptions, it clarifies that it does not use generative AI in news and has no plans to do so.