Skip to main content

AI and copyright – an author’s viewpoint

A personal viewpoint on AI copyright deals for authors: 

I've received an offer from a publisher, inviting me to opt in to an artificial intelligence (AI) licencing deal and I am considering what to do. The publisher is being approached by AI companies who want to licence their content, which would potentially include my book, ironically entitled  A Copyright Masquerade. 

 As I hope readers will know from the content of my website, I’ve written extensively about past battles over copyright, where we learned about some of those dangers, and which inform the complex  challenges we face with AI today.  These earlier  battles that I wrote about were quite different from today's, however. They were about individuals being targeted by large corporations. Today we are dealing with an inter-industry  battle,  in which book authors, like myself, are stuck in the middle. I am trying to unpack what it means for me as an author.

Please read this article as a positive statement. I would like my work to be included in any potential AI deals. I think there are benefits for me as an author.  These include the ability for my work, an academic book,  to be discovered by new generations of researchers and students. Indeed, AI  is probably vital to ensure the discoverability of the book.

However, it does feel like wading into the unknown. Copyright for AI is quite different from print. With generative AI, and the so-called large language models, it is about the right to use the text of the book for training and possibly other purposes. It may also entail the right of communication to the public, depending on the kind of outputs.  The tricky aspect is how to assess the monetary value.  

Let’s cast the mirror onto the AI deals. The AI companies are awash with cash and settlements can potentially be  more than many authors get from their printed book. All of which makes it worthwhile for them to get a good deal from AI licencing.

The case of Bartz v Anthropic offers a perspective on the sums at stake. This is a class action law suit filed in the US courts by a group of authors. A settlement of $1.5 billion was achieved  for around half a million works. It is likely to realise around $3000 per work, with a 50:50 author: publisher split.  For many  authors, for example academic writers,  who don’t earn much in royalties,  this would be a welcome  bonus.

This  settlement was awarded on the basis that  Anthropic had downloaded pirated databases of books – so-called shadow libraries - to train its AI. Because the  training data was pirated and the books were not obtained lawfully, Anthropic  was deemed to be infringing copyright. However, the same reasoning  does not apply where AI companies have legitimately purchased the books and used them as training material.  In those instances, the US courts have ruled that this is “fair use”.  

From an author’s perspective, the distinction between the  pirated material versus lawfully sourced training data  matters. In a compensation claim, courts will differentiate between them.  It also illustrates why a licencing system would be advantageous, as it avoids the problem.

In the French courts, a similar legal point will be tested In a case litigated by the French Publishers’ Association against Meta. The issue concerns  unlawful use of copyrighted material to train its large language models. The publishers  are asking for compensation and their content removed from the AI training datasets.  Unfortunately, their claim  isn’t public so there is no way of verifying the substance of it. The substance is so important in these cases because a court judgement will ultimately rely on highly technical detail.

Turning back to Bartz v Anthropic, I think one conclusion we can draw  is that there is money to be claimed  for authors. However, authors should not underestimate the tough legal battle to get there. They and their publishers will have to make their case at quite a technical level.

Publishers need to be on the ball. The criteria for a book to be included in the Anthropic settlement was that the book had  to be on a list of books downloaded from the pirated libraries and registered with the US Copyright Office. Books that were downloaded but unregistered, were simply thrown out by the law firm handling the case.   A Copyright Masquerade can be found in a search of  the shadow libraries that Anthropic downloaded, but is not registered. This raises a crucial point:  if the criteria is registration of the copyright in the US, how many British authors will miss out?  

On licencing, there is some way to go before a system can be put in place. The US Copyright Office signalled the way in its report of May 2025, which recommended licencing as the way forward, but the report  was immediately dumped by the Trump Administration. Could UK and the EU pick up the idea and run with it?

The UK House of Lords made an attempt to address AI and copyright in the Spring of this year with a proposition for “transparency” in law, but in my opinion, that was never sufficient. It puts a stake in the ground, highlighting the issue to government, but what’s needed  for authors is a workable solution.

The EU AI Act includes a transparency requirement specifically for copyright purposes.  AI companies are told to provide a “sufficiently detailed” summary of the training data obtained,  following an official EU template that can be downloaded here.  I will let you decide for yourself if this template can function in a technical environment. I will just say that from an author’s perspective, there is a lot of work still to do at a policy level to put together a viable legislative proposition.

And so we go back to where I started: authors are the engines of the publishing industry  and what’s at stake is the way that they will remunerated in the new world where their work becomes known via AI but is also used for strange new purposes we had never before imagined. It is difficult to determine whether or not we are getting a good deal when there is so little substantive information provided. Right now, the future of copyright for AI models is being decided in the courts and in obscure corporate back-rooms. Transparency and licencing must centre on authors, and in order to take fully informed decisions about licencing options, they must be fully involved in the process.

 

---

Please see also  Copyright wars 3.0: the AI challenge 

If you liked this, you may also like to read my earlier work on copyright and the Internet. 

If you would like to contact me, please do so via the contact page.  Please remember to credit me as “Dr Monica Horten” if you cite my article. 

 

 

  • Article Views: 45

About Iptegrity

Iptegrity.com is the website of Dr Monica Horten, independent policy advisor: online safety, technology and human rights. Advocating to protect the rights of the majority of law abiding citizens online. Independent expert on the Council of Europe Committee of Experts on online safety and empowerment of content creators and users.  Published author, and post-doctoral scholar, with a PhD from the University of Westminster, and a DipM from the Chartered Institute of Marketing.  Former telecoms journalist,  experienced panelist and Chair, cited in the media eg  BBC, iNews, Times, Guardian and Politico.