This is the third article in our “AI 101” series, where the team at Lewis Silkin will unravel the legal issues involved in the development and use of AI text and image generation tools. In the previous article of the series, we considered questions of ownership and authorship when it comes to generating AI works. In this article we consider how those tools might be infringing IP rights and how the users of those tools might find themselves in hot water.
“This song sucks”. That is what Nick Cave had to say in response to a song sent to him by a fan that had been written by an AI generator asked to mimic his writing style. While the use of AI-generated works risks irritating artists, more concerning for those wanting to use AI is the risk of infringing their intellectual property (IP) rights.
Why use AI tools?
From a user-perspective, the commercial appeal of AI-generated works is undeniable: why pay for a stock photo when one that meets your needs can be obtained for free/very little in a few clicks? Or a slogan? A jingle? Or even source code?
From copyright works to out of work
As these tools develop, and as they become more integrated into other services through APIs, they will increasingly find uses where previously organisations might have needed to search and pay for a human-generated work. This could have a real impact on the creative industries, a fact recently recognised by the UK government in scrapping the proposed text and data mining exemption to copyright law. As the Liberal Democrat MP Sarah Olney said in Parliament, “We cannot let AI replace the human creators who have built our world-leading creative industry, nor can AI content be produced off the backs of hard-working creatives without their consent.”
What legal risks are involved?
As we explained in our previous articles, AI generators are typically trained by analysing vast databases. Some of those databases are created for the purpose (lawfully or otherwise), some are licensed, and others are simply insufficiently protected by technical means to prevent their use. The ordinary user of the AI generator will likely have little idea what data or materials were used to train the system, or whether the owners of any IP rights had consented to that training.
Asked for its opinion, ChatGPT says:
“The infringement risk comes from copyright law. There has been no court ruling in this area so it is still up for debate…”
Phew, us humans that make a living from advising on the law can hold on for now. While ChatGPT is correct that copyright is an obvious risk (and is the focus of this article), it is far from the only one. While the focus so far has been on copyright (particularly of artistic, literary and musical works), other IP rights such as database rights and trade mark rights can also be infringed by these tools. It is not difficult to imagine an AI tool producing a design for a new product that infringes an existing design (especially in fashion), or even a patent.
What cases have there been so far?
The most significant claims brought to date have involved training AI on databases of images or text. For example, Getty Images is claiming in proceedings launched in the UK and USA that Stability AI has been training on its database (i.e. the images and captions it offers to licence). Interestingly, some of the outputs of Stability AI have been shown including a distorted version of the Getty watermark, introducing a trade mark claim - see the image with this article, which was taken from Getty’s complaint in the US proceedings against Stability AI.
Showing that a work has been copied
The current claims face a common issue: proving that the claimant’s work was actually used to train the AI generator in question. While procedural rules in jurisdictions such as the UK and USA allowing for disclosure of documents will assist claimants in establishing whether their works were used for training, rights holders may have tools at their disposal to identify whether their databases have been trawled. Without that evidence and unless the AI generator creates a work that remarkably similar to the work of the claimant, it will be difficult for the claimant to establish infringement.
How could the user be liable?
This is perhaps the biggest unanswered question out of all of the knotty issues involved. An alleged infringer might reasonably say that they did not copy the original work, and if the AI tool did, then they were not aware of the infringement.
However, the battleground will likely be over the extent to which the infringer was involved in the infringement or should have been aware that the output was infringing. For example, if a user had asked for an image or song ‘in the style of’ a particular artist, it may be liable for the output that duly complies with that request.
For example, ask an AI system for an image of Marilyn Monroe in the style of Andy Warhol, and you might get (unsurprisingly) a very good imitation of the famous Warhol Marilyn prints. Such a user cannot necessarily claim that they innocently infringed the rights in the famous work if they then use it commercially.
Who can we expect to bring claims, and against whom?
Stock photo services are obvious claimants. So too are music publishers whose lyrics and compositions are often widely available online. However, any organisation that makes large databases available online, or whose works are included on such databases, are potentially going to find those databases are being used for training AI systems.
Where the AI tools are then being used to generate competing content, it is likely that rights holders will bring claims. There is also scope for group or representative claims to be brought, e.g. by artists, songwriters etc., whose works have been used for training AI tools that are now competing with them.
As for the potential defendants, the AI tools themselves are obvious targets. Both the training and generation aspects of their tools have significant infringement risks. However, users of the tools should also be careful. The platforms’ terms will usually give no warranty over infringement and may require proceedings against them to be brought abroad and/or through arbitration.
A scenario
Imagine you are in the social media team at a car manufacturer. You want to create a social media campaign for your new electric vehicle (EV). You use an AI tool to generate a futuristic urban backdrop for some social media posts to show how advanced the EV is. You superimpose the EV over the images that are generated and post them to the brand’s Instagram account. A few days later, you receive a letter from lawyers acting for an artist. That letter refers to a painting the artist created and exhibited several years previous. There are some differences, but the painting is quite clearly similar to one of the images you posted. The artist is claiming damages, payment of her legal costs and a public statement.
Typically, when defending a copyright claim, you might say that the images are not sufficiently similar to give rise to a presumption of copying, that you did not have access to the earlier work and that in any event you did not copy it.
If the images are very similar, and you have generated the new image using an AI tool, you will struggle to say whether or not the tool had access to the earlier work (as most tools are far from transparent about what they were trained on). Worse still, the artist may even be able to show that their work was used by the AI tool. You may have unwittingly, and unwillingly, become a party to one of the first cases on copyright infringement by AI tools.
This scenario could equally have applied to an AI-generated music track.
How to minimise the risk of infringement when using AI-generated works?
Thanks to the sudden growth and release of AI tools, we are likely entering a period where the sometimes competing interests of creatives and tech innovators are clashing. As will all new technology, there will be growing pains before a more stable equilibrium is reached. Those growing pains will likely include a fair few legal disputes.
Sometimes business needs will outweigh the risks lawyers can identify, at least until more clarity is given by the courts. In the meantime, users of AI tools should consider ways to mitigate the risks. For example, carrying out a reverse image or text search of the AI generator’s work and looking for any close visual hits may be a good option, although that will not be possible in all circumstances, and reverse image searches can be hit-and-miss.
Not all AI developers are the same and some are taking a more cautious approach. Scenario, for example, provides artists and game developers with bespoke image generators that create assets consistent with the style and art direction of their game. Cleverly, Scenario requires users to upload the data of their own game’s assets, which is then used to custom-train the generator. No external data is used, and the AI-generated work is therefore unlikely to infringe any third-party copyright (although it still may irritate Nick Cave).
In the next article of this AI 101 series, we consider the regulatory framework for AI being proposed by the European Commission.