Did you know that massive data scraping is standard practice in the AI industry? And that the way they traditionally get away with it is to have a non-profit do the scraping for "research" purposes, which they then claim is fair use, then resell the data to commercial entities as happened here? Its actually been termed "data laundering", and I'm surprised this practice hasn't been shut down as an abuse of Fair Use doctrine already when its known those research organizations are nothing but commercial fronts for shit like this. The practice extends to all kinds of data, not just images used in image generation AI. Also, something the lawyer mentioned that the article did not, the way the AI works it doesn't exactly store copies of the images in the traditional sense, but what it does store is close enough that its kind of like how you can't claim you aren't storing a copy because its in a compressed format. For legal purposes, the plaintiffs will argue that the neural network is still intended to infringe on the copyrights of the artists by design.Artists file a copyright lawsuit against Stable Diffusion and Midjourney wrote:Three artists have filed a copyright lawsuit against the creators of Stable Diffusion, Midjourney, and DreamUp, DeviantArt’s AI image generator. Sarah Andersen, Kelly McKernan, Karla Ortiz, and their attorney claim that these programs have infringed the copyright of “millions of artists” by training their algorithm on their work without permission.
The Midjourney founder recently admitted to using “hundreds of millions of images” without their authors’ consent to train the image generator’s AI. And now, his company and the two others could face legal consequences.
Lawyer Matthew Butterick is an artist himself, and he teamed up with litigators Brian Clark and Laura Matson of Lockridge Grindal Nauen P.L.L.P. On behalf of Andersen, McKernan, and Ortiz, he filed the lawsuit against Stability AI (the company behind Stable Diffusion), DeviantArt, and Midjourney.
Karla Ortiz
@kortizart
1/ As I learned more about how the deeply exploitative AI media models practices I realized there was no legal precedent to set this right. Let’s change that.
Read more about our class action lawsuit, including how to contact the firm here:
https://t.co/yvX4YZMfrG
8:14 PM · Jan 14, 2023
4.2K
1.2K
“Stable Diffusion contains unauthorized copies of millions—and possibly billions—of copyrighted images”, Butterick writes in a blog post. “These copies were made without the knowledge or consent of the artists.” He argues that, even if we assume the nominal damages of only $1 per image, this misappropriation would add up to around $5 billion!
In his explanation, Butterick wrongfully notes that Stable Diffusion is a “21st-century collage tool” and his lawsuit has been criticized over this. Stable Diffusion Frivolous writes that the diffusion model doesn’t store images, so it doesn’t collage them. It rather learns data distributions and then generates new work from them.
Butterick notes that the resulting images (“collages”) from AI image generators “may or may not outwardly resemble the training images.” While they’re not exactly collages, he might still be onto something here. Although small, there still is a possibility of AI-generated art resembling previously made artwork. With the right prompts and artistic style applied, you might actually generate an image that’s too similar to a work of an artist. It reminds me a bit of the “Infinite monkey theorem,” but it’s not impossible. And what do we do then? Does the original artist hold the copyright, or is it the person who created it using a text-to-image generator?
Scraping millions of images without artists’ consent is, in my opinion, a wrong approach, at least on the moral side. Many artists argue against using their work to train algorithms, but there are also those with arguments that support it. Here’s an interesting thought from an earlier article about AI training:
“If an art student studies all of Picasso’s 10,000 paintings and then creates a new painting that is clearly based on them, we call this the advancement of culture. The same is true if a writer uses a word that was coined by Shakespeare, or if a graffitist is clearly inspired by Shepard Fairey.”
In his article Patterns, Culture, and Theft, Seth Goddin argues that “taking an idea isn’t theft; [it] is an oxymoron.” He argues that this is how culture evolves and that “ideas belong to all of us.” So, we can’t say whether the judges will rely on arguments like this one or the ones stated by the artists who feel robbed of their work and years of learning and perfecting. But it’s definitely a topic to think about and to make it more defined in legal terms as soon as possible.
As a user of Deviantart, I say its about goddamn time. The first time the community learned that the website trained an AI on all of our art was when they announced the AI was ready to use. The put a fig leaf on it just after the users revolted en mass, but all they did was let you put a tag on your images telling the website and external sites not to include this image in future data scraping. As far as I know, all of my own art was probably used to train DreamUp. Suffice to say, tons of artists left the platform over it, as its not like Deviantart's owners are well liked by the community to begin with. But that's a long story for another time.