October 26, 2022 •

Training Generative Models

I would consider myself an AI art enthusiast and optimist. Despite being starkly anti-Luddite, I do recognize the arguments about how the models are trained seem to have the most substance. (Let’s cut out all the whining about how these new tools will bring about a sea change and “won’t someone think of the poor starving artists”. This Ludditism is indistinguishable from past laments about innovations that are now boring commodities, such as the novel, the teddy bear, the bicycle, and on and on. Perhaps I am wrong and AI will be different?—won’t someone think about the poor grandmaster chess player with nothing better to do than sue for peace, now that the machines consistently beat them handily. Or perhaps scrappy humans will adapt and commercial and fine art and photography will flourish.)

Links About Training ML Models and Licensing (or Not) Input Images or Text Corpus

Here are some links about generative models and the ethics and legality of training data. I will add more as I come across them.

Matthew Butterick, a programmer, typographer and lawyer is dusting off his bar membership and squaring off with Microsoft because they might be ingesting his open source code (and that of millions of others) in violation of the software licenses.

I’m not sure if the result of Copilot will be the erosion of open source contributions or not, but the copyright aspects obviously have merits. A lot is unknown at this point.

Shutterstock announced an alliance with Open AI (also Microsoft essentially) to sell and license their contributors’ tagged and organized photos to train DALL-E 2. What they don’t say in the press release is what it says in the email they sent to their contributors, that Shutterstock will not accept any AI-generated contributions:

Working together to lead the way with AI

We’re excited to announce that we are partnering with OpenAI to bring the tools and experiences to the Shutterstock marketplace that will enable our customers to instantly generate and download images based on the keywords they enter.

As we step into this emerging space, we are going to do it in the best way we know how—with an approach that both compensates our contributor community and protects our customers.

In this spirit, we will not accept content generated by AI to be directly uploaded and sold by contributors in our marketplace because its authorship cannot be attributed to an individual person consistent with the original copyright ownership required to license rights. Please see our latest guidelines here. When the work of many contributed to the creation of a single piece of AI-generated content, we want to ensure that the many are protected and compensated, not just the individual that generated the content.

In the spirit of compensating our contributor community, we are excited to announce an additional form of earnings for our contributors. Given the collective nature of generative content, we developed a revenue share compensation model where contributors whose content was involved in training the model will receive a share of the earnings from datasets and downloads of ALL AI-generated content produced on our platform.

We see generative as an exciting new opportunity—an opportunity that we’re committed to sharing with our contributor community. For more information, please see our FAQ on the subject, which will be updated regularly.

More about Shutterstock:

Shutterstock.AI: Shutterstock is bringing AI-generated content to the masses in partnership with OpenAI and LG.

So even Shutterstock is playing both sides of this issue—feeding the beast (models) but saying no to the quagmire of feeding legally questionable images back into the system. A complex and nuanced stance?! (Or just putting out FUD to bolster their own position, where they benefit if other gratis and open projects (like Stable Diffusion) get sidelined by actual or perceived legal issues? A little from column A, a little from column B?)

Invasive Diffusion: How One Unwilling Illustrator Found Herself Turned into an AI Model.

Interesting dilemma. Working hypothesis: all illustrators will be expected to work in all styles, since a single style is always too easy to copy. Also, graphic designers have done this for like a century: adapt their style to the needs of each project, client, or product. Perhaps illustrators will be required to up their game?

Butterick and big class action law firm also sues Stable Diffusion because of how the model is trained.

Strangely he belittles to the tool as merely “a collage” tool, and though he is actually a technical person, unsurprisingly frames things as unfavorably as possible for the defendant, Stable Diffusion. A collage tool, but it is very threatening! He seems to miss some obvious things in his skewed framing. For example he says the model compresses and stores a lossy copy of 4 billion images. He compares this to lossy MP3s, which at 10% file size compared to uncompressed recording are just passable auditorily (128kbit quality, 1 minute stereo MP3 of original WAV file is about 1 MB / minute, or 70 MB for a full 700 MB uncompressed CD.) A 1% compressed MP3 sounds like this (insert 12kbit/sec ear-grating audio sandpaper). So in a model measure in gigabytes (or one billion bytes) each image has been lossy compressed by approximately a factor of one million (512x512x3 = 786 KB) or more depending on whether the input images are tiled into many smaller images or lossily downsampled first. Anyway, for training billions of images on Stable Diffusion into a model that can run on my laptop, this can only work because the system is learning to share information from all the other images.

He is a lawyer tasked with using rhetoric to tear into SD as a tool for creativity so he has to try to make it appear as ordinary as possible such that if this were done by humans it would be very illegal. The better he does this, the more money the lawyers make. He may genuinely believe he is helping artists by trying to destroy SD as a creative tool. But we can never know because he is financially invested (obligatory Upton Sinclair quote) in the outcome now so there is no way he could ever be convinced otherwise by probably any means.

So, for example, as part of this die-hard rhetoric, he never mentions image scraping by Google for Google Image Search. Why not sue Google over this, for the big bucks? Has that ship sailed? Can people honestly make a consistent case about scraping public images being totally wrong and immoral under all circumstances? What about the longstanding, totally unenforceable situation where working artists regularly use image search to find many reference images, and create a collage or amalgam the old-fashioned way, without paying for any of the images, and create their own drawing or illustration, mostly but not really from scratch, and hide their image-inspiration tracks? Now that computers can do this at scale it is suddenly immoral? Weird.

I wonder why smart people take different sides when these new, innovative tools appear. I know other technologists (who have never seriously used the tools) who have reacted strongly and were surprised that not everyone thinks it is a clear case of “absolute evil leviathan-versus-little-guy art heist.” Just like the MPAA and RIAA spent tons of money over decades labeling “music piracy” (sharing) as “theft” because it hurt the fat cat label execs’ bottom line and expensive middle-man lifestyles (they were systematically fleecing and hurting artists way worse than any music sharing ever could), the record execs purposely never mention the following: Taking a chair from someone’s house leaves the house short of one chair (same for an art museum with the heist of a specific piece of art) but walking into a museum and carefully measuring the dimensions of a chair and going home and making one’s own duplicate does not. This creates two chairs, not just one that was stolen and removed to a different place. Thus IP copying is not theft nor burglary nor larceny in the usual sense. It may be illegal or wrong in some sense, but it is not a zero-sum action.

Ironically Butterick is mad at Microsoft for training Github CoPilot on Github repos, but if the SD lawsuit is successful, only financially larger players such as OpenAI (effectively a Microsoft subsidiary now, after a recent $10B infusion) will have the money to license input images to train models, and all tools that rely on models will have to be paid (unless you know someone who has downloaded the old contraband models and can illegally share that contraband with you). Thus eventually only big businesses will control AI image art creation, which is the nightmare Butterick and other lawyers are supposedly trying to prevent with their rhetoric about stealing from the artists. He may be doing OpenAI and hence Microsoft’s dirty work for them by killing off the small players or free players such as Stable Diffusion.