AI art isn't "about to shake things up". It's already here.

davis_kingsley

AI art isn't "about to shake things up". It's already here.

post by Davis_Kingsley · 2022-08-22T11:17:55.415Z · LW · GW · 19 comments

19 comments

For a while, I've been seeing people commenting about how AI art is on the cusp of shaking up the art world. Quite frankly, at this point I consider such sentiments to be behind the times. AI art is not "on the cusp" of disrupting such things. It is already here. Using only capabilities that are straightforwardly and publicly available right now, AI art has radically transformed the budget for certain types of project that require art.

Let's give a basic example. I play card games of various types -- games like Magic: the Gathering or similar -- on a highly competitive level. I've even been involved in testing and development for some such games, so I'm familiar with that process as well. I am not a professional game designer, but am fairly involved with some aspects of the field.

For one of these games, a typical budget for an individual card illustration of sufficient quality is, as I understand it, in the realm of $200-1000 USD. A single "set" of cards might require 100+ such illustrations. That means that you're looking at paying $20k in art costs on the low end, and that this is a recurring cost every time you want to make a new set that isn't just reprinting old cards -- and even then, sometimes new art is used for reprints!

Except, well, that was then and this is now. Now, if I were in the business of making a card game set, my art budget wouldn't be $20k-100k. It would be... a $30/month Midjourney subscription with $20/month private visibility enabled, and quite frankly high quality Midjourney images look better than many of the images already being used for art in these games. [1]

I'm going to highlight that again. The price for the art needed to create a set of a hundred cards just went from twenty thousand dollars-- at the low end -- to fifty bucks a month. This is an extreme shift, and it is already here. This is not something that is based on a press release or future development that hasn't arrived yet. This is something that I could do today, using only techniques that are widely known and publicly available. If you assume you can get the art needed in one month of Midjourney time, that's four hundred times cheaper.

There are many other areas where this applies. What's the price for a book cover? Quite frankly, that's not my field -- but whatever it is, I'm going to bet that Midjourney is often going to be cheaper and better. What's the price for the internal illustrations in a role-playing game manual? Again, whatever it is I'm going to bet that AI art is already beating it.

Further, AI art is much easier to work with than professional artists. This is not intended as an insult to professional artists by any means! However, if I am working with a professional artist on an image, it may take them a significant amount of time to produce the image and get back to me on that. By contrast, if I don't like the Midjourney output I can write a variation on the prompt and get a new set of images extremely quickly. And I don't have to worry about people missing or misreading my emails, a potential language barrier, or time zone issues. [2]

Now, there are admittedly some things that AI art isn't good at (the really big one being art with integrated text). You know what? That's true. There are definitely some things that AI art does not handle well. It's not a perfect substitute yet. However, given the outrageous cost savings, I am perfectly fine with that. I am altogether willing to change the focus of my card illustrations a bit in order to avoid areas where AI art generation does poorly if it means paying four hundred times less for art for my game, and I suspect others will soon be choosing the same.

So, yeah. AI art isn't some hypothetically disruptive thing that might happen in the future. The capabilities are straightforwardly available right here and right now -- and the fact that there are some flaws and foibles still to be worked out means that it only has room to grow further. If AI art generation already offers this much of an advantage over traditional art commissions, how much more obvious a choice will it be once we've seen some more iteration on these systems?

[1]"But wait," you might say. "Don't companies with lots of revenue have to pay a higher price for Midjourney?" They do, but that price is $600 USD/year, and it comes with the private visibility option so the yearly price is actually the same. You do get somewhat less GPU time than a standard membership but can still buy extra time if needed.

[2] "Language barrier" and "time zone issues" may seem like weird problems to have, but as I understand it many game companies are getting art from artists who live in Eastern Europe or Asia. This is a good strategy in some ways but also has its downsides.

19 comments

Comments sorted by top scores.

comment by benjamincosman · 2022-08-22T13:53:36.092Z · LW(p) · GW(p)

I don't think you have any object-level disagreement with most people who say AI is "about to" shake things up. You just view "the tech is already such that this could theoretically be happening today" as sufficient justification to say things are being shaken up in the present tense, while others are using that same justification for the words "about to", and are presumably waiting for real-world effects (significant shifts in commission art pricing, artists losing their jobs, any major project shifting their art budget to use this tech, etc.) before we say the shake up is actually happening.

comment by gbear605 · 2022-08-22T12:34:59.824Z · LW(p) · GW(p)

I’m more familiar with DALL-E 2 than with Midjourney, so I’ll assume that they have the same shortcomings. If not, feel free to ignore this. It seems like there are still some crucial details that cause problems with AI art that will prevent it from being used for many types of art that will probably soon be fixed, and that’s why I would say “on the cusp” rather than “it’s already here”. I think the biggest issue for your example with Magic cards, there’s a certain level of art style consistency between the cards in a set that is necessary. From my experience with DALL-E, that consistency isn’t possible yet. You’ll create one art piece with a prompt, but then edit the prompt slightly and it will have a rather different style. See, for example, Scott Alexander’s attempt at making stained glass: https://astralcodexten.substack.com/p/a-guide-to-asking-robots-to-design I’m curious if you tried making a set of Magic cards (even, say, ten cards) and then asked other people into Magic to decide which ten are better, how many would choose the existing set. I would bet that they would choose the existing set because of the style consistency.

Beyond that, like you said there are some places where the AIs are just not there yet. Images with text is one, like you mentioned. Another one that seems like it would be a big problem for a Magic set is human faces, which DALL-E is notoriously bad at. Worse, it’s bad at it in ways that are rather obvious to viewers.

Both of these issues seem likely to be solved soon, but they’re not here quite yet. My use of DALL-E so far would still incline me towards paying a real artist.

Replies from: gwern, Bezzi, maia, Raemon

↑ comment by gwern · 2022-08-22T17:01:09.031Z · LW(p) · GW(p)

I think the biggest issue for your example with Magic cards, there’s a certain level of art style consistency between the cards in a set that is necessary. From my experience with DALL-E, that consistency isn’t possible yet. You’ll create one art piece with a prompt, but then edit the prompt slightly and it will have a rather different style.

As I keep emphasizing, DALL-E makes deliberate tradeoffs and is deliberately inaccessible, deliberately barring basic capabilities it ought to have like letting you use GLIDE directly, and so is a loose lower bound on current image synthesis capabilities, never mind future ones. For example, Stable Diffusion already is being used with style transfer and the final checkpoint hasn't even been officially released yet (that's scheduled for later today EDIT: out). So if you can't get adequate stylistic similarity by simply dialing in a very long detailed prompt with style keywords (noting that due to the avoidance of unCLIP, Midjourney/Stable Diffusion seem to handle long prompts more like Imagen/Parti ie. better), you can generate a set of content images, a style image, and style transfer over the set.

And of course, now that you have the actual model, all sorts of approaches and improvements become available that you will never, ever be allowed to do with DALL-E 2.

Images with text is one, like you mentioned.

Imagen/Parti show that this is not an intrinsic challenge but solved by scale. (Now, if only Google would let you pay for any access to them, that would be the perfect rebuttal...) Also, this would be one of the easiest things to do yourself or hire a very quick easy commission for <<$200 to insert some lettering.

Another one that seems like it would be a big problem for a Magic set is human faces, which DALL-E is notoriously bad at.

No, it does faces pretty well IMO. And Make-A-Scene shows you can avoid solving it with scale by a face-specific loss.

Replies from: TrevorWiesinger

↑ comment by trevor (TrevorWiesinger) · 2022-08-22T20:28:48.273Z · LW(p) · GW(p)

Thank you for clarifying this.

I would have been very misinformed, in a very damaging way, in the work I do every day, if you hadn't refuted some of the erroneous claims made in this post and in that comment.

On balance this post still would have been very helpful for my analyst work, but even more so thanks to you clearing this up.

↑ comment by Bezzi · 2022-08-23T21:04:17.129Z · LW(p) · GW(p)

At least for actual Magic cards, it's not just a matter of consistency in some abstract sense. Cards from the same set need to relate to each other in very precise ways and the related constraints are much more subtle than "please keep the same style".

Here you can find some examples of real art descriptions that got used for real cards (just google "site:magic.wizards.com art descriptions" for more examples). I could describe further constraints that are implicit in those already long descriptions. For example, consider the Cult Guildmage in the fourth image. When the art description lists "Guild: Rakdos", it's implicitly asking for giving the whole card a black-red tone, and possibly inserting the guild logo somewhere in the picture (the guild logo looks like this; note how the guildmage wears one).

I don't want to dispute that AI-generated artworks are very cheap and can be absolutely stunning, but I still predict that AI as available today would make a terrible job if used to replace human illustrators for Magic cards (You could have a better time using AI artworks for a brand-new trading card game, however).

↑ comment by maia · 2022-08-22T14:01:09.244Z · LW(p) · GW(p)

Midjourney seems to be better at stylistic consistency. E.g. see the images on the post, which are pretty stylistically consistent: https://alexanderwales.com/the-ai-art-apocalypse/

↑ comment by Raemon · 2022-08-23T02:24:40.756Z · LW(p) · GW(p)

I think the biggest issue for your example with Magic cards, there’s a certain level of art style consistency between the cards in a set that is necessary. From my experience with DALL-E, that consistency isn’t possible yet. You’ll create one art piece with a prompt, but then edit the prompt slightly and it will have a rather different style

Hmm, I haven't had much trouble getting Dall-E to output consistent styles. (I think there's some upfront cost in figuring out how to get the style I want, but then it tends to work pretty reliably, or at least I develop a sense of how to tweak it to maintain the style. (albeit, this does take extra time, and is part of why in my other comment I note that I think Davis is undercounting the cost of AI art)

comment by Raemon · 2022-08-23T02:21:36.313Z · LW(p) · GW(p)

For one of these games, a typical budget for an individual card illustration of sufficient quality is, as I understand it, in the realm of $200-1000 USD. A single "set" of cards might require 100+ such illustrations. That means that you're looking at paying $20k in art costs on the low end, and that this is a recurring cost every time you want to make a new set that isn't just reprinting old cards -- and even then, sometimes new art is used for reprints!

Except, well, that was then and this is now. Now, if I were in the business of making a card game set, my art budget wouldn't be $20k-100k. It would be... a $30/month Midjourney subscription with $20/month private visibility enabled, and quite frankly high quality Midjourney images look better than many of the images already being used for art in these games. [1]

This doesn't undermine your overall point, but this is getting the numbers pretty wrong IMO.

You're either hiring an artist to do your work, or your doing the art yourself (which also costs you time, which you presumably value at some rate)

Making Midjourney art isn't instant – I've sometimes actually spent hours chasing a particular look. A $200 - $1000 card art probably takes an artist 5-20 hours to make, charging somewhere between $20 - $200/hour. Midjourney is still an art tool that still requires someone with good aesthetic sense. It replaces one set of technical skills with a different set, and the overall process is much faster. I'm guessing the average midjourney art ends up taking ~30 minutes to make, when you factor in trying a bunch of prompts that didn't quite work, as well as context-switching costs.

The cost is higher if you want your art to not only be good, but to be unique among other games or products. i.e. I'd Oliver and I have spent somewhere between 20-30 hours on a combination of figuring out the LessWrong art direction. The first phase of that project involved looking at possible aesthetics to pursue (this started before the rise of ML art), The second phase involved figuring out how to get Dall-E et al to output something that hit the right target. (i.e. having classy watercolor images that fade into a white paper background in a particular way)

So I think it's more accurate to say that the state-of-the-art is for Dall-E/Midjourney/etc to cut art costs by a factor of 10-20. This is enough to be industry-changing, but not quite as extremely as you imply here.

Replies from: Davis_Kingsley

↑ comment by Davis_Kingsley · 2022-08-23T11:47:58.944Z · LW(p) · GW(p)

I'm not sure I agree. The normal art project also requires a bunch of "art director time" -- there can be multiple rounds of back and forth between author and artist, different sketches or concepts to evaluate, and so on. If anything, I think there's more context-switching cost required for a traditional project because of the inherent major delay in creating traditional art.

In other words, if I have an AI art prompt that doesn't come out quite right, I know that very quickly and can then run another prompt to refine what I'm going for. If I have a traditional art prompt and a professional artist comes back a while later with sketches that aren't right, I can send them art direction to refine the project -- but doing so will impose more context-switching because of the delay on communications between us, the fact that these sketches/drafts will be arriving substantially after I've sent my initial piece, etc.

Replies from: Raemon

↑ comment by Raemon · 2022-08-23T18:37:12.430Z · LW(p) · GW(p)

Hmm, yeah that does seem reasonable. I do think a big chunk of the process here is more like "doing art" than "doing art direction", but not sure where I'd draw the line.

comment by Tomás B. (Bjartur Tómas) · 2022-08-22T14:26:42.649Z · LW(p) · GW(p)

Stable diffusion comes out today. Given its quality, and the fact that it can be used to generate pornography, I suspect it will be quite newsworthy. There have already been huge Twitter threads about it, started by artists terrified of their obsolescence.

Replies from: Kaj_Sotala

↑ comment by Kaj_Sotala · 2022-08-22T16:52:25.331Z · LW(p) · GW(p)

Yep. Here's one that got quite a lot of engagement, and here's a writer talking about the backlash he got after using AI art to illustrate his Atlantic article.

Replies from: gwern, Bjartur Tómas, Viliam

↑ comment by gwern · 2022-08-23T01:40:22.139Z · LW(p) · GW(p)

I'm very amused by the ending. "I'm sorry, I repent, I've learned my lesson! I will never again dabble in the seductions of that devil AI art again, and will atone with a good old-fashioned high-quality human illustration like this one... which, er, will be in the next issue."

↑ comment by Tomás B. (Bjartur Tómas) · 2022-08-22T19:48:46.677Z · LW(p) · GW(p)

And so it begins: https://twitter.com/emostaque/status/1561777122082824192?s=21&t=_73oggw4l2CehPK9fsUZaQ

I’m going to see if I can finish a comic book in this brief interregnum where artists are obsolete and writers are not.

↑ comment by Viliam · 2022-08-24T00:04:16.537Z · LW(p) · GW(p)

A frequent objection is that the AI learns from human authors, and is thereby violating their copyrights... well, maybe not according to the current laws, but then those laws should be updated to reflect the new reality. The (updated) copyright laws will hopefully stop the AI from making the artists obsolete. From now on, each artist will have to provide pictures of their work-in-progress, to make sure that their work is not generated by an AI.

Let's ignore the fact that if this becomes the norm, soon you will have a lot of "work-in-progress" samples, so the next generation of the AI will be able to produce the final art along with corresponding work-in-progress pictures. Let's imagine an inconvenient [LW · GW] world with universal surveillance, where you can easily distinguish human-made and machine-made pictures, and every copyright violation is punished.

We already have a legal solution for this, in form of public domain and copyleft. If the AI learns from legally available and modifiable images, no one's copyright will be violated. Artists who support the AI can give up the copyright for their images. Companies that support the AI can offer to the artists deals like "I will pay you for making this specific image, but only if you give up the copyright"; some artists will accept the deal. Gradually, the AI will become better at what it does even if it only learns from the free resources; at some moment possibly as good as it is now. Then, there will be no legal argument against the AI art. (Unless you explicitly ban the AI, or the concepts of public domain and copyleft. Which would be obviously unfair.)

Actually, a stronger argument can be made: unless we invent immortality first, it is just a question of a few decades until all things that are copyrighted today will have their copyrights expired.

If the copyright law is actually modified to ban training from copyrighted images, people will start using non-copyrighted training sets. Actually, someone should already try it now.

comment by Lycaos King (lycaos-king) · 2022-08-23T03:45:17.771Z · LW(p) · GW(p)

Am I the only person who thinks AI art still looks terrible? I see all these posts talking about how amazing AI art is and sharing pictures and they just look...bad?

Replies from: Veedrac

↑ comment by Veedrac · 2022-08-23T13:32:42.074Z · LW(p) · GW(p)

Some people feel this way, but I've done this test and most people just can't tell for good prompts that play to AI's strengths. And also, people don't cherry pick results enough, some images are just excellent, even if the modal image is a good bit jank.

comment by Ben (ben-lang) · 2022-08-24T10:09:47.017Z · LW(p) · GW(p)

On the cost of hiring an artist for a book cover. A couple hundred pounds would be the cheapest end of the spectrum. And that is really very cheap given the number of work hours that appear to be invested. Probably within a year books published directly through amazon can ask an amazon AI to read the book (or blurb) and wizard them up an acceptable cover.

I suspect some artists might loose business, but I imagine that many might benefit from these tools. For example some tool where the artist draws some stick men in different colours and says "the green one is an orc, the red one our hero who you drew in the last image etc." this allows the artist to transmit certain spatial details that might be hard to put in text to the ai. Basically all the tedious details can be done by an ai onto a human drawn skeleton image.

comment by Capybasilisk · 2022-08-23T10:16:46.658Z · LW(p) · GW(p)

"Story of our species. Everyone knows it's coming, but not so soon."

-Ian Malcolm, Jurassic Park by Michael Crichton.

AI art isn't "about to shake things up". It's already here.

Contents

19 comments