What's the easiest way to currently generate images with machine learning?

post by ChristianKl · 2022-02-24T15:20:24.244Z · LW · GW · No comments

This is a question post.

Contents

  Answers
    5 Quintin Pope
    2 artifex0
    2 Trevor Hill-Hand
None
No comments

As far as I understand, the images for the latest LessWrong books were generated by a neural net. I would like to generate images for the sequences I have on LessWrong in a similar way. What's the easiest way for me to do so?

Answers

answer by Quintin Pope · 2022-02-24T19:31:36.161Z · LW(p) · GW(p)

Download an app such as WOMBO Dream or starryai, which both provide text to image generation through a convenient interface. WOMBO Dream is free to use as often as you want and very fast. starryai only gives you 5 free images per day (with 5 to start) and is much slower. However, starryai has more customisation (more styles, can use a starting image, can tune number of optimization steps) and uses an overall stronger approach.

I’d suggest WOMBO to refine your prompt and try starryai to generate the final image. WOMBO alone might be enough if you’re more interested in the texture/“feel” of the image than the actual shape.

comment by Celenduin (michael-grosse) · 2022-02-26T16:59:38.703Z · LW(p) · GW(p)

Note that there also exists a web version of the WOMBO Dream app: https://app.wombo.art/

answer by artifex0 · 2022-02-26T11:15:41.504Z · LW(p) · GW(p)

For text-to-image synthesis, the Disco Diffusion notebook is pretty popular right now.  Like other notebooks that use CLIP, it produces results that aren't very coherent, but which are interesting in the sense that they will reliably combine all of the elements described in a prompt in surprising and semi-sensible ways, even when those elements never occurred together in the models' training sets.

The Glide notebook from OpenAI is also worth looking at.  It produces results that are much more coherent but also much less interesting than the CLIP notebooks. Currently, only the smallest version of the model is publicly available, so the results are unfortunately less impressive than those in the paper.

Also of note are the Chinese and Russian efforts to replicate DALL-E.  Like Glide, the results from those are coherent but not very interesting.  They can produce some very believable results for certain prompts, but struggle to generalize much outside of their training sets.

DALL-E itself still isn't available to the public, though I'm personally still holding out hope that OpenAI will offer a paid API at some point.

answer by Trevor Hill-Hand · 2022-02-25T15:52:14.302Z · LW(p) · GW(p)

My favorite one to play around with has been this Google Colab notebook: https://is.gd/artmachine - totally free if you don't mind it being slow (i.e. 10-20 minutes per image).

No comments

Comments sorted by top scores.