Rattibha

Some thoughts on #StableDiffusion vs #DALLE2 vs #midjourney
They are actually all different (complementary?).
First this whole field was catalysed by @OpenAI releasing CLIP publicly last year (!) which @RiversHaveWings, @advadnoun and others built on openai.com

openai.com

This allowed image generation to be guided, the output of which was surprising and led to a burst of developer and artist creativity from @ClaireSilver12 and many others.
Proud to say we supported & funded most of the open tools in this space.
It is now good enough, fast enough

So #dalle2 is a model and a service.
It is focused on a certain usage subset that will broaden.
Inpainting is it’s best feature but by default it is random and best used for ideation and more corporate usage, hence it’s clear training on licensed stock images

I am sure lots more features are in the pipeline and more efficient, even higher quality versions are in the pipeline.
Expect the cost to drop 10x by the new year and we should see API access.
I do not expect the model to be open sourced given the training data

If you want to see how it works dig into the research paper here arxiv.org
And you can check the open source variant of those research concepts we supported here: github.com
@OpenAI is focused on general intelligence, not product.
And that is fine.

github.com

arxiv.org

Now #midjourney.
David Holz is a visionary technologist focused on understanding how humans interact with computers.
MidJourney is not a VC-backed product company but a research lab seeing how people interact and can be improved by this new tech, see his latest interviews

MidJourney has a very distinctive painterly style on purpose.
It uses the same raw open source model as most other services (for now! That will change soon) but a huge amount of work has gone into consistency and coherence.
Output is random tilted but there is some control

The front end code is not open, nor is anything else about it, but David has open sourced millions of lines of code i his career.
And that is fine. Not everything needs to be open source and it is a great service that will evolve in perhaps surprising ways 🙃

Now #stablediffusion is a model built through a series of collaborations that will be released very 🔜
It is a file that is part of foundational infrastructure for knowledge of all types of image, from art to product to whatever you can imagine
It is a general foundation model

Now there are services that will be built around it as it is being released open source as the first of our benchmark/foundational models.
We will shortly announce our #DreamStudio prosumer service but our focus is on our API to drive down costs of accessing this & future models

So that a billion people can communicate better.
These models will need to reflect every culture and work with creators and we are doing a lot of work in this area with the brightest and most big names in the space.
As a general model outputs are wider, but what you have seen

is raw output from our beta model tests, no pre or post processing.
It is much better with these and we have focused on fine grained control.
But
As an open source model anyone can use it. Code is already available as is dataset.
So everyone will improve and build on it

Or take elements of it & make even better things
Which is fantastic.
More tools more choice but ultimately a new way of communicating for all
Brand new market and segment, nobody is competing as going from millions to billions using this
Look forward to collaborating with all

Categories

More from this author

Related Threads

Popular Threads

Categories

More from this author

Related Threads

Popular Threads

Unroll Thread