DALL-E 2 is *leaps and bounds* better at image generation than its predecessor, which was already state-of-the-art. A thread with some notable comparisons:
#dalle
Deeply respect the public reversal on this. It has pained me to see how Ilya has been portrayed in media: I find him so thoughtful and earnest in trying to do right; he also seemed absolutely distraught in the wake of Sam’s dismissal, not coup-like in the least. Again, respect.
I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company.
We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo.
We are collaborating to figure out the details. Thank you so much for your patience through this.
There's rightfully been a lot of buzz about DALL-E 2's image-generation abilities so far, but I've less about its amazing Variations feature.
In just a few cycles, you can go from an image like this (left), to several like this (right):
We’re testing an important step forward: AI optimized for helping w/ goals via chat, & reducing some of the most common issues w/ these AIs (“hallucinating” facts, inappropriate requests, etc). Help test it out & you could win $500 of API credit in our feedback contest!
We start by erasing the region we'd like to inpaint, then give DALL-E 2 a caption to generate toward. With a little AI inspiration, we can give that old guitarist a shiny new neon pink electric guitar!
With famous paintings, DALL-E 2 can often do in a single-shot what might have taken dozens before, and at significantly higher quality. Can you guess the caption for the below? (Ok, it's listed, but don't cheat!)
We've learned a lot from deploying large-scale AI systems, and attempting to do so responsibly. Now, a bunch of colleagues and I have written up some reflections so they're in the public record and can help others more widely:
Deploying and studying the real-world use of language models helps us learn more about safety and misuse than research alone.
As we advance our safety and policy work, we're sharing some of our findings to help others do the same.
A major part of OpenAI's Charter is to not unduly concentrate power.
It would be quite undesirable for one private group to choose all of AI's values; that is not OpenAI's plan!
Read more about how we're building public input and accountability into the design of our systems.
@robertwiblin
p(doom) = 0 is such a wild claim. I wonder how many are expressing a view like "I think we'll fix it b/c we'll take it seriously, therefore it's extremely low-risk", vs "I think we're currently on track to solve it" or "there's nothing even in need of solving"
Next, let's try to change the apple into another fruit - say, a bunch of red grapes. This is a bit hard, because the leaves in the original don't quite match what a bunch of grapes looks like. Nonetheless, DALL-E 2 is able to do it. I erase the apple's region, and get these back:
And for those wondering - it isn't just adaptations of famous paintings!
@dabmely
came up with one of my favorites; the level of detail is pretty stunning, even for my having seen hundreds if not thousands of these:
We make an important side-point in this paper, one that I fear is too often overlooked:
AI systems don't need to better than you to be extremely useful, nor do they even need to be just as good.
They just need to offer *comparative advantage*, in safe, reliable ways.
Our new research white paper identifies seven practices for keeping increasingly agentic AI systems safe and accountable as they become more common and more capable.
We are providing research grants for work on a range of open questions.
With a little creativity and
@_dschnurr
magic, I was able to make these "larger than life"
#dalle
images: stitching together different 1024 x 1024 images to create mural-like artwork. Below, a Noah's Ark of robotic animals, & a sandy beach at sunset.
Inpainting with DALL·E 2 is super fun. With some ingenuity, you can create arbitrarily large artwork like the murals shown below – which I assume are the largest
#dalle
-produced images created so far.
@patio11
@sama
Here's the Chicago one; at first, this wasn't recognizably Chicago, so I gave it a landmark and this came out a bit better; I then generated some Variations on my favorite:
The future's so bright, we've got to wear shades
("3D rendering of a happy robot wearing sunglasses, shallow depth-of-field, tropical celebratory background")
It might be fun to create some Variations on that image, so we aren't working from an *exact* copy. Here's what DALL-E 2 produces: Not bad! But I prefer the original, so let's stick with that.
Sometimes, especially with a tricky concept, generating Variations will turn up some really beautiful ones. So I pick my favorite from the above, and ask DALL-E 2 to create some more that are similar to it - and it produces this:
I'm honored to have worked on this, especially on ensuring that DALL-E 2 leads to delight and increased AI understanding in the world, rather than harm (disinformation, bias, etc). Our amazing Policy Research team did a great job explaining our approach:
Very excited to share something lot of folks have been working on for a while - an initial set of recommended best practices for language model deployment. Much more to be done and said here but excited for this statement to help move discussion forward!
@patio11
@sama
The goldfish one is tricky; even my adapted prompt doesn't do much better, nor do Variations on my favorite of it (the hallucinated text in the original here is very funny):
But eh, where's the fun in another stodgy portrait of a man with grapes and a red party hat obscuring his face? So trite and cliched. Let's do ... a corgi!
I'm up for taking some
#dalle
requests, especially focused on really beautiful or creative artwork. Drop some ideas in-thread, and I'll share out the most-voted ones!
Once upon a time, we developed the technique called "Inpainting", which let you not only generate a novel image from text, but *edit* an existing image to look more like your caption. With painstaking iteration, it was possible to create images like the below.
@ESYudkowsky
@nickcammarata
And this is one I happen to get if I say the painting is by Eliezer Yudkowsky, though the system on the whole doesn't appear to know what you look like:
I think the world would be far better off if other AGI outposts adopted these norms.
Some basically have!
@AnthropicAI
is a B-Corp, which I love & find far more comforting than "maximizing shareholder value". (3/n)
We are building an early warning system for LLMs being capable of assisting in biological threat creation. Current models turn out to be, at most, mildly useful for this kind of misuse, and we will continue evolving our evaluation blueprint for the future.
@patio11
@sama
Here's an illustrative generation re: "A young girl staring down a dragon, who is visibly amused."; this worked on the first shot. Note that the human face which isn't perfect, which IMO is a feature rather than a bug at this stage: see
Truly one of the great privileges of working at OpenAI is a front-row seat to what folks are building, internally and on our platform (and building the things yourself!)
I have now gotten enough of a taste of AI-powered creative tools to know that they're going to be much better than even the AI optimists think.
So cool to just think of ideas and iteratively have the computer implement and build on them.
My 4 year old designed cute fuzzy red chicken slippers with
#dalle
a couple weeks ago. 🤩
I had them custom made and they just arrived. Soooo cute 🥰
She named them Henry and Wendy 🐓🐓
This is the future of creativity, young entrepreneurship, fashion design and ecommerce.
For those curious, I'll show an end-to-end process of creativing something fun. Art was never my strong suit, but with DALL-E 2, I can now "paint my imagination". Let's start with a classic like this:
If you're interested in previewing our research and playing with the AI to help us explore the capabilities of highly-advanced image-generation, you can join the waitlist now:
If you want to give GPT-4 a go, you can do so now via ChatGPT Plus:
And if any of this sounds interesting, note that we're hiring and would love to hear from you:
DALL-E 2 can also make use of Inpainting, when an image isn't quite well-known enough to be generated de novo. Consider these attempts at generating The Old Guitarist by Picasso; they don't look quite like the original. Thankfully, we can edit the original if we wish!
@sloanesloane
Here's the second I chose:
You can see it has some trouble getting the "three panes" exactly right (many have two) and the fine details aren't tremendous, but I think the general style is there!
One of the best kept "secrets" at OpenAI:
Clicking this button at gets you access to instant, world-class transcription. And now, you can do this at scale with our Whisper API.
One of the coolest parts? OpenAI's corporate structure. Capped profits; governed by a non-profit Board; ability to cancel all shareholder obligations if responsibility calls for it. (2/n)
@_markhudson_
Hmm someone asked for "The Creation of Adam by Michelangelo but the humans are cyborgs", but I don't see their Tweet anymore. Maybe they'll see this?
Ok, great, so we have grapes! But that hat is kind of boring, why don't we try a red party hat instead? Again, start by erasing, and give DALL-E 2 a prompt:
You can read more here about our Content Policy, which applies to DALL-E 2:
It's great to see this getting highlighted in news coverage as well, not just the technical features:
That dog in r2c3 is really calling out to me; the neckline isn't quite what I want though, and on further thought, I want a fun vibrant setting like the beach, rather than a solid-colored wall:
@AISafetyMemes
Are these quotes attributed as being from distinct individuals?
If they're in fact many from the same individual, I think it would be clearer and more-accurate to indicate that with "...." as is the journalistic norm :-)
@bakztfuture
Just a note - this is 50 *requests* (multiple images per request), not 50 images; please let us know if you're seeing something different, would definitely want to patch that
I spent most of this past weekend experimenting with DALL·E 2, OpenAI’s new AI system that can create realistic images from a written description.
I curated a book of 1000 robot paintings. You can view the entire thing online at
#dalle
@mattyglesias
"a prosperous and multicultural United States with one billion varied Americans, standing hand-in-hand in loving harmony, high-quality digital art"
If any of this sounds inspiring, know that we need a ton more people.
We'd love to hear from you, especially if you have experience doing right by communities via transformative technology:
Re-upping this in light of today's WebGPT announcement (). Language models can be steered towards so many tasks beyond "mere" writing (not that that isn't important in itself).
Not all of these turn out with *great* quality; truthfully, I think I was a little sloppy with the region around the hand, and so sometimes we end up without a guitar at all. But, compared to the version that
@ilyasut
and I hacked together back in the day, I'll take it!
We're still working on a bunch of things - help submit feedback! - but it's great to have such a useful tool for asking questions & improving my understanding.
I'm remarkably privileged to have world-class colleagues - and still, ChatGPT is incredibly useful.
During my PhD I spent a lot of time studying a protein called beta-catenin.
Today OpenAI released ChatGPT and I asked it a few questions about beta-catenin. LLMs as a learning tool will be incredible. Everyone will have an on-demand socratic method information mentor.
One of the best kept "secrets" at OpenAI:
Clicking this button at gets you access to instant, world-class transcription. And now, you can do this at scale with our Whisper API.
Not only did ChatGPT's API release, now Whisper's in the API!
Whisper can transcribe audio into text with incredible accuracy. For a sense of scale Alexa gets roughly 1 in 5 words wrong. Whisper in contrast gets close to professional human transcribers
Excited to share some of what we've been working: Tooling to help developers protect against misuse of their applications, and to unlock beneficial uses in sensitive domains, like tutoring for students.
Excited about the next phase of our push to bring more folks from underrepresented groups into AI.
I have the privilege of working closely with several alums of the earlier iterations of this program and look forward to meeting the next generation!
A nice feature of chat is that you can re-direct GPT if you change your mind, want a variation on the theme, etc.
For instance, I redirected it from general 80s upbeat songs to ones with great drum beats:
First recommendation: "In the Air Tonight", not bad!
@kshekar
@EmilyKager
@lauren_n_roth
The Gmail is probably their iMessage account - people can choose whether new messages are sent by default from a phone number, email, etc, but many don’t know about this, and even for those who do it can be counterintuitive
@drawbrandondraw
@sloanesloane
These are generations for “A black and white cartoon sketch of a robot artist, in the style of Bill Watterson” - my sense is they don’t look a *ton* like his? I bet it would generate good Variations of his work though
Around this time 3 years ago, early 2020, we started testing the OpenAI GPT-3 API with the first external users. It was really hard to get anyone to try it, let alone build on it. So much has changed since then, but still feels like we're in the early days.
@rokine40
Yeah that was one of my major things initially too - it still works and is generally more straightforward now :-) for instance, this is just "a 3D rendering of a harp in the shape of a snail"
Now, to my point - that rightmost image from the beginning isn't *that* great. But one-level more down the tree, look how cool they are!
It's DALL-E images all the way down.
When I see impressive artwork in my Twitter feed, I'm now often checking the bottom-right for our "DALL-E signature" to see if it was AI-generated.
DALL-E has seriously changed the way I interact with media online & what I believe to be possible.
It's here: the ✨
#DALLE
Prompt Book ✨, first edition! Free to download, it features 82 pages of artistic inspiration with over 300 examples of
#DALLE2
images!
Here's what's inside... 🧵