The past 24 hours have had me navigating an existential crisis while simultaneously being gaslit by friends, family, and colleagues about what’s going on. And that’s probably fair of them—I have a tendency to overreact to things, to be a bit dramatic.
I am 100% the guy in this panel right now.
But 4o image generation is insane.
I’ve been working in the LLM space since before ChatGPT shifted everything. I’ve closely followed the progress. I test every new release. I tell my friends that every AI app they send me is slop. I am not easily impressed.
But this feels like another ChatGPT moment. This isn’t just better distribution (is hiding your state-of-the-art model in a Discord chat behind /commands really the best way to get people to use it?).
This feels foundational. It’s not just a better diffusion model—it’s actual reasoning in pixel space.
I’ve been on the fence about whether AGI (whatever that even means) is possible. Can we actually bottle intelligence into an electric rock? But it doesn’t take much napkin math to pencil this out a few years. (True believers might ask where I’ve been, but rest easy brethren—I am yours now.)
It brings to mind this 100% real needlepoint of an Ilya Sutskever quote.
Trying to game out the second- and third-order effects of an image generation model feels strange, even dumb. Infinite Ghibli? What are you worried about? Ghibli gonna take all the jobs?
I think this tweet (shared with me by a friend this morning) just about sums it up.
But to that, I say:
Below is a series I’ve been working on to try and demonstrate this phenomenon I’m experiencing. My fiancée and I got engaged last October, and we captured an amazing photo (maybe my favorite picture ever). So I’ve been trying to recreate it in every style possible.
The consistency of this model is incredible–and the content filters are tuned to low right now. I won’t be surprised if by the time you’re reading this, most of these styles will be blocked (already seems to be happening :/ )
Our original photo:
Lego:
Claymation:
Sesame Street:
Scooby-Doo:
Neon Sign:
Tim Burton:
Hey Arnold:
Victorian Botanical Print:
Wes Anderson:
Pixar:
Vintage Comic:
Peanuts:
“Yellow Submarine Family”:
Construction Paper:
Architectural Blueprint:
Medieval Manuscript:
Street Art Stencil:
Pixelated Video Game:
1960s Style Cartoon:
Stop Motion:
And finally, Ghibli:
Other models could already do this!
No, they couldn’t.