🎉 Big news, folks! 🎉 I'm thrilled to join forces with
@a16zgames
for their upcoming
@Speedrun
class, backing visionary founders at the frontier of AI, 3D & immersive computing. 🚀
They're looking for startups building the next-generation of:
1️⃣ Visual 3D Creation Tools
2️⃣…
Before/after of Corridor's latest AI video is wild. They shot video on greenscreen, made virtual sets in Unreal, then reskinned it to anime by finetuning Stable Diffusion. Net result? 120 VFX shots done by a team of 3 on a dime. Bravo! This is a milestone in creative technology🧵
Midjourney v5 has pushed into photorealism, a goal which has eluded the computer graphics industry for decades (!) 🤯
Insane progression, and all that by 11 people with a shared dream.
🧵 Let's explore what these breakthrough in Generative AI mean for 3D & VFX as we know it...
Remember that 'air head' video made with Sora? Turns out it used a ton of rotoscoping and manual VFX.
A 'head' would pop back on, and the balloon colors would keep changing from generation to generation. TL;DR researchers and developers of generative AI tools really need to…
Generative AI is cool and all, but I continue to be blown away by the strides in real-time visual effects.
Case in point — ripping through immaculate fluid simulations on an RTX 4090 using EmberGen:
3D capture is moving so fast - I scanned & animated this completely on an iPhone.
Last summer you'd need to wrangle COLMAP, Instant NGP, and FFmpeg to make NeRFs.
Now you can do it all inside Luma AI's mobile app. Capture anything and reframe infinitely in post!
Thread 🧵
🧠 AI experiment comparing
#ControlNet
and
#Gen1
. Video goes in ➡ Minecraft comes out.
Results are wild, and it's only a matter of time till this tech runs at 60fps. Then it'll transform 3D and AR.
How soon until we're channel surfing realities layered on top of the world?🧵
🚀 Big news today with Google + Adobe joining forces!
We're talking about 3D content anchored to the real world at insane scale🌐 And of course, AI had a role to play.
I've got early access, and let's just say the physical & digital worlds are blurring 😎 Let's get into it!🧵
Q: "After doing AI for so long, what have you learned about humans?"
Sam Altman: "I grew up implicitly thinking that intelligence was this, like really special human thing and kind of somewhat magical. And I now think that it's sort of a fundamental property of matter..."
It's…
AI just took 3D modeling to a whole new level 🤯
Introducing Neuralangelo, a new AI model by NVIDIA that reconstructs mind-blowingly detailed 3D surfaces directly from 2D videos — like photogrammetry on steroids. 🧙🏻♂️
Keep reading to see this crazy magic for yourself 🧵
3D scanning and rendering is moving so fast - got my splats up and running and I'm mind blown getting ~100fps for this complex 3D scene ⬇️ 🤯
1. WAY faster than NeRF: For comparison, NeRFs would takes around 10 seconds per frame (!) Instead I'm zipping around with FPV controls…
Wow! This looks SO much better than Roto Brush in After Effects, it's not even funny.
Track-Anything is a video object tracking & segmentation tool based on Meta's Segment Anything Model.
That robustness though! No way I'd get a similar result in AE without tons of manual…
ControlNet experiment where I'm toggling through different styles of contemporary Indian décor, while keeping a consistent scene layout.
Loving how ControlNet is putting the artists back in control of AI image generation process.
🧵Thread
#ControlNet
#StableDiffusion
#EbSynth
3D, AI, NeRFs and oh my! Immersive view is now rolling out to 5 cities, with 5 more to follow. Experience the best of all views, with helpful info layered on top, so you can decide when & where to go. Proud to be part of the team-of-teams that made this maps milestone a reality!
i'm convinced the killer use case for 3d reconstruction tech is memory capture
my parents retired earlier this year and i have immortalized their home forever more
photo scanning is legit the most future proof medium we have access to today
scan all the spaces/places/things
AR multiplayer FPS in China
Who else loved paintball and airsoft as a kid? This would be so hype with none of the mess. Perfect for an urban community.
And the P90s make it extra cool 😎
Ask and you shall receive. Playing guitar in mixed reality. 🎸
So much potential to augment real world instruments. Why should pianos have all the fun? 😁
Awesome work by
@sergeyglkn
Anyone making an app for Quest 3 that using mixed reality to overlay guitar tabs on an actual guitar fretboard?
Not air guitar, but more like the
@PianoVisionAR
app tailored to guitars. 🎸
I spent the weekend hacking an AI room makeover using ControlNet & Stable Diffusion, restyling my parent's "drawing room." How soon until this workflow is standard for home décor & interior design?
ControlNet is a glimpse into the AI-infused future of 3D rendering & AR.
🔥 Unreal Engine 5 is wild. Such a powerful tool for cinematic storytelling! 😻
📺 Made this short video entirely entirely in-engine with marketplace 3D assets, and a sprinkle of GPT for scripting assistance. Hope you enjoyed my voice acting too 😂
🤯 Doing this the “good old…
🤯 Wondering why creators like
@SirWrender
are losing their minds over
@WonderDynamics
?
Short answer: it’s a middle ground between 3D, VFX and editorial tools ⚔️
So what took 3 days across many tools — takes 3 minutes in just one tool!
🧵 Thread (0/8):
I'm still convinced the killer use case for 3d reconstruction tech is memory capture. No surprise Apple is headed in this direction.
Here's a massive scan of the backyard + rooftop of my parent's old house. They're retired now, but their home is immortalized forever.
Photo…
What a week for AI! Not yet scary, but a feeling is in the air. Things are heating up and people are conflicted.
Why are the brightest minds in AI asking for a 6 month pause, while others say it doesn't go far enough? 🤯
Here's why this debate deserves our attention.
🧵 Thread
It’s over. After this Sora clip, I can't even imagine doing this the ‘normal’ way with complex 3D/VFX tools.
I really thought you'd need 3D engine + generative AI to achieve consistency & quality like this.
Turns out all you need more data and compute…
I love how the simplest Al creation tools weaved together tastefully into a cohesive narrative will outperform posts pushing complexity.
Creators saying this isn't hard to make are missing the point. Story above all else. Case in point: barbenheimer 😌
Multi ControlNet is a game changer for making an open source video2video pipeline. I spent some time hacking this NeRF2Depth2Image workflow using a combination of ControlNet methods + SD 1.5 + EbSynth.
🧵 Full breakdown of my workflow & detailed tips shared in the thread below ⬇
Taking “AR” videos and running them through a NeRF pipeline is a fun thing to do. Turns out things look even more 🔥 with Gaussian Splatting.
But don’t take my word for it. Behold this beauty:
BREAKING: YouTube announced a suite of AI tools for creators! From a dedicated video editing app and AI language dubbing to AI video generation and ideation. Here's the TL;DR:
1. AI Video with Dream Screen - Visually transport yourself anywhere by typing a prompt. This new…
🚫This is not a NeRF. Not even a Photogrammetry mesh 📐This is Gaussian Splatting at ~130 fps.
This was a fun 3D scan to do - I'm blown away by the recovery of complex vegetation and lighting effects.
Most NeRF methods aren't nearly as sharp, and photogrammetry would've…
Forget physics. Wake me up when Sora (or any other AI model) can do this...
This is almost indistinguishable from reality 🤯
Created using Blender, Unreal Engine, Maya, Substance 3D Painter, and Nuke, using 22 UDIMs at 4K resolution by the talented Petter Steen.
During Sam Altman and Ilya Sutskever's recent trip to Israel, folks didn't beat around the bush.
First question: spicy open source vs closed source debate 🔥
Followed immediately by: "Can you tell us more about the base model before you lobotomized it?" 🤣
🚘🌌 AI-Powered Joyride: Cyberpunk San Francisco 🌉✨
🏙️ The world is changing quickly. Brace yourself as reality and fantasy intertwine, with AI turning into lenses through which we'll see the world. 🌐🌆
⚙ Brought to life by Kaiber Video2Video (featuring ControlNet, Stable…
If you though reskinning 2D videos was fun, how about reskinning 3D captures of the world?
That's exactly what you get when you combine NeRFs with InstructPix2Pix in this new paper by
@ayaanzhaque
et al.
Mini-thread🧵
Such a dope use of wonder studio. Virtual production and VFX continue to get democratized 🪄
Just imagine how much longer this would’ve taken with a classical approach to rotoscoping, in-painting, 3D tracking, animation, rendering and compositing 😅
Is it just me or was today the first time OpenAI was unable to overshadow a Google AI announcement?
Gemini 1.5 Pro is pretty wild. Just dropped in an audio file + hour long video interview, and now it's helping me package it up for YouTube.
Multimodality + 1M context window…
Generative AI and NeRFs are super cool… BUT I’m continually blown away with the work happening in procedural 3D modelling 🤯
Especially given the fact that these capabilities are in a free (!) 3D tool like Blender. No fancy Houdini license required 👇
1/ 🌌 Roblox isn't just a video game — it's the world's largest UGC 3D platform. But wait, it gets better!
🧙♂️ Roblox is now embracing Generative AI, shattering barriers to 3D creation, and creating a TikTok-like flywheel 🔁
🧵 Thread: here's how the magic is unfolding in 3D 🪄
NeRF compilation time ft. drones, phones and oh my! Photogrammetry ain't new, but it's never looked better thanks to AI. Plus, it's so satisfying to see it breath new life into old 3d scans! Watch the full 4 min video here:
#ComputerVision
#NeRF
#3D
#AI
🤯 Woah! This new AI paper is legit like Adobe Puppet Warp on steroids 📌
Text prompts aren’t the be-all and end-all of AI creation. DragGAN is a perfect example of giving creators fine-grain control over the AI image generation process.
Here’s the TL;DR summary:
• 🖌️…
All right folks, it's official!
After six incredible years at Google, I'm saying goodbye to the mothership to pursue my dream as a full-time creatorpreneur.
I'm beyond excited to ride the wave of AI-enabled creativity and see where this journey takes me! 🌊🚀
🧵 Thread
🥽 Apple Vision Pro is everything we hoped for and then some.
But at $3,499 you're either an early adopter or a AR/VR developer.
Yet two things change after today:
1. "Metaverse" will stop being a bad word (even though Apple didn't use the word) - sparking a resurgence in…
NeRFs + ControlNet + EbSynth = 🔥
🗿 Processed this 3D capture of a statue through Kaiber's video2video workflow
🤖 Wish there was a bit more consistency across keyframes, but damn these are pretty clean results!
🔥 Can't wait till this tech is running in real time at 60…
Seen that viral Lex/Zuck interview inside the metaverse? While novel, Meta's actually been working on this tech since 2019! Here's the TL;DR on how it works:
The idea behind it is simple: capture yourself once in immaculate detail -- whether it's with a fancy "light stage"…
The quality of relighting you can do with AI is pretty amazing. So much potential for virtual sets and visual effects.
Segmented the video and inferred PBR textures locally using SwitchLight Studio (on an NVIDIA RTX 6000 Ada GPU).
More videos on this awesome technique soon.
🌳🎮 The physical and digital worlds are converging. I used AI to transform the historic Lodhi Garden in India into a Minecraft landscape 🕌🌳
🧩🍃 I created a 3D NeRF of this serene garden using GoPro video, then transformed it into the blocky Minecraft aesthetic using…
Oh yeah. Gaussian Splatting inside Unreal Engine 5 at ~80 fps. This is gonna be fun :)
It's only been a few weeks and there's already a Unity plugin, WebGL implementation, and a Blender plugin is in the works. Loving the velocity.🖌️
4D reality capture is making major strides. You probably saw everyone raving about gaussian splatting and debating whether it's a (Ne)RF or not... lol
Well how about making them dynamic? This new AI paper creates dynamic 3D scenes with characters and objects that you can attach…
Video in-painting in Pika 1.0 is goated. Perfect for mixed reality magic like this 🪄
Moreover: It’s wild to see this completely bypass the classical 3D animation and VFX pipeline. Imagine the hours saved.
Amazing experiment by
@Martin_Haerlin
Zooming into the future and back in time. ⌛
Generative AI is wildly powerful.
Learn how to wield it. ⚔️
Because it's an amplifier of creativity, not a substitute.
If you don't. Well, the threadbois just might be right. 🙃
Roblox took yet another step towards becoming the YouTube of 3D, and generative AI is playing a central role.
What started off as code assistance and texture generation has now evolved into speaking entire 3D worlds into existence.
Jet fighters chasing UFOs over a NASA base? Not quite! I conjured up this wildly realistic scene using DALL·E combined with a little VFX magic, and the internet can't get enough. Tens of millions of view and counting on TikTok.
🧵 Here's how I used AI to mesmerize millions 🎥💫
I continue to be bullish on intermediate 3D representations to make controllable AI content.
Greybox and kitbash the world you actually want, and use generative AI to take it all the way.
The future is hybrid, and
@Yokohara_h
illustrates this well:
4k volumetric video @ 80fps on a 4090 🔥 Of course capture rn necessitates a synced multi-camera array; but the fact that we’re moving past uncanny GTA looking “videogrammetry” to photorealistic 4D radiance fields means distribution won’t be a problem :)
The future of reality capture and spatial media is bright.
It’s the closest thing we have to teleportation ✨
Check out this Quest 3 demo by
@VoxelKei
:
The craft of content creation evolved dramatically from YouTubing to TikToking. But hold on tight—Generative AI is about to make that colossal shift seem like a mere footnote in history, and revolutionize the way we create and consume content...
Then: "YouTubing" was a…
🖼️ Ok so “reskinning” the Real World with 3D Capture + Generative AI continues to be a blast ✨
🌐 Reality capture techniques like photogrammetry and NeRFs allow you to capture the spaces, places and objects you care about — creating a growing library of assets you can pull on…
The Tesla Bots Uprising!
It legit blows my mind that I can make this whole video in an hour 🤯
Breakdown of my AI creation process:
1. Make Skynet inspired visuals in
@midjourney
2. Separate Tesla Bot video into clips using "Scene Edit Detection" in
@adobe
Premiere Pro
3.…
Top Gun Maverick. For a movie with no CGI, it sure has a lot of it.
A whopping 2,400 (!!) visual effects shots in fact.
But wait, wasn't everything filmed practically? 😉
Sure was. Yet almost every jet you see on-screen is CGI.
Let's dive into this "invisible" movie magic 👇
Got some more splats up and running - check out this gorgeous 3D scene in New Delhi running at 180 frames per second. Absolutely mind blowing! But how does this magic work exactly? 🪄
It all starts from a point cloud. You’ve probably seen these if you’ve used any photogrammetry…
Been hands-on with the beta of Adobe's cutting-edge Generative AI tool, and I'm impressed! 🤯
Here's a taste of the power of
#AdobeFirefly
🎇 and what sets it apart in the increasingly crowded world of
#AI
art.
Thread 🧵🎨
✨ NeRF at night part 2: Urban Cityscapes 🌇
⚙️ Drone video ➡️ Trained + rendered with Luma AI web app
💡 Creators can use
#NeRF
to reframe & retime generic stock footage — so that drone shot transition hits *exactly* the way you want 🎥
#AI
#NeuralRendering
#Photogrammetry
I got to try this demo in person today at NVIDIA HQ — mind blown 🤯
Immaculate streaming quality, full-res CAD models with real-time ray tracing.
Stay tuned for a deep dive video after GTC. I’ll also be interviewing the VP of XR behind these efforts.
The level of creativity and manipulation possible with Gaussian Splatting is insane!
Check out this Unity demo by
@Ruben_Fro
— it’s giving me Oppenheimer meets Thanos Snap 🫰🔥
This weekend Google AI gave 60 Minutes an exclusive preview of their text-to-image model (Imagen) and text-to-video model (Phenaki).
Demo below. What do ya'll think?
🕹️ Ever dreamed of building your own 3D Super Mario World? 🍄
Been using a 3D tool that makes that not just possible, but easy and super fun - introducing Rooms XYZ 🌐
And get this, it even has a ChatGPT integration! 🤯 🧵
Google DeepMind has access to the most complete models of the digital and physical world.
Digital World:
- Search
- Docs
- Drive
- Gmail
...
Physical World:
- YouTube
- Maps
- Waymo
- Photos
...
Brace yourself for the foundational models that will be trained from this treasure…
This video is 100% computer generated 🤯
What you’re witnessing is the immaculate fusion of 3D animation, visual effects, and photogrammetry — a far cry from your typical UFO hoaxes. 🛸
Typically, you capture that characteristically shaky real world footage and add in a…
Reality capture and generative AI make a powerful combination that is yet to be fully explored.
Once you can perceive the world, genAI gives you the ability to reskin & augment it -- whether it's for VFX or interior design.
Explore the possibilities in this compilation of my…
With Gaussian Splatting you get 3D editing support! So you can select, move, and delete stuff; apply shader fx. This type of editing has been tedious to do with NeRFs and their implicit black box representations.
Case in point (1/3) by
@hybridherbst
:
I love it when open source & closed source AI video goes toe to toe 💪🏽
With DragNUWA you can draw anchor points to tightly control your animations 📌
Ostensibly just as much control as Runway’s new multi motion brush. And it’s free to run locally:
⚙ Corridor basically made an open source video2anime workflow to pull off this video. Key tools they used:
- Stable Diffusion model + DreamBooth fine-tuning
- Unreal Engine + asset store 3D models
- Img2Img + DeFlickering effect
- Heaps of gold ol' fashioned VFX compositing
NeRFs are like copy/paste for real life — capture once & reframe infinitely in post. But what if you want to bring other aspects of reality into 3D tools? Think weather or lighting. Well,
@aviel08
’s reverie uses AI to do exactly that — with just a photo!
Computer vision is so dope.
You can map your space in real-time with the phone in your pocket — creating a simplified 3D model with ‘semantically meaningful’ entities representing the room, walls, windows and furniture within that space. Basically, a 3D floor plan.
This is…
Reshape reality under ~12 seconds using AI! 🤯
Qualcomm just brought the magic of ControlNet + Stable Diffusion from the cloud to your smartphone (!) 📱🔥
The best part? It’s all happening on-device; no cloud required!
Dive into my hands-on demo, sponsored by Qualcomm:
#1
:…
This new AI paper on 2D to 3D by
@JingxiangSun42
et al looks promising!
“DreamCraft3D leverages a 2D image generated from the text prompt and uses it to guide the stages of geometry sculpting and texture boosting. When sculpting the geometry, the view- conditioned diffusion…
Stable Diffusion 2.0 is out, and the feature I’m most excited about is
#depth2img
. Inferring a depth map to maintain structural coherence will be pretty sweet for all sorts of
#img2img
use cases. For instance...
Big day for open source 3D capture! 🌐
@nerfstudioteam
hits 1.0 and gets a *commercial-ready* Gaussian Splatting implementation
@GoogleAI
open sourced Zip-NeRF — the highest quality 3D reconstruction pipeline, setting the stage for a future SMERF release
And so it begins!
@Polycam3D
just shipped support for Gaussian Splatting! It's fun to see your splats progressively load in - keeps you entertained you while you wait. Web only for now, with mobile support forthcoming.
The velocity is wild, isn't it? Going from research to…
Your smartphone is a sensor collection contraption. Just using video for VFX discards all that rich metadata. Post-capture workflows like this put all that metadata to work to work. No need to manually match exposure, track the camera or estimate lighting:
HOLY CRAP 🤯 After 10 years in stealth R&D Google just dropped their new VR headset!
It has way better passthrough than the Apple Vision Pro! Works amazingly in low light too. And unlike every other headset -- it's like you're actually there! 👀
❗️BREAKING: Adobe has created a new 50-person AI research org called CAVA (Co-Creation for Audio, Video, & Animation).
I can’t help but wonder if OpenAI’s Sora has been a wake up call for Adobe to formalize and accelerate their video and multimodal creation efforts?
While…
📍 Sydney Opera House —
#NeRF
does a good job of recovering the waterfront and surrounding cityscape from (only) about 150 images pulled from an HD drone.
#ComputerVision
#AIart
BREAKING: Hugging Face is partnering with NVIDIA to add an AI training service right into their website. So instead of just picking models and running inference, you will soon be able to pick a pre-trained model and fine tune it using NVIDIA's DGX Cloud. Awesome news for devs!
NVIDIA's Omniverse was big news in Jensen Huang's SIGGRAPH 2023 keynote today.
Here's everything you need to know:
Omniverse is a 3D collaboration platform that enables the creation of massive virtual worlds with photorealistic graphics and (now) AI-generated content.…
Thoughts NeRFs were dead? Google DeepMind just dropped SMERF — streamable, multi-room NeRFs with cm-level detail. Oh and it works realtime on mobile 🤯
It’s a sweet spot between the speed of Gaussian Splatting and the quality of Zip-NeRF.
More below ⬇️
Meta has been building an absolutely wild dataset comprised entirely of first person or “egocentric” views 🤯
Not just imagery but also time-synced audio, IMU, eye gaze, head poses & 3D point clouds of the environment.
Perfect for training an AI Jarvis?