Alex Carlier Profile Banner
Alex Carlier Profile
Alex Carlier

@alexcarliera

7,894
Followers
1,455
Following
390
Media
1,746
Statuses

3D/AI • Prev (NeRFs in the browser), AI research at @MetaAI , @ETH Zurich & @Amazon

Paris
Joined March 2016
Don't wanna be here? Send us removal request.
Pinned Tweet
@alexcarliera
Alex Carlier
8 months
Another experiment with Gaussian Painters ✨🎨 By optimizing 3D Gaussian Splattings over separate images at several viewpoints, it is possible to get a Steganography effect! Three paintings are hidden in those gaussian splats
@alexcarliera
Alex Carlier
8 months
I optimized 3D Gaussian Splattings over a single picture on a 2D plane. I'm calling this "Gaussian Painters" 🎨✨ Watch the gaussian splats work to paint the Girl with a Pearl Earring! Here's how I did it (code below) ⬇️⬇️
12
74
645
21
224
1K
@alexcarliera
Alex Carlier
5 months
Upscale-A-Video was just released and it's so good! 🤯 It's a temporal-consistent Diffusion Model for video Super-Resolution, and has some of the best results I've ever seen, look at how sharp those lines become! More details below ⬇️⬇️
43
280
2K
@alexcarliera
Alex Carlier
5 months
Google just revealed an ABSOLUTE depth estimation model 🤯 As opposed to recent depth models (Marigold, PatchFusion) which aim for maximum details, DMD aims to estimate the ABSOLUTE depth (in meters) within the image More details below ⬇️⬇️
27
270
2K
@alexcarliera
Alex Carlier
5 months
Meta AI strikes again, with Relightable Gaussian Codec Avatars This is an update to the Meta Codec Avatars 2.0, building on 3D Gaussian Splatting. As a result, we get fully relightable real-time avatars, accurate at the hair strand level 🤯 More details below ⬇️⬇️
45
368
2K
@alexcarliera
Alex Carlier
8 months
Gaussian Painters imported into #b3d as ellipsoids (3D Gaussian Splatting plugin for Blender - work in progress)
41
220
2K
@alexcarliera
Alex Carlier
5 months
Wow Marigold 🌼 depth estimation works extremely well! 🤯 And the best thing is that the checkpoints and code are fully available for commercial use! Try it out yourself! ⬇️⬇️
Tweet media one
17
179
1K
@alexcarliera
Alex Carlier
4 months
This is literally magic 🤯 FMA-Net is a new AI method for video deblurring! It uses complex motion representation learning for spatio-temporally-variant restoration with kernels that are aware of motion trajectories. More info below ⬇️⬇️
59
153
1K
@alexcarliera
Alex Carlier
5 months
PatchFusion was just released. Compared to ZoeDepth, it predicts depth maps with much finer details, just look at the comparison below! 🤯 Its main contribution is a new Global-to-Local module and Consistency-Aware Training. More examples below (with code) ⬇️⬇️
22
137
899
@alexcarliera
Alex Carlier
5 months
LooseControl was just released and it's so good! 🔥 It enables depth-map conditioned image generation, but unlike ControlNet, the 3D boxes enable less strict control with simple bounding boxes. And look at how stable it is across frames! More examples (with code) below ⬇️⬇️
22
120
798
@alexcarliera
Alex Carlier
4 months
This scene was scanned using only 3 pictures 🤯 In my opinion, this was the biggest flaw of NeRFs & 3D Gaussian splats: they are trained from scratch every time with no knowledge of the world. With ReconFusion, we now acquire it from diffusion models More examples below ⬇️⬇️
18
117
758
@alexcarliera
Alex Carlier
10 months
Having accurate keypoints is extremely important for many tasks in AI and 3D. Here I trained a reenactment network with @reshotAI keypoints!
16
138
742
@alexcarliera
Alex Carlier
7 months
New 3D Gaussian Splatting recording! Those metallic reflections and leather were captured REALLY well! When looking closer, you can also see how the watch hands are modeled with just a couple of elongated gaussians. #GaussianSplatting
19
70
751
@alexcarliera
Alex Carlier
5 months
This scene was scanned using only 3 pictures 🤯 In my opinion, this was the biggest flaw of NeRFs & 3D Gaussian splats: they are trained from scratch every time with no knowledge of the world. With ReconFusion, we now acquire it from diffusion models More examples below ⬇️⬇️
27
90
708
@alexcarliera
Alex Carlier
4 months
SUPIR: Scaling Up to Excellence. Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild 🔥 Links below ⬇️⬇️
23
79
676
@alexcarliera
Alex Carlier
3 years
Made a little visualisation for my latest project on free-view one-shot image generation 🤩 Just pick a photo, and generate images with full control of rotation and facial expressions. Or choose a driving video and let the magic happen✨ @ylecun Try it for free using @litso_app !
18
111
650
@alexcarliera
Alex Carlier
9 months
I have written a tutorial on how to train your own "3D Gaussian Splatting" models! #GaussianSplatting Write me here if you're facing any issues. ⬇️⬇️
23
95
652
@alexcarliera
Alex Carlier
8 months
I optimized 3D Gaussian Splattings over a single picture on a 2D plane. I'm calling this "Gaussian Painters" 🎨✨ Watch the gaussian splats work to paint the Girl with a Pearl Earring! Here's how I did it (code below) ⬇️⬇️
12
74
645
@alexcarliera
Alex Carlier
4 months
OpenVoice was just released! 🤯 Given a short audio clip, it clones the reference voice and can generate speech in multiple languages, while having control over emotion, accent, rhythm, pauses, and intonation! Code & details below ⬇️⬇️
13
125
627
@alexcarliera
Alex Carlier
4 months
A new highly accurate OCR was just released and it's open-source! Surya is accurate to the line-level, and multilingual. Well, in this example, only the newspaper name was not detected 😅 Link below ⬇️⬇️
Tweet media one
12
77
517
@alexcarliera
Alex Carlier
5 months
EfficientSAM was just released and it's fast! 💨 With 20x fewer params, it is now 20x faster than the original SAM segmentation model, while staying in the same accuracy range. See below for the project page and an interactive @huggingface space to try it out! ⬇️⬇️
14
111
496
@alexcarliera
Alex Carlier
10 months
Mediapipe vs my in-house keypoint detector for Reshot AI!
13
50
455
@alexcarliera
Alex Carlier
8 months
New 3D Gaussian Splatting capture at the Vintage Cars association in Versailles, from a 30 seconds recording. Some floaters were cleaned with my b3d plugin. Original output below ⬇️⬇️ #GaussianSplatting
13
40
456
@alexcarliera
Alex Carlier
8 months
Imagine this on the @Nike website. This is a 3D capture of the Nike ZoomX Vaporfly Next%, and visualizing this feels as real as touching the real shoe. 3D Gaussian Splattings are SO good at modeling fine structures, like in this case the transparent fabric. #GaussianSplatting
14
38
442
@alexcarliera
Alex Carlier
5 months
Copy-paste any object into an image with AI! 🤯 Here's one application of using AnyDoor for virtual try-on, but it's much more general and is designed to maintain texture details yet allow versatile local variations! Links below (with code!) ⬇️⬇️
9
89
427
@alexcarliera
Alex Carlier
4 months
Get crisp 3D Gaussian splats from blurry inputs 🤯 Capturing sharp videos is often impossible because of lens defocusing, object motion or camera shake. And 3DGS learns to fit this. "Deblurring 3DGS" optimizes a small MLP to model the scene blurriness! More details below ⬇️⬇️
8
75
432
@alexcarliera
Alex Carlier
4 months
An open-source version of AnimateAnyone was just released! (Moore-AnimateAnyone) I just tried it, the quality is not there just yet, but it has great potential! Links below ⬇️⬇️
3
68
425
@alexcarliera
Alex Carlier
7 months
3.8 mb 🤯 I tried the 3D Gaussian Splatting add-on for Unity by @aras_p on my Nike shoe capture, and when using the “Very low” quality, the file size becomes 3.8mb with minimal visual loss. That’s the lowest file size I’ve seen for a 3DGS yet! #GaussianSplatting
18
45
423
@alexcarliera
Alex Carlier
5 months
Wow, generate infinite videos from a single image 🤯 WonderJourney creates coherently connected 3D scenes along a controllable camera trajectory. Look how running the code three times results in completely different videos! More examples below ⬇️⬇️
14
60
401
@alexcarliera
Alex Carlier
5 months
While everyone is waiting for AnimateAnyone, MagicAnimate was just released and it's really impressive! 🤯 It needs a single image and a motion video, and it produces an animated video! See below for more examples ⬇️⬇️
8
75
379
@alexcarliera
Alex Carlier
5 months
A new video generation paper just dropped 🤯 DreaMoving creates human dance videos given a target identity and posture sequences. But unlike AnimateAnyone and MagicAnimate, a full body picture is not required as input (only face + optional prompt). More details below ⬇️⬇️
19
88
375
@alexcarliera
Alex Carlier
5 months
So Impressive 🤯 StreamDiffusion, built on sd-turbo, can generate up to 150 images per second. Hardware configuration below ⬇️⬇️
14
47
374
@alexcarliera
Alex Carlier
8 months
How do 3D Gaussian Splatting models handle view-dependency? Using Spherical Harmonics Here's how it works ⬇️⬇️ #GaussianSplatting
11
54
361
@alexcarliera
Alex Carlier
8 months
Using Gaussian Painters, you can also create a psychedelic illusion using two orthogonal images!
@alexcarliera
Alex Carlier
8 months
I optimized 3D Gaussian Splattings over a single picture on a 2D plane. I'm calling this "Gaussian Painters" 🎨✨ Watch the gaussian splats work to paint the Girl with a Pearl Earring! Here's how I did it (code below) ⬇️⬇️
12
74
645
8
40
357
@alexcarliera
Alex Carlier
4 months
A new real-time Radiance Field paper beating 3DGS was just released! 🔥 Similarly to 3D Gaussian Splatting, TRIPS optimizes a point-cloud with color, position & size that gets splatted to the screen. But it does so using a single trilinear write in an image pyramid More info ⬇️
10
59
352
@alexcarliera
Alex Carlier
5 months
Adaptive Shells was just awarded best paper at SIGGRAPH Asia! 🙌 It's a new hybrid method between a NeRF and mesh, and achieves up to 300 FPS at HD resolution! More details below ⬇️⬇️
5
44
337
@alexcarliera
Alex Carlier
5 months
DreamBooth from a SINGLE image with perfect accuracy 🤯 Unlike specialized models like AnimateAnyone, DreamTuner is a general method for subject-driven generation, controllable via text or pose But it works so well, it can create temporally consistent animations! More below ⬇️
10
56
333
@alexcarliera
Alex Carlier
5 months
Gaussian Head Avatars look amazing! 🤯 Capture a dynamic 3D Gaussian splat of a face, then animate it in 3D using another actor. Imagine the potential for the film industry! More examples below ⬇️⬇️
8
50
332
@alexcarliera
Alex Carlier
5 months
Google just announces VideoPoet: a multimodal video generation model! It's massively multimodal and can take as input: text, image, depth & optical flow or a masked video and is one of the first models that generates video + audio! More info below ⬇️⬇️
9
68
319
@alexcarliera
Alex Carlier
5 months
Meta AI's new real-time translation model is so impressive! 🤯 It streams the translation BEFORE waiting for the end of a sentence, with <2 seconds of latency. See how fast the translation appears after the speaker starts talking 💨 More details below (with code!) ⬇️⬇️
11
61
320
@alexcarliera
Alex Carlier
5 months
Wow this is cool! 🤯 PixelLLM generates image captions with pixel coordinates Just a few years ago, the field of Explainable AI was amazed by simple heatmaps in the image classification task (single label prediction) This brings it to a whole new level! Project links below ⬇️
8
75
314
@alexcarliera
Alex Carlier
5 months
Wow! 🤯🔥 Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion Links below ⬇️⬇️
7
47
308
@alexcarliera
Alex Carlier
9 months
3D Gaussian Splatting is INSANELY good at fur rendering. Look at the fuzzy details here! Makes sense since it literally optimizes over small ellipsoid particles, as opposed to NeRF or photogrammetry. #GaussianSplatting
9
40
299
@alexcarliera
Alex Carlier
4 months
Deblurring 3D Gaussian Splatting is seriously amazing! 🔥 Can't wait to try it out on my captures!
@alexcarliera
Alex Carlier
4 months
Get crisp 3D Gaussian splats from blurry inputs 🤯 Capturing sharp videos is often impossible because of lens defocusing, object motion or camera shake. And 3DGS learns to fit this. "Deblurring 3DGS" optimizes a small MLP to model the scene blurriness! More details below ⬇️⬇️
8
75
432
3
29
298
@alexcarliera
Alex Carlier
10 months
Mediapipe vs @reshotAI keypoints!
11
42
275
@alexcarliera
Alex Carlier
5 months
Segment Anything Model (SAM) now runs at 30 FPS on an iPhone! 🤯 EdgeSAM is the first SAM variant that can run at over 30 FPS on an iPhone 14 with good quality. Low how accurately it segments tiny vegetables! Code and @HuggingFace demo below! ⬇️⬇️
3
57
273
@alexcarliera
Alex Carlier
4 months
3D Gaussian Splattings from a single image 🤯 Compared to recent novel-view synthesis approaches (like Stable Zero123) which generate novel views as images (causing inconsistencies) this work generates 3D Gaussians directly (via a pointcloud and triplane features) More below ⬇️
7
44
256
@alexcarliera
Alex Carlier
5 months
The first Consistency Model for Video was just released! 🤯 It enables video generation with as little as 4 sampling steps: generating 16 frames (at 256x256 resolution) takes 10 seconds only! So not real-time yet (as for images), but close! More details below! ⬇️⬇️
Tweet media one
7
39
253
@alexcarliera
Alex Carlier
5 months
Wow this is cool! SMERF is a streamable NeRF that runs in real-time on any device with the quality of Zip-NeRF! Try it yourself! ⬇️⬇️
9
18
244
@alexcarliera
Alex Carlier
4 months
The code for DreamTalk was just released! Given any audio (text or song) and a single image frame, it generates a lip-synced animated video, copying the "expression" of a style reference. Links below ⬇️⬇️
5
49
242
@alexcarliera
Alex Carlier
10 months
Mediapipe vs @reshotAI keypoints. The interpolations are so satisfying!
12
29
241
@alexcarliera
Alex Carlier
4 months
A comparison between 3D Gaussian Splatting and the new TRIPS radiance field rendering method ⬇️ Can't wait to try this out on some of my scenes! 🔥
@alexcarliera
Alex Carlier
4 months
A new real-time Radiance Field paper beating 3DGS was just released! 🔥 Similarly to 3D Gaussian Splatting, TRIPS optimizes a point-cloud with color, position & size that gets splatted to the screen. But it does so using a single trilinear write in an image pyramid More info ⬇️
10
59
352
6
24
237
@alexcarliera
Alex Carlier
4 months
This new video upscaling & deblurring method (FMA-Net) works insanely well! ⬇️
@alexcarliera
Alex Carlier
4 months
This is literally magic 🤯 FMA-Net is a new AI method for video deblurring! It uses complex motion representation learning for spatio-temporally-variant restoration with kernels that are aware of motion trajectories. More info below ⬇️⬇️
59
153
1K
2
43
233
@alexcarliera
Alex Carlier
4 months
Temporally stable 3D body MoCap with a SINGLE camera & occlusions!🔥 Obtaining globally coherent & plausible motions through occlusions is an incredibly difficult problem, but RoHM (by Meta and ETH Zurich) seems to have just solved this! More info below ⬇️⬇️
9
37
227
@alexcarliera
Alex Carlier
5 months
ControlNet with higher fidelity, faster training & lower GPU memory 🔥 SCEdit introduces a lightweight tuning module called SC-Tuner Project links below ⬇️⬇️
4
39
226
@alexcarliera
Alex Carlier
5 months
Photopea just announced their new Background Removal tool It's available for FREE and works imo better than "Remove bg"! Wow 🤯 More examples below ⬇️⬇️
8
24
215
@alexcarliera
Alex Carlier
7 months
Wow the loading of #GaussianSplatting in @LumaLabsAI is so smart and satisfying! 😍 Only show the point cloud till fully loaded + progressive streaming from center to background
2
22
216
@alexcarliera
Alex Carlier
4 months
Just tried @LumaLabsAI 's Text-to-3D (low-poly) 3D generation is now as fast as image generation, which you can then upscale for higher resolution 3D models. Looks promising! 🔥
@LumaLabsAI
Luma AI
4 months
🔥 Introducing Genie 1.0, our first step towards building multimodal AI. Genie is a text-to-3d model capable of creating any 3d object you can dream of in under 10 seconds with materials, quad mesh retopology, variable polycount, and in all standard formats! Try it on web and in…
107
509
3K
2
37
213
@alexcarliera
Alex Carlier
8 months
This is insane! @antimatter15 has implemented a WebGL viewer for 3D Gaussian Splattings. Unlike other implementations, this uses vanilla WebGL, and runs on any device in the browser (60+ FPS on my desktop, 30 FPS on mobile but no touch controls yet). Link to try it below ⬇️⬇️
@alexcarliera
Alex Carlier
8 months
Imagine this on the @Nike website. This is a 3D capture of the Nike ZoomX Vaporfly Next%, and visualizing this feels as real as touching the real shoe. 3D Gaussian Splattings are SO good at modeling fine structures, like in this case the transparent fabric. #GaussianSplatting
14
38
442
2
36
216
@alexcarliera
Alex Carlier
8 months
I created an upside-down optical illusion using Stable Diffusion XL ✨✨ Here's how I did it ⬇️⬇️ #SDXL
11
38
210
@alexcarliera
Alex Carlier
4 months
Try furniture in your living room before buying 🔥 Amazon just announced "Diffuse to Choose", a new diffusion-based image-conditioned inpainting model. It is fast and accurately copies fine details of the reference to the target image. More examples below ⬇️⬇️
8
39
208
@alexcarliera
Alex Carlier
5 months
High quality real-time NeRFs on your phone🤯 MERF is a new streamable memory-efficient approach that achieves real-time performance while equaling the quality of Zip-NeRF (and outperforming 3DGS) Try it out yourself below ⬇️⬇️
5
28
207
@alexcarliera
Alex Carlier
5 months
We made a small promo video for our upcoming #LEGO AR app! What do you think? ❤️ Join the waitlist ⬇️⬇️
21
21
204
@alexcarliera
Alex Carlier
5 months
This is so perfect 🔥✨ SDXL Auto FaceSwap by @fffiloni enables to create new images using the face of a source image. Try it out in this @huggingface space ⬇️⬇️
Tweet media one
5
31
200
@alexcarliera
Alex Carlier
4 months
Reminds of LooseControl, but for video! Controlling 3d cubes in a video would be 🔥!
@alexcarliera
Alex Carlier
4 months
Fully control AI videos with simple boxes 🤯 Recent approaches enable control with human pose or depth maps, but creating these maps is challenging. TrailBlazer (built on top of ZeroScope) enables control with boxes through spatial & temporal attention map editing More below ⬇️
1
7
79
1
20
183
@alexcarliera
Alex Carlier
5 months
#GaussianSplatting from just two images in a single forward pass 🤯 PixelSplat predicts a dense probability distribution and samples Gaussians through a differentiable operation allowing to back-propagate gradients to the 3DGS representation Completely insane results! More ⬇️⬇️
3
28
185
@alexcarliera
Alex Carlier
4 months
Tested Meta AI's new Audio2Photoreal: photorealistic animated 3D Codec Avatars from audio alone, sound on 🔊 Needs better face expressions, but very promising multi-view results! Code links below ⬇️⬇️
9
33
183
@alexcarliera
Alex Carlier
8 months
New 3D Gaussian Splatting capture of a park near Versailles. The hardest part in shooting outdoors is to find the perfect timing when no people are seen 😅 #GaussianSplatting
7
18
182
@alexcarliera
Alex Carlier
4 months
Neural radiance field methods like Zip-NeRF perform very poorly when given only a few images. This is because they learn the scene from scratch with no prior information about the world. ReconFusion fixes that! 🔥⬇️⬇️
@alexcarliera
Alex Carlier
4 months
This scene was scanned using only 3 pictures 🤯 In my opinion, this was the biggest flaw of NeRFs & 3D Gaussian splats: they are trained from scratch every time with no knowledge of the world. With ReconFusion, we now acquire it from diffusion models More examples below ⬇️⬇️
18
117
758
2
27
177
@alexcarliera
Alex Carlier
4 months
DepthAnything was just released! 🔥 TLDR: it was trained on labeled + 62M unlabeled images The encoder is initialized with DINOv2, a segmentation models helps to detect the sky (and set depth to ∞), the unlabeled images are strongly distorted (color, blur, CutMix). More ⬇️⬇️
2
19
158
@alexcarliera
Alex Carlier
5 months
Get in-context descriptions of any object in an image 🤯 Osprey is a new Pixel Understanding model that can be integrated with Segment Anything Model (SAM) to obtain multi-granularity semantics of any region in an image! More info below ⬇️⬇️
4
30
153
@alexcarliera
Alex Carlier
8 months
I optimized the Spherical Harmonic coefficients of a grid of 100x100 gaussian spheres to fit pictures at different viewpoints, creating a lenticular card effect! ✨🌈 Once trained, the images are stored in the SH coefficients of SHARED spheres Here's how it works (w. code) ⬇️⬇️
3
17
141
@alexcarliera
Alex Carlier
5 months
Generate high-resolution UV textures from just a mesh 🤯 Compared to other approaches, Paint3D achieves to create UV textures without embedded illumination information. It does so using a novel coarse-to-fine approach. More info below ⬇️⬇️
2
26
143
@alexcarliera
Alex Carlier
5 months
Generate a 3D object from a few pictures only! 🤯 UpFusion also works taking a single picture as input, but providing a few UNPOSED images improves the fidelity to the input object! More details below ⬇️⬇️
3
21
130
@alexcarliera
Alex Carlier
9 months
Font resolution test on a 3D Gaussian Splatting capture. High frequency areas use more splats, while uniform ones are covered by just a handful of gaussians. Still mind blown that this uses only 3D ellipsoids. #GaussianSplatting
5
14
125
@alexcarliera
Alex Carlier
5 months
Another video showing Relightable Gaussian Codec Avatars in more details
2
13
127
@alexcarliera
Alex Carlier
5 months
Ok this is cool! 🤯 Testing the @krea_ai AI Enhancer on one of our BricksAR #LEGO buildings to turn it into a realistic Parisian café 😍 Check below for more ⬇️⬇️ @Scobleizer
10
20
123
@alexcarliera
Alex Carlier
8 months
And here you go! A 3D Gaussian Splatting running in real-time in the browser on a 3-year old iPhone Try it out here ⬇️⬇️
@antimatter15
Kevin Kwok
8 months
@alexcarliera I just updated the camera controls and implemented touch controls on mobile!
3
0
10
11
13
119
@alexcarliera
Alex Carlier
4 years
Excited to announce that "DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation" has been accepted to #NeurIPS ! If you're interested in vector graphics & sketch generation, feel free to check it out Code: Paper:
@hardmaru
hardmaru
4 years
DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation Exciting work from @alxandrecarlier et al. Transformer-based hierarchical generative models learn latent representations of vector graphics, with nice applications in SVG animation.
Tweet media one
Tweet media two
Tweet media three
7
64
291
2
34
118
@alexcarliera
Alex Carlier
10 months
Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction Paper page: Generating realistic human 3D reconstructions using image or video data is essential for various communication and entertainment applications. While existing methods achieved…
3
18
117
@alexcarliera
Alex Carlier
9 months
Tested 3D Gaussian Splatting on a capture from the Comics Art Museum, Brussels. 🇧🇪 Super impressive training convergence and real-time rendering! #GaussianSplatting
5
15
117
@alexcarliera
Alex Carlier
5 months
Ok this is wild for 3D artists. What if you could just get a ready-to-use material by simply clicking on a material in an IMAGE? This is what MaterialPalette solves. Especially useful when modeling from a reference image, just pick the correct material! Project page below ⬇️⬇️
5
16
113
@alexcarliera
Alex Carlier
8 months
There's something beautiful about visualizing impressionist artworks from Claude Monet using new technology. Peaceful scenes from the past that become 3d dreams, and which would be fun to play with in VR. Created using GaussianPainters. #GaussianSplatting
4
16
109
@alexcarliera
Alex Carlier
5 months
For reference, here is the ground truth depth maps for the previous images. DMD improves the relative depth error by up to 25% over ZoeDepth! One finding is that conditioning on the FOV is essential for disambiguating depth-scale.
Tweet media one
4
8
108
@alexcarliera
Alex Carlier
4 months
3D Gaussian Splatting vs TRIPS (new radiance field method running at 60 FPS 🔥) ⬇️⬇️
@alexcarliera
Alex Carlier
4 months
A new real-time Radiance Field paper beating 3DGS was just released! 🔥 Similarly to 3D Gaussian Splatting, TRIPS optimizes a point-cloud with color, position & size that gets splatted to the screen. But it does so using a single trilinear write in an image pyramid More info ⬇️
10
59
352
1
12
102
@alexcarliera
Alex Carlier
5 months
I tested the ControlNet for video (MagicAnimate) and here are is my opinion: it works great but has some flaws. - the identity of the motion video leaks to the resulting video (and deforms body shape) - bad hands and face (unsurprisingly!) But a great first step for consistent…
@alexcarliera
Alex Carlier
5 months
While everyone is waiting for AnimateAnyone, MagicAnimate was just released and it's really impressive! 🤯 It needs a single image and a motion video, and it produces an animated video! See below for more examples ⬇️⬇️
8
75
379
7
16
93
@alexcarliera
Alex Carlier
9 months
New 3D Gaussian Splatting capture of an amethyst. Look at those reflections 😍 #GaussianSplatting
6
6
90
@alexcarliera
Alex Carlier
8 months
Forget spirals, I just invented AI emojis 👀
Tweet media one
Tweet media two
Tweet media three
Tweet media four
13
12
88
@alexcarliera
Alex Carlier
5 months
Those fonts do not exist 🤯 @AdobeResearch strikes again with VecFusion, a new diffusion approach for Vector Image generation. Here it generates missing glyphs from just a few examples! If you follow me from my DeepSVG paper you know how excited I'm about this! More below ⬇️⬇️
3
9
87
@alexcarliera
Alex Carlier
8 months
Create your own upside-down optical illusions with Stable Diffusion XL! 🎨✨ I have created a colab notebook with the modified diffusion process for you to try! Link in the comments ⬇️⬇️
10
20
84
@alexcarliera
Alex Carlier
8 months
@Wizard_Mahi07 @MKBHD Options for more storage (6TB and 12 TB)
8
2
78
@alexcarliera
Alex Carlier
10 months
These cheeks/wrinkles do not exist! The @reshotAI reenactment network only takes the keypoints as input
4
10
79
@alexcarliera
Alex Carlier
5 months
Get prepared for more spambots in the coming months 🤯 @elonmusk Tencent just announced AppAgent, an LLM-based multimodal agent framework designed to control phone apps This looks helpful for the visually impaired, but can also make bots much easier to deploy More info ⬇️⬇️
2
19
76
@alexcarliera
Alex Carlier
8 months
The only way current LLMs can "reason" is through their own outputs. This is why prompts such as "Let's think step by step" can help in case of complicated logical requests. What OpenAI is probably working on for GPT-5 is an "internal thoughts" stream (that doesn't get printed…
@TimKietzmann
Tim Kietzmann
8 months
Well, actually, yes.
Tweet media one
513
7K
153K
6
8
81
@alexcarliera
Alex Carlier
4 months
Fully control AI videos with simple boxes 🤯 Recent approaches enable control with human pose or depth maps, but creating these maps is challenging. TrailBlazer (built on top of ZeroScope) enables control with boxes through spatial & temporal attention map editing More below ⬇️
1
7
79