In 2022, I finished 2 half marathons, 2 MTB races, and my computer vision research group was firing on all cylinders. In 2023 our lives were turned upside down by severe
#LongCovid
and
#MECFS
.
1/
Some of you may have seen this already 😀 I'm still very excited about Vid2Avatar which we will present at
#CVPR2023
this week. We propose a method to reconstruct detailed 3D avatars from monocular in the wild videos via self-supervised scene decomposition.
🧵👇
1/
Warning: personal and emotional thread.
Today is
#MEAwarenessDay
and we are heartbroken 💔. Our smart, athletic, ball-of-energy 10 year old is now bedridden and unable to stand or walk since January due to
#LongCOVID
and
#MECFS
.
#LongCovidKids
👇🧵
1/
I am painfully aware that we are only two of 60+ million patients with
#LongCovid
(
#pwLC
) and tens of millions with
#MECFS
(
#pwME
).
Yet despite these numbers, there has been zero public research funding and hence no cure, no therapy, or quality-of-life care.
4/
To make matters worse, our younger son, who also has
#LongCovid
, has been unable to stand or walk since January, and has been too sick to attend school in over a year.
#LongCovidKids
3/
Today I spend nearly 24/7 lying in bed in a quiet, darkened room. Even the smallest activity, such as eating, can cause me to crash (a rapid worsening of symptoms).
2/
Sad and infuriating reality: 3 years in, and most medical staff remain ill-informed or even unwilling to acknowledge
#LongCOVID
. Grateful for the few doctors and therapists like
@SusScro58355800
who are trying to help. It's time for a change.
4/
Last summer, months after Covid he repeatedly developed headaches and felt ill then suddenly got better. Specialist after specialist dismissed his symptoms, urging him to return to school and exercise. Little did we know, this triggered
#PEM
, worsening his condition.
2/
In our latest TPAMI paper we introduce FastSNARF! An efficient deformer for non-rigid shapes, represented as neural fields (SDF, NeRF, etc.). It's a 1:1 replacement to our previous work, SNARF but 150x faster! 🚀⚡
#FastSNARF
#NeuralFields
#NeuralAvatars
🧵👇
1/6
It saddens and embarrasses me how comparatively easy it is to secure funding for my own research in computer vision, while in
#MECFS
researchers struggle to obtain support. May is
#MEAwarenessMonth
- let’s push for equitable funding and resources.
6/
One step closer to realistic virtual humans: with gDNA we propose a method that generates diverse 3D virtual humans appearing in varied clothing and under full pose control!
#CVPR2022
Paper:
Video:
Our latest toy is finally online 🎉🚀 Super excited about the upcoming computer vision and graphics research on neural avatars, human state estimation and much more leveraging one of the most advanced volumetric video capture studios in academia!
@ait_eth
,
@mapo1
,
@MarkusGross63
Great opening of the new state-of-the-art volumetric video capture facility to create
#4Davatars
with exciting fields of application, e.g. in transportation, aging society or health care.
@mapo1
@OHilliges
@MarkusGross63
@ETH_en
Need to scan and animate yourself?
@YufengZzzz
will present I M Avatar at
#CVPR2022
(oral) - which learns implicit head avatars from monocular videos via correspondence search and differentiable ray-marching.
Paper:
Project:
Even worse:
#MECFS
, known since the 50s, yet still no substantial funding or approved medications. With an estimated 17.5 million sufferers pre-pandemic, this is an urgent crisis. It's time to invest real money into basic and translational research. Lives are at stake.
5/
Please help to spread understanding and support for millions - many of them kids and young adults - suffering from these debilitating conditions. Raising awareness is a first step towards better care and research. 💙.
#LongCovidKids
#LongCovid
#MECFS
7/7
Yufeng (
@YufengZzzz
) is about to present our paper on learning implicit morphable head avatars from videos at
#CVPR2022
(oral: Image & Video Synthesis Session, 1:30pm). Sadly I can't be there in person but don't fret, Yufeng can still make my Avatar say cheeky things 😆
For AIs to reason about human interaction with the world, we need generative models that can imagine a plausible future. In our upcoming
#CVPR2022
paper, we introduce D-Grasp, an RL-based method that generates physically plausible grasp sequences.
Implicit surfaces are great to reconstruct 3D humans. However, editing is hard because the geometry is represented by a single continuous function. In our upcoming
#CVPR2023
paper, we overcome this by combining the advantages of explicit and implicit representations.
🧵👇
The application window for the ETH-CLS PhD program is now open:
This is a great opportunity to do cutting-edge research in 3D computer vision and be co-supervised by fantastic advisors at ETH and the MPI (
@Michael_J_Black
,
@JustusThies
).
Just in time for
#CVPR2023
we release Hi4D, a dataset of closely interacting humans in 4D, including 4D textured geometry, multi-view RGB images, registered parametric models, instance segmentation masks in 2D & 3D, and vertex-level contact annotations.
🧵👇
Introducing "Preface: A Data-driven Volumetric Prior for Few-shot Ultra High-resolution Face Synthesis".
TL;DR: Novel views of faces at ultra-high 4K resolution from very few input images.
@GoogleARVR
@ETH_en
@ait_eth
#ICCV2023
.
See thread below.
For digital humans to come alive they need to be expressive! In our
#CVPR2023
paper, X-Avatar, we propose an implicit human avatar model capable of capturing human body poses, hand gestures, facial expressions, and appearance 🕺🏻
1/
🧵👇
We are excited to announce the HANDS'23 workshop challenge () with AssemblyHands and ARCTIC at
#ICCV2023
!
The challenge focuses on hand pose estimation and articulated hand-object reconstruction (Deadline: September 15). See 🧵 for more details.
Building on Fast-SNARF, we take another important step towards real-time neural avatars. Our latest
#CVPR
paper InstantAvatar proposes a method to reconstruct animatable full-body avatars from a monocular video in less than 60 seconds. ⚡️
#NeuralAvatars
#DigitalHumans
🧵👇
1/
In our latest TPAMI paper we introduce FastSNARF! An efficient deformer for non-rigid shapes, represented as neural fields (SDF, NeRF, etc.). It's a 1:1 replacement to our previous work, SNARF but 150x faster! 🚀⚡
#FastSNARF
#NeuralFields
#NeuralAvatars
🧵👇
1/6
AG3D is an important step on our quest towards fully generative models of realistic 3D humans. It is learned entirely from 2D image collections and requires no 3D supervision. To be presented at
#ICCV2023
.
Last night
@emreaksan
has been awarded the Fritz-Kutter award for his outstanding PhD thesis. Congratulations! Very well deserved Emre! Proud advisor moment.
We are excited to share EMDB, a novel dataset of 3D human poses for in-the-wild monocular videos, including global trajectories. Data and toolkit code is now available. More details in the thread below.
Project Page:
I'm honored to have been awarded an ERC consolidator grant! Looking forward to working with my superstar students
@ait_eth
on next gen computer vision for collaborative AIs.
@ait_eth
The implication: papers are for reading, discussing, ideating - not for counting. To all the junior folks: focus on doing good work, the rest will follow.
Excited to share our upcoming
#CVPR2023
paper
#PointAvatar
that leverages learned, deformable point clouds to create high fidelity 3D facial avatars efficiently from video. 👇
Efficiently create accurate and realistic 3D facial avatars that can be animated and lit in new environments. Recent implicit shape models look good but are slow to learn and render. Our
#PointAvatar
method is high quality and more efficient. Appearing at
#CVPR2023
. (1/9)
Our method generalizes to diverse human shapes, garment styles, and facial features even under challenging poses and complicated environments without requiring any external segmentation.
5/
Congratulations to all
#CVPR2022
authors! I'm happy to announce that we also got X/Y papers accepted. Fantastic work by my very talented students
@ait_eth
lab and great collaborators. Stay tuned.
We're looking for ELLIS PhD candidates at ETH Zurich
@ait_eth
.
Topics in human-centric 3D computer vision include 3D pose, shape and appearance estimation and underlying methods such as neural fields, 3D generative models and more.
I would have loved to go to
#CVPR2023
this year. Alas our family’s health situation does not allow for that. For all of you attending: enjoy, stay healthy and go chat my students and postdocs (even if less than 50% of the authors from
@ait_eth
got a visa).
Last year we introduced a principled method to learn articulated neural surfaces from scans (). This year at
#CVPR2022
we show how to learn personalized avatars from a single RGB-D sequence: . Great work by
@dong_zijian
&
@ChenGuo96
!
Hand pose estimation often ignores temporal information. In TempCLR, we introduce a time-contrastive learning objective that significantly improves hand pose reconstruction from in-the-wild videos and that improves cross-dataset generalization.
#3dv2022
#handposeestimation
Frau Dr. Strasser ruft im Schweizer Ärzteblatt für eine bessere Versorgung von
#LongCovid
und
#MECFS
Patienten, sowie für mehr und bessere Forschung auf! 👏🙌🙌. Der Paradigmenwechsel ist dringend nötig.
@SusScro58355800
We show that solving the problem entirely in 3D - and forgoing the use of 2D segmentation methods - leads to better results overall. We model both the human and the background in the scene jointly, parameterized via two layered neural fields.
3/
One step closer to realistic virtual humans: with gDNA we propose a method that generates diverse 3D virtual humans appearing in varied clothing and under full pose control!
#CVPR2022
Paper:
Video:
I’m proud that two members of the AIT lab
@ait_eth
, Marcel Bühler
@mc_buehler
and Xu Chen
@XuChen71058062
, have been recognized as outstanding reviewers for
#CVPR2023
! I always say that if you want to receive good reviews; write good reviews.
We sincerely thank all reviewers, area chairs and senior area chairs who contributed their time to
#CVPR2023
! Reviewers who did an outstanding job are recognized here:
I really was super excited about finally seeing everyone in person at
#CVPR2022
but in the end I decided not to go this year. Why? Some of my rationale below (1/6)
Friends and I finally made it to
@CVPR
without visa!🎉 Just kidding😄We made a fun "group photo" of
@ait_eth
with generative fill
@Adobe
.Thanks
@OHilliges
for shooting😉Truly astonished by
#GenerativeAI
! More motivation to research generative 3D human: maybe 3D selfie one day😉
We formulate a global optimization over the background and canonical human model. A coarse-to-fine sampling strategy for volume rendering and novel objectives to cleanly separate the human and static background, yield detailed and robust 3D human geometry reconstructions.
4/
Coming up at
#CVPR
- hierarchical graph neural networks and physical self-supervision lead to neural garment simulation that generalizes across garments, handles varying topologies and models the dynamics of free flowing clothing. 👇
📢📢 Have been waiting for a garment modeling method, that
- 👕👚👖needs just one model for all types of garments
- 🥼handles changing topology (e.g. buttons)
- 👗realistically models loose garments?
Happy to present our
#CVPR2023
paper HOOD
Project:
❄️ ARCTIC Challenge: is focussed on consistent motion reconstruction. The aim is to reconstruct 3D surfaces of two hands and of an articulated object in each video frame. Crucially, the hand-object contact must be consist to explain object articulation.
@ugoerra
Currently processing takes several hours but keep in mind this non-optimized research code. For quasi- real-time checkout
#InstantAvatar
(also at
#CVPR2023
).
Prof. Luc Van Gool, one of the world’s top AI and Computer Vision scientists, is joining
@INSAITinstitute
in Sofia. His arrival is made possible by the 6M EUR financial support provided by
@SiteGround
@CVPR
We're also excited to share a novel 3D human dataset - CustomHumans. Our dataset contains over 600 high-quality scans of humans alongside accurately registered SMPL-X parameters.
5/
Luckily a lot of my students will be there to present their awesome work. So go see their talks and posters and discuss with them. Already looking forward to a virtual
#CVPR2022
and an in person
#CVPR2023
. (6/6)
FastSNARF enables efficient training and inference of digital 3D humans. Powered by the latest release of aitviewer
@ait_eth
, we stream live network outputs in quasi-realtime - and so can you!
4/6
@cjmaddison
Thanks Chris. I was very saddened when I found out your suffering from post-COVID. Your outspokenness on the issue helped strengthen my resolve to share our story.
Excited to be part of this initiative to build a new world-class AI institute in Europe. I'm happy to hire up to two doctoral students (PhDs) at
@INSAITInstitute
who will work closely with me and the AIT lab
@ait_eth
at ETH Zurich.
We are excited to launch our world-class AI/CS PhD program (), the first of its kind in Eastern Europe, with
@DeepMind
PhD fellowships, please share :)
PINA is a tribute to Pina Bausch; it's also the name of our method that creates personalised avatars from RGB-D videos - and makes them dance 🕺💃. Today at
#CVPR2022
, Session: 4.2, Poster: 171b
@dong_zijian
,
@ChenGuo96
, Jie Song,
@AutoVisionGroup
, me
Last year we introduced a principled method to learn articulated neural surfaces from scans (). This year at
#CVPR2022
we show how to learn personalized avatars from a single RGB-D sequence: . Great work by
@dong_zijian
&
@ChenGuo96
!
Tomorrow
@sammy_j_c
will present his work on vision-based hand off of objects from humans to robots at
#CVPR2023
. Go see the talk and poster. 🦿
More info in the thread 👇
In our upcoming
#CVPR2023
highlight paper, we propose the first framework to learn vision-based human-to-robot handovers. This task is challenging because it requires an accurate simulation of humans and a robot that can react to dynamic human movements.
🧵👇
Code, paper, and dataset are now publicly accessible for research purposes!
👨💻Code:
📄Paper:
📊Dataset:
🎥Video:
🖼️ Poster:
Come chat with us at
#CVPR2023
How? We replace the MLP and leverage a compact voxel grid to represent the skinning weight field, thanks to its inherent smoothness. Plus, we exploit the linearity of LBS to streamline computations, slashing time without compromising accuracy.
2/6
Our method can also be used to correct imperfect labels (e.g., from existing datasets) or predictions from static grasp synthesis methods and even image based pose estimates.
Wanting the pandemic to be over and it actually being over, sadly, is not the same thing. My social media stream was full of folks coming back with Covid from
#CHI2022
(also in NOLA), some getting stuck in quarantine hotels for significant amounts of time. (2/6)
Today Sammy will present our paper on learning natural and physically plausible human object interaction sequences at
#CVPR22
(Session: Faces and Gestures, Poster: 181b). Do stop by.
@sammy_j_c
,
@mkocab_
,
@emreaksan
,
@HwangboJemin
Project
For AIs to reason about human interaction with the world, we need generative models that can imagine a plausible future. In our upcoming
#CVPR2022
paper, we introduce D-Grasp, an RL-based method that generates physically plausible grasp sequences.
Explore the InstantAvatar project in depth:
Project page:
Github:
Read our paper:
Joint work by Tianjian Jiang,
@XuChen71058062
, Jie Song, and myself
@ait_eth
.
5/5
With diminished value of social interaction and a
#ClimateEmergency
going on I cannot justify this additional long-haul flight (already committed to another US trip this year). (5/6).
In FastSNARF, costly MLP evaluations and LBS calculations are replaced by a single tri-linear interpolation step—lightweight and super fast (18x faster). A custom CUDA implementation provides an additional speed-up factor of 8x!
3/6
And we're not stopping here either! Because Tianjian is awesome, he improved the method since acceptance and integrated it with SAM for accurate in-the-wild segmentation. Now InstantAvatar can reconstruct 3D avatars from monocular in-the-wild videos in just minutes!
4/5
We propose a hierarchical RL-based method that decomposes the task into low-level grasping control and high-level motion synthesis. This method can generate novel hand sequences that approach, grasp, and move an object to a desired location, while retaining human-likeness.
We supervise via unlabelled in-the-wild videos with a time-contrastive learning objective. We show that this 1) improves hand reconstruction and yields smoother estimates; 2) significantly improves cross-dataset generalization; 3) similar hand poses are closer in feature space.
This is hard since this requires reasoning about the complex articulation of the human hand and the physical interactions with the object (e.g., collisions, friction, gravity). Even ground truth labels from existing grasping datasets do not lead to stable grasps.
🚗 AssemblyHands Challenge: the AssemblyHands dataset, includes 3rd-person and egocentric images of toy assembly and disassembly, along with 3D hand pose annotations. Participants must estimate 3D hand joints from an egocentric view.
We propose a hybrid representation that incorporates the advantages of parametric meshes and neural fields. A skinned, animatable mesh is used to store local features at each vertex. A global decoder generates high frequency details from these features.
2/
We introduce the novel task of dynamic grasp synthesis: given an initial object pose and a static grasp reference, the goal is to move the object to an arbitrary goal position in a human-like, physically plausible way.
To facilitate future research on expressive avatars, we contribute the X-Humans dataset, containing 233 sequences (20 participants), a total of 35,500 frames. It includes high-quality textured scans of expressive human motions and the corresponding SMPL[-X] registrations.
4/
Our method reconstructs individual actors in dynamic interaction with complete geometry & detailed contact info. Thus we attain 3D/2D instance segmentation masks, body model registrations, and vertex-level contact labels.
4/