Chen Wang Profile Banner
Chen Wang Profile
Chen Wang

@chenwang_j

2,480
Followers
687
Following
44
Media
285
Statuses

PhD student @StanfordSVL @StanfordAILab . Prev @NVIDIA @MIT_CSAIL . Robotics/Manipulation

Stanford, CA
Joined January 2021
Don't wanna be here? Send us removal request.
Pinned Tweet
@chenwang_j
Chen Wang
2 months
Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced
21
131
625
@chenwang_j
Chen Wang
1 year
How to teach robots to perform long-horizon tasks efficiently and robustly🦾? Introducing MimicPlay - an imitation learning algorithm that uses "cheap human play data". Our approach unlocks both real-time planning through raw perception and strong robustness to disturbances!🧵👇
20
144
741
@chenwang_j
Chen Wang
8 months
How to chain multiple dexterous skills to tackle complex long-horizon manipulation tasks? Imagine retrieving a LEGO block from a pile, rotating it in-hand, and inserting it at the desired location to build a structure. Introducing our new work - Sequential Dexterity 🧵👇
27
92
473
@chenwang_j
Chen Wang
3 years
Can robots learn hand-eye coordination simply from teleoperated human demonstrations? Our new #IROS2021 paper presents a novel action space to enable this! Website: 1/9
3
37
107
@chenwang_j
Chen Wang
2 years
1/ Can we improve the generalization capability of a vision-based task planner with representation pretraining? Check out our RAL paper on learning to plan with pre-trained object-level representation. Website:
2
13
72
@chenwang_j
Chen Wang
10 months
The combination of LLM and VLM shows great potentials in grounding “Where” and “How” to act in 3D observation space. Such capability allows the robot to perform visuomotor manipulation in a zero-shot fashion! Checkout VoxPoser, amazing work led by @wenlong_huang at @StanfordSVL
@wenlong_huang
Wenlong Huang
10 months
How to harness foundation models for *generalization in the wild* in robot manipulation? Introducing VoxPoser: use LLM+VLM to label affordances and constraints directly in 3D perceptual space for zero-shot robot manipulation in the real world! 🌐 🧵👇
10
141
582
0
14
69
@chenwang_j
Chen Wang
4 months
Everything is soooo fast. Are these all happen within 3 months???
@zipengfu
Zipeng Fu
4 months
Introduce 𝐌𝐨𝐛𝐢𝐥𝐞 𝐀𝐋𝐎𝐇𝐀🏄 -- Learning! With 50 demos, our robot can autonomously complete complex mobile manipulation tasks: - cook and serve shrimp🦐 - call and take elevator🛗 - store a 3Ibs pot to a two-door cabinet Open-sourced! Co-led @tonyzzhao , @chelseabfinn
188
894
4K
4
4
60
@chenwang_j
Chen Wang
2 months
wow 200hz low-level control frequency! Thanks for sharing details, learned a lot!
@coreylynch
Corey Lynch
2 months
Finally, let's talk about the learned low-level bimanual manipulation. All behaviors are driven by neural network visuomotor transformer policies, mapping pixels directly to actions. These networks take in onboard images at 10hz, and generate 24-DOF actions (wrist poses and…
19
61
407
3
3
47
@chenwang_j
Chen Wang
2 months
Motion capture gloves, unlike vision-based tracking, are not affected by occlusions during hand-object interactions, perfect for mocap in daily activities. With an RGB-D camera, DexCap reconstructs 3D scenes and aligns motion data, all powered by a mini-PC in the backpack. 2/
1
3
36
@chenwang_j
Chen Wang
2 months
We then retarget the mocap data to the robot embodiment. This includes (1) Observation retargeting by switching the camera system from human to robot. (2) Action retargeting by matching fingertip positions with IK. (3) Bridging the visual gap by including robot point clouds. 3/
1
2
36
@chenwang_j
Chen Wang
2 months
Exactly! Checking out for all the hardware/code/data details!
@DrJimFan
Jim Fan
2 months
DexCap: a $3,600 open-source hardware stack that records human finger motions to train dexterous robot manipulation. It's like a very "lo-fi" version of Optimus, but affordable to academic researchers. This isn't teleoperation: data collection is decoupled from the robot…
23
161
748
0
3
32
@chenwang_j
Chen Wang
8 months
We hope Sequential Dexterity paves the path for future research on long-horizon dexterous manipulation. Feel free to check out our code! Website & Paper: Code: Work done w/ Yuanpei Chen, @drfeifei , and Karen Liu at @StanfordAILab .
0
6
29
@chenwang_j
Chen Wang
8 months
Designing such a wearable and portable exoskeleton system is extremely hard. And its low-cost ($300 per arm)! Another great work from @haoshu_fang !
@haoshu_fang
Hao-Shu Fang
8 months
🤖Joint-level control + portability = robot data in the wild! We present AirExo, a low-cost hardware, and showcase how in-the-wild data enhances robot learning, even in contact-rich tasks. A promising tool for large-scale robot learning & TeleOP, now at !
6
37
206
0
2
24
@chenwang_j
Chen Wang
8 months
Thanks Jim's amazing summary of our key insight - the bi-directional optimization for skill chaining. Check out our code () and video () to see how we make it work!
@DrJimFan
Jim Fan
8 months
This is "Sequential Dexterity", a neural network that controls a robot arm to build legos given a manual 🤖 To do this task, the robot needs to chain together multiple skills (grasping, re-orienting, pushing, etc.) and execute without compounding error. I find some very simple…
10
68
331
0
3
24
@chenwang_j
Chen Wang
8 months
The core of the system is a learning-based transition feasibility function that progressively finetunes the sub-policies (learned with RL) for enhancing chaining success, which can also be used during skill selection for re-planning from failures and bypassing redundant stages.
2
3
23
@chenwang_j
Chen Wang
4 months
Probably the best manipulation result ever seen. Huge congrats to the team!
@tonyzzhao
Tony Z. Zhao
4 months
Introducing 𝐌𝐨𝐛𝐢𝐥𝐞 𝐀𝐋𝐎𝐇𝐀🏄 -- Hardware! A low-cost, open-source, mobile manipulator. One of the most high-effort projects in my past 5yrs! Not possible without co-lead @zipengfu and @chelseabfinn . At the end, what's better than cooking yourself a meal with the 🤖🧑‍🍳
236
1K
5K
0
2
23
@chenwang_j
Chen Wang
2 months
We train a point cloud-based Diffusion Policy with retargeted human mocap data only. The robot controls both hands (46-dim action space) to perform tasks including collecting tennis balls🎾 and packaging objects🎁. All the policies are learned without any teleoperation data. 4/
1
1
22
@chenwang_j
Chen Wang
2 months
However, DexCap is not yet ready for tasks that require applying force, as positional data alone is insufficient. Therefore, we introduce DexCap for human-in-the-loop correction during rollouts. Within 30 trials of corrections, our robot can prepare tea🍵 and use scissors✂️. 7/
1
1
22
@chenwang_j
Chen Wang
7 months
Using a domain definition language to set multi-task evaluation goals is very scalable and saves a lot of engineering efforts! Kudos to @yifengzhu_ut and the team! If you missed it, we've released MimicPlay code and tested on LIBERO a while ago. Check out
@yifengzhu_ut
Yifeng Zhu 朱毅枫
7 months
We are thrilled to announce LIBERO, a lifelong robot learning benchmark to study knowledge transfer in decision-making and robotics at scale! 🤖 LIBERO paves the way for prototyping algorithms that allow robots to continually learn! More explanations and links are in the 🧵
2
58
245
0
1
19
@chenwang_j
Chen Wang
8 months
How to acquire generalizable robotic skills has garnered much attention recently. We invite you to join our CoRL 2023 workshop - Towards Generalist Robots. Don't miss the chance to hear from our amazing speakers and share your insights through paper submission (Before Oct. 16)!
@zhou_xian_
Zhou Xian
8 months
🤖How far are we from 𝐠𝐞𝐧𝐞𝐫𝐚𝐥𝐢𝐬𝐭 𝐫𝐨𝐛𝐨𝐭𝐬? 𝐀𝐧𝐧𝐨𝐮𝐧𝐜𝐢𝐧𝐠 the 1st Workshop on "Towards Generalist Robots" at #CoRL2023 ! Join us to discuss how to scale up robotic skill learning, with an amazing lineup of speakers! CfP: Details 👇
Tweet media one
4
34
130
0
0
19
@chenwang_j
Chen Wang
2 years
Our workshop on "Overlooked Aspects of Imitation Learning" is happening this Monday June 27 from 11:00am - 12:30pm EST at #RSS2022 . Welcome to joining us virtual or in-person and don't forget it is EST time zone!
@AjayMandlekar
Ajay Mandlekar
2 years
Our #RSS2022 workshop on "Overlooked Aspects of Imitation Learning" is this Monday June 27 - join us virtual or in-person! We have a wonderful lineup of speakers ( @ancadianadragan @ankurhandos @chelseabfinn @LerrelPinto Mohi Khansari @shimon8282 ) See:
Tweet media one
0
9
46
0
3
19
@chenwang_j
Chen Wang
3 months
Another amazing work shows the tremendous potential of "collecting robot data without a robot". These portable low-cost systems really pave the way for scaling up data collection. Super looking forward to what further advancements will emerge.
@chichengcc
Cheng Chi
3 months
Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from @Stanford designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)
42
342
2K
1
1
19
@chenwang_j
Chen Wang
2 months
We hope DexCap paves the path for future research on scaling up robot data with wearable devices. The code, data, and hardware are open-sourced at 🌐 Work done w/ @HaochenShi74 @KenWangWeizhuo @RuohanZhang76 @drfeifei Karen Liu @StanfordAILab @StanfordSVL
1
1
18
@chenwang_j
Chen Wang
1 year
We observe that human play data is fast and easy to collect, which also covers diverse behavior and situations. On the other hand, although robot data is slow and limited, it does not have embodiment gaps. MimicPlay is a method designed to combine the best of both worlds. (2/N)
1
2
17
@chenwang_j
Chen Wang
8 months
Despite being trained only in simulation with a few task objects, our system demonstrates generalization capability to novel object shapes and is able to zero-shot transfer to a real-world robot equipped with a dexterous hand.
1
1
18
@chenwang_j
Chen Wang
8 months
Thanks @_akhaliq for sharing! Feel free to check out more at
@_akhaliq
AK
8 months
Sequential Dexterity: Chaining Dexterous Policies for Long-Horizon Manipulation paper page: Many real-world manipulation tasks consist of a series of subtasks that are significantly different from one another. Such long-horizon, complex tasks highlight…
2
27
108
0
5
17
@chenwang_j
Chen Wang
1 year
Waiting for this thread for a long time! Fantastic work by @tonyzzhao showing the great potential of this bimanual teleoperation system. The conditional action synthesis makes a lot of sense in handling human demos for long-horizon tasks!
@tonyzzhao
Tony Z. Zhao
1 year
Introducing ALOHA 🏖: 𝐀 𝐋ow-cost 𝐎pen-source 𝐇𝐀rdware System for Bimanual Teleoperation After 8 months iterating @stanford and 2 months working with beta users, we are finally ready to release it! Here is what ALOHA is capable of:
94
711
3K
1
2
17
@chenwang_j
Chen Wang
2 months
DexCap is fully portable and can scale up data collection in the wild. By collecting data with multiple objects in diverse environments, the learned policy can generalize to unseen objects for the same task. 5/
1
0
17
@chenwang_j
Chen Wang
1 year
MimicPlay is a hierarchical imitation learning algorithm that leverages cheap and non-labeled human play data (10 minutes) for learning the high-level planner and a small amount of robot data (20 demonstrations ~ 20 minutes) for learning a plan-guided low-level controller. (3/N)
Tweet media one
2
1
16
@chenwang_j
Chen Wang
9 months
Fortunate to have early access to try out this infra. Indeed very useful tool for IG and other simulations!
@QinYuzhe
Yuzhe Qin
9 months
Just dropped sim_web_visualizer! 🚀 Transform the way you view simulation environments right from your web browser like Chrome. Dive into more examples on our Github:
Tweet media one
7
36
237
0
0
16
@chenwang_j
Chen Wang
2 months
Also check out the fun failure modes of our robot. 8/
1
1
15
@chenwang_j
Chen Wang
18 days
Great to see such natural power grasping and tool-use motions, and super smooth! Now, robots can even play video games with a joystick controller 🤣 Amazing results highlighting the strength of vision+tactile. Congrats @ToruO_O !
@ToruO_O
Toru
18 days
Imitation learning works™ – but you need good data 🥹 How to get high-quality visuotactile demos from a bimanual robot with multifingered hands, and learn smooth policies? Check our new work “Learning Visuotactile Skills with Two Multifingered Hands”! 🙌
7
67
273
1
1
15
@chenwang_j
Chen Wang
2 months
DexCap enables fast data collection, approximating the speed of natural human motion. Moreover, the collection process does not require costly robot hardware. 6/
1
1
14
@chenwang_j
Chen Wang
1 year
Pretty cool results with diffusion model + visuomotor imitation learning! It is always painful to learn from multi-modal demos. Seems like iteriative diffusion policy is a promising direction! congrats @chichengcc @SongShuran
@chichengcc
Cheng Chi
1 year
What if the form of visuomotor policy has been the bottleneck for robotic manipulation all along? Diffusion Policy achieves 46.9% improvement vs prior StoA on 11 tasks from 4 benchmarks + 4 real world tasks! (1/7) website : paper:
9
100
534
0
1
13
@chenwang_j
Chen Wang
28 days
Truly stunning results in manipulating deformable objects. The motion of spreading out the cloth is fascinating!
@tonyzzhao
Tony Z. Zhao
28 days
Introducing 𝐀𝐋𝐎𝐇𝐀 𝐔𝐧𝐥𝐞𝐚𝐬𝐡𝐞𝐝 🌋 - Pushing the boundaries of dexterity with low-cost robots and AI. @GoogleDeepMind Finally got to share some videos after a few months. Robots are fully autonomous filmed in one continuous shot. Enjoy!
56
330
1K
0
0
13
@chenwang_j
Chen Wang
8 months
Considering how fragile allegro hand is, the sample efficiency of the online learning is truly amazing! Plier cutting is a cool result highlighting the potential of dexterous hands using human tools.
@LerrelPinto
Lerrel Pinto
8 months
We just released TAVI -- a robotics framework that combines touch and vision to solve challenging dexterous tasks in under 1 hour. The key? Use human demonstrations to initialize a policy, followed by tactile-based online learning with vision-based rewards. Details in🧵(1/7)
11
72
305
0
2
12
@chenwang_j
Chen Wang
2 months
We especially thank @kenny__shaw @anag004 @pathak2206 for open-sourcing the LEAP Hand project. Having a customizable and low-cost dexterous hand benefits our project a lot!
0
0
13
@chenwang_j
Chen Wang
1 year
We found MimicPlay significantly outperforms prior methods in performance and sample efficiency. With only 20 robot demonstrations and a planner learned with 10 minutes of human play data (shared across tasks), MimicPlay can perform long-horizon tasks such as baking foods. (6/N)
1
2
12
@chenwang_j
Chen Wang
1 year
We hope MimicPlay paves the path for future research to scale up robot learning with affordable human costs. 🌐Project site 📄PDF (9/N)
1
0
12
@chenwang_j
Chen Wang
18 days
Amazing cross-platform system for multiple end-effectors. Nice work @xuxin_cheng and super fast move!
@xuxin_cheng
Xuxin Cheng
18 days
 🤖Introducing 📺𝗢𝗽𝗲𝗻-𝗧𝗲𝗹𝗲𝗩𝗶𝘀𝗶𝗼𝗻: a web-based teleoperation software!  🌐Open source, cross-platform (VisionPro & Quest) with real-time stereo vision feedback.  🕹️Easy-to-use hand, wrist, head pose streaming. Code:
12
80
350
1
3
12
@chenwang_j
Chen Wang
1 year
Visual correspondence paves the way for numerous downstream tasks. Learning form internet-scale video further reveals its full strength! Very interested in trying it out for manipulation and more! Nice work from @agrimgupta92 🚀
@agrimgupta92
Agrim Gupta
1 year
How should we leverage internet videos for learning visual correspondence? In our latest work we introduce SiamMAE: Siamese Masked Autoencoders for self-supervised representation learning from videos. web: paper: 👇🧵
16
122
484
0
0
11
@chenwang_j
Chen Wang
3 months
Huge congrats to @DrJimFan and @yukez on the exciting move! Really enjoyed working with you last year and can't wait to see what this team will achieve!🦾
@DrJimFan
Jim Fan
3 months
Career update: I am co-founding a new research group called "GEAR" at NVIDIA, with my long-time friend and collaborator Prof. @yukez . GEAR stands for Generalist Embodied Agent Research. We believe in a future where every machine that moves will be autonomous, and robots and…
Tweet media one
241
471
4K
0
0
10
@chenwang_j
Chen Wang
3 months
Amazing robot hand hardware and tracking results!
@BostonDynamics
Boston Dynamics
3 months
Can't trip Atlas up! Our humanoid robot gets ready for real work combining strength, perception, and mobility.
225
1K
5K
0
0
11
@chenwang_j
Chen Wang
3 years
I’ll present our work on learning human-robot collaboration in simulation today @corl_conf (Wed 5:00-6:00pm GMT, 9:00-10:00am PST). Drop by the poster (Session V Booth 4) and check out our paper () with code () to learn more!
0
2
11
@chenwang_j
Chen Wang
10 months
New updates of robomimic v0.3 are out! We have included several new algorithms and features. Check them out and let us know any feedbacks!
@AjayMandlekar
Ajay Mandlekar
10 months
robomimic v0.3 released - the most major upgrade yet! New features: 🧠 New Algorithms (BC-Transformer, IQL) 🤖 Full compatibility with robosuite v1.4 and @DeepMind 's MuJoCo bindings 👁️ Pre-trained image reps 📈 wandb logging @weights_biases try it out:
1
19
79
1
0
10
@chenwang_j
Chen Wang
1 year
We further test MimicPlay in more challenging multi-task learning settings, where we found that MimicPlay has the smallest performance drop compared to prior methods. This result highlights MimicPlay's capability to handle diverse tasks within one model. (8/N)
1
1
9
@chenwang_j
Chen Wang
1 year
More importantly, after training multiple tasks within one model, MimicPlay is able to generalize to new tasks with unseen temporal compositions. (7/N)
1
1
9
@chenwang_j
Chen Wang
2 years
Super cool design for haptic feedbacks!
0
0
8
@chenwang_j
Chen Wang
8 months
Very neat design of the fingertip tactile sensor. Beautiful results!
@HaozhiQ
Haozhi Qi
8 months
🦾 Our robot hand can rotate objects over 6+ axes in the real-world! Introducing RotateIt (CoRL 2023), a Sim-to-Real policy that can rotate many objects over many axes, using vision and touch! Check it out: . Paper: . #CoRL2023
7
28
184
1
1
8
@chenwang_j
Chen Wang
4 months
Nice setup for bringing robot arm out of the lab. Congrats!
@Haoyu_Xiong_
Haoyu Xiong
4 months
Introducing Open-World Mobile Manipulation 🦾🌍 – A full-stack approach for operating articulated objects in open-ended unstructured environments: Unlocking doors with lever handles/ round knobs/ spring-loaded hinges 🔓🚪 Opening cabinets, drawers, and refrigerators 🗄️ 👇…
30
103
784
1
1
8
@chenwang_j
Chen Wang
1 year
The high-level planner is first trained as a goal-conditioned policy. It takes the current and goal images from human play data and outputs a latent plan. We also use a KL-loss to minimize the visual gap between human and robot data. (4/N)
Tweet media one
1
1
8
@chenwang_j
Chen Wang
1 year
amazing
@erwincoumans
Erwin Coumans 🇺🇦
1 year
OpenAI ChatGPT is excellent creating PyBullet scripts: "can you create a pybullet script with a ground plane, a box and a sphere on top."+"can you add 10 more boxes on top?"+"can you move the 5th box 0.3 units along the x axis?"+"can you add a quadruped robot next to the boxes?"
Tweet media one
4
16
139
0
0
8
@chenwang_j
Chen Wang
1 year
In the second step, we freeze the weights of the trained latent planner. The latent planner takes the current and goal images from the robot data and generates a latent plan to train the low-level controller with a plan-guided imitation learning algorithm. (5/N)
Tweet media one
1
1
7
@chenwang_j
Chen Wang
1 year
@Stone_Tao Great question! The primary issue at hand is whether we have sufficient data to support end2end. This becomes particularly challenging for tasks with longer horizons, as the accumulation of errors requires an abundance of data to cover diverse scenarios.
1
1
7
@chenwang_j
Chen Wang
2 years
Imitation learning for real world problems takes more than new algorithms - check out our RSS21 workshop on Overlooked Aspects of Imitation Learning - consider submitting by May 7th!
@danfei_xu
Danfei Xu@ICRA24
2 years
Applying imitation learning to real world problems takes more than new algorithms. We are organizing a workshop "Overlooked Aspects of Imitation Learning: Systems, Data, Tasks, and Beyond” at RSS22! Exciting speakers & more to come. Submit by May 7th!
1
8
65
0
0
6
@chenwang_j
Chen Wang
1 year
2
0
5
@chenwang_j
Chen Wang
3 years
0
0
4
@chenwang_j
Chen Wang
2 months
@tonyzzhao @kenny__shaw Thanks Tony!! Open-sourcing shapes the future🙌
0
0
4
@chenwang_j
Chen Wang
2 years
Super cool results. Multimodal prompts largely improve the flexibility of goal-conditioned policy!
@DrJimFan
Jim Fan
2 years
We trained a transformer called VIMA that ingests *multimodal* prompt and outputs controls for a robot arm. A single agent is able to solve visual goal, one-shot imitation from video, novel concept grounding, visual constraint, etc. Strong scaling with model capacity and data!🧵
18
147
870
1
0
4
@chenwang_j
Chen Wang
5 months
Amazing video generation results! Congrats Agrim and the team!
@agrimgupta92
Agrim Gupta
5 months
We introduce W.A.L.T, a diffusion model for photorealistic video generation. Our model is a transformer trained on image and video generation in a shared latent space. 🧵👇
55
267
1K
1
0
4
@chenwang_j
Chen Wang
8 months
@kevin_zakka Thanks Kevin! huge fan of RoboPianist!
0
0
2
@chenwang_j
Chen Wang
2 years
Super cool results! Congrats @yifengzhu_ut !
@yifengzhu_ut
Yifeng Zhu 朱毅枫
2 years
How can robot manipulators perform in-home tasks such as making coffee for us? We introduce VIOLA, an imitation learning model for end-to-end visuomotor policies that leverages object-centric priors to learn from only 50 demonstrations!
9
57
320
0
0
3
@chenwang_j
Chen Wang
2 months
@DrJimFan Thanks Jim!!
0
0
3
@chenwang_j
Chen Wang
1 year
@kaixhin We 3d printed it and the model is here -
0
0
3
@chenwang_j
Chen Wang
2 months
@simonkalouche
Simon Kalouche
2 months
Humanoids w legs and anthropomorphic hands will never beat this robot in cost, speed, maintenance or reliability in the multi-trillion dollar warehouse/manufacturing market where climbing stairs is unnecessary. Customers will choose faster, lower cost & more reliable every time.
28
30
196
2
0
3
@chenwang_j
Chen Wang
2 months
@ralfgulde @StanfordAILab It's LEAP hand. Invented by @kenny__shaw . More info
1
0
3
@chenwang_j
Chen Wang
3 years
Humans learn to collaborate with others through experiences. However, it would take countless human hours to teach robots how to collaborate through trial-and-error. We ask: is it possible to teach human-robot collaboration skills through human-human collaboration demos? 3/8
1
0
2
@chenwang_j
Chen Wang
2 months
@drfeifei Thanks Fei-Fei!!
0
0
1
@chenwang_j
Chen Wang
4 months
0
0
0
@chenwang_j
Chen Wang
8 months
@wenlong_huang Thanks Wenlong for all the nice feedback and discussion!
0
0
2
@chenwang_j
Chen Wang
5 months
0
0
1
@chenwang_j
Chen Wang
8 months
@QinYuzhe Thanks Yuzhe!
0
0
2
@chenwang_j
Chen Wang
2 months
@kenny__shaw Thanks Kenny!! LEAP hand really helps us greatly. Can’t wait to try out more new hands from you!
0
0
2
@chenwang_j
Chen Wang
2 months
@astroboticist congrats, amazing work!!
0
0
0
@chenwang_j
Chen Wang
3 years
Having a robot assistant at home that could seamlessly assist us with daily activities is a long-sought dream. To achieve this goal, the robot needs to recognize and react to their human partners’ intentions on-the-fly. 2/8
Tweet media one
1
0
2
@chenwang_j
Chen Wang
8 months
@pathak2206 Thanks Deepak! Very interested in LEAP hand and functional grasp
0
0
2
@chenwang_j
Chen Wang
2 months
@HaochenShi74 Thanks Haochen! Really looking forward to seeing your new creations!
0
0
2
@chenwang_j
Chen Wang
3 years
In this work, we take a step forward to improve the generalization capability of traditional imitation learning algorithms by introducing a novel human-like hand-eye coordination action space. We hope it can inspire future studies on generalizable offline policy learning. 8/9
Tweet media one
1
0
2
@chenwang_j
Chen Wang
1 year
@kevin_zakka Great question! We tried using human demos as task video prompts, which works for short-horizon subgoals such as opening an oven or turning off a lamp. However, it encountered difficulty with longer-horizon tasks due to a mismatch in motion speed between human and robot.
0
0
2
@chenwang_j
Chen Wang
2 years
@SongShuran Congrats!
0
0
1
@chenwang_j
Chen Wang
8 months
@tonyzzhao Thanks Tony!
0
0
1
@chenwang_j
Chen Wang
1 year
@hameleoned Awesome catch! We find lots of cases that human hand and the robot gripper need to be used in very different ways to accomplish the same task (e.g., open an oven). Such embodiment gap between human and robot makes it hard to learn low-level control solely from human data.
2
0
1
@chenwang_j
Chen Wang
1 year
@AndreTI @olivercameron Great point! We are paying close attention to human/robot motion synthesis with diffusion. The key challenge is how to ground such generated behavior to the physical world/objects. Let's say the hand needs to reach the 3D location of a door handle in the correct pose to open it.
1
0
1
@chenwang_j
Chen Wang
2 months
@chichengcc Thanks Cheng! Collecting robot data without robots is promising🙌
0
0
1
@chenwang_j
Chen Wang
2 months
@DJiafei Thanks Jiafei! AR2D2 is a great idea!
0
0
1
@chenwang_j
Chen Wang
8 months
@_krishna_murthy Thanks Krishna!
0
0
1
@chenwang_j
Chen Wang
2 years
6/ With the pre-trained representation, a state transition model is learned to predict the skill effects. The key insight is: the representation summarizes the common features of the objects from the same category (e.g. foods are cookable), which could transfer to new instances.
Tweet media one
1
0
1
@chenwang_j
Chen Wang
8 months
@raghavaupp Thanks Raghava, glad ppl like it!
1
0
1
@chenwang_j
Chen Wang
8 months
@drfeifei Thanks Fei-Fei!
0
0
1
@chenwang_j
Chen Wang
2 months
@wenlong_huang Thanks Wenlong!!
0
0
1
@chenwang_j
Chen Wang
2 months
@LerrelPinto @ieeeras Congrats Lerrel!!
0
0
1
@chenwang_j
Chen Wang
1 year
@allenzren Thanks for bringing this up! Q_H and Q_R represent the feature embedding space produced by the image encoders. Yes, we treat all the data as a single distribution, and the human play data we utilized does not have any task labeling.
0
0
1
@chenwang_j
Chen Wang
1 year
@allenzren The KL is trying to minimize the distribution gap between human and robot images. I like your idea of dividing the distribution by clustering the data with 3D trajectories and employing KL based on the clustered features. This seems reasonable and might enhance the results!
0
0
1
@chenwang_j
Chen Wang
8 months
@lucy_x_shi Thanks Lucy!
0
0
1
@chenwang_j
Chen Wang
2 years
7/ Finally, we can use a simple tree search algorithm to find the task plans for different symbolic goals. Experiments show that the task planner based on the pre-trained representation could successfully transfer to new objects and new scene layouts.
1
0
1