Cheng Chi Profile Banner
Cheng Chi Profile
Cheng Chi

@chichengcc

3,118
Followers
2,096
Following
22
Media
244
Statuses

🤖PhD student @Stanford and @Columbia

Stanford, CA
Joined January 2016
Don't wanna be here? Send us removal request.
Pinned Tweet
@chichengcc
Cheng Chi
3 months
Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from @Stanford designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)
42
343
2K
@chichengcc
Cheng Chi
1 year
What if the form of visuomotor policy has been the bottleneck for robotic manipulation all along? Diffusion Policy achieves 46.9% improvement vs prior StoA on 11 tasks from 4 benchmarks + 4 real world tasks! (1/7) website : paper:
9
100
533
@chichengcc
Cheng Chi
3 months
We made a step-by-step video tutorial for building the UMI gripper! Please leave comments on @YouTube if you have any question
Tweet media one
9
25
189
@chichengcc
Cheng Chi
2 months
Weights drop ⚠️ We released our pre-trained model for the cup arrangement task trained on 1400 demos! We aim to enable anyone to deploy UMI on their robot to arrange any "espresso cup with saucer" they buy on Amazon.
3
24
170
@chichengcc
Cheng Chi
2 months
UMI x ARX @YihuaiGao just got our in-the-wild cup policy working with ARX5 @ARX_Zhang ! We are still tuning the controller and latency matching for smoother tracking. Lot’s of potential in these low-cost lightweight arms!
3
14
102
@chichengcc
Cheng Chi
3 months
With UMI, you can go to any home, any restaurant and start data collection within 2 minutes. With a diverse in-the-wild cup manipulation dataset, we can train a diffusion policy that generalizes to the top of a water fountain – clearly unseen environments and objects. 2/9
2
7
85
@chichengcc
Cheng Chi
4 years
@mranti 因为这样观众就可以同时看到演员的脸和屏幕上显示的东西了。因此没隐私是feature不是bug
0
0
57
@chichengcc
Cheng Chi
3 months
It was a blast working with @tonyzzhao and @zipengfu in the Stanford Robotic Center! 8/9
3
0
63
@chichengcc
Cheng Chi
3 months
Portable and transferable data collection for dexterous hands! This is how humanoid data should be collected
@chenwang_j
Chen Wang
3 months
Can we use wearable devices to collect robot data without actual robots? Yes! With a pair of gloves🧤! Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands Everything open-sourced
21
132
624
3
2
53
@chichengcc
Cheng Chi
3 months
UMI data is robot agnostic. Here we can deploy the same policy on both UR5e and Franka robots. In fact, you can deploy it on any robot with a parallel jaw stroke > 85mm. 3/9
2
1
47
@chichengcc
Cheng Chi
1 month
I love how with just parallel jaw grippers and visuomotor policy, you can do really dextrous and precise tasks, often exceeding the mechancial accuracy of the robot arms themselves 🦾
@tonyzzhao
Tony Z. Zhao
1 month
Introducing 𝐀𝐋𝐎𝐇𝐀 𝐔𝐧𝐥𝐞𝐚𝐬𝐡𝐞𝐝 🌋 - Pushing the boundaries of dexterity with low-cost robots and AI. @GoogleDeepMind Finally got to share some videos after a few months. Robots are fully autonomous filmed in one continuous shot. Enjoy!
56
333
2K
1
0
46
@chichengcc
Cheng Chi
3 months
Please also check out our epic fails compilation! We achieve a 70-90% success rate on most tasks, which still doesn’t hit the bar for commercial deployment. However, we think getting a larger in-the-wild dataset will get us a lot closer! 6/9
2
0
41
@chichengcc
Cheng Chi
5 years
Finally finished my personal project! @cgmastersnet Thank you so much for creating such an excellent tutorial!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
0
40
@chichengcc
Cheng Chi
3 months
Enabled by our unique wrist-only camera configuration and camera-centric action representation, our robot systems are calibration-free (works even with base movement) and robust against distractors and lighting changes. 4/9
3
0
40
@chichengcc
Cheng Chi
24 days
Congrats on LeRobot’s release! Both the Diffusion Policy and UMI project have benefited tremendously from @huggingface libraries such as diffusers and accelerate. I hope to see more organized robotics open source efforts!
@RemiCadene
Remi Cadene
24 days
Meet LeRobot, my first library at @huggingface robotics 🤗 The next step of AI development is its application to our physical world. Thus, we are building a community-driven effort around AI for robotics, and it's open to everyone! Take a look at the code:
Tweet media one
35
207
823
1
3
39
@chichengcc
Cheng Chi
1 year
Incredible results from simple hardware + behavior cloning! I’m glad that @tonyzzhao also find combining generative model + action sequence prediction to be effective at capturing multimodal actions!
@tonyzzhao
Tony Z. Zhao
1 year
Introducing ALOHA 🏖: 𝐀 𝐋ow-cost 𝐎pen-source 𝐇𝐀rdware System for Bimanual Teleoperation After 8 months iterating @stanford and 2 months working with beta users, we are finally ready to release it! Here is what ALOHA is capable of:
94
713
3K
1
1
26
@chichengcc
Cheng Chi
3 months
This project would have been impossible without the hard work from co-authors: @Zhenjia_Xu @chuer_pan @eacousineau @Ben_Burchfiel Siyuan Feng @RussTedrake @SongShuran 7/9
1
0
21
@chichengcc
Cheng Chi
3 months
@GoPro technologies: GPMF, QR control, Voice control, media mod, max lens … Has been indispensable for this project. Shout out to @David_Newman who personally responded to my questions related to timecodes, which is critical for bimanual UMI. 9/9
2
0
20
@chichengcc
Cheng Chi
2 months
This model is far from perfect. For example, it doesn't work well under direct sunlight since it constantly rained at Stanford during our data collection effort. Please share your failure cases! Hopefully, we can have a community-based effort to train an even more robust model!
2
0
20
@chichengcc
Cheng Chi
3 months
Another video for 3D printing tips and tricks
Tweet media one
0
0
19
@chichengcc
Cheng Chi
3 months
Congrats to @JayLEE_0301 for the impressive result with just a single forward pass! Visual imitation FTW!
@LerrelPinto
Lerrel Pinto
3 months
LLMs swept the world by predicting discrete tokens. But what’s the right tool to model continuous, multi-modal, and high dim behaviors? Meet Vector Quantized Behavior Transformer (VQ-BeT), beating or matching diffusion based models in speed, quality, and diversity. 🧵
4
52
289
0
2
19
@chichengcc
Cheng Chi
1 year
By learning the gradient field of action distribution, and generating action trajectories via “gradient descent” during inference, Diffusion Policy addressed 3 key challenges of robotics by inheriting advantages from Diffusion Models. (2/7)
1
2
17
@chichengcc
Cheng Chi
1 month
Excited to see what @xuxin_cheng can do with all these data!
@xuxin_cheng
Xuxin Cheng
1 month
 🤖Introducing 📺𝗢𝗽𝗲𝗻-𝗧𝗲𝗹𝗲𝗩𝗶𝘀𝗶𝗼𝗻: a web-based teleoperation software!  🌐Open source, cross-platform (VisionPro & Quest) with real-time stereo vision feedback.  🕹️Easy-to-use hand, wrist, head pose streaming. Code:
12
82
352
1
1
15
@chichengcc
Cheng Chi
7 months
Simulation envs/data are indispensible tools for rapid development iterations and reproducable benchmarks. A small but ambitious team lead by @zhou_xian_ is pushing the boundary of robotic simulators by unifying multiple material representaitons and solvers. Please check it out!
@zhou_xian_
Zhou Xian
7 months
Can GPTs generate infinite and diverse data for robotics? Introducing RoboGen, a generative robotic agent that keeps proposing new tasks, creating corresponding environments and acquiring novel skills autonomously! code: 👇🧵 (better with audio)
10
83
314
0
1
14
@chichengcc
Cheng Chi
1 year
Can't wish for a higher compliment to our project! Thank you Mohit!
@mohito1905
Mohit Shridhar
1 year
Not only is this amazing research, it has all the elements of a great robotics (manipulation) paper that I wish was common practice in the field. Quick Thread:
1
30
198
0
0
14
@chichengcc
Cheng Chi
1 year
We found Diffusion Policy to perform surprisingly well on real-world tasks by simply training its vision-encoder end-to-end. The resulting policy solves under-actuated, multi-stage tasks with a 6DoF action space and is robust against perturbations and kinematic constraints. (6/7)
1
3
13
@chichengcc
Cheng Chi
21 days
🤯🤯🤯
@vrushankdes
Vrushank Desai
22 days
I spent a couple months at the beginning of this year learning about GPU programming through trying to optimize inference for @chichengcc awesome Diffusion Policy paper. I was able to improve inference time for the denoising U-Net by ~3.4x over Pytorch eager mode and ~2.65x over
30
87
610
0
1
14
@chichengcc
Cheng Chi
5 months
@zipengfu and @tonyzzhao are making strides towards actually useful home robots! Please checkout their video of autonomous shrimp cooking 🍤
@zipengfu
Zipeng Fu
5 months
Introduce 𝐌𝐨𝐛𝐢𝐥𝐞 𝐀𝐋𝐎𝐇𝐀🏄 -- Learning! With 50 demos, our robot can autonomously complete complex mobile manipulation tasks: - cook and serve shrimp🦐 - call and take elevator🛗 - store a 3Ibs pot to a two-door cabinet Open-sourced! Co-led @tonyzzhao , @chelseabfinn
187
889
4K
2
1
12
@chichengcc
Cheng Chi
1 year
This project wouldn’t have been possible without our amazing collaborators from Columbia, MIT and TRI: Siyuan Feng @du_yilun @Zhenjia_Xu @eacousineau @Ben_Burchfiel @SongShuran (7/7)
1
2
13
@chichengcc
Cheng Chi
3 months
@keerthanpg @Stanford I think wrist fisheye cams are sufficient for a surprisingly wide range of tasks. I do think there are tasks that could benefit from more views. For those cases, UMI data pipeline supports unlimited number of non-gripper GoPros (e.g. head mounted)
1
0
13
@chichengcc
Cheng Chi
1 month
Congrats Toru! I hope to see more bi-manual systems from the community Imitation learning works™
@ToruO_O
Toru
1 month
Imitation learning works™ – but you need good data 🥹 How to get high-quality visuotactile demos from a bimanual robot with multifingered hands, and learn smooth policies? Check our new work “Learning Visuotactile Skills with Two Multifingered Hands”! 🙌
7
70
281
1
0
13
@chichengcc
Cheng Chi
1 year
② Being used for image generation, Diffusion Models are no stranger for predicting high-dim tensors. Diffusion Policy’s actions-space scalability affords action sequence prediction, which is surprisingly important to maintain temporal mode-consistency of the actions. (4/7)
Tweet media one
1
2
10
@chichengcc
Cheng Chi
1 year
① Diffusion Policy can express arbitrary normalizable distributions, which includes multimodal action distributions - a well known challenge for policy learning. (3/7)
Tweet media one
1
3
10
@chichengcc
Cheng Chi
1 year
③ Training stability is particularly important for robotics due to the high-cost of real world evaluation. By side-stepping intractable constant approximation by learning the gradient, Diffusion Policy is significantly more stable and easy to train compared to IBC. (5/7)
Tweet media one
1
3
9
@chichengcc
Cheng Chi
8 months
WooHoo! Diffusion Policy in Navigation! Congrats to @ajaysridhar0 and Dhruv on their amazing links of work on real-world vision based navigation!
@shahdhruv_
Dhruv Shah
8 months
Visual Nav Transformer 🤝 Diffusion Policy Works really well and ready for deployment on your robot today! We will also be demoing this @corl_conf 🤖 Videos, code and checkpoints: Work led by @ajaysridhar0 in collaboration with @CatGlossop @svlevine
3
21
133
0
0
9
@chichengcc
Cheng Chi
3 years
🌸
1
0
9
@chichengcc
Cheng Chi
6 months
Big transformer + multiple diffusion head for flexible action prediction! I’m excited for the ability to train with multiple action spaces
@KarlPertsch
Karl Pertsch
6 months
3 mo. ago we released the Open X-Embodiment dataset, today we’re doing the next step: Introducing Octo 🐙, a generalist robot policy, trained on 800k robot trajectories, stronger than RT-1X, flexible observation + action spaces, fully open source! 💻: /🧵
10
92
379
0
1
9
@chichengcc
Cheng Chi
4 years
@FedeItaliano76 Pic #3 is Japan tho
1
0
8
@chichengcc
Cheng Chi
3 months
@abhishekunique7 @breadli428 @zipengfu @tonyzzhao @chelseabfinn I think the key to real-world robustness is close-loop policy and visual feedback. Due to its simplicity, BC is perfect for studying policy representation, HW interface design, and data collection methods. I hope the learning from BC can quickly propagate to the rest of robotics!
0
0
8
@chichengcc
Cheng Chi
2 years
Wow I wish this feature was available when working on the IRP project 🥹 Kudos to the MuJoCo team!
@saran__t
Saran Tunyasuvunakool
2 years
#MuJoCo 2.3.0 is out. We've taken our first steps towards improving fast simulations of flexible materials. Props to @TheSmallQuail for the new cable model!
1
11
110
0
0
7
@chichengcc
Cheng Chi
6 months
@notmahi Really cool work! I need to try this app ASAP 😝
0
0
5
@chichengcc
Cheng Chi
1 year
We’ll see 😉
@shaneguML
Shane Gu
1 year
[2/3] Robotics is hard, and I focus my time mostly on generative AI these days. Because in 5 years, we likely still can't match motor control of a 3-year old baby (generalization, adaptability, smoothness, dexterity)... Last remaining AI challenge would be mastering dexterity.
1
7
31
0
0
4
@chichengcc
Cheng Chi
8 months
Reproducible benchmarks drive the field forward 🦾
@yifengzhu_ut
Yifeng Zhu 朱毅枫
8 months
We are thrilled to announce LIBERO, a lifelong robot learning benchmark to study knowledge transfer in decision-making and robotics at scale! 🤖 LIBERO paves the way for prototyping algorithms that allow robots to continually learn! More explanations and links are in the 🧵
2
58
245
1
0
6
@chichengcc
Cheng Chi
2 months
@Thom_Wolf @RemiCadene Wow so fast! Have fun🔥 🔥 🔥
0
0
6
@chichengcc
Cheng Chi
2 years
@ericjang11 One issue I often encounter in practice is the “moment matching” behavior of L2 regression (implying Gaussian dist) failed to capture multimodal distributions. On the other hand, categorical distribution implied by classification handles this quite well.
0
0
5
@chichengcc
Cheng Chi
3 months
@RemiCadene @Tesla Congrats! Open source FTW!
1
0
5
@chichengcc
Cheng Chi
6 months
Woohoo!
@nilsingelhag
Nils Ingelhag
6 months
Recording demonstrations for diffusion-policy 😎 Training models for tonight, eval tomorrow! Great work @chichengcc @du_yilun @eacousineau and team!
2
3
27
0
0
5
@chichengcc
Cheng Chi
1 year
🤯
@kevin_zakka
Kevin Zakka
1 year
First, if you enjoyed the video above, we have a live demo that runs MuJoCo in your browser using Javascript and Web Assembly! You can accompany the robot (drag the keys down), or be adversarial and tug at the fingers 🙃
4
14
112
1
0
4
@chichengcc
Cheng Chi
19 days
@Mankaran32 Cool mod! How are you planning to supply power?
1
0
4
@chichengcc
Cheng Chi
4 years
@tianyuf @yindavid This is bizzare. My grand parents have white hair and they can definitely renew their national IDs.
0
0
4
@chichengcc
Cheng Chi
1 year
@peteflorence @SongShuran Thanks Pete! The amazing demos from IBC was the reason I started working on BC and leads to this porject!
0
0
4
@chichengcc
Cheng Chi
4 months
@ericjang11 Really appreciate your quite detailed approach writeup! I wish more companies do the same
0
0
4
@chichengcc
Cheng Chi
3 months
@notmahi Elegant code as always! Congrats @notmahi
1
0
4
@chichengcc
Cheng Chi
2 years
@kevin_zakka Their SDF based collision handling for nuts and bolts will be available in next Issac Sim release! You can already try it in latest Omniverse Create
0
1
4
@chichengcc
Cheng Chi
1 year
@danfei_xu I can’t thank you and coauthors of robomimic enough! Getting good results on robomimic was the reason we are confident enough to start working on real word tasks. The hyperparameters tuned on robomimic also transfer to real very well
0
0
3
@chichengcc
Cheng Chi
8 months
@philippswu Really cool project! Can’t wait to teleport our UR5s with GELLO!
1
0
3
@chichengcc
Cheng Chi
3 months
@zipengfu @YouTube Thank you for the advice 😉
0
0
3
@chichengcc
Cheng Chi
3 months
@AlexanderDerve @Stanford The IMU data recorded by GoPro is key for robust SLAM. I’m not aware of other action cameras with the same feature. The max lens mod (fisheye) is critical for learning as well
0
0
3
@chichengcc
Cheng Chi
2 months
@chris_j_paxton I believe this robot is teleported🤔 Very smooth hardware tho
2
0
3
@chichengcc
Cheng Chi
2 months
@LerrelPinto @ieeeras Congrats Lerrel!
1
0
2
@chichengcc
Cheng Chi
2 years
@kevin_zakka Issac Gym is still half-baked with missing advertised features. Nvidia had a bad track record for maintaining non-revenue generating software, specially given its recent pivot to Metaverse (Omniverse). On the other hand, MuJoCo has withstood the test of time
0
0
2
@chichengcc
Cheng Chi
2 months
@AlperCanberk1 Direct sunlight makes extremely high contrast shadows. The highlight also saturates image sensor. Actually very different distribution vs what color jitter does.
1
0
2
@chichengcc
Cheng Chi
5 years
@Nick_Stevens_Gr @katlinegrey Is that logo on the upper left....Microsoft?😂
Tweet media one
0
1
2
@chichengcc
Cheng Chi
2 years
@fatlimey I found this effect to less extreme if I took my glasses off. Chromatic aberration?
0
0
2
@chichengcc
Cheng Chi
1 year
@AjayMandlekar @SongShuran Thank you so much for publishing robomimic! Our project might went a very different route without it
1
0
2
@chichengcc
Cheng Chi
3 months
@lab_ho @keerthanpg @Stanford I think more cameras and sensors are absolutely helpful! With UMI, we tried to answer a different question: what’s the absolute minimum set of cameras do we need? We found that you can do a lot with surprisingly few (one per arm).
1
0
2
@chichengcc
Cheng Chi
3 months
@wenlong_huang 🐔<->🥚 🐣
0
0
2
@chichengcc
Cheng Chi
3 months
0
0
2
@chichengcc
Cheng Chi
4 years
@handleym99 @jonmasters @davideneco25320 Sunway TaihuLight uses domestically designed CPU though ( #1 HPC 2015-2018)
0
0
2
@chichengcc
Cheng Chi
6 months
@LerrelPinto @notmahi @anantr_ai @HarithejaE @imisra_ @soumithchintala Fantastic work! Congrats Lerrel! Have you experimented with policy formulations other than action regression? I can imagine that being a bottleneck for task performance
1
0
2
@chichengcc
Cheng Chi
5 years
@HongKongFP I have respected HKFP for its independent reporting on recent HK issues. However, I failed to find any image in this post about protesters actually being crushed by the van. I guess it’s just a metaphor? Please don’t become yet another clickbait generator!
0
0
2
@chichengcc
Cheng Chi
4 years
@syam64 @shapoco I had exactly this problem before! It can be repaired at an Apple Store by replacing the camera module. I remember it only costed USD $60
1
0
1
@chichengcc
Cheng Chi
3 months
@chenwang_j Thank you Chen! Looking forward to what you are brewing 👀
0
0
1
@chichengcc
Cheng Chi
1 year
@peterchencyc Congrats Peter!
0
0
1
@chichengcc
Cheng Chi
1 year
@kevin_zakka Kevin Inference vs Andy Inferece, who will win? ⚔️
0
0
1
@chichengcc
Cheng Chi
3 months
@SurajNair_1 Congrats Suraj!
1
0
1
@chichengcc
Cheng Chi
5 years
@Silas_Artist @cgmastersnet The rendering used several light path based tricks, therefore will not work out of the box in EEVEE. I will take a look once I have time.
0
0
1
@chichengcc
Cheng Chi
5 years
@PatrickMoorhead @dylan522p Numerous counter examples can be made to your argument, including the newly started Tesla factory in Shanghai and Samsung’s factory in Tianjin. Apple also operates in China, and has far less than 50% of its stock owned by Chinese investors.
0
0
1
@chichengcc
Cheng Chi
2 months
@raghavaupp @AlperCanberk1 We use GoPro’s auto exposure for both data collection and inference. During training additional color jittering augmentation is added.
1
0
1
@chichengcc
Cheng Chi
3 years
@cszechy Congratulations Colin! Go Blue!
0
0
1
@chichengcc
Cheng Chi
3 months
0
0
1
@chichengcc
Cheng Chi
3 months
@simonkalouche Thanks Simon!
1
0
1
@chichengcc
Cheng Chi
5 years
@mao9821 Best illustration I have seen so far! Thank you so much!
0
0
1
@chichengcc
Cheng Chi
3 months
@ehsanik @SongShuran Thanks Kiana! It was always nice to have you here! Plz come to Stanford more often!
0
0
1
@chichengcc
Cheng Chi
5 years
@davegraham @david_schor This is for Prores Raw though, first on the market. Red rocket was for Redcode Raw.
2
0
1
@chichengcc
Cheng Chi
3 years
@drerictmiller Exactly! I was very confused by this when I first came to the states
0
0
1
@chichengcc
Cheng Chi
5 years
@tculpan No offenses, but Hu’s statement is consistent with my observation of my Chinese friends. There is a huge media bias problem on both sides. HK protestors will not lose support from the west even if they kill a couple of innocent people, same can be said for HK police and CN people
1
0
1
@chichengcc
Cheng Chi
3 years
@muddywatersre @elonmusk @ritholtz Due to various constraints, the ability to sublease house/cars multiple levels are limited, yet you can short a stock multiple times, >100% of what’s available, which is one of the important factor of why this big squeeze worked.
0
0
1
@chichengcc
Cheng Chi
2 months
@JimmyTYYang1 12 man-hours
0
0
1
@chichengcc
Cheng Chi
6 months
0
0
0
@chichengcc
Cheng Chi
2 months
@benjamin_bolte Congrats Ben!
0
0
0
@chichengcc
Cheng Chi
3 years
@keenanisalive Congratulations!
0
0
1