Chris Paxton Profile Banner
Chris Paxton Profile
Chris Paxton

@chris_j_paxton

8,071
Followers
1,518
Following
335
Media
3,998
Statuses

Mostly posting about robots. Embodied AI @hellorobotinc , formerly @AIatMeta , @NVIDIAAI , @zoox . All views my own.

Pittsburgh, PA
Joined December 2013
Don't wanna be here? Send us removal request.
Pinned Tweet
@chris_j_paxton
Chris Paxton
26 days
Some personal news: I’m excited to be joining Hello Robot, to lead their Embodied AI effort! For years now, I’ve been a passionate supporter of their vision of affordable, useful robots that can help people out with their day-to-day lives. I’ve previously worked at FAIR, part of…
Tweet media one
77
11
535
@chris_j_paxton
Chris Paxton
1 month
lmao, why do people do this
Tweet media one
50
515
7K
@chris_j_paxton
Chris Paxton
2 months
@Jetskigrizzly Honestly, at most jobs I've worked, you'd be able to get that $12 reimbursed much more easily than $1 of real pennies No idea how it works for teachers
25
0
2K
@chris_j_paxton
Chris Paxton
1 month
@taknil Hacker news was suggesting it was a honeypot
Tweet media one
5
8
600
@chris_j_paxton
Chris Paxton
22 days
Its genuinely hard to believe a 70B model is up there with the 1.8T GPT4? I guess training data really is everything
@Teknium1
Teknium (e/λ)
22 days
Welp folks, we have gpt-4 at home
Tweet media one
152
377
5K
19
47
575
@chris_j_paxton
Chris Paxton
1 month
So big news: today was my last day at Meta. It's bittersweet, since meta has some of the smartest and hardest working people in the world, and I learned a lot there. But I'm excited to work on something new that I'm passionate about - what won't be a surprise if you follow me.
47
4
470
@chris_j_paxton
Chris Paxton
2 months
So does tesla full self driving just kind of work now? As a non tesla owner I was super impressed
14
16
341
@chris_j_paxton
Chris Paxton
1 month
@sureailabs They're all different top level domains, it feels like a gray area to me?
4
0
330
@chris_j_paxton
Chris Paxton
11 months
The future of robot butlers starts with mobile manipulation. We’re announcing the NeurIPS 2023 Open-Vocabulary Mobile Manipulation Challenge! - Full robot stack ✅ - Parallel sim and real evaluation ✅ - No robot required ✅👀
4
74
323
@chris_j_paxton
Chris Paxton
3 months
There’s a ton of exciting progress in robotics these days, from big companies like Google and NVIDIA, to startups and academic labs. Here’s a short thread of some cool robotics things you may have missed from the last week or so:
Tweet media one
1
46
286
@chris_j_paxton
Chris Paxton
1 year
As a roboticist I think it's a bit disappointing how many of the flashiest deep learning results seem to depend on massive amounts of really good, clean data (Dreamer, ChatGPT) - as opposed to mountains of unlabeled garbage and a handful of good examples
10
8
252
@chris_j_paxton
Chris Paxton
2 years
This is a really cool idea: 1 - segment objects + generate text description of scene 2 - use language prompt with DALL-E, generate goal image 3 - segment again, match features using CLIP and then use pose estimation to plan a sequence of pick + place actions
@_akhaliq
AK
2 years
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics abs: project page:
Tweet media one
6
107
770
0
34
245
@chris_j_paxton
Chris Paxton
2 years
The rise of LLMs and foundation models in robotics will be the most influential + important development in our field this year (presentation by Aleksandra Faust of google)
Tweet media one
4
34
238
@chris_j_paxton
Chris Paxton
2 years
Interested in a fall internship in robot learning at Meta AI, particularly using language or for multi-step tasks? Send me a message!
14
30
217
@chris_j_paxton
Chris Paxton
2 months
Diffusion has a lot of potential to be the future of planning: - better at long horizon problems than straight feed-forward learning - can be generated from imperfect sensor data, unlike classical methods - generates diverse "plans" before execution so we can check for safety
@dkanou
Dimitrios Kanoulas
2 months
Creating 2D maze paths using diffusion! All applied on legged robot navigation. #ICRA2024 is coming soon, and @JianweiLiu93 and Mania will present the work in Japan! Paper: Video:
Tweet media one
2
27
158
8
19
214
@chris_j_paxton
Chris Paxton
1 month
Robotics foundation model is a term you will hear a lot over the next year
@CovariantAI
Covariant
1 month
Large language models are trained to predict the next token in text. For robotics, this means training models on physical interaction datasets to build generalized AI that can simulate the physical world. We've built RFM-1, the first commercial Robotics Foundation Model. RFM-1…
9
35
204
4
17
210
@chris_j_paxton
Chris Paxton
2 years
If scale really is all you need (for machine learning) how come tesla self driving isn't that good? They have so much data
33
4
203
@chris_j_paxton
Chris Paxton
1 year
@DrJimFan Maybe, but anyone can say their secret thing is great. OpenAI is the only one who's proven it. At some point, papers and benchmarks don't matter; putting it in the hands of ordinary people and seeing it not blow up makes a way more powerful statement
10
8
195
@chris_j_paxton
Chris Paxton
2 months
Cheap robots + learning will change the world
@alexkoch_ai
Alexander Koch
2 months
Folding clothes with $250 robot arms. I've added another motor to improve mobility and extend the reach. The CAD files and the code are public at: (video at 2x speed)
129
389
3K
4
17
196
@chris_j_paxton
Chris Paxton
24 days
I like this advice. If I was getting started as a robotics student these days, I think I would install ManiSkill2 and try to get some learning stuff working - I like that it has great simulations of "real" robot sensors, is fast, and easy to start with.
@JohnVial
Dr. John Vial
25 days
Robotics Students: Stop Doing Basic Tutorials, Start Doing Projects
Tweet media one
7
27
260
4
29
188
@chris_j_paxton
Chris Paxton
2 years
So, some really big news: this is my last week at NVIDIA! Starting next week, I'll be working with @MetaAI in Pittsburgh, working on learning representations for long-horizon robot planning from perception. I've loved working at NVIDIA and am excited for the next thing.
5
3
179
@chris_j_paxton
Chris Paxton
6 months
In personal news, major development in autonomous intelligent agents achieved (we had our first baby 11/11, going on paternity leave, won't be as responsive or on Twitter as much)
24
1
181
@chris_j_paxton
Chris Paxton
2 years
Convergent evolution in the Dyson video (left) vs Google (center) and Hello (right). I think this is clearly (roughly) what the first in home robots will look like
Tweet media one
8
15
165
@chris_j_paxton
Chris Paxton
4 months
Lots of roboticists talk about how inspiring watching babies learn is. Total nonsense, worst manipulation learning algorithm I've ever seen. 2 months of basically constant supervision and no applications to real-world tasks at all. It's like grad school all over again
21
1
158
@chris_j_paxton
Chris Paxton
2 months
There’s a lot of cool stuff going on in robotics research these days. Here’s some cool stuff -- focusing on research papers -- that you might have missed, including work from Meta, TRI, and academia, a thread:
Tweet media one
2
28
155
@chris_j_paxton
Chris Paxton
1 month
@sureailabs Actually I think youre right, it must be ignoring robots.txt, or it wouldnt have suddenly gotten stuck like this right?
4
1
157
@chris_j_paxton
Chris Paxton
10 months
- Large language models are becoming the standard way to program - Most robotics programs are very simple at the high level (go here, grab that) - So I guess replace all your state machines with GPT already - Great tools + examples in several robotics domains
@_akhaliq
AK
10 months
ChatGPT for Robotics: Design Principles and Model Abilities paper page: paper presents an experimental study regarding the use of OpenAI's ChatGPT for robotics applications. We outline a strategy that combines design principles for prompt engineering and…
Tweet media one
14
136
549
6
30
155
@chris_j_paxton
Chris Paxton
3 months
There’s nothing like seeing a robot working at home for yourself. It has to move a variety of unusual objects, like the green cactus plushie in this video, and place them intelligently in a very messy world. So on that note, we're opening up OK Robot to the community! A thread:
8
22
151
@chris_j_paxton
Chris Paxton
1 year
LLMs like GPT take input tokens - words - to make predictions. But what do those tokens look like for robotics? In SLAP, we propose a multimodal representation for learning generalizable, language-conditioned robot policies. (short thread)
3
33
150
@chris_j_paxton
Chris Paxton
2 years
Use multi-modal transformers to predict plan success in task and motion planning -- route to letting robots perform challenging multi-step tasks. Tokenize plan, initial condition, and goal. (continuing to share papers that seem cool but aren't mine)
Tweet media one
3
22
150
@chris_j_paxton
Chris Paxton
4 months
Large Language Models (LLMs) and Vision-Language Models (VLMs) are poised to revolutionize robotics. Join our workshop at #ICRA2024 on VLMs/LLMs for scene understanding, decision making, control, and more: Submissions due March 11, 2024!
Tweet media one
4
30
144
@chris_j_paxton
Chris Paxton
4 months
Build a "foundation model" for 6dof pose detection and tracking of arbitrary objects, from my old colleagues at NVIDIA: Interesting objects detected and tracked in clutter, and the AR demo is super cool. 3d foundation models will fuel in home robotics
5
19
140
@chris_j_paxton
Chris Paxton
7 months
There are a large number of papers coming out which show that you can use LLMs to create realistic or interesting motions. Some interesting ones. First, RoboTool:
Tweet media one
3
27
139
@chris_j_paxton
Chris Paxton
2 months
There’s a ton of exciting progress in robotics these days. We saw more ICRA2024 and CVPR 2024 accepted papers, amazing space robots, exciting new progress in humanoids, and a bunch of cool research from various labs. Here’s some things you may have missed:
Tweet media one
3
16
129
@chris_j_paxton
Chris Paxton
19 days
this guy is going to single handedly bring AGI progress to a halt in the way Eliezer could only dream
@RatOrthodox
Ronny Fernandez 🔍⏸️
19 days
factorio 2 is coming out soon. if you work in frontier model research at open ai, anthropic, or deepmind and would like a free copy, I would be very happy to buy you one! please feel free to reach out. people don't do enough for you guys
56
111
2K
3
14
130
@chris_j_paxton
Chris Paxton
2 years
Interesting idea here. Using LLMs to take actions, retrieve information about the world, and do some simple chain-of-thought reasoning. Increasingly seems that large language models are the glue that can combine disparate parts of AI + robotics to build some cool applications
Tweet media one
@ShunyuYao12
Shunyu Yao
2 years
Large Language Models (LLM) are 🔥in 2 ways: 1.🧠Reason via internal thoughts (explain jokes, math reasoning..) 2.💪Act in external worlds (SayCan, ADEPT ACT-1, WebGPT..) But so far 🧠and💪 remain distinct methods/tasks... Why not 🧠+💪? In our new work ReAct, we show 1+1>>2!
10
67
390
3
16
128
@chris_j_paxton
Chris Paxton
3 months
This is by far the most impressive humanoid manipulation video I've seen so far
@BostonDynamics
Boston Dynamics
3 months
Can't trip Atlas up! Our humanoid robot gets ready for real work combining strength, perception, and mobility.
225
1K
5K
12
5
127
@chris_j_paxton
Chris Paxton
2 months
3D multimodal diffusion models let robots perform language-conditioned tasks in the real world. - combine pretrained 2d features from CLIP with 3d perception - achieves SOTA on RLBench - works on the real robot Website:
1
18
125
@chris_j_paxton
Chris Paxton
1 year
TidyBot from a team at princeton. - ViLD + CLIP + LLMs for object detection and reasoning - Engineered low-level skills for the robot Yet another paper making me wonder "is robot learning actually dead though?" seems like classic robotics + vision/language models are all you need
Tweet media one
@_akhaliq
AK
1 year
TidyBot: Personalized Robot Assistance with Large Language Models approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts…
33
308
1K
7
19
122
@chris_j_paxton
Chris Paxton
1 month
Actually its from hacker news
0
3
121
@chris_j_paxton
Chris Paxton
7 months
So many humanoids startups. The future feels like it's coming fast sometimes
Tweet media one
10
6
115
@chris_j_paxton
Chris Paxton
1 year
"Language as a generic interface between neural networks" - decompose into a set of structured tasks using different networks Cool idea for getting complex behavior out of LLMs
Tweet media one
2
18
117
@chris_j_paxton
Chris Paxton
1 month
HuggingFace looking to replace ROS with something that's more performant, based on modern practices. ROS is a behemoth - there's a lot of code for things like drones, ground vehicles, SLAM, etc. that you might not even *want* to include in an effort like this. Pretty excited to…
@Ilir_AI
Ilir
1 month
🔥 𝗛𝗨𝗚𝗚𝗜𝗡𝗚 𝗙𝗔𝗖𝗘 🔥 - aims to replace ROS - 😱 NEW robotics framework - called Dora-rs. ⚡️A super impressive replacement of ROS (the Robot Operating System), for those who know, one of the pain-point to lower the entry barrier in robotics. 🤗Dora-rs is much much…
10
38
193
6
10
116
@chris_j_paxton
Chris Paxton
16 days
Cool low cost robots I've seen recently, what am I missing:
6
11
114
@chris_j_paxton
Chris Paxton
3 months
If people like this kind of thing, I'll try to do it every week. Unless I have something better to do or I forget
@chris_j_paxton
Chris Paxton
3 months
There’s a ton of exciting progress in robotics these days, from big companies like Google and NVIDIA, to startups and academic labs. Here’s a short thread of some cool robotics things you may have missed from the last week or so:
Tweet media one
1
46
286
14
3
114
@chris_j_paxton
Chris Paxton
3 months
I largely disagree with this, I think this is a classic underestimate of what in robotics is actually hard vs what learning is good at
@DrJimFan
Jim Fan
3 months
Robotics is a field with lots of historical burdens. Too many poorly engineered software, algorithmic patches, and outdated mindsets. Foundation models will happen slowly, then all at once before you know it, burying these legacy stuff once and for all. Just like how ChatGPT…
41
80
576
3
6
112
@chris_j_paxton
Chris Paxton
9 months
Potentially one of the most exciting LLM papers I've seen for a while: - uses scannet/Objaverse/HM3d datasets to create a large database of 3d-language data - get 3d projected vision-language features (e.g. (clip, x, y, z) - train LLM align w/ 3d features
Tweet media one
3
29
111
@chris_j_paxton
Chris Paxton
3 years
We are now looking for PhD students to intern with the Robotics Lab at NVIDIA! Help build robots that perceive the world, use language to work with humans, and perform challenging tasks.
1
19
105
@chris_j_paxton
Chris Paxton
1 month
It's always amazing to me how we talk so much about end to end, and then we see results like this - with robots actually generalizing in a new environment - and it's basically an engineered, classical planning system powered by VLMs. Not that there's anything wrong with that!
Tweet media one
@Ilir_AI
Ilir
1 month
🤖 𝗥𝗢𝗕𝗢𝗧 𝗡𝗔𝗩𝗜𝗚𝗔𝗧𝗜𝗢𝗡 🧭 Finding things in - u n k n o w n - spaces: Vision-Language Frontier Maps (VLFM), a navigation system inspired by human decision-making to find unseen objects in new places. ✅ Identifies exploration frontiers using maps and a…
4
12
71
2
17
108
@chris_j_paxton
Chris Paxton
17 days
Absolutely wild that this kind of thing works at all
@saba_khalilnaji
Saba Khalilnaji
18 days
GPT-4V picks up an object with my robot hand. 1 hour to setup 0 tuning. Using laptop camera. Output is wrist angles and grip type. #Robotics
11
16
106
2
6
107
@chris_j_paxton
Chris Paxton
1 month
Why do you need human hands for industrial automation when you can do this?
@Ilir_AI
Ilir
1 month
🤖 Gripper-cushion can grip almost everything ❗ 🖐 Handles sheet metal, liquid tanks, nonwovens, and pipes: FORMHAND Automation GmbH gripper pads are revolutionizing material handling in manufacturing. ✅ Versatile: Capable of gripping diverse materials from sheet metal to…
2
9
50
16
7
105
@chris_j_paxton
Chris Paxton
1 year
I like this paper; using LLMs while also pouring some cold water on the LLM hype (they clearly cannot plan; hard to consider something that can't plan on the road to AGI in my opinion) LLMs are good at writing code though, so generate PDDL and get a plan that way hopefully
Tweet media one
6
12
102
@chris_j_paxton
Chris Paxton
5 months
Great stuff (from ViLA ) - common sense reasoning to search for the object in various locations - shows how you can connect different skills together with some memory and visual reasoning
1
15
97
@chris_j_paxton
Chris Paxton
1 year
There is a genre of papers which is "i made my robot do a cool thing no one else's can do yet," which will forever be my favorite
2
3
98
@chris_j_paxton
Chris Paxton
11 months
Honestly, one of my favorite robotics paper videos I've seen this year. it just does so much with no cuts. The skills here are all heuristic (not trained/learned, as far as I can tell), LLM-based high level reasoning
@jimmyyhwu
Jimmy Wu
1 year
When organizing a home, everyone has unique preferences for where things go. How can household robots learn your preferences from just a few examples? Introducing 𝗧𝗶𝗱𝘆𝗕𝗼𝘁: Personalized Robot Assistance with Large Language Models Project page:
22
116
529
2
19
95
@chris_j_paxton
Chris Paxton
19 days
Segment Anything with Optical Flow makes for an incredibly good segmentation method. Really useful to know.
@dreamingtulpa
Dreaming Tulpa 🥓👑
20 days
SAM + Optical Flow = FlowSAM FlowSAM can discover and segment moving objects in a video and outperforms all previous approaches by a considerable margin in both single and multi-object benchmarks 🔥
10
233
1K
2
10
93
@chris_j_paxton
Chris Paxton
2 months
@soft_fox_lad where's the lie
2
1
94
@chris_j_paxton
Chris Paxton
3 months
Love seeing this. We need widely available low cost robotics for our field to have anything like a "GPT moment" or even an imagenet moment
@alexkoch_ai
Alexander Koch
3 months
Early results from my AI training runs. I've trained my $200 robot arm on a simple picking task using imitation learning. It has learned to control the robot arm using only camera images and joint states.
94
286
2K
2
6
92
@chris_j_paxton
Chris Paxton
1 month
Honestly I don't think Apple is the one who will pull this off Someone else will be first mover
@SawyerMerritt
Sawyer Merritt
1 month
NEWS: Apple is exploring home robots as its potential ‘next big thing’ after its car efforts failed.
287
93
1K
23
1
92
@chris_j_paxton
Chris Paxton
3 months
Object-centric world representations make a lot of sense for robots interacting with their world, but no one "scale" of object makes sense. So I liked seeing this work, GARField, which learns a scale conditioned field per-scene for object segmentation. Notes:
Tweet media one
@ChungMinKim
Chung Min Kim
4 months
🪴Should the leaves of a plant be considered separate or part of the whole? Answer: it really depends! Points can, and should, belong to multiple groups. With GARField, points can belong to multiple groups, with physical scale 📏 as an extra dimension.
6
24
191
1
13
89
@chris_j_paxton
Chris Paxton
1 year
I'm glad to see people are interested in HACMan. This paper convinced me this was the *right* way of scaling autonomy for dexterous manipulation: predicting actions as interactions with the world and trajectories. Lots of cool potential followups Writeup:
0
24
89
@chris_j_paxton
Chris Paxton
24 days
Love this
Tweet media one
6
8
90
@chris_j_paxton
Chris Paxton
10 months
Tomorrow at #RSS2023 , in session 4: StructDiffusion shows how we can train *multi-modal transformers* which can build structures out of previously-unseen objects in the real world, using object-centric diffusion led by @Weiyu_Liu_ w/ @du_yilun , Tucker Hermans, + Sonia Chernova
1
15
89
@chris_j_paxton
Chris Paxton
2 months
@davelu lol so being tall is only an advantage if you're gunning for someone else's job? thanks for the context
1
0
89
@chris_j_paxton
Chris Paxton
11 months
Learning sampling dictionaries for robot motion planning from point clouds, using transformers. Cool work from @ahquresh and others: Motion planning is hard, and genuinely not solved! Always exciting to see work like this
Tweet media one
2
22
75
@chris_j_paxton
Chris Paxton
2 years
Some cool research from MSR: this is what I really think "foundation models" for robots should look like. Learn one huge model to solve real robot problems like localization, mapping, etc.
0
33
86
@chris_j_paxton
Chris Paxton
6 months
Our final CoRL paper, SLAP: Spatial Language Attention Policies, was also presented yesterday. Presented by @Priyam8Parashar With 3d multimodal transformers, we can learn policies for mobile robots that can scale to larger scenes and more complex environments with less data!
3
17
86
@chris_j_paxton
Chris Paxton
9 months
The leaderboard for the HomeRobot challenge is now open at : - create a docker image containing your agent - planning or learning, use your favorite strategy - we plan to run the top 3 on real hardware - potentially win a robot from @hellorobotinc
Tweet media one
2
23
83
@chris_j_paxton
Chris Paxton
1 year
Setting the table using multi-modal transformers via StructDiffusion! In the end, this is a very hard task + motion planning problem, since we need to figure out where to place objects, as well as how to grasp them and move them.
2
8
85
@chris_j_paxton
Chris Paxton
1 month
Source
1
5
84
@chris_j_paxton
Chris Paxton
1 year
I still really think we need *small data* like this for robotics - your home/office/factory is *yours*, no one else's, and you're not going to get a million hours of data there and I really like the range of robot platforms and tasks here!
@LerrelPinto
Lerrel Pinto
1 year
While we are going gaga over large models and big data, there is still incredible value left to extract in small models and data, especially in robotics. All the skills shown below were each trained with <1 min of human data and <20 min of online RL 🧵👇
8
64
384
4
9
82
@chris_j_paxton
Chris Paxton
2 years
Does robotics research have much credibility *outside* robotics research? Whenever I talk to lay people (i.e. friends and family) they usually say something like, "oh, you guys have been promising in-home robots for years and we're no where close."
13
4
83
@chris_j_paxton
Chris Paxton
10 months
We (quickly) trained a multi-modal (spatial+language) transformer for low-cost home robots from @hellorobotinc , which can solve tasks like "bring me a bottle" in an unmapped home. Work w/ @Priyam8Parashar @XiaohanZhang220 @viddivj @jdvakil @ybisk More details:
@priyam8parashar
Priyam Parashar
10 months
Robot learning of language and manipulation tasks needs to be sample efficient. SLAP combines language and point-cloud embeddings as spatial-language tokens within a Transformer, to do just that – learn free-form language-conditioned robot policies. 🧵
4
21
142
1
10
82
@chris_j_paxton
Chris Paxton
2 months
I shared this already but wanted to comment a bit. I really like that this thing is - fast - agile - looks to be purpose built for moving containers around - mechanically simple - probably can be made cheap and robust All in all seems like a really cool and novel design
@KennethCassel
Kenneth Cassel
2 months
favorite warehouse robot from the last few years is the evoBOT
23
123
2K
2
7
81
@chris_j_paxton
Chris Paxton
2 years
Loved the #ICRA2022 robot parade. These things feel like they've come a long way.
Tweet media one
2
11
80
@chris_j_paxton
Chris Paxton
8 months
Love this. Language understanding adds robustness and interpretability to self driving cars
@Jamie_Shotton
Jamie Shotton
8 months
For example, what did the model perceive that made it slow down? #EmbodiedIntelligence #LINGO1 #selfdriving
3
5
42
1
6
80
@chris_j_paxton
Chris Paxton
1 month
@moultano i feel like some of the "enshittification" discourse is pretty interesting, but treating it like some natural property of the system is weird, i feel like there are clearly companies whose products have held up perfectly well
1
0
81
@chris_j_paxton
Chris Paxton
6 months
Let's make robots that can help out in *any* home environment! Check out our poster on *Open Vocabulary Mobile Manipulation* at #CoRL2023 @corl_conf tomorrow 12pm-12:45pm eastern. Presented by the great @yvsriram - applying for PhD programs this fall!
0
17
80
@chris_j_paxton
Chris Paxton
17 days
Chest-mounted teleop setup is very cool. Seems like an extremely dexterous platform as well, capable of performing a ton of different tasks.
@ReazonHILab
Reazon Human Interaction Lab
17 days
𝗥𝗲𝗮𝘇𝗼𝗻𝗖𝗼𝗯𝗼𝘁 is a low-cost and robust household collaborative robot. Check out our first teleoperated dual-armed prototype: chest-mounted control, arm elevation, mobile chassis, and lemons 🍋  (full video → )
11
48
209
4
6
79
@chris_j_paxton
Chris Paxton
1 month
Every time I try to use some simple install instructions for research code from github, I get frustrated. Even with conda etc, it seems like AI code is much less reproducible than you would think.
12
3
79
@chris_j_paxton
Chris Paxton
1 year
0 papers at ICRA this year, RIP (it happens)
3
2
78
@chris_j_paxton
Chris Paxton
4 months
Introducing MultiPLY, multimodal LLM which predicts action tokens, as well as a 500k multimodal dataset with dialogue, audio, touch, etc. They use an object-centric scene representation, powered by ConceptGraphs. Very promising results!
@gan_chuang
Chuang Gan
4 months
Can LLM engage actively in 3D environments and dynamically gather multisensory interactive data, including g visual, audio, tactile, and thermal? Introducing MultiPLY, a multisensory embodied LLM that seamlessly connects words, actions, and perceptions!
5
74
313
2
8
79
@chris_j_paxton
Chris Paxton
4 months
Low cost mobile manipulation in the real world. This is how we scale up robot learning, with real data and widely reproducible hardware!
@_akhaliq
AK
4 months
Adaptive Mobile Manipulation for Articulated Objects In the Open World paper page: Deploying robots in open-ended unstructured environments such as homes has been a long-standing research problem. However, robots are often studied only in closed-off lab…
2
70
300
1
12
79
@chris_j_paxton
Chris Paxton
2 months
Getting high performance -- including generalization and 99%+reliability -- with a learned model is really difficult. Covariant aims to do it with: - an 8 billion parameter transformer trained on text, images, actions, and sensors - all modalities tokenized into a common space -…
Tweet media one
1
7
77
@chris_j_paxton
Chris Paxton
2 months
This is the dream
@xiaolonw
Xiaolong Wang
2 months
I have been cleaning my daughter's mess for more than two years now. Last weekend our robot came to home to do the job for me. 🤖 Our new work on visual whole-body control learns a policy to coordinate the robot legs and arms for mobile manipulation. See…
23
115
651
2
9
76
@chris_j_paxton
Chris Paxton
5 months
How to build a home robot today: - multi object rearrangement - captures user preferences - handles multiple rooms The system diagram here gives a good feel for how complex a system a *reliable* in home robot actually is
Tweet media one
0
16
77
@chris_j_paxton
Chris Paxton
11 months
Cool work from @imankitgoyal et al at NVIDIA: - 3d representations are great but often slow - instead, take pt cloud, render it to multiple 2d synthetic views and predict there - significantly outperforms PerAct and works in the real world on a lot of tasks
Tweet media one
@NVIDIARobotics
NVIDIA Robotics
11 months
Our Seattle #Robotics Lab introduces a new, faster, & more efficient method for teaching robot manipulation tasks in real-life scenarios — like opening drawers or dispensing soap — training 36x faster than the current standard. #NVIDIAResearch Learn more:
0
11
62
2
9
76
@chris_j_paxton
Chris Paxton
1 year
Segment anything has learned a general concept for "objectness." This is huge, a foundation model for image segmentation is itself a huge step towards a foundation model for robotics and embodied ai
@AIatMeta
AI at Meta
1 year
Today we're releasing the Segment Anything Model (SAM) — a step toward the first foundation model for image segmentation. SAM is capable of one-click segmentation of any object from any photo or video + zero-shot transfer to other segmentation tasks ➡️
144
2K
7K
0
7
76
@chris_j_paxton
Chris Paxton
8 months
Article on how Toyota is scaling diffusion policies: 60 tasks including spreading on bread, mixing eggs, etc.
Tweet media one
1
8
76
@chris_j_paxton
Chris Paxton
6 months
Very cool work: - start with a trained LLM - learn alignment from frozen vision encoders - learn to predict actions using this LLM backbone This feels like closer to the "right" way to get general intelligence on robots, since we can leverage web data to learn useful features
@alexttoshev
Alexander Toshev
7 months
Can an LLM be adapted to see and act within an On Policy RL for Embodied Tasks? We show how to adapt a 13B LLM with PPO, and demonstrate strong generalization capabilities on a wide range Language Rearrangement tasks. Paper and results:
3
18
143
2
3
74
@chris_j_paxton
Chris Paxton
3 years
Arranging unknown objects according to structured language, trained on sim data and applied in the world with no fine tuning. Really excited about this new work with Weiyu Liu, Tucker Hermans, and Dieter Fox.
@_akhaliq
AK
3 years
StructFormer: Learning Spatial Structure for Language-Guided Semantic Rearrangement of Novel Objects abs: project page:
0
6
47
2
8
74
@chris_j_paxton
Chris Paxton
4 months
So we have our OK-Robot paper for pick and place, and now this one for door opening. That said, methods are very different. This one focuses on very efficient Ral using a parameterized action space, which makes grasp predictions relative to "door" points.
@Haoyu_Xiong_
Haoyu Xiong
4 months
Introducing Open-World Mobile Manipulation 🦾🌍 – A full-stack approach for operating articulated objects in open-ended unstructured environments: Unlocking doors with lever handles/ round knobs/ spring-loaded hinges 🔓🚪 Opening cabinets, drawers, and refrigerators 🗄️ 👇…
30
103
784
1
9
74
@chris_j_paxton
Chris Paxton
22 days
I love all the ingenuity people are using to build and operate these cheap robots.
@BartronPolygon
Bart Trzynadlowski
23 days
This might just be crazy enough to work.
10
17
153
2
4
74
@chris_j_paxton
Chris Paxton
2 years
You can now check out the code for our ICRA 2022 paper StructFormer, for rearranging previously unseen objects based on structured language commands:
1
13
72
@chris_j_paxton
Chris Paxton
1 year
Won outstanding paper award at LangRob - CoRL 2022. Awesome work by @notmahi
@notmahi
Mahi Shafiullah @ ICRA 2024 🏡🤖
2 years
How can we train data-efficient robots that can respond to open-ended queries like “warm up my lunch” or “find a blue book”? Introducing CLIP-Field, a semantic neural field trained w/ NO human labels & only w/ web-data pretrained detectors, VLMs, and LLMs
4
67
319
2
1
70
@chris_j_paxton
Chris Paxton
10 months
A live demonstration of this older paper was maybe the most impressive thing I saw at the most recent #RSS2023 conference: The results in the paper don't do it justice. Hold up a transparent water bottle in front of a Realsense, get perfect depth...
Tweet media one
3
14
71
@chris_j_paxton
Chris Paxton
1 month
Apple home robotics from Bloomberg: - looking at telepresence on an arm (sort of like Facebook's canceled Portal? seems like a waste) - Telepresence follower robots - Home assistants, robots for your chores (most exciting) I think apple is really well positioned to do this; big…
6
6
70
@chris_j_paxton
Chris Paxton
2 months
So not to pick on agility here, because I love their robot and everything they've shown, but watch the feet in this video, between 1 and 2 seconds, to get an idea why getting sim-to-real for manipulation to work well is so hard
Tweet media one
@agilityrobotics
Agility Robotics
2 months
Collaborating with @NVIDIARobotics accelerates Digit's ability to learn new skills. In this case, it was pulling freshly cooked GPUs out of the oven based on a voice command & LLM integration. #SimToReal
1
7
54
7
4
69
@chris_j_paxton
Chris Paxton
2 months
Real classic silicon valley energy here, building a cool looking robot in your garage.
@Scobleizer
Robert Scoble
2 months
EXCLUSIVE FIRST LOOK: Silicon Valley garage startup building a humanoid robot. $8,000. BOM on robot. @kscalelabs . 3D printed carbon fiber. 26 high torque electric motors that were stuck in customs for a month. Pre launch. Will be first shown at @ycombinator demo day in…
363
656
4K
2
2
70