Chris Paxton @chris_j_paxton Twitter profile | Pikagi

Pikagi

Chris Paxton

@chris_j_paxton

8,071

Followers

1,518

Following

335

Media

3,998

Statuses

Mostly posting about robots. Embodied AI @hellorobotinc , formerly @AIatMeta , @NVIDIAAI , @zoox . All views my own.

Pittsburgh, PA

https://t.co/oRdhBeeO5C

Joined December 2013

Don't wanna be here? Send us removal request.

Pinned Tweet

@chris_j_paxton

Chris Paxton

@chris_j_paxton

26 days

Some personal news: I’m excited to be joining Hello Robot, to lead their Embodied AI effort! For years now, I’ve been a passionate supporter of their vision of affordable, useful robots that can help people out with their day-to-day lives. I’ve previously worked at FAIR, part of…

Tweet media one

77

11

535

Last Seen Profiles

@shyshyshydir

@sally2135

@mrr360

@spartanijkerk

@IsbUnited

@guiba478

@patspicksem

@aLexLibris

@MrJerryOC

@Danubis1wheel

@Ruiz_Rowland

@NoFugazi300

@Ed_nareehS

@matt_araiza

@HotshotRadio1

@TalkEasyPod

@brynly_marsh

@kushab_michael

@TBTaughtyou

@JonBatiste

@SuprkickStudios

@cydonah

@mahimahi

@AbdlNitro

@the_film_god

@ChildWelfarePAC

@amalmariei

@MetaCeneGame

@mesfer_ghamdi

@1RealJoeyB

@PetSatoshi

@KCDQYaniss

@samrahatika

@StatsMan

@WabashCollege

@_ScottBlair_

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

lmao, why do people do this

Tweet media one

50

515

7K

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

@Jetskigrizzly Honestly, at most jobs I've worked, you'd be able to get that $12 reimbursed much more easily than $1 of real pennies No idea how it works for teachers

25

0

2K

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

@taknil Hacker news was suggesting it was a honeypot

Tweet media one

5

8

600

@chris_j_paxton

Chris Paxton

@chris_j_paxton

22 days

Its genuinely hard to believe a 70B model is up there with the 1.8T GPT4? I guess training data really is everything

@Teknium1

Teknium (e/λ)

22 days

Welp folks, we have gpt-4 at home

Tweet media one

152

377

5K

19

47

575

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

So big news: today was my last day at Meta. It's bittersweet, since meta has some of the smartest and hardest working people in the world, and I learned a lot there. But I'm excited to work on something new that I'm passionate about - what won't be a surprise if you follow me.

47

4

470

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

So does tesla full self driving just kind of work now? As a non tesla owner I was super impressed

14

16

341

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

@sureailabs They're all different top level domains, it feels like a gray area to me?

4

0

330

@chris_j_paxton

Chris Paxton

@chris_j_paxton

11 months

The future of robot butlers starts with mobile manipulation. We’re announcing the NeurIPS 2023 Open-Vocabulary Mobile Manipulation Challenge! - Full robot stack ✅ - Parallel sim and real evaluation ✅ - No robot required ✅👀

4

74

323

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

There’s a ton of exciting progress in robotics these days, from big companies like Google and NVIDIA, to startups and academic labs. Here’s a short thread of some cool robotics things you may have missed from the last week or so:

Tweet media one

1

46

286

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

As a roboticist I think it's a bit disappointing how many of the flashiest deep learning results seem to depend on massive amounts of really good, clean data (Dreamer, ChatGPT) - as opposed to mountains of unlabeled garbage and a handful of good examples

10

8

252

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

This is a really cool idea: 1 - segment objects + generate text description of scene 2 - use language prompt with DALL-E, generate goal image 3 - segment again, match features using CLIP and then use pose estimation to plan a sequence of pick + place actions

@_akhaliq

AK

2 years

DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics abs: project page:

Tweet media one

6

107

770

0

34

245

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

The rise of LLMs and foundation models in robotics will be the most influential + important development in our field this year (presentation by Aleksandra Faust of google)

Tweet media one

4

34

238

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

Interested in a fall internship in robot learning at Meta AI, particularly using language or for multi-step tasks? Send me a message!

14

30

217

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

Diffusion has a lot of potential to be the future of planning: - better at long horizon problems than straight feed-forward learning - can be generated from imperfect sensor data, unlike classical methods - generates diverse "plans" before execution so we can check for safety

@dkanou

Dimitrios Kanoulas

2 months

Creating 2D maze paths using diffusion! All applied on legged robot navigation. #ICRA2024 is coming soon, and @JianweiLiu93 and Mania will present the work in Japan! Paper: Video:

Tweet media one

2

27

158

8

19

214

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

Robotics foundation model is a term you will hear a lot over the next year

@CovariantAI

Covariant

1 month

Large language models are trained to predict the next token in text. For robotics, this means training models on physical interaction datasets to build generalized AI that can simulate the physical world. We've built RFM-1, the first commercial Robotics Foundation Model. RFM-1…

9

35

204

4

17

210

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

If scale really is all you need (for machine learning) how come tesla self driving isn't that good? They have so much data

33

4

203

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

@DrJimFan Maybe, but anyone can say their secret thing is great. OpenAI is the only one who's proven it. At some point, papers and benchmarks don't matter; putting it in the hands of ordinary people and seeing it not blow up makes a way more powerful statement

10

8

195

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

Cheap robots + learning will change the world

@alexkoch_ai

Alexander Koch

2 months

Folding clothes with $250 robot arms. I've added another motor to improve mobility and extend the reach. The CAD files and the code are public at: (video at 2x speed)

129

389

3K

4

17

196

@chris_j_paxton

Chris Paxton

@chris_j_paxton

24 days

I like this advice. If I was getting started as a robotics student these days, I think I would install ManiSkill2 and try to get some learning stuff working - I like that it has great simulations of "real" robot sensors, is fast, and easy to start with.

@JohnVial

Dr. John Vial

25 days

Robotics Students: Stop Doing Basic Tutorials, Start Doing Projects

Tweet media one

7

27

260

4

29

188

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

So, some really big news: this is my last week at NVIDIA! Starting next week, I'll be working with @MetaAI in Pittsburgh, working on learning representations for long-horizon robot planning from perception. I've loved working at NVIDIA and am excited for the next thing.

5

3

179

@chris_j_paxton

Chris Paxton

@chris_j_paxton

6 months

In personal news, major development in autonomous intelligent agents achieved (we had our first baby 11/11, going on paternity leave, won't be as responsive or on Twitter as much)

24

1

181

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

Convergent evolution in the Dyson video (left) vs Google (center) and Hello (right). I think this is clearly (roughly) what the first in home robots will look like

Tweet media one

8

15

165

@chris_j_paxton

Chris Paxton

@chris_j_paxton

4 months

Lots of roboticists talk about how inspiring watching babies learn is. Total nonsense, worst manipulation learning algorithm I've ever seen. 2 months of basically constant supervision and no applications to real-world tasks at all. It's like grad school all over again

21

1

158

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

There’s a lot of cool stuff going on in robotics research these days. Here’s some cool stuff -- focusing on research papers -- that you might have missed, including work from Meta, TRI, and academia, a thread:

Tweet media one

2

28

155

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

@sureailabs Actually I think youre right, it must be ignoring robots.txt, or it wouldnt have suddenly gotten stuck like this right?

4

1

157

@chris_j_paxton

Chris Paxton

@chris_j_paxton

10 months

- Large language models are becoming the standard way to program - Most robotics programs are very simple at the high level (go here, grab that) - So I guess replace all your state machines with GPT already - Great tools + examples in several robotics domains

@_akhaliq

AK

10 months

ChatGPT for Robotics: Design Principles and Model Abilities paper page: paper presents an experimental study regarding the use of OpenAI's ChatGPT for robotics applications. We outline a strategy that combines design principles for prompt engineering and…

Tweet media one

14

136

549

6

30

155

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

There’s nothing like seeing a robot working at home for yourself. It has to move a variety of unusual objects, like the green cactus plushie in this video, and place them intelligently in a very messy world. So on that note, we're opening up OK Robot to the community! A thread:

8

22

151

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

LLMs like GPT take input tokens - words - to make predictions. But what do those tokens look like for robotics? In SLAP, we propose a multimodal representation for learning generalizable, language-conditioned robot policies. (short thread)

3

33

150

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

Use multi-modal transformers to predict plan success in task and motion planning -- route to letting robots perform challenging multi-step tasks. Tokenize plan, initial condition, and goal. (continuing to share papers that seem cool but aren't mine)

Tweet media one

3

22

150

@chris_j_paxton

Chris Paxton

@chris_j_paxton

6 months

PhD-level Internships now available on my team at FAIR:

Tweet card media

Meta's mission is to give people the power to build community and bring the world closer together. Together, we can help people build stronger communities - join us.

www.metacareers.com

9

18

148

@chris_j_paxton

Chris Paxton

@chris_j_paxton

4 months

Large Language Models (LLMs) and Vision-Language Models (VLMs) are poised to revolutionize robotics. Join our workshop at #ICRA2024 on VLMs/LLMs for scene understanding, decision making, control, and more: Submissions due March 11, 2024!

Tweet media one

4

30

144

@chris_j_paxton

Chris Paxton

@chris_j_paxton

4 months

Build a "foundation model" for 6dof pose detection and tracking of arbitrary objects, from my old colleagues at NVIDIA: Interesting objects detected and tracked in clutter, and the AR demo is super cool. 3d foundation models will fuel in home robotics

5

19

140

@chris_j_paxton

Chris Paxton

@chris_j_paxton

7 months

There are a large number of papers coming out which show that you can use LLMs to create realistic or interesting motions. Some interesting ones. First, RoboTool:

Tweet media one

3

27

139

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

There’s a ton of exciting progress in robotics these days. We saw more ICRA2024 and CVPR 2024 accepted papers, amazing space robots, exciting new progress in humanoids, and a bunch of cool research from various labs. Here’s some things you may have missed:

Tweet media one

3

16

129

@chris_j_paxton

Chris Paxton

@chris_j_paxton

19 days

this guy is going to single handedly bring AGI progress to a halt in the way Eliezer could only dream

@RatOrthodox

Ronny Fernandez 🔍⏸️

19 days

factorio 2 is coming out soon. if you work in frontier model research at open ai, anthropic, or deepmind and would like a free copy, I would be very happy to buy you one! please feel free to reach out. people don't do enough for you guys

56

111

2K

3

14

130

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

Interesting idea here. Using LLMs to take actions, retrieve information about the world, and do some simple chain-of-thought reasoning. Increasingly seems that large language models are the glue that can combine disparate parts of AI + robotics to build some cool applications

Tweet media one

@ShunyuYao12

Shunyu Yao

2 years

Large Language Models (LLM) are 🔥in 2 ways: 1.🧠Reason via internal thoughts (explain jokes, math reasoning..) 2.💪Act in external worlds (SayCan, ADEPT ACT-1, WebGPT..) But so far 🧠and💪 remain distinct methods/tasks... Why not 🧠+💪? In our new work ReAct, we show 1+1>>2!

10

67

390

3

16

128

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

This is by far the most impressive humanoid manipulation video I've seen so far

@BostonDynamics

Boston Dynamics

@BostonDynamics

3 months

Can't trip Atlas up! Our humanoid robot gets ready for real work combining strength, perception, and mobility.

225

1K

5K

12

5

127

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

3D multimodal diffusion models let robots perform language-conditioned tasks in the real world. - combine pretrained 2d features from CLIP with 3d perception - achieves SOTA on RLBench - works on the real robot Website:

1

18

125

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

TidyBot from a team at princeton. - ViLD + CLIP + LLMs for object detection and reasoning - Engineered low-level skills for the robot Yet another paper making me wonder "is robot learning actually dead though?" seems like classic robotics + vision/language models are all you need

Tweet media one

@_akhaliq

AK

1 year

TidyBot: Personalized Robot Assistance with Large Language Models approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts…

33

308

1K

7

19

122

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

Actually its from hacker news

0

3

121

@chris_j_paxton

Chris Paxton

@chris_j_paxton

7 months

So many humanoids startups. The future feels like it's coming fast sometimes

Tweet media one

10

6

115

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

"Language as a generic interface between neural networks" - decompose into a set of structured tasks using different networks Cool idea for getting complex behavior out of LLMs

Tweet media one

2

18

117

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

HuggingFace looking to replace ROS with something that's more performant, based on modern practices. ROS is a behemoth - there's a lot of code for things like drones, ground vehicles, SLAM, etc. that you might not even *want* to include in an effort like this. Pretty excited to…

@Ilir_AI

Ilir

1 month

🔥 𝗛𝗨𝗚𝗚𝗜𝗡𝗚 𝗙𝗔𝗖𝗘 🔥 - aims to replace ROS - 😱 NEW robotics framework - called Dora-rs. ⚡️A super impressive replacement of ROS (the Robot Operating System), for those who know, one of the pain-point to lower the entry barrier in robotics. 🤗Dora-rs is much much…

10

38

193

6

10

116

@chris_j_paxton

Chris Paxton

@chris_j_paxton

16 days

Cool low cost robots I've seen recently, what am I missing:

6

11

114

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

If people like this kind of thing, I'll try to do it every week. Unless I have something better to do or I forget

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

There’s a ton of exciting progress in robotics these days, from big companies like Google and NVIDIA, to startups and academic labs. Here’s a short thread of some cool robotics things you may have missed from the last week or so:

Tweet media one

1

46

286

14

3

114

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

I largely disagree with this, I think this is a classic underestimate of what in robotics is actually hard vs what learning is good at

@DrJimFan

Jim Fan

3 months

Robotics is a field with lots of historical burdens. Too many poorly engineered software, algorithmic patches, and outdated mindsets. Foundation models will happen slowly, then all at once before you know it, burying these legacy stuff once and for all. Just like how ChatGPT…

41

80

576

3

6

112

@chris_j_paxton

Chris Paxton

@chris_j_paxton

9 months

Potentially one of the most exciting LLM papers I've seen for a while: - uses scannet/Objaverse/HM3d datasets to create a large database of 3d-language data - get 3d projected vision-language features (e.g. (clip, x, y, z) - train LLM align w/ 3d features

Tweet media one

3

29

111

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 years

We are now looking for PhD students to intern with the Robotics Lab at NVIDIA! Help build robots that perceive the world, use language to work with humans, and perform challenging tasks.

Tweet card media

Work With us and Transform Industries

Learn about our culture and much more. #NVIDIA Careers.

1

19

105

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

It's always amazing to me how we talk so much about end to end, and then we see results like this - with robots actually generalizing in a new environment - and it's basically an engineered, classical planning system powered by VLMs. Not that there's anything wrong with that!

Tweet media one

@Ilir_AI

Ilir

1 month

🤖 𝗥𝗢𝗕𝗢𝗧 𝗡𝗔𝗩𝗜𝗚𝗔𝗧𝗜𝗢𝗡 🧭 Finding things in - u n k n o w n - spaces: Vision-Language Frontier Maps (VLFM), a navigation system inspired by human decision-making to find unseen objects in new places. ✅ Identifies exploration frontiers using maps and a…

4

12

71

2

17

108

@chris_j_paxton

Chris Paxton

@chris_j_paxton

17 days

Absolutely wild that this kind of thing works at all

@saba_khalilnaji

Saba Khalilnaji

@saba_khalilnaji

18 days

GPT-4V picks up an object with my robot hand. 1 hour to setup 0 tuning. Using laptop camera. Output is wrist angles and grip type. #Robotics

11

16

106

2

6

107

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

Why do you need human hands for industrial automation when you can do this?

@Ilir_AI

Ilir

1 month

🤖 Gripper-cushion can grip almost everything ❗ 🖐 Handles sheet metal, liquid tanks, nonwovens, and pipes: FORMHAND Automation GmbH gripper pads are revolutionizing material handling in manufacturing. ✅ Versatile: Capable of gripping diverse materials from sheet metal to…

2

9

50

16

7

105

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

I like this paper; using LLMs while also pouring some cold water on the LLM hype (they clearly cannot plan; hard to consider something that can't plan on the road to AGI in my opinion) LLMs are good at writing code though, so generate PDDL and get a plan that way hopefully

Tweet media one

6

12

102

@chris_j_paxton

Chris Paxton

@chris_j_paxton

5 months

Great stuff (from ViLA ) - common sense reasoning to search for the object in various locations - shows how you can connect different skills together with some memory and visual reasoning

1

15

97

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

There is a genre of papers which is "i made my robot do a cool thing no one else's can do yet," which will forever be my favorite

2

3

98

@chris_j_paxton

Chris Paxton

@chris_j_paxton

11 months

Honestly, one of my favorite robotics paper videos I've seen this year. it just does so much with no cuts. The skills here are all heuristic (not trained/learned, as far as I can tell), LLM-based high level reasoning

@jimmyyhwu

Jimmy Wu

1 year

When organizing a home, everyone has unique preferences for where things go. How can household robots learn your preferences from just a few examples? Introducing 𝗧𝗶𝗱𝘆𝗕𝗼𝘁: Personalized Robot Assistance with Large Language Models Project page:

22

116

529

2

19

95

@chris_j_paxton

Chris Paxton

@chris_j_paxton

19 days

Segment Anything with Optical Flow makes for an incredibly good segmentation method. Really useful to know.

@dreamingtulpa

Dreaming Tulpa 🥓👑

20 days

SAM + Optical Flow = FlowSAM FlowSAM can discover and segment moving objects in a video and outperforms all previous approaches by a considerable margin in both single and multi-object benchmarks 🔥

10

233

1K

2

10

93

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

@soft_fox_lad where's the lie

2

1

94

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

Love seeing this. We need widely available low cost robotics for our field to have anything like a "GPT moment" or even an imagenet moment

@alexkoch_ai

Alexander Koch

3 months

Early results from my AI training runs. I've trained my $200 robot arm on a simple picking task using imitation learning. It has learned to control the robot arm using only camera images and joint states.

94

286

2K

2

6

92

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

Honestly I don't think Apple is the one who will pull this off Someone else will be first mover

@SawyerMerritt

Sawyer Merritt

1 month

NEWS: Apple is exploring home robots as its potential ‘next big thing’ after its car efforts failed.

287

93

1K

23

1

92

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 months

Object-centric world representations make a lot of sense for robots interacting with their world, but no one "scale" of object makes sense. So I liked seeing this work, GARField, which learns a scale conditioned field per-scene for object segmentation. Notes:

Tweet media one

@ChungMinKim

Chung Min Kim

4 months

🪴Should the leaves of a plant be considered separate or part of the whole? Answer: it really depends! Points can, and should, belong to multiple groups. With GARField, points can belong to multiple groups, with physical scale 📏 as an extra dimension.

6

24

191

1

13

89

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

I'm glad to see people are interested in HACMan. This paper convinced me this was the *right* way of scaling autonomy for dexterous manipulation: predicting actions as interactions with the world and trajectories. Lots of cool potential followups Writeup:

Tweet card media

CMU and Meta AI Researchers Propose HACMan: A Reinforcement Learning Approach for 6D Non-Prehensile...

Human skill relies heavily on the capacity to handle items beyond simple grabbing. Pushing, flipping, toppling, and sliding are examples of non-prehensile manipulation, and they are crucial for a...

www.marktechpost.com

0

24

89

@chris_j_paxton

Chris Paxton

@chris_j_paxton

24 days

Love this

Tweet media one

6

8

90

@chris_j_paxton

Chris Paxton

@chris_j_paxton

10 months

Tomorrow at #RSS2023 , in session 4: StructDiffusion shows how we can train *multi-modal transformers* which can build structures out of previously-unseen objects in the real world, using object-centric diffusion led by @Weiyu_Liu_ w/ @du_yilun , Tucker Hermans, + Sonia Chernova

1

15

89

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

@davelu lol so being tall is only an advantage if you're gunning for someone else's job? thanks for the context

1

0

89

@chris_j_paxton

Chris Paxton

@chris_j_paxton

11 months

Learning sampling dictionaries for robot motion planning from point clouds, using transformers. Cool work from @ahquresh and others: Motion planning is hard, and genuinely not solved! Always exciting to see work like this

Tweet media one

2

22

75

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

Some cool research from MSR: this is what I really think "foundation models" for robots should look like. Learn one huge model to solve real robot problems like localization, mapping, etc.

Tweet card media

PACT: Perception-Action Causal Transformer for Autoregressive Robotics Pretraining - Microsoft...

Our method uses common robotics representation trained using self-supervised learning that can be fine-tuned to multiple downstream tasks.

www.microsoft.com

0

33

86

@chris_j_paxton

Chris Paxton

@chris_j_paxton

6 months

Our final CoRL paper, SLAP: Spatial Language Attention Policies, was also presented yesterday. Presented by @Priyam8Parashar With 3d multimodal transformers, we can learn policies for mobile robots that can scale to larger scenes and more complex environments with less data!

3

17

86

@chris_j_paxton

Chris Paxton

@chris_j_paxton

9 months

The leaderboard for the HomeRobot challenge is now open at : - create a docker image containing your agent - planning or learning, use your favorite strategy - we plan to run the top 3 on real hardware - potentially win a robot from @hellorobotinc

Tweet media one

2

23

83

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

Setting the table using multi-modal transformers via StructDiffusion! In the end, this is a very hard task + motion planning problem, since we need to figure out where to place objects, as well as how to grasp them and move them.

2

8

85

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

Source

1

5

84

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

I still really think we need *small data* like this for robotics - your home/office/factory is *yours*, no one else's, and you're not going to get a million hours of data there and I really like the range of robot platforms and tasks here!

@LerrelPinto

Lerrel Pinto

1 year

While we are going gaga over large models and big data, there is still incredible value left to extract in small models and data, especially in robotics. All the skills shown below were each trained with <1 min of human data and <20 min of online RL 🧵👇

8

64

384

4

9

82

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

Does robotics research have much credibility *outside* robotics research? Whenever I talk to lay people (i.e. friends and family) they usually say something like, "oh, you guys have been promising in-home robots for years and we're no where close."

13

4

83

@chris_j_paxton

Chris Paxton

@chris_j_paxton

10 months

We (quickly) trained a multi-modal (spatial+language) transformer for low-cost home robots from @hellorobotinc , which can solve tasks like "bring me a bottle" in an unmapped home. Work w/ @Priyam8Parashar @XiaohanZhang220 @viddivj @jdvakil @ybisk More details:

@priyam8parashar

Priyam Parashar

@priyam8parashar

10 months

Robot learning of language and manipulation tasks needs to be sample efficient. SLAP combines language and point-cloud embeddings as spatial-language tokens within a Transformer, to do just that – learn free-form language-conditioned robot policies. 🧵

4

21

142

1

10

82

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

I shared this already but wanted to comment a bit. I really like that this thing is - fast - agile - looks to be purpose built for moving containers around - mechanically simple - probably can be made cheap and robust All in all seems like a really cool and novel design

@KennethCassel

Kenneth Cassel

2 months

favorite warehouse robot from the last few years is the evoBOT

23

123

2K

2

7

81

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

Loved the #ICRA2022 robot parade. These things feel like they've come a long way.

Tweet media one

2

11

80

@chris_j_paxton

Chris Paxton

@chris_j_paxton

8 months

Love this. Language understanding adds robustness and interpretability to self driving cars

@Jamie_Shotton

Jamie Shotton

8 months

For example, what did the model perceive that made it slow down? #EmbodiedIntelligence #LINGO1 #selfdriving

3

5

42

1

6

80

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

@moultano i feel like some of the "enshittification" discourse is pretty interesting, but treating it like some natural property of the system is weird, i feel like there are clearly companies whose products have held up perfectly well

1

0

81

@chris_j_paxton

Chris Paxton

@chris_j_paxton

6 months

Let's make robots that can help out in *any* home environment! Check out our poster on *Open Vocabulary Mobile Manipulation* at #CoRL2023 @corl_conf tomorrow 12pm-12:45pm eastern. Presented by the great @yvsriram - applying for PhD programs this fall!

0

17

80

@chris_j_paxton

Chris Paxton

@chris_j_paxton

17 days

Chest-mounted teleop setup is very cool. Seems like an extremely dexterous platform as well, capable of performing a ton of different tasks.

@ReazonHILab

Reazon Human Interaction Lab

17 days

𝗥𝗲𝗮𝘇𝗼𝗻𝗖𝗼𝗯𝗼𝘁 is a low-cost and robust household collaborative robot. Check out our first teleoperated dual-armed prototype: chest-mounted control, arm elevation, mobile chassis, and lemons 🍋 (full video → )

11

48

209

4

6

79

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

Every time I try to use some simple install instructions for research code from github, I get frustrated. Even with conda etc, it seems like AI code is much less reproducible than you would think.

12

3

79

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

0 papers at ICRA this year, RIP (it happens)

3

2

78

@chris_j_paxton

Chris Paxton

@chris_j_paxton

4 months

Introducing MultiPLY, multimodal LLM which predicts action tokens, as well as a 500k multimodal dataset with dialogue, audio, touch, etc. They use an object-centric scene representation, powered by ConceptGraphs. Very promising results!

@gan_chuang

Chuang Gan

4 months

Can LLM engage actively in 3D environments and dynamically gather multisensory interactive data, including g visual, audio, tactile, and thermal? Introducing MultiPLY, a multisensory embodied LLM that seamlessly connects words, actions, and perceptions!

5

74

313

2

8

79

@chris_j_paxton

Chris Paxton

@chris_j_paxton

4 months

Low cost mobile manipulation in the real world. This is how we scale up robot learning, with real data and widely reproducible hardware!

@_akhaliq

AK

4 months

Adaptive Mobile Manipulation for Articulated Objects In the Open World paper page: Deploying robots in open-ended unstructured environments such as homes has been a long-standing research problem. However, robots are often studied only in closed-off lab…

2

70

300

1

12

79

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

Getting high performance -- including generalization and 99%+reliability -- with a learned model is really difficult. Covariant aims to do it with: - an 8 billion parameter transformer trained on text, images, actions, and sensors - all modalities tokenized into a common space -…

Tweet media one

1

7

77

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

This is the dream

@xiaolonw

Xiaolong Wang

2 months

I have been cleaning my daughter's mess for more than two years now. Last weekend our robot came to home to do the job for me. 🤖 Our new work on visual whole-body control learns a policy to coordinate the robot legs and arms for mobile manipulation. See…

23

115

651

2

9

76

@chris_j_paxton

Chris Paxton

@chris_j_paxton

5 months

How to build a home robot today: - multi object rearrangement - captures user preferences - handles multiple rooms The system diagram here gives a good feel for how complex a system a *reliable* in home robot actually is

Tweet media one

0

16

77

@chris_j_paxton

Chris Paxton

@chris_j_paxton

11 months

Cool work from @imankitgoyal et al at NVIDIA: - 3d representations are great but often slow - instead, take pt cloud, render it to multiple 2d synthetic views and predict there - significantly outperforms PerAct and works in the real world on a lot of tasks

Tweet media one

@NVIDIARobotics

NVIDIA Robotics

@NVIDIARobotics

11 months

Our Seattle #Robotics Lab introduces a new, faster, & more efficient method for teaching robot manipulation tasks in real-life scenarios — like opening drawers or dispensing soap — training 36x faster than the current standard. #NVIDIAResearch Learn more:

0

11

62

2

9

76

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

Segment anything has learned a general concept for "objectness." This is huge, a foundation model for image segmentation is itself a huge step towards a foundation model for robotics and embodied ai

@AIatMeta

AI at Meta

1 year

Today we're releasing the Segment Anything Model (SAM) — a step toward the first foundation model for image segmentation. SAM is capable of one-click segmentation of any object from any photo or video + zero-shot transfer to other segmentation tasks ➡️

144

2K

7K

0

7

76

@chris_j_paxton

Chris Paxton

@chris_j_paxton

8 months

Article on how Toyota is scaling diffusion policies: 60 tasks including spreading on bread, mixing eggs, etc.

Tweet media one

1

8

76

@chris_j_paxton

Chris Paxton

@chris_j_paxton

6 months

Very cool work: - start with a trained LLM - learn alignment from frozen vision encoders - learn to predict actions using this LLM backbone This feels like closer to the "right" way to get general intelligence on robots, since we can leverage web data to learn useful features

@alexttoshev

Alexander Toshev

7 months

Can an LLM be adapted to see and act within an On Policy RL for Embodied Tasks? We show how to adapt a 13B LLM with PPO, and demonstrate strong generalization capabilities on a wide range Language Rearrangement tasks. Paper and results:

3

18

143

2

3

74

@chris_j_paxton

Chris Paxton

@chris_j_paxton

3 years

Arranging unknown objects according to structured language, trained on sim data and applied in the world with no fine tuning. Really excited about this new work with Weiyu Liu, Tucker Hermans, and Dieter Fox.

@_akhaliq

AK

3 years

StructFormer: Learning Spatial Structure for Language-Guided Semantic Rearrangement of Novel Objects abs: project page:

0

6

47

2

8

74

@chris_j_paxton

Chris Paxton

@chris_j_paxton

4 months

So we have our OK-Robot paper for pick and place, and now this one for door opening. That said, methods are very different. This one focuses on very efficient Ral using a parameterized action space, which makes grasp predictions relative to "door" points.

@Haoyu_Xiong_

Haoyu Xiong

4 months

Introducing Open-World Mobile Manipulation 🦾🌍 – A full-stack approach for operating articulated objects in open-ended unstructured environments: Unlocking doors with lever handles/ round knobs/ spring-loaded hinges 🔓🚪 Opening cabinets, drawers, and refrigerators 🗄️ 👇…

30

103

784

1

9

74

@chris_j_paxton

Chris Paxton

@chris_j_paxton

22 days

I love all the ingenuity people are using to build and operate these cheap robots.

@BartronPolygon

Bart Trzynadlowski

@BartronPolygon

23 days

This might just be crazy enough to work.

10

17

153

2

4

74

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 years

You can now check out the code for our ICRA 2022 paper StructFormer, for rearranging previously unseen objects based on structured language commands:

1

13

72

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 year

Won outstanding paper award at LangRob - CoRL 2022. Awesome work by @notmahi

@notmahi

Mahi Shafiullah @ ICRA 2024 🏡🤖

2 years

How can we train data-efficient robots that can respond to open-ended queries like “warm up my lunch” or “find a blue book”? Introducing CLIP-Field, a semantic neural field trained w/ NO human labels & only w/ web-data pretrained detectors, VLMs, and LLMs

4

67

319

2

1

70

@chris_j_paxton

Chris Paxton

@chris_j_paxton

10 months

A live demonstration of this older paper was maybe the most impressive thing I saw at the most recent #RSS2023 conference: The results in the paper don't do it justice. Hold up a transparent water bottle in front of a Realsense, get perfect depth...

Tweet media one

3

14

71

@chris_j_paxton

Chris Paxton

@chris_j_paxton

1 month

Apple home robotics from Bloomberg: - looking at telepresence on an arm (sort of like Facebook's canceled Portal? seems like a waste) - Telepresence follower robots - Home assistants, robots for your chores (most exciting) I think apple is really well positioned to do this; big…

6

6

70

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

So not to pick on agility here, because I love their robot and everything they've shown, but watch the feet in this video, between 1 and 2 seconds, to get an idea why getting sim-to-real for manipulation to work well is so hard

Tweet media one

@agilityrobotics

Agility Robotics

@agilityrobotics

2 months

Collaborating with @NVIDIARobotics accelerates Digit's ability to learn new skills. In this case, it was pulling freshly cooked GPUs out of the oven based on a voice command & LLM integration. #SimToReal

1

7

54

7

4

69

@chris_j_paxton

Chris Paxton

@chris_j_paxton

2 months

Real classic silicon valley energy here, building a cool looking robot in your garage.

@Scobleizer

Robert Scoble

2 months

EXCLUSIVE FIRST LOOK: Silicon Valley garage startup building a humanoid robot. $8,000. BOM on robot. @kscalelabs . 3D printed carbon fiber. 26 high torque electric motors that were stuck in customs for a month. Pre launch. Will be first shown at @ycombinator demo day in…

363

656

4K

2

2

70