Jesse Mu @jayelmnop Twitter profile

Last Seen Profiles

@bloq

@fireengineering

@affectivelab

@GCRefugees

@PRMSEofficial

@capitao_wagner

@Melanie05552318

@statisqueen

@lari_chwan

@TheRocketsWire

@Kirsten12696991

@fluidattacks

@discordfora

@Rw0gswq28B2AE9B

@Lexreon

@sher_sdk

@_irinamailen

@nbuounb

@plexus_fi

@Aleoffside

@PocztaPolska

@dkbyak

@wokesocieties

@FAMontevideo

@ZalSpace

@LIVEPROJECT_BOX

@Web3eventorg

@IlverHouses

@KekCatUA

@YShiblog

@folusoleigh

@taeteadmirer

@oppenheiter

@vmsalama

@madame_evelina

@gwpasrah

Jesse Mu

@jayelmnop

1 year

I've found the killer app of large language models.

57

514

4K

Jesse Mu

@jayelmnop

4 years

The machine learning research process

Yin Chen, M.S.

@yin_psyched

4 years

I can’t stop laughing at this.

133

6K

29K

13

340

2K

Jesse Mu

@jayelmnop

1 year

Since prompting, instruction tuning, RLHF, ChatGPT etc are such new and fast-moving topics, I haven't seen many university course lectures covering this content. So we made some new slides for this year's CS224n: NLP w/ Deep Learning course at @Stanford !

20

292

2K

Jesse Mu

@jayelmnop

1 year

PSA to anyone who wants to write an op-ed criticizing LLMs (yes, including Noam Chomsky): if you're going to come up with hypothetical failure cases for LLMs, at a minimum, please actually check that your case fails with a modern LLM

31

88

870

Jesse Mu

@jayelmnop

2 years

I am announcing the Perverse Scaling Prize: a $1.14 USD prize for tasks which exhibit any of the following scaling curves

Ethan Perez

@EthanJPerez

2 years

We’re announcing the Inverse Scaling Prize: a $100k grand prize + $150k in additional prizes for finding an important task where larger language models do *worse*. Link to contest details: 🧵

48

313

2K

9

57

747

Jesse Mu

@jayelmnop

1 year

Excited to present 3 #NeurIPS2022 papers on a trend I've been very excited about recently: blurring the boundaries between language models and RL agents (+a bonus 4th paper on active learning!) 🧵(0/7) PS: I'm on the industry job market!

9

87

685

Jesse Mu

@jayelmnop

1 year

Prompting is cool and all, but isn't it a waste of compute to encode a prompt over and over again? We learn to compress prompts up to 26x by using "gist tokens", saving memory+storage and speeding up LM inference: (w/ @XiangLisaLi2 and @noahdgoodman ) 🧵

14

119

602

Jesse Mu

@jayelmnop

2 years

TIL in 2009 two Berkeley undergrads flipped a coin *40,000* times (1hr/day for a semester) to see whether a coin flip was truly random (it's biased towards the side facing up pre-flip!) Gives a new meaning to the term "undergraduate research project"...

8

73

512

Jesse Mu

@jayelmnop

1 year

Life update: this week I joined the Alignment team @AnthropicAI ! I’m starting part-time for now as I finish up my PhD at Stanford. Excited to work on making large language models safer and more aligned!

27

9

485

Jesse Mu

@jayelmnop

2 months

We’re hiring for the adversarial robustness team @AnthropicAI ! As an Alignment subteam, we're making a big effort on red-teaming, test-time monitoring, and adversarial training. If you’re interested in these areas, let us know! (emails in 🧵)

4

72

462

Jesse Mu

@jayelmnop

4 years

New preprint with @jacobandreas : we generate explanations of the individual neurons inside deep neural networks by identifying *compositional logical concepts* that closely approximate neuron behavior (e.g. "water that isn't blue") (1/5)

5

113

459

Jesse Mu

@jayelmnop

8 months

My lecture on prompting, instruction tuning, and RLHF for Stanford's CS224n course is (finally!) available online:

Jesse Mu

@jayelmnop

1 year

Since prompting, instruction tuning, RLHF, ChatGPT etc are such new and fast-moving topics, I haven't seen many university course lectures covering this content. So we made some new slides for this year's CS224n: NLP w/ Deep Learning course at @Stanford !

20

292

2K

4

90

436

Jesse Mu

@jayelmnop

1 year

@AlexReibman Whew! Time to go back to my day job of solving leetcode #42 (trapping rain water) and #1330 (reverse subarray to maximize array value)

0

2

405

Jesse Mu

@jayelmnop

1 year

New LM eval just dropped—Google has no moat??

5

29

360

Jesse Mu

@jayelmnop

4 months

Seeing some confusion like: "You trained a model to do Bad Thing, why are you surprised it does Bad Thing?" The point is not that we can train models to do Bad Thing. It's that if this happens, by accident or on purpose, we don't know how to stop a model from doing Bad Thing 1/5

Anthropic

@AnthropicAI

4 months

New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through.

128

582

3K

11

39

335

Jesse Mu

@jayelmnop

2 years

Excited to share my work from my internship at @MetaAI : improving exploration in RL with language abstractions! Paper: 🧵 (1/8)

5

48

289

Jesse Mu

@jayelmnop

3 months

Achievement unlocked ✅, thanks for the shout-out @karpathy !

Andrej Karpathy

@karpathy

3 months

New (2h13m 😅) lecture: "Let's build the GPT Tokenizer" Tokenizers are a completely separate stage of the LLM pipeline: they have their own training set, training algorithm (Byte Pair Encoding), and after training implement two functions: encode() from strings to tokens, and

383

2K

14K

3

7

256

Jesse Mu

@jayelmnop

3 years

Anonymous Reviewer #2 's list of missing references

3

8

242

Jesse Mu

@jayelmnop

2 years

Stable Diffusion Telephone: take an image, generate a likely prompt with CLIP interrogator (), feed the prompt back into Stable Diffusion, rinse and repeat

10

32

237

Jesse Mu

@jayelmnop

4 years

GPT-4: Language Models are Fully Autonomous Vehicles

hci.social/@jbigham

@jeffbigham

4 years

i think i'm going to wait until GPT-4 to upgrade. seems like a mid-cycle release. trillion parameters or bust.

4

12

179

6

14

210

Jesse Mu

@jayelmnop

1 year

Something I didn't fully understand until recently— Imagine FLOPs for 2 transformer fwd passes with 1 input token - w/ no KV cache - w/ a 2K length KV cache Decoding w/ a 2K length KV cache (w/ no optimizations) is only ~10% more FLOPs than no KV cache Feedforward is pricey!

Andrej Karpathy

@karpathy

1 year

@itsclivetime @dylan522p @abhi_venigalla @NaveenGRao @davisblalock @typedfemale @cis_female @cHHillee I fixed the Transformer diagram :D

3

31

405

6

18

212

Jesse Mu

@jayelmnop

22 days

2

20

204

Jesse Mu

@jayelmnop

4 years

Compositional Explanations of Neurons will be an oral presentation at #NeurIPS2020 !

Jesse Mu

@jayelmnop

4 years

New preprint with @jacobandreas : we generate explanations of the individual neurons inside deep neural networks by identifying *compositional logical concepts* that closely approximate neuron behavior (e.g. "water that isn't blue") (1/5)

5

113

459

3

28

189

Jesse Mu

@jayelmnop

2 years

Hey, can I borrow your marker? CS academics: Sure! If you found this helpful, please cite @misc {fleming2022, title={EXPO Black Dry Erase Marker, Chisel Tip}, author={Fleming, Sam}, year={2022}, eprint={2407.1242}, primaryClass={cs.OS - Office Supplies} }

4

12

181

Jesse Mu

@jayelmnop

1 year

Playing games with #ChatGPT : 1. Tic-Tac-Toe

4

13

175

Jesse Mu

@jayelmnop

2 years

"deep learning is easy" this is how slack decides when to send a notification

11

21

166

Jesse Mu

@jayelmnop

1 year

Gist model checkpoints are now up on @huggingface . Give it a try and see what prompts you can (or can't) compress! LLaMA-7B (weight diff only): FLAN-T5-XXL: Code:

Jesse Mu

@jayelmnop

1 year

Prompting is cool and all, but isn't it a waste of compute to encode a prompt over and over again? We learn to compress prompts up to 26x by using "gist tokens", saving memory+storage and speeding up LM inference: (w/ @XiangLisaLi2 and @noahdgoodman ) 🧵

14

119

602

0

31

158

Jesse Mu

@jayelmnop

2 years

prompt engineering is such a brittle and hacky way to use my half-trillion param black box LM I trained on reddit shitposts via adamw (not adam) cyclic lr=5e-5 (5e-4 too high) rotary positional embs (sinusoidal embs no good) batch size set to 124x the number of a100s on the clust

Graham Neubig

@gneubig

2 years

Recently some complain about prompting as an approach to NLP. "It's so brittle." "Prompt engineering is hacky." etc. But there's another way to view it: prompt engineering is another way of tuning the model's parameters, and human interpretable! See 1/2

4

99

567

5

4

144

Jesse Mu

@jayelmnop

4 years

This week's @stanfordnlp seminar Thursday 10am PT: Diyi Yang ( @Diyi_Yang ) from Georgia Tech will speak on "When Social Context Meets NLP: Learning with Less Data and More Structures"! Open to the public - register at

4

22

102

Jesse Mu

@jayelmnop

2 years

The year is 2053. The 10k most popular words in the English dictionary have all been claimed and implemented as Huggingface python packages. To complete basic daily activities you open a REPL: >>> import food >>> import reading >>> import toothbrush

4

3

101

Jesse Mu

@jayelmnop

5 months

I'll be presenting Gist tokens, our new approach to LLM prompt compression, tomorrow (Thursday) morning at #NeurIPS2023 Great Hall & Hall B1+B2 (level 1) #604 , 10:45–12:45 CST Stop by!

Jesse Mu

@jayelmnop

1 year

Prompting is cool and all, but isn't it a waste of compute to encode a prompt over and over again? We learn to compress prompts up to 26x by using "gist tokens", saving memory+storage and speeding up LM inference: (w/ @XiangLisaLi2 and @noahdgoodman ) 🧵

14

119

602

0

15

99

Jesse Mu

@jayelmnop

4 months

Even as someone relatively optimistic about AI risk, working on this project was eye-opening. For example, I was almost certain that red-teaming the model for Bad Thing would stop the model from doing Bad Thing, but it just ended up making the model do Bad Thing more 🫠 5/5

3

8

93

Jesse Mu

@jayelmnop

1 year

I’m now seeing name.gpt Twitter names instead of name.eth. It’s over, the bubble has burst

4

1

89

Jesse Mu

@jayelmnop

1 year

Problem w/ this article is the cited LM evals are misleading: they don't measure frontier capabilities but a very narrow task distr. Claims that closed LMs have no moat must evaluate OSS models on actual knowledge work, not stuff like "name a restaurant"

Google "We Have No Moat, And Neither Does OpenAI"

Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI

www.semianalysis.com

2

9

88

Jesse Mu

@jayelmnop

1 year

#ChatGPT is not as creative here:

5

0

89

Jesse Mu

@jayelmnop

4 years

I'm presenting work on regularizing visual representations with language at the #NeurIPS2019 ViGIL workshop today (Friday) - West Hall 202 @ 12:10pm - joint work with Percy Liang and Noah Goodman - stop by!

2

14

82

Jesse Mu

@jayelmnop

2 years

With all of the astonishing neural net announcements coming out multiple times a week these days, it's worth keeping in mind that Spam Detection—*the* introductory example used in every blogpost, Youtube tutorial, and ML/NLP undergrad course—is *still* not solved

4

82

Jesse Mu

@jayelmnop

6 months

I'll be at #NeurIPS2023 this year! I've been having a lot of fun at Anthropic—excited to chat about (1) what it's like to work here, and (2) research topics including alignment, red-teaming, language+RL, and more

Sam Bowman

@sleepinyourhat

6 months

If you'll be at #NeurIPS2023 and you're interested in chatting with someone at Anthropic about research or roles, there'll be a few people of us around. Expression of interest form here:

2

21

200

2

3

78

Jesse Mu

@jayelmnop

4 years

This Friday 9/11 at 12pm PDT I'm giving a talk at Deep Learning: Classics and Trends on Compositional Explanations of Neurons () - open to the public! More info: Mailing list + zoom link:

3

26

75

Jesse Mu

@jayelmnop

1 year

The 2-layer MLP I wrote in Matlab as part of Andrew Ng's ML coursera course back in 2017 achieved parity with Bard on the XOR problem

2

4

70

Jesse Mu

@jayelmnop

3 years

In our new @StanfordAILab blog post, @ShikharMurty and I discuss the problem of training deep learning models with natural language explanations for tasks in vision, NLP, and beyond:

Stanford AI Lab

@StanfordAILab

3 years

Language is a powerful mechanism for people to communicate goals, beliefs and concepts. Can we use language to train machine learning models? Read our new blog post on Learning from Language Explanations:

1

44

127

0

13

65

Jesse Mu

@jayelmnop

2 years

> is emeritus ML/stats prof > uses sample size of n = 1 to extrapolate to 2 million female graduate students

Pedro Domingos

@pmddomingos

2 years

Corridor conversation: Me: What discrimination have you experienced? Female grad student: [Thinks a while] I can't think of any, but the literature says there is, so there must be.

9

4

58

4

2

66

Jesse Mu

@jayelmnop

1 year

@scalo This is Claude from @AnthropicAI

3

0

63

Jesse Mu

@jayelmnop

1 year

To be clear, there *is* lots of disruption potential from smaller, on-device, OSS LMs. Siri doesn't need PaLM-540B to figure out how to turn on your lights. But to say that these foundation models are made completely irrelevant by someone fitting LLaMA-7B on an iPhone is silly

5

2

59

Jesse Mu

@jayelmnop

2 years

@jachiam0 Isn't there still a question mark about the economic implications of DALL-E/GPT-3/PaLM/etc? It *feels* valuable but I'm not seeing any self-sustaining business models yet (besides cool prototypes, e.g. copilot) I think the fate of LMaaS startups (OpenAI, Cohere) will tell?

11

4

55

Jesse Mu

@jayelmnop

1 month

Another thorny safety challenge for LLMs. Like Sleeper Agents (), @cem__anil has found behavior that is stubbornly resistant to finetuning. Training on MSJ shifts the intercept, but not the slope, of the relationship b/t # of shots and attack efficacy.

Anthropic

@AnthropicAI

1 month

New Anthropic research paper: Many-shot jailbreaking. We study a long-context jailbreaking technique that is effective on most large language models, including those developed by Anthropic and many of our peers. Read our blog post and the paper here:

83

350

2K

3

5

56

Jesse Mu

@jayelmnop

2 years

Interested in human/machine communication? Excited to share 2 papers at #NeurIPS2021 : 1/ Emergent Communication of Generalizations, poster A0 Fri 12/10 8:30-10a PT 2/ Multi-party Referential Communication in Complex Strategic Games @ MiC workshop, Mon 12/13 1:15p PT read on ⬇

1

8

53

Jesse Mu

@jayelmnop

3 years

“Why did you decide to do a remote internship in NYC?”

1

52

Jesse Mu

@jayelmnop

8 months

@savvyRL

2

0

51

Jesse Mu

@jayelmnop

2 years

web3 is incredibly effective, fatalistic branding. We should be calling machine learning stats2 and deep learning stats3

3

5

49

Jesse Mu

@jayelmnop

4 years

Our last @stanfordnlp seminar before Thanksgiving features Yonatan Bisk ( @ybisk ) from CMU: "Language Should be Embodied—But what does that mean?" Thursday 10am PT / open to the public / register at ! #robonlp #nlproc

1

19

49

Jesse Mu

@jayelmnop

2 years

Seeing more tweets from people who are convinced that the DALL-E generations are simply too good/curated to be real, and that there’s a human behind the scenes. Wondering if this is the birth of an “AI isn’t real” conspiracy movement that’ll only grow stronger in the future

2

3

47

Jesse Mu

@jayelmnop

4 months

Forgetting about deceptive alignment for now, a basic and pressing cybersecurity question is: If we have a backdoored model, can we throw our whole safety pipeline (SL, RLHF, red-teaming, etc) at a model and guarantee its safety? Our work shows that in some cases, we can't 2/5

2

1

48

Jesse Mu

@jayelmnop

4 years

For this week's @stanfordnlp seminar Thursday 10am PT, we'll have Mikel Artetxe ( @artetxem ) of Facebook AI Research speaking on Unsupervised Machine Translation! Open to the public - non-Stanford affiliates register at

0

11

46

Jesse Mu

@jayelmnop

4 years

For this week's @stanfordnlp seminar Thursday 10am PT, excited to have Yoon Kim (MIT-IBM Watson/incoming MIT EECS prof) speaking on Deep Unsupervised Learning of Syntactic Structure. Open to the public - non-Stanford affiliates register at

4

10

44

Jesse Mu

@jayelmnop

2 years

the "import torch" to "import openai" researcher pipeline

0

44

Jesse Mu

@jayelmnop

2 years

@pfau The internet has also decided that lo-fi/pixelated/sloppy memes tend to be funnier

2

1

43

Jesse Mu

@jayelmnop

4 years

Excited to kick-off virtual @stanfordnlp seminars () Thursdays 10am PT - open to the public! We'll first be hearing from @_jessethomason_ on "From Human Language to Agent Action". Non-Stanford affiliates, register for Zoom link:

1

6

43

Jesse Mu

@jayelmnop

3 years

Very excited for our last @stanfordnlp seminar of the year: Melanie Subbiah (now @ColumbiaCompSci , prev @OpenAI ) on GPT-3: Few-shot Learning with a Giant Language Model. Thursday 10am PT / open to the public / register at ! #gpt3 #NLProc

1

12

40

Jesse Mu

@jayelmnop

1 year

1️⃣ Improving Intrinsic Exploration with Language Abstractions Using language abstractions to guide exploration in RL, e.g. by self-designing a curriculum of increasingly difficult language goals Also see @ykilcher review: (1/7)

Jesse Mu

@jayelmnop

2 years

Excited to share my work from my internship at @MetaAI : improving exploration in RL with language abstractions! Paper: 🧵 (1/8)

5

48

289

1

6

40

Jesse Mu

@jayelmnop

2 years

Really enjoyed speaking to @ykilcher about our recent work on RL exploration w/ language () - check out the interview below, as well as the excellent paper review here:

Improving Intrinsic Exploration with Language Abstractions (Machine...

#reinforcementlearning #ai #explainedExploration is one of the oldest challenges for Reinforcement Learning algorithms, with no clear solution to date. Espec...

www.youtube.com

Yannic Kilcher 🇸🇨

@ykilcher

2 years

Check out this interview with Jesse Mu, author of "Improving Intrinsic Exploration with Language Abstractions"! Simple idea, big impact: Adding natural language really helps intrinsic exploration in reinforcement learning💪Watch to find out more:

1

19

101

1

9

38

Jesse Mu

@jayelmnop

2 years

Classic active learning problems become more interesting when combined w/ the few-shot abilities of foundation models: in tiny datasets, the kind of data you train on matters a lot. Check out our work (w/ @AlexTamkin , Salil Deshpande, Dat Nguyen, Noah Goodman) towards this end!

Alex Tamkin 🦣

@AlexTamkin

2 years

How can we choose examples for a model that induce the intended behavior? We show how *active learning* can help pretrained models choose good examples—clarifying a user's intended behavior, breaking spurious correlations, and improving robustness! 1/

4

62

259

0

7

38

Jesse Mu

@jayelmnop

4 months

Backdoored models may seem far-fetched now, but just saying "just don't train the model to be bad" is discounting the rapid progress made in the past year poisoning the entire LLM pipeline, including human feedback [1], instruction tuning [2], and even pretraining [3] data. 3/5

1

38

Jesse Mu

@jayelmnop

2 months

If this sounds fun, we’d love to chat! Please email {jesse,ethan,miranda} at anthropic dot com with [ASL-3] in the subject line, a paragraph about why you might be a good fit, and any previous experience you have. We will read (and try to respond to) every message we get!

2

0

35

Jesse Mu

@jayelmnop

2 months

I too have gotten Claude 3 to vertically center a <div>

Daniel Losey 🔀

@DanielJLosey

2 months

I quite literally did the work of 50 front-end web developers working for a week in one night thanks to Claude 3.

35

31

671

3

0

34

Jesse Mu

@jayelmnop

2 years

2

0

35

Jesse Mu

@jayelmnop

4 years

🗣️ Can neural networks learn pragmatics via multi-agent communication, w/o explicit pragmatic reasoning? Yes! Check out our amortized Rational Speech Acts (RSA) model at #CogSci2020 (w/ Julia White, Noah Goodman) paper: talk/qa: Sat Aug 1, 2pm ET/6pm UTC

0

8

33

Jesse Mu

@jayelmnop

5 years

#Stanford #AISalon with Chris Re ( @HazyResearch ), @JeffDean . Overwhelming themes of the day: huge, multi-task models, weak supervision, transfer learning

1

4

33

Jesse Mu

@jayelmnop

1 year

@sleepinyourhat New LM benchmark is Inception score on rendered SVG generations New txt2img benchmark is BERTScore after OCRing the image generated from the prompt "first page of a newly discovered Shakespearean play, 1597, 35mm film photograph, colorized"

2

0

33

Jesse Mu

@jayelmnop

2 years

@Bam4d @MetaAI I'm in the same boat as Chris, travel cancelled last minute by Meta (fmr intern). I also had to pay out of pocket for registration. @EMostaque @nathanbenaich any chance of support? Can provide verification. @kchonyc any chance @NeurIPSConf could support (eg waiving registration)?

3

2

33

Jesse Mu

@jayelmnop

3 years

🗣️ New work on emergent communication! Agents trained on Lewis reference games develop successful but uninterpretable language. Meanwhile, our language conveys ideas&categories, not just ref exps. We propose communicating over *sets* of objects, increasing compositionality (1/5)

1

2

32

Jesse Mu

@jayelmnop

5 months

If you happened upon a tin of Cafe Du Monde coffee at #NeurIPS2023 , allow extra time at the airport for TSA to double/triple check it

3

1

32

Jesse Mu

@jayelmnop

4 months

[1] [2] [3] [4] and, of course, deceptive alignment! 4/5

1

3

31

Jesse Mu

@jayelmnop

1 year

3. Hangman

1

30

Jesse Mu

@jayelmnop

2 years

Speculating on LMs, anonymity, and the internet 1/

2

30

Jesse Mu

@jayelmnop

1 year

Covers GPT 1-3, in-context learning, (zero-shot) chain-of-thought, instruction finetuning, RLHF (w/ an intro to RL for the uninitiated), constitutional AI, etc, and discusses pros/cons of alignment methods Hope it can be helpful, and please let me know if you spot any errors!

3

0

29

Jesse Mu

@jayelmnop

1 year

There's a lot more work to be done here: parameter-efficient gisting, compressing longer prompts, etc Paper: Code: This was a joint effort with @XiangLisaLi2 and @noahdgoodman . Also thx to the Stanford Alpaca team, esp. @lxuechen !

GitHub - jayelm/gisting: Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/23...

Learning to Compress Prompts with Gist Tokens - https://arxiv.org/abs/2304.08467 - jayelm/gisting

github.com

0

3

28

Jesse Mu

@jayelmnop

3 years

I'm presenting Compositional Explanations of Neurons (w/ @jacobandreas ) at #NeurIPS2020 on Thursday: 🗣️ oral 6:30am PST/2:30pm GMT (track 28 deep learning) 📜 poster 9-11am PST/5-7pm GMT (gather town deep learning C3 - spot A3) Stop by!

Jesse Mu

@jayelmnop

4 years

New preprint with @jacobandreas : we generate explanations of the individual neurons inside deep neural networks by identifying *compositional logical concepts* that closely approximate neuron behavior (e.g. "water that isn't blue") (1/5)

5

113

459

0

5

28

Jesse Mu

@jayelmnop

4 years

For this week's @stanfordnlp seminar Thursday 10am PST, Douwe Kiela ( @douwekiela ) from Facebook AI Research will present "Rethinking Benchmarking in AI" and the Dynabench platform ()! Open to the public - non-Stanford registration at

0

6

27

Jesse Mu

@jayelmnop

2 months

People on the team right now include me and: - @EthanJPerez - @megtong_ - @ZhongRuiqi - @MrinankSharma We exist under the broader Alignment Science team led by Sam Bowman ( @sleepinyourhat ), which has too many awesome colleagues to count.

1

28

Jesse Mu

@jayelmnop

2 years

I want SpamBERT to win Best Paper Award at ACL 2023

1

2

27

Jesse Mu

@jayelmnop

1 year

2. Chess (ChatGPT is a sore loser)

2

1

27

Jesse Mu

@jayelmnop

4 years

@chrisdonahuey

1

0

25

Jesse Mu

@jayelmnop

1 year

3️⃣ STaR: Bootstrapping Reasoning with Reasoning (led by @ericzelikman , @Yuhu_ai_ ) Improving multistep reasoning in LMs by bootstrapping off of self-generated rationales Essentially doing RL in chain-of-thought rationale space! (3/7)

Yuhuai (Tony) Wu

@Yuhu_ai_

2 years

Language models can dramatically improve their reasoning by learning from chains of thought that they generate. With STaR, just a few worked examples can boost accuracy to that of a 30X larger model (GPT-J to GPT-3). W. @ericzelikman , Noah Goodman 1/

8

93

525

2

5

23

Jesse Mu

@jayelmnop

1 year

To recap, I see LMs and RL converging from 2 directions: RL➡️?⬅️LMs Starting from RL: imbuing agents w/ language priors [1️⃣,2️⃣] Starting from LMs: improving reasoning not from static corpora, but RL exploration & interaction [3️⃣] Excited for these paths to intertwine! (5/7)

1

4

23

Jesse Mu

@jayelmnop

2 years

@Miles_Brundage "OK {Mozart,The Beatles,Kendrick,...} made a good song but it was cherrypicked. Show me all the bad outputs"

1

0

22

Jesse Mu

@jayelmnop

1 year

I'm on the job market! Mostly industry (+startups). Interested in both traditional RS positions *and* applied roles deploying products to users and improving from feedback. Please DM or reach out at NeurIPS! (Also reach out in general, happy to chat about anything) (6/7)

2

21

Jesse Mu

@jayelmnop

2 months

Context: we’ve been pushing towards our ASL (AI Safety Level) safety commitments under our Responsible Scaling Policy—think about this as a “sprint on safety.” Red-teaming and adversarial robustness are a major part of this story.

1

21

Jesse Mu

@jayelmnop

2 years

@NandoDF Now have it explain this

2

1

20

Jesse Mu

@jayelmnop

1 year

4️⃣ (bonus!) Active Learning Helps Pretrained Models Learn the Intended Task (led by @AlexTamkin ) Revisiting classic active learning techniques in the context of modern foundation models and few-shot task ambiguity (4/7)

Alex Tamkin 🦣

@AlexTamkin

2 years

How can we choose examples for a model that induce the intended behavior? We show how *active learning* can help pretrained models choose good examples—clarifying a user's intended behavior, breaking spurious correlations, and improving robustness! 1/

4

62

259

1

4

20

Jesse Mu

@jayelmnop

2 years

For your next icebreaker, try two truths and a lie, except after your initial guess, one of the other choices is revealed to be true, and you all debate whether it’s beneficial to change your initial guess

0

20

Jesse Mu

@jayelmnop

5 years

"Broader Context Improves Metaphor Identification" with Ekaterina Shutova, @HYannakoudakis accepted to #naacl2019 !

1

2

19

Jesse Mu

@jayelmnop

5 years

@kaifulee . @Susan_Athey explains requirements for AI to break out of existing “narrow” domains: (1) efficient causal/counterfactual reasoning from small data (vs chugging away at ImageNet) (2) interprerability, robustness, trust.

1

2

20

Jesse Mu

@jayelmnop

1 year

4. Blackjack (not bad!)

2

0

19

Jesse Mu

@jayelmnop

2 years

1

0

19

Jesse Mu

@jayelmnop

2 months

Some examples of what our team has been up to (1/2): 1. Understanding and mitigating gradient-based attacks 2. Multimodal adversarial robustness 3. Interp-based safety interventions (working with our *world-class* interpretability team)

1

0

18

Jesse Mu

@jayelmnop

3 years

Check out our work (w/ @rose_e_wang , Julia White, Noah Goodman) on training better and more pragmatic LMs by communicating with ensembles of listeners—to appear in #EMNLP2021 Findings!

Rose

@rose_e_wang

3 years

How do we train language models (LMs) to be good pragmatic conversational partners? We investigate this in our #EMNLP2021 Findings paper: Calibrate your listeners! Robust communication-based training for pragmatic speakers. 📜: 📺:

2

9

60

0

2

18

Jesse Mu

@jayelmnop

2 months

Like any RS/RE role, writing papers is part of the job, but the biggest pitch for our team is the impact you'll have beyond that. Anthropic is (still!) small and nimble, and there are many fun opportunities to collaborate across product, T&S, and policy.

1

0

18

Jesse Mu

@jayelmnop

3 years

Eric is doing very prescient work on vulnerabilities in production NLP systems - this will be super interesting!

Stanford NLP Group

@stanfordnlp

3 years

Eric Wallace ( @Eric_Wallace_ ) from UC Berkeley presents at this week's Stanford NLP Seminar on vulnerabilities in NLP models. Thursday Mar 4 10am PT, open to the public! Sign up at

0

15

70

0

7

18

Jesse Mu

@jayelmnop

1 year

2️⃣ Improving Policy Learning with Language Dynamics Distillation (led by @hllo_wrld ) Increasing RL sample efficiency by pretraining agents to model env dynamics from language-annotated demonstrations (2/7)

Victor Zhong

@hllo_wrld

2 years

Our latest reading to learn paper Language Dynamics Distillation will appear at #NeurIPS2022 ! In LDD, we pretrain the agent to read to model env dynamics. LDD improves generalization on 5 distinct language grounding envs over naive RL, VAE, inverse RL. 🧵

2

12

51

1

3

18