Denny Zhou @denny_zhou Twitter profile

Last Seen Profiles

@RockinJello

@b3n70n71

@chinochan_dayo

@MrCourington53

@michele27039688

@Barnes_S76

@Jessica23270443

@Grazia83772654

@dammybee85

@yuyu_uwakuta

@TMpaint

@JoseAlv28682655

@cardmedic

@LinaSteime21448

@morokumash48637

@Caroline_Lila33

@JamesGRickaards

@MomoMim74826523

@Achraf51535165

@Leonard32506509

@MigraStudium

@Jaylon37322755

@KhurramBhatti01

@maestas_ma60701

@richelias408

@FedProm

@diego19cantu

@kredete

@LUCKY5012413802

@SajidSa33306790

@Coach_Creighton

@GOAT184717844

@Georges44917817

@achmunzav

@lyd__hall

@satejp

Denny Zhou

@denny_zhou

2 years

The AI revolution has begun. No matter whether we like it or not.

81

134

939

Denny Zhou

@denny_zhou

6 months

RL is a dead end.

77

34

455

Denny Zhou

@denny_zhou

3 years

Our fast tokenizer work is accepted to EMNLP 2021 as oral, and released in several Google products. 8.2x faster than HuggingFace Tokenizers and 5.1x faster than TensorFlow Text.

5

60

402

Denny Zhou

@denny_zhou

2 months

Our team at Google DeepMind has a full-time Research Scientist position available at our Mountain View site. Minimum qualification: PhD in ML/NLP. Please email me with: your CV and Google Scholar link; a brief description of the impactful work you have done; and what you aim

12

51

290

Denny Zhou

@denny_zhou

6 months

In our ICML 2023 paper "Large language models can be easily distracted by irrelevant context" (), we simply asked LLMs to ignore the irrelevant context and then they worked well. E.g. the example showed in your paper can be solved as follows:

Jason Weston

@jaseweston

6 months

🚨 New paper! 🚨 We introduce System 2 Attention (S2A). - Soft attention in Transformers is susceptible to irrelevant/biased info - S2A uses LLM reasoning to generate what to attend to Improves factuality & objectivity, decreases sycophancy. 🧵(1/5)

1

271

1K

7

22

259

Denny Zhou

@denny_zhou

2 years

If LLMs are humans, all the ideas are trivial: chain-of-thought prompting ("explain your answer"), self-consistency ("double check your answer"), least-to-most prompting ("decompose to easy subproblems"). The shocking thing is that LLMs are not humans but these still work!

8

29

252

Denny Zhou

@denny_zhou

1 year

Machine learning is statistics. Nearly irrelevant to AI.

34

14

249

Denny Zhou

@denny_zhou

1 year

This week marks ICLR 2023. Thrilled to share a thread dedicated to our 9 accepted papers. We are grateful that all our submissions were accepted and truly appreciate the constructive discussions and valuable feedback provided by the reviewers. 🧵

11

38

247

Denny Zhou

@denny_zhou

19 days

Long context is essentially a length generalization problem. Our recent work "Transformers Can Achieve Length Generalization But Not Robustly" () showed a surprisingly simple approach to achieving length generalization: train your model several times using

Transformers Can Achieve Length Generalization But Not Robustly

Length generalization, defined as the ability to extrapolate from shorter training sequences to longer test ones, is a significant challenge for language models. This issue persists even with...

arxiv.org

6

29

240

Denny Zhou

@denny_zhou

1 year

Reduce serving cost is a huge problem in the field of LLMs. Typically done by distillation and quantization. We propose "LLMs as Tool Makers", as a complementary solution to these existing techniques.

5

47

237

Denny Zhou

@denny_zhou

2 years

Self-consistency amazingly improves chain-of-thought prompting, even 50+% improvements on many tasks. Why does it work so well? It essentially integrates/marginalizes out all latent reasoning paths to compute the full probability of each distinct answer!

7

42

229

Denny Zhou

@denny_zhou

2 years

Self-consistency improves chain of thought reasoning in large language models.

7

35

224

Denny Zhou

@denny_zhou

10 months

Maybe time to initiate a new conference dedicated to LLMs, reminiscent of how ICLR emerged for DL years ago. This could also help reduce submissions to NeurIPS and ICLR. Any thoughts?

14

9

196

Denny Zhou

@denny_zhou

6 months

This is not quite RL: the blue one looks way smarter than the rest. It seems planning with reasoning, and the others that succeeded basically imitate the blue one.

Zhengzhong Tu

@_vztu

6 months

Reinforcement learning explained

157

2K

13K

11

10

179

Denny Zhou

@denny_zhou

1 year

A key point in the chain-of-thought prompting paper: RIP downstream-task finetuning on LLMs.

7

21

175

Denny Zhou

@denny_zhou

10 months

Slides of my talk given at the fascinating ACL 2023 natural language reasoning workshop (): several points to highlight:

Natural Language Reasoning and Structured Explanations Workshop

Natural Language Reasoning and Structured Explanations Workshop ---

nl-reasoning-workshop.github.io

2

24

157

Denny Zhou

@denny_zhou

1 year

To fix hallucinations in chatbots, a high-quality search engine is first must-have. This is obvious to experts, but definitely not to everyone.

13

9

152

Denny Zhou

@denny_zhou

1 year

Discovering and understanding emergent abilities of LLMs, like CoT reasoning, is the most exciting and challenging topic in LLM research. These emergent abilities are not intentionally built by anyone who trained LLMs via predicting next tokens.

5

22

145

Denny Zhou

@denny_zhou

2 years

Biggest milestones in AI: perceptron, deep learning / transformer, pretraining + prompting.

5

15

135

Denny Zhou

@denny_zhou

1 year

LLMs/AGI is not a zero-sum game. We are just at the beginning of a great revolution.

12

144

Denny Zhou

@denny_zhou

11 months

Self-debug has worked quite well on the raw GPT3.5 model: code-davinci-002 (not instruction tuned):

Jim Fan

@DrJimFan

11 months

GPT-4 has one emergent ability that is extremely useful and stronger than any other models: self-debug. Even the most expert human programmer cannot always get a program correct at the first try. We look at execution results, reason about what's wrong, apply fixes, rinse and

53

398

2K

8

33

130

Denny Zhou

@denny_zhou

9 months

Chain of thought (CoT) and related work are "cognitive science/psychology for LLMs" rather than prompting engineering. Discovering and understanding CoT emerged from LLM next-token-prediction pretraining is like studying human's cognition abilities emerged from neurons.

9

13

132

Denny Zhou

@denny_zhou

2 months

For those who interpret/cite self-consistency (SC) as majority voting (MV) or other MV equivalents, they need to take an entry-level course in machine learning. In our original paper (), Xuezhi and I have provided the math theory underlying SC:

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy,...

arxiv.org

5

17

132

Denny Zhou

@denny_zhou

8 months

Fantastic comments by ( @jkronand , @enjoyingthewind ) connect our LLM reasoning work to Polya. Then checked the book "How to solve it". Page 75: "decomposing and recombining". Maps to "Least to Most Prompting" Page 98: "do you know a related problem".

5

9

125

Denny Zhou

@denny_zhou

11 months

Compared to AI safety, now I am more concerned about the decline of human intelligence.

10

16

123

Denny Zhou

@denny_zhou

2 years

I am planning to submit a workshop proposal on "large language model prompting" to ICLR '23. What will be your favorite name for the workshop? Tell me and I will choose the most voted name to submit.😀

58

8

125

Denny Zhou

@denny_zhou

6 months

I believe OpenAI will continue to achieve remarkable breakthroughs as long as people like Ilya and Alec are still there. The next big innovation would make most of the current techniques (if not all) irrelevant. The imitation game is just at its very beginning.

Ilya Sutskever

@ilyasut

6 months

I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company.

7K

4K

33K

7

10

117

Denny Zhou

@denny_zhou

2 years

Prompting seems to be difficult for some machine learning researchers to understand. This is not surprising because prompting in not machine learning. Prompting is the opposite of machine learning.

6

10

112

Denny Zhou

@denny_zhou

1 year

Just tried GPT-3 003: "What is the largest number?" It says "The largest number possible would be infinity". I asked the same question to my kid (3rd grade). She said "There is no largest number. For any given number, I can always make it larger by plus one."

11

2

114

Denny Zhou

@denny_zhou

2 years

I dont think there is magic here: text-davinci-002 and other 002 models in GPT-3, and instruct GPT should have been finetuned with "let's think step by step ... ". I tried 001 models in GPT3 and none of them works with this kind of prompt while CoT still works.

Aran Komatsuzaki

@arankomatsuzaki

2 years

Large Language Models are Zero-Shot Reasoners Simply adding “Let’s think step by step” before each answer increases the accuracy on MultiArith from 17.7% to 78.7% and GSM8K from 10.4% to 40.7% with GPT-3.

59

573

3K

8

13

112

Denny Zhou

@denny_zhou

5 months

In NeurIPS 2023, there is a section “CoT/Reasoning”. When preparing our CoT paper, I kicked off a discussion on the title. Different names were proposed, like stream of thought (Jason), train of thought (Dale), chain of thought (Dale). Finally I decided to choose “chain of

8

12

108

Denny Zhou

@denny_zhou

8 months

Automate chain-of-thought prompting simply by letting LLMs recall relevant questions/knowledge that they have seen. Fully matches hand crafted CoT performance on reasoning/coding benchmarks and even performs better.

Michi Yasunaga

@michiyasunaga

8 months

Introducing Analogical Prompting, a new method to help LLMs solve reasoning problems. Idea: To solve a new problem, humans often draw from past experiences, recalling similar problems they have solved before. Can we prompt LLMs to mimic this? [1/n]

9

110

482

2

13

108

Denny Zhou

@denny_zhou

1 year

Why does few-shot prompting work without any training or finetuning? Few-shot prompting can be equivalent to fine-tuning running inside of an LLM! Led by @akyurekekin , great collaboration with amazing folks: Dale Schuurmans, @jacobandreas , @tengyuma !

What learning algorithm is in-context learning? Investigations...

Neural sequence models, especially transformers, exhibit a remarkable capacity for in-context learning. They can construct new predictors from sequences of labeled examples $(x, f(x))$ presented...

arxiv.org

3

12

104

Denny Zhou

@denny_zhou

5 months

A simple yet effective approach to fill the performance gap between zero-shot and few-shot prompting Xinyun Chen @xinyun_chen_ is going to present our recent work LLM analogical reasoning () this afternoon in the exciting #MathAI workshop of #NeurIPS2023 .

Large Language Models as Analogical Reasoners

Chain-of-thought (CoT) prompting for language models demonstrates impressive performance across reasoning tasks, but typically needs labeled exemplars of the reasoning process. In this work, we...

arxiv.org

Xinyun Chen

@xinyun_chen_

5 months

Looking forward to the workshop! Excited to share our recent work on LLM reasoning

0

3

37

3

20

101

Denny Zhou

@denny_zhou

2 years

I asked LLM its confidences on its answers. Turned out LLM was highly confident on its every answer. I was disappointed at the useless. Then I realized this is actually another empirical evidence showing the amazing coincidence between LLMs and humans, eg people like me.

11

6

95

Denny Zhou

@denny_zhou

8 months

See how Bard made it:

Vince Vatter

@VinceVatter

8 months

Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have? Here are 60 LLMs getting it wrong.

183

247

1K

10

5

97

Denny Zhou

@denny_zhou

1 year

Don't try to "improve" chain-of-thought (CoT) prompts by "clever engineering". Those "engineering" tricks will be meaningless when LLMs are further improved. Should write CoT prompts naturally. Here is my favorite one taken from my kid's homework:

1

11

97

Denny Zhou

@denny_zhou

7 months

A new conference is dedicated to language modeling for everyone. Great to see any feedback that you would have:

Sasha Rush

@srush_nlp

7 months

Introducing COLM () the Conference on Language Modeling. A new research venue dedicated to the theory, practice, and applications of language models. Submissions: March 15 (it's pronounced "collum" 🕊️)

35

437

2K

3

6

82

Denny Zhou

@denny_zhou

2 years

Using "natural language" to describe rationales, pioneered by (Ling et al 2017) and then followed by (Cobbe et 2021, Wei et al 2022), is essential for the success of chain of thought prompting. 1/

3

16

82

Denny Zhou

@denny_zhou

1 year

Many years ago, Yahoo gave up building its own search engine and outsourced it to Google, and investors cheered up. Now Microsoft gives up building its own LLM and outsources it to OpenAI, and investors cheer up.

12

8

77

Denny Zhou

@denny_zhou

6 months

Amazing! Met @demi_guo_ in Google Beijing about 5 years ago. She came with Xiaodong He, my invitee and collaborator, who did pioneering work on text-to-image. Now Pika's work is just like magic.

Demi Guo

@demi_guo_

6 months

Excited to share that I recently left Stanford AI PhD to start Pika. Words can't express how grateful I am to have all the support from our investors, advisors, friends, and community members along this journey! And there's nothing more exciting than working on this ambitious &

479

520

6K

3

4

79

Denny Zhou

@denny_zhou

1 year

search queries are zero-shot prompts

6

4

74

Denny Zhou

@denny_zhou

7 months

“If I were given one hour to save the planet, I would spend 59 minutes defining the problem and one minute resolving it.” — Albert Einstein

2

3

71

Denny Zhou

@denny_zhou

2 years

For our ICLR submission, a reviewer wrote "I see a major limitation with this approach: it looks like the prompt examples need to be designed specifically for each dataset." Sounds like they expect AGI. That would be a really high standard.

4

67

Denny Zhou

@denny_zhou

10 months

Here is a simple way to reduce the submission number in a conference: charge each submission $$$, e.g. $100, refund after accepted.

18

2

64

Denny Zhou

@denny_zhou

2 years

Using rationales to solve math word problems has been in the NLP literature for years. Here is Fig 1 taken from “Learning to Solve and Explain Algebraic Word Problems” (Ling et al 2017, ACL).

2

12

63

Denny Zhou

@denny_zhou

2 years

Prompting is like teaching/instructing kids. When I was asked how to write a chain-of-thought prompt, I showed them my kid's homework.

2

6

61

Denny Zhou

@denny_zhou

7 months

Although our work is about LLMs, I actually don't think humans do better than LLMs on self-correcting. Look at those countless mistakes in AI tweets.

Jie Huang

@jefffhj

7 months

Interesting papers that share many findings similar to our recent work (), in which we argue that "Large Language Models Cannot Self-Correct Reasoning Yet". I'm happy (as a honest researcher) and sad (as an AGI enthusiast) to see our conclusions confirmed

8

61

338

2

11

60

Denny Zhou

@denny_zhou

1 month

Welcome to the new era of AI: "Deep" was once the buzzword at AI conferences, but it's no longer the case in COLM.

Yoav Artzi

@yoavartzi

2 months

Folks, some @COLM_conf stats, because looking at these really brightens the mood :) We received a total of ⭐️1036⭐️ submissions (for the first ever COLM!!!!). What is even more exciting is the nice distribution of topics and keywords. Exciting times ahead! ❤️

4

33

256

1

10

62

Denny Zhou

@denny_zhou

2 years

A nice combination of LLMs and Google. Like least-to-most prompting, a complex questions is decomposed into a set of easier subquestions. The key difference here is that Google is used to answer those subquestions instead of LLMs.

Ofir Press

@OfirPress

2 years

We've found a new way to prompt language models that improves their ability to answer complex questions Our Self-ask prompt first has the model ask and answer simpler subquestions. This structure makes it easy to integrate Google Search into an LM. Watch our demo with GPT-3 🧵⬇️

52

307

2K

1

12

59

Denny Zhou

@denny_zhou

6 months

The “emergence” in LLMs has nothing to do with the “emergence” in complex systems, e.g. swarm intelligence. It is unfortunate to use the same term to illustrate two unrelated concepts and lead to confusion and misinterpretation.

3

11

60

Denny Zhou

@denny_zhou

2 years

@arankomatsuzaki The truth should be simple: Text-davinci-002 (175B) or other 002 model, or instruct GPT have been finetuned with "let's think step by step. .....". I tried 001 models and none of them works with the proposed method while CoT still works well.

5

6

53

Denny Zhou

@denny_zhou

11 months

@DrJimFan Self-debug:

John Nay

@johnjnay

1 year

Teaching LLMs to Self-Debug -Prompt instructs LLM to execute code then generate a feedback message based on result -W/ out any feedback on code correctness, LLM is able to identify its mistakes by explaining its code -SoTA performance on code generation

16

155

812

4

6

53

Denny Zhou

@denny_zhou

2 years

Go big, or go home. Big means big language models.

4

2

49

Denny Zhou

@denny_zhou

6 months

Will be attending NeurIPS. Excited to meet COLM co-organizers in person, and receive suggestions on COLM from everyone in NeurIPS.

Sasha Rush

@srush_nlp

6 months

Some COLM updates - () * Added amazing @aliceoh and @monojitchou as DEI Chairs * PCs are hard at work on org * OpenReview Interest has been overwhelming, (400 surveys responses!) but the team is awesome and it's going to be great.

6

14

113

1

0

50

Denny Zhou

@denny_zhou

2 months

Debates on whether LLMs can reason like humans seem silly to me. What I care is whether a new technique is more useful than the existing ones. Have anyone heard any debate on whether machine learning is human-like? I have not. So, why having such debates on machine reasoning?🤣

9

4

50

Denny Zhou

@denny_zhou

2 years

If you are interested in seeing how decomposition in the least-to-most prompting powerfully generalizes, here is the thread for you. Just use one simple example to teach GPT-3 code-davinci-002 how to decompose a problem, and test it with other problems.

1

4

50

Denny Zhou

@denny_zhou

1 year

How to let LLM write better code ? Let them debug! Led by @xinyun_chen_ , great collaborations with Maxwell and Nathanael!

Xinyun Chen

@xinyun_chen_

1 year

New preprint: Teach LLMs to self-debug! () With few-shot demonstrations, LLMs can perform rubber duck debugging: w/o error messages, it can identify bugs by explaining the predicted code. SOTA on several code generation benchmarks using code-davinci-002.

12

106

534

0

9

49

Denny Zhou

@denny_zhou

2 years

With only 0.1% examples, matched the SoTA in the literature (specialized models w/ full training) on the challenging CFQ benchmark; and achieved new great SoTA with only 1% examples. Opens a great opportunity to use knowledge graphs by natural language! Well done team!

Andrew Drozdov

@mrdrozdov

2 years

🚨 New preprint! 🚨 We refine least-to-most prompting and achieve sota on CFQ (95% accuracy), outperforming previous fully supervised methods. Joint first author work with the formidable Nathanael Schärli.

2

50

216

0

10

49

Denny Zhou

@denny_zhou

9 months

The ultimate breakthrough in AI would be the ability for an AI system to autonomously generate a superior AI system, without relying on human knowledge or guidance.

13

7

44

Denny Zhou

@denny_zhou

2 years

chain-of-thought: self-consistency: least-to-most:

2

43

Denny Zhou

@denny_zhou

4 months

A great success of neural symbolic reasoning with LLMs. The training data is synthetic!

Google DeepMind

@GoogleDeepMind

4 months

Introducing AlphaGeometry: an AI system that solves Olympiad geometry problems at a level approaching a human gold-medalist. 📐 It was trained solely on synthetic data and marks a breakthrough for AI in mathematical reasoning. 🧵

127

1K

4K

5

2

42

Denny Zhou

@denny_zhou

1 year

The magic of LLMs is that human-like reasoning emerges from the trivial next-token prediction pretraining. Knowing every detail in pretraining is not helpful for understanding LLM reasoning. Just like knowing every neuron is not helpful for understanding human intelligence.

Denny Zhou

@denny_zhou

2 years

I asked @xinyun_chen_ how to make parrots, well known for being able to learn and mimic human speech, truly intelligent. She said scaling up.

4

3

24

2

6

38

Denny Zhou

@denny_zhou

2 months

The size of a black-box LLM can be estimated via the maximum length of long addition that the model can achieve.

2

1

38

Denny Zhou

@denny_zhou

1 year

1/9 Self-Consistency Improves Chain of Thought Reasoning in Language Models - Let LLMs generate multiple responses and then take a majority vote - Crushed SoTA results by a large margin slides/video:

1

5

37

Denny Zhou

@denny_zhou

1 year

When one said chain of thought (CoT) prompting doesn't work, it usually only meant that a specific CoT prompt that they wrote didn't work. CoT is just a guideline for writing quality prompts, as OO is a guideline for writing quality code.

2

34

Denny Zhou

@denny_zhou

1 year

Key difference between LLMs and normal softwares: the engineers of a normal software can tell us exactly what their software can do. But don't expect anyone who trained an LLM can tell us what the LLM can do.

0

1

36

Denny Zhou

@denny_zhou

2 years

LLMs need prompts as hardwares need softwares. Without applying right prompts, LLMs mainly generate garbage. Regardless of scales.

2

34

Denny Zhou

@denny_zhou

2 years

Least-to-most prompting has been proposed to solve length-generalization, or more generally, compositional generalization and easy-to-hard generalization. For simple problems like parity, least-to-most prompting can nearly perfectly solve it.

Cem Anil

@cem__anil

2 years

🆕📜We study large language models’ ability to extrapolate to longer problems! 1) finetuning (with and without scratchpad) fails 2) few-shot scratchpad confers significant improvements 3) Many more findings (see the table & thread) Paper: [] 1/

4

40

242

0

6

32

Denny Zhou

@denny_zhou

1 year

Looking forward to this workshop. Excited to see NLP tasks being solved by reasoning with task instructions, instead of relying solely on training models with tons of annotated examples.

Greg Durrett

@gregd_nlp

1 year

We ( @bhavana_dalvi , @peterjansen_ai , @danilodnr2 , @_jasonwei , and Lionel Wong from @MITCoCoSci ) are excited to announce the 1st Workshop on Natural Language Reasoning and Structured Explanations, co-located with ACL 2023. 🧵

1

51

223

0

2

34

Denny Zhou

@denny_zhou

2 years

Did Aristotle use a laptop? Let us see how least-to-most prompting can be used to query LLMs to get this problem correctly solved! Here is the prompt:

3

4

32

Denny Zhou

@denny_zhou

6 months

@ZhangLunjun RL is prefect for games

6

0

33

Denny Zhou

@denny_zhou

1 year

Not surprising at all. What I am surprised is that someone still needs annotations for text tasks.

Aran Komatsuzaki

@arankomatsuzaki

1 year

ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frames detection, w/ 20x less cost.

21

155

735

0

3

32

Denny Zhou

@denny_zhou

1 year

@_jasonwei @Google @OpenAI All the best for your next chapter Jason! We’ll miss you!

1

0

32

Denny Zhou

@denny_zhou

1 year

Greatly enjoyed my time at UCSB. Had the opportunity to meet incredible faculty members and students. Thank you for the wonderful experience @WilliamWangNLP @ucsbNLP

William Wang

@WilliamWangNLP

1 year

We are super excited about Google Brain's @denny_zhou 's visit to UCSB Computer Science next week. He will talk about "Teach Language Models to Reason." Date: Monday, January 16th, 2022. Time: 3:30 - 4:30 pm. Location: Henley 1010. See you there!

2

9

58

1

30

Denny Zhou

@denny_zhou

7 months

In the modern days, it becomes much more critical to have LLMs know "I don't know" than humans. :)

Prof. Feynman

@ProfFeynman

7 months

Illusion of knowledge is more dangerous than ignorance: It's Okay to say "I don't know" and admit that you don't know it. It's shameful to pretend that you know everything.

94

1K

6K

0

3

30

Denny Zhou

@denny_zhou

8 months

I hesitate to describe in-context learning as an "emergent property". At least for reasoning benchmarks, few-shot standard prompting shows flat performance curves across various model scales.

2

3

28

Denny Zhou

@denny_zhou

2 years

With 8-shot chain-of-thought prompting, PaLM solves 58% of the problems in GSM8K, outperforming OpenAI’s 55% achieved by fine-tuning the GPT-3with 7500 problems and combining it with an external calculator and verifier! chain-of-thought prompting paper:

Google AI

@GoogleAI

2 years

Introducing the 540 billion parameter Pathways Language Model. Trained on two Cloud #TPU v4 pods, it achieves state-of-the-art performance on benchmarks and shows exciting capabilities like mathematical reasoning, code writing, and even explaining jokes.

76

1K

4K

1

6

28

Denny Zhou

@denny_zhou

1 year

@apkumar_ here you go:

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning. In particular,...

arxiv.org

0

5

25

Denny Zhou

@denny_zhou

1 year

Data should be the most critical part of building an LLM. Hope you will find our preprint led by @ShayneRedford helpful.

Shayne Longpre

@ShayneRedford

1 year

#NewPaperAlert When and where does pretraining (PT) data matter? We conduct the largest published PT data study, varying: 1⃣ Corpus age 2⃣ Quality/toxicity filters 3⃣ Domain composition We have several recs for model creators… 📜: 1/ 🧵

12

85

349

1

5

25

Denny Zhou

@denny_zhou

2 years

I asked @xinyun_chen_ how to make parrots, well known for being able to learn and mimic human speech, truly intelligent. She said scaling up.

4

3

24

Denny Zhou

@denny_zhou

6 months

@tonyabracadabra Eg one can quickly learn how to drive without knowing RL

15

0

25

Denny Zhou

@denny_zhou

5 months

@jbhuang0604 where is the requirement for the industrial jobs from? In my team’s hiring, I usually only look at one arxiv paper that the candidates are most proud of or their projects on github. Below are irrelevant to my hiring: citations, h-index or publication venue.

3

0

24

Denny Zhou

@denny_zhou

1 year

@JesseDodge @ShayneRedford You may look at this paper as well:

The Flan Collection: Designing Data and Methods for Effective...

We study the design decisions of publicly available instruction tuning methods, and break down the development of Flan 2022 (Chung et al., 2022). Through careful ablation studies on the Flan...

arxiv.org

1

4

24

Denny Zhou

@denny_zhou

1 year

@IntuitMachine prompting is 100x more data efficient than finetuning

7

0

23

Denny Zhou

@denny_zhou

7 months

Provide an estimate of the number of attendees expected for the Conference on Language Modeling (COLM) 2024 ()? This will greatly assist us in selecting an appropriately sized venue for the event.🙏🏿

<1000

85

1000 - 2000

100

2000+

175

1

8

24

Denny Zhou

@denny_zhou

1 year

Read all the rebuttals and review feedbacks for the ICLR submissions from my team. Have to say overall the feedback quality is high: constructive and helpful.

1

23

Denny Zhou

@denny_zhou

6 months

@polrjoy @ZhangLunjun general non-game tasks

10

1

24

Denny Zhou

@denny_zhou

5 months

If letting you name one ml technique that was considered to be critical in building AGI but now you think it is irrelevant or at least not important, what is on top of your mind?

12

1

23

Denny Zhou

@denny_zhou

1 year

9/9 Least-to-Most Prompting Enables Complex Reasoning in Large Language Models – Decompose a complex problem into subproblems – Solve problems harder than those in the exemplars slides/video:

2

23

Denny Zhou

@denny_zhou

2 months

We look forward to hearing from talented and ambitious candidates who are eager to contribute to cutting-edge research in LLM.

2

1

22

Denny Zhou

@denny_zhou

1 year

7/9 Mind's Eye: Grounded Language Model Reasoning through Simulation — use a computational physics engine as a tool to ground language model reasoning slides/video:

1

23

Denny Zhou

@denny_zhou

1 year

5/9 UL2: Unifying Language Learning Paradigms – Combines diverse pre-training paradigms together slides/video:

1

2

21

Denny Zhou

@denny_zhou

2 years

"Tiny" LMs augmented with simulators obtain similar performance to "huge" LMs (100x larger). Dont have to build a big LM to store everything.

Ruibo Liu

@RuiboLiu

2 years

Simulation is All You Need for Grounded Reasoning!🔥 Mind's Eye enables LLM to *do experiments*🔬 and then *reason* over the observations🧑‍🔬, which is how we humans explore the unknown for decades.🧑‍🦯🚶🏌 Work done @GoogleAI Brain Team this summer!

12

65

328

0

2

22

Denny Zhou

@denny_zhou

1 year

@DimitrisPapail @AngelikiGiannou @shashank_r12 @Kangwook_Lee @jasondeanlee Nice work! This paper sounds related here:

AK

@_akhaliq

1 year

Memory Augmented Large Language Models are Computationally Universal abs:

10

76

399

0

22

Denny Zhou

@denny_zhou

1 year

2/9 What learning algorithm is in-context learning? Investigations with linear models @akyurekekin @jacobandreas @tengyuma Transformer can implement SGD and Ridge regression slides/video:

1

2

22

Denny Zhou

@denny_zhou

3 months

AI is nothing without #NVDA

0

1

18

Denny Zhou

@denny_zhou

1 year

@Francis_YAO_ GPT4 has been pretrained with GSM8k (See the GPT 4 report)

3

1

20

Denny Zhou

@denny_zhou

24 days

My long-term collaborator Mengdi is leading Princeton’s new AI initiative. They seek postdoc fellows in genAI and AI for science&engineering. Check it out!

Mengdi Wang

@MengdiWang10

24 days

Princeton University #AI is recruiting Postdoc Fellows in AI for Accelerating Invention! Join us if you want to advance generative AI, RL and AI applications in engineering and science! Apply here today: @ryan_p_adams @jrexnet @EPrinceton @Princeton

0

15

72

1

0

20

Denny Zhou

@denny_zhou

11 months

@YiTayML @WenhuChen Any AI breakthrough from mathematicians or learning theorists? Have not noticed any one.

7

0

19