Denny Zhou Profile
Denny Zhou

@denny_zhou

9,084
Followers
432
Following
59
Media
589
Statuses

@GoogleDeepMind founder & lead of Reasoning Team. Build LLMs to reason. Opinions my own.

Joined August 2013
Don't wanna be here? Send us removal request.
@denny_zhou
Denny Zhou
2 years
The AI revolution has begun. No matter whether we like it or not.
81
134
939
@denny_zhou
Denny Zhou
6 months
RL is a dead end.
77
34
455
@denny_zhou
Denny Zhou
3 years
Our fast tokenizer work is accepted to EMNLP 2021 as oral, and released in several Google products. 8.2x faster than HuggingFace Tokenizers and 5.1x faster than TensorFlow Text.
Tweet media one
5
60
402
@denny_zhou
Denny Zhou
2 months
Our team at Google DeepMind has a full-time Research Scientist position available at our Mountain View site. Minimum qualification: PhD in ML/NLP. Please email me with: your CV and Google Scholar link; a brief description of the impactful work you have done; and what you aim
12
51
290
@denny_zhou
Denny Zhou
6 months
In our ICML 2023 paper "Large language models can be easily distracted by irrelevant context" (), we simply asked LLMs to ignore the irrelevant context and then they worked well. E.g. the example showed in your paper can be solved as follows:
Tweet media one
@jaseweston
Jason Weston
6 months
🚨 New paper! ​​🚨 We introduce System 2 Attention (S2A). - Soft attention in Transformers is susceptible to irrelevant/biased info - S2A uses LLM reasoning to generate what to attend to Improves factuality & objectivity, decreases sycophancy. 🧵(1/5)
Tweet media one
1
271
1K
7
22
259
@denny_zhou
Denny Zhou
2 years
If LLMs are humans, all the ideas are trivial: chain-of-thought prompting ("explain your answer"), self-consistency ("double check your answer"), least-to-most prompting ("decompose to easy subproblems"). The shocking thing is that LLMs are not humans but these still work!
8
29
252
@denny_zhou
Denny Zhou
1 year
Machine learning is statistics. Nearly irrelevant to AI.
34
14
249
@denny_zhou
Denny Zhou
1 year
This week marks ICLR 2023. Thrilled to share a thread dedicated to our 9 accepted papers. We are grateful that all our submissions were accepted and truly appreciate the constructive discussions and valuable feedback provided by the reviewers. 🧵
11
38
247
@denny_zhou
Denny Zhou
19 days
Long context is essentially a length generalization problem. Our recent work "Transformers Can Achieve Length Generalization But Not Robustly" () showed a surprisingly simple approach to achieving length generalization: train your model several times using
6
29
240
@denny_zhou
Denny Zhou
1 year
Reduce serving cost is a huge problem in the field of LLMs. Typically done by distillation and quantization. We propose "LLMs as Tool Makers", as a complementary solution to these existing techniques.
Tweet media one
5
47
237
@denny_zhou
Denny Zhou
2 years
Self-consistency amazingly improves chain-of-thought prompting, even 50+% improvements on many tasks. Why does it work so well? It essentially integrates/marginalizes out all latent reasoning paths to compute the full probability of each distinct answer!
Tweet media one
7
42
229
@denny_zhou
Denny Zhou
2 years
Self-consistency improves chain of thought reasoning in large language models.
Tweet media one
7
35
224
@denny_zhou
Denny Zhou
10 months
Maybe time to initiate a new conference dedicated to LLMs, reminiscent of how ICLR emerged for DL years ago. This could also help reduce submissions to NeurIPS and ICLR. Any thoughts?
14
9
196
@denny_zhou
Denny Zhou
6 months
This is not quite RL: the blue one looks way smarter than the rest. It seems planning with reasoning, and the others that succeeded basically imitate the blue one.
@_vztu
Zhengzhong Tu
6 months
Reinforcement learning explained
157
2K
13K
11
10
179
@denny_zhou
Denny Zhou
1 year
A key point in the chain-of-thought prompting paper: RIP downstream-task finetuning on LLMs.
7
21
175
@denny_zhou
Denny Zhou
10 months
Slides of my talk given at the fascinating ACL 2023 natural language reasoning workshop (): several points to highlight:
2
24
157
@denny_zhou
Denny Zhou
1 year
To fix hallucinations in chatbots, a high-quality search engine is first must-have. This is obvious to experts, but definitely not to everyone.
13
9
152
@denny_zhou
Denny Zhou
1 year
Discovering and understanding emergent abilities of LLMs, like CoT reasoning, is the most exciting and challenging topic in LLM research. These emergent abilities are not intentionally built by anyone who trained LLMs via predicting next tokens.
5
22
145
@denny_zhou
Denny Zhou
2 years
Biggest milestones in AI: perceptron, deep learning / transformer, pretraining + prompting.
5
15
135
@denny_zhou
Denny Zhou
1 year
LLMs/AGI is not a zero-sum game. We are just at the beginning of a great revolution.
12
12
144
@denny_zhou
Denny Zhou
11 months
Self-debug has worked quite well on the raw GPT3.5 model: code-davinci-002 (not instruction tuned):
Tweet media one
@DrJimFan
Jim Fan
11 months
GPT-4 has one emergent ability that is extremely useful and stronger than any other models: self-debug. Even the most expert human programmer cannot always get a program correct at the first try. We look at execution results, reason about what's wrong, apply fixes, rinse and
Tweet media one
53
398
2K
8
33
130
@denny_zhou
Denny Zhou
9 months
Chain of thought (CoT) and related work are "cognitive science/psychology for LLMs" rather than prompting engineering. Discovering and understanding CoT emerged from LLM next-token-prediction pretraining is like studying human's cognition abilities emerged from neurons.
9
13
132
@denny_zhou
Denny Zhou
2 months
For those who interpret/cite self-consistency (SC) as majority voting (MV) or other MV equivalents, they need to take an entry-level course in machine learning. In our original paper (), Xuezhi and I have provided the math theory underlying SC:
5
17
132
@denny_zhou
Denny Zhou
8 months
Fantastic comments by ( @jkronand , @enjoyingthewind ) connect our LLM reasoning work to Polya. Then checked the book "How to solve it". Page 75: "decomposing and recombining". Maps to "Least to Most Prompting" Page 98: "do you know a related problem".
Tweet media one
5
9
125
@denny_zhou
Denny Zhou
11 months
Compared to AI safety, now I am more concerned about the decline of human intelligence.
10
16
123
@denny_zhou
Denny Zhou
2 years
I am planning to submit a workshop proposal on "large language model prompting" to ICLR '23. What will be your favorite name for the workshop? Tell me and I will choose the most voted name to submit.😀
58
8
125
@denny_zhou
Denny Zhou
6 months
I believe OpenAI will continue to achieve remarkable breakthroughs as long as people like Ilya and Alec are still there. The next big innovation would make most of the current techniques (if not all) irrelevant. The imitation game is just at its very beginning.
@ilyasut
Ilya Sutskever
6 months
I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company.
7K
4K
33K
7
10
117
@denny_zhou
Denny Zhou
2 years
Prompting seems to be difficult for some machine learning researchers to understand. This is not surprising because prompting in not machine learning. Prompting is the opposite of machine learning.
6
10
112
@denny_zhou
Denny Zhou
1 year
Just tried GPT-3 003: "What is the largest number?" It says "The largest number possible would be infinity". I asked the same question to my kid (3rd grade). She said "There is no largest number. For any given number, I can always make it larger by plus one."
Tweet media one
11
2
114
@denny_zhou
Denny Zhou
2 years
I dont think there is magic here: text-davinci-002 and other 002 models in GPT-3, and instruct GPT should have been finetuned with "let's think step by step ... ". I tried 001 models in GPT3 and none of them works with this kind of prompt while CoT still works.
@arankomatsuzaki
Aran Komatsuzaki
2 years
Large Language Models are Zero-Shot Reasoners Simply adding “Let’s think step by step” before each answer increases the accuracy on MultiArith from 17.7% to 78.7% and GSM8K from 10.4% to 40.7% with GPT-3.
Tweet media one
59
573
3K
8
13
112
@denny_zhou
Denny Zhou
5 months
In NeurIPS 2023, there is a section “CoT/Reasoning”. When preparing our CoT paper, I kicked off a discussion on the title. Different names were proposed, like stream of thought (Jason), train of thought (Dale), chain of thought (Dale). Finally I decided to choose “chain of
Tweet media one
8
12
108
@denny_zhou
Denny Zhou
8 months
Automate chain-of-thought prompting simply by letting LLMs recall relevant questions/knowledge that they have seen. Fully matches hand crafted CoT performance on reasoning/coding benchmarks and even performs better.
Tweet media one
@michiyasunaga
Michi Yasunaga
8 months
Introducing Analogical Prompting, a new method to help LLMs solve reasoning problems. Idea: To solve a new problem, humans often draw from past experiences, recalling similar problems they have solved before. Can we prompt LLMs to mimic this? [1/n]
Tweet media one
9
110
482
2
13
108
@denny_zhou
Denny Zhou
1 year
Why does few-shot prompting work without any training or finetuning? Few-shot prompting can be equivalent to fine-tuning running inside of an LLM! Led by @akyurekekin , great collaboration with amazing folks: Dale Schuurmans, @jacobandreas , @tengyuma !
3
12
104
@denny_zhou
Denny Zhou
5 months
A simple yet effective approach to fill the performance gap between zero-shot and few-shot prompting Xinyun Chen @xinyun_chen_ is going to present our recent work LLM analogical reasoning () this afternoon in the exciting #MathAI workshop of #NeurIPS2023 .
@xinyun_chen_
Xinyun Chen
5 months
Looking forward to the workshop! Excited to share our recent work on LLM reasoning
0
3
37
3
20
101
@denny_zhou
Denny Zhou
2 years
I asked LLM its confidences on its answers. Turned out LLM was highly confident on its every answer. I was disappointed at the useless. Then I realized this is actually another empirical evidence showing the amazing coincidence between LLMs and humans, eg people like me.
11
6
95
@denny_zhou
Denny Zhou
8 months
See how Bard made it:
Tweet media one
@VinceVatter
Vince Vatter
8 months
Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have? Here are 60 LLMs getting it wrong.
183
247
1K
10
5
97
@denny_zhou
Denny Zhou
1 year
Don't try to "improve" chain-of-thought (CoT) prompts by "clever engineering". Those "engineering" tricks will be meaningless when LLMs are further improved. Should write CoT prompts naturally. Here is my favorite one taken from my kid's homework:
Tweet media one
1
11
97
@denny_zhou
Denny Zhou
7 months
A new conference is dedicated to language modeling for everyone. Great to see any feedback that you would have:
@srush_nlp
Sasha Rush
7 months
Introducing COLM () the Conference on Language Modeling. A new research venue dedicated to the theory, practice, and applications of language models. Submissions: March 15 (it's pronounced "collum" 🕊️)
Tweet media one
35
437
2K
3
6
82
@denny_zhou
Denny Zhou
2 years
Using "natural language" to describe rationales, pioneered by (Ling et al 2017) and then followed by (Cobbe et 2021, Wei et al 2022), is essential for the success of chain of thought prompting. 1/
3
16
82
@denny_zhou
Denny Zhou
1 year
Many years ago, Yahoo gave up building its own search engine and outsourced it to Google, and investors cheered up. Now Microsoft gives up building its own LLM and outsources it to OpenAI, and investors cheer up.
12
8
77
@denny_zhou
Denny Zhou
6 months
Amazing! Met @demi_guo_ in Google Beijing about 5 years ago. She came with Xiaodong He, my invitee and collaborator, who did pioneering work on text-to-image. Now Pika's work is just like magic.
@demi_guo_
Demi Guo
6 months
Excited to share that I recently left Stanford AI PhD to start Pika. Words can't express how grateful I am to have all the support from our investors, advisors, friends, and community members along this journey! And there's nothing more exciting than working on this ambitious &
479
520
6K
3
4
79
@denny_zhou
Denny Zhou
1 year
search queries are zero-shot prompts
6
4
74
@denny_zhou
Denny Zhou
7 months
“If I were given one hour to save the planet, I would spend 59 minutes defining the problem and one minute resolving it.” — Albert Einstein
2
3
71
@denny_zhou
Denny Zhou
2 years
For our ICLR submission, a reviewer wrote "I see a major limitation with this approach: it looks like the prompt examples need to be designed specifically for each dataset." Sounds like they expect AGI. That would be a really high standard.
4
4
67
@denny_zhou
Denny Zhou
10 months
Here is a simple way to reduce the submission number in a conference: charge each submission $$$, e.g. $100, refund after accepted.
18
2
64
@denny_zhou
Denny Zhou
2 years
Using rationales to solve math word problems has been in the NLP literature for years. Here is Fig 1 taken from “Learning to Solve and Explain Algebraic Word Problems” (Ling et al 2017, ACL).
Tweet media one
2
12
63
@denny_zhou
Denny Zhou
2 years
Prompting is like teaching/instructing kids. When I was asked how to write a chain-of-thought prompt, I showed them my kid's homework.
Tweet media one
2
6
61
@denny_zhou
Denny Zhou
7 months
Although our work is about LLMs, I actually don't think humans do better than LLMs on self-correcting. Look at those countless mistakes in AI tweets.
@jefffhj
Jie Huang
7 months
Interesting papers that share many findings similar to our recent work (), in which we argue that "Large Language Models Cannot Self-Correct Reasoning Yet". I'm happy (as a honest researcher) and sad (as an AGI enthusiast) to see our conclusions confirmed
Tweet media one
8
61
338
2
11
60
@denny_zhou
Denny Zhou
1 month
Welcome to the new era of AI: "Deep" was once the buzzword at AI conferences, but it's no longer the case in COLM.
Tweet media one
@yoavartzi
Yoav Artzi
2 months
Folks, some @COLM_conf stats, because looking at these really brightens the mood :) We received a total of ⭐️1036⭐️ submissions (for the first ever COLM!!!!). What is even more exciting is the nice distribution of topics and keywords. Exciting times ahead! ❤️
Tweet media one
Tweet media two
4
33
256
1
10
62
@denny_zhou
Denny Zhou
2 years
A nice combination of LLMs and Google. Like least-to-most prompting, a complex questions is decomposed into a set of easier subquestions. The key difference here is that Google is used to answer those subquestions instead of LLMs.
@OfirPress
Ofir Press
2 years
We've found a new way to prompt language models that improves their ability to answer complex questions Our Self-ask prompt first has the model ask and answer simpler subquestions. This structure makes it easy to integrate Google Search into an LM. Watch our demo with GPT-3 🧵⬇️
52
307
2K
1
12
59
@denny_zhou
Denny Zhou
6 months
The “emergence” in LLMs has nothing to do with the “emergence” in complex systems, e.g. swarm intelligence. It is unfortunate to use the same term to illustrate two unrelated concepts and lead to confusion and misinterpretation.
3
11
60
@denny_zhou
Denny Zhou
2 years
@arankomatsuzaki The truth should be simple: Text-davinci-002 (175B) or other 002 model, or instruct GPT have been finetuned with "let's think step by step. .....". I tried 001 models and none of them works with the proposed method while CoT still works well.
5
6
53
@denny_zhou
Denny Zhou
11 months
@DrJimFan Self-debug:
@johnjnay
John Nay
1 year
Teaching LLMs to Self-Debug -Prompt instructs LLM to execute code then generate a feedback message based on result -W/ out any feedback on code correctness, LLM is able to identify its mistakes by explaining its code -SoTA performance on code generation
Tweet media one
16
155
812
4
6
53
@denny_zhou
Denny Zhou
2 years
Go big, or go home. Big means big language models.
4
2
49
@denny_zhou
Denny Zhou
6 months
Will be attending NeurIPS. Excited to meet COLM co-organizers in person, and receive suggestions on COLM from everyone in NeurIPS.
@srush_nlp
Sasha Rush
6 months
Some COLM updates - () * Added amazing @aliceoh and @monojitchou as DEI Chairs * PCs are hard at work on org * OpenReview Interest has been overwhelming, (400 surveys responses!) but the team is awesome and it's going to be great.
6
14
113
1
0
50
@denny_zhou
Denny Zhou
2 months
Debates on whether LLMs can reason like humans seem silly to me. What I care is whether a new technique is more useful than the existing ones. Have anyone heard any debate on whether machine learning is human-like? I have not. So, why having such debates on machine reasoning?🤣
9
4
50
@denny_zhou
Denny Zhou
2 years
If you are interested in seeing how decomposition in the least-to-most prompting powerfully generalizes, here is the thread for you. Just use one simple example to teach GPT-3 code-davinci-002 how to decompose a problem, and test it with other problems.
Tweet media one
1
4
50
@denny_zhou
Denny Zhou
1 year
How to let LLM write better code ? Let them debug! Led by @xinyun_chen_ , great collaborations with Maxwell and Nathanael!
@xinyun_chen_
Xinyun Chen
1 year
New preprint: Teach LLMs to self-debug! () With few-shot demonstrations, LLMs can perform rubber duck debugging: w/o error messages, it can identify bugs by explaining the predicted code. SOTA on several code generation benchmarks using code-davinci-002.
Tweet media one
12
106
534
0
9
49
@denny_zhou
Denny Zhou
2 years
With only 0.1% examples, matched the SoTA in the literature (specialized models w/ full training) on the challenging CFQ benchmark; and achieved new great SoTA with only 1% examples. Opens a great opportunity to use knowledge graphs by natural language! Well done team!
@mrdrozdov
Andrew Drozdov
2 years
🚨 New preprint! 🚨 We refine least-to-most prompting and achieve sota on CFQ (95% accuracy), outperforming previous fully supervised methods. Joint first author work with the formidable Nathanael Schärli.
2
50
216
0
10
49
@denny_zhou
Denny Zhou
9 months
The ultimate breakthrough in AI would be the ability for an AI system to autonomously generate a superior AI system, without relying on human knowledge or guidance.
13
7
44
@denny_zhou
Denny Zhou
2 years
chain-of-thought: self-consistency: least-to-most:
2
2
43
@denny_zhou
Denny Zhou
4 months
A great success of neural symbolic reasoning with LLMs. The training data is synthetic!
@GoogleDeepMind
Google DeepMind
4 months
Introducing AlphaGeometry: an AI system that solves Olympiad geometry problems at a level approaching a human gold-medalist. 📐 It was trained solely on synthetic data and marks a breakthrough for AI in mathematical reasoning. 🧵
127
1K
4K
5
2
42
@denny_zhou
Denny Zhou
1 year
The magic of LLMs is that human-like reasoning emerges from the trivial next-token prediction pretraining. Knowing every detail in pretraining is not helpful for understanding LLM reasoning. Just like knowing every neuron is not helpful for understanding human intelligence.
@denny_zhou
Denny Zhou
2 years
I asked @xinyun_chen_ how to make parrots, well known for being able to learn and mimic human speech, truly intelligent. She said scaling up.
Tweet media one
4
3
24
2
6
38
@denny_zhou
Denny Zhou
2 months
The size of a black-box LLM can be estimated via the maximum length of long addition that the model can achieve.
2
1
38
@denny_zhou
Denny Zhou
1 year
1/9 Self-Consistency Improves Chain of Thought Reasoning in Language Models - Let LLMs generate multiple responses and then take a majority vote - Crushed SoTA results by a large margin slides/video:
Tweet media one
1
5
37
@denny_zhou
Denny Zhou
1 year
When one said chain of thought (CoT) prompting doesn't work, it usually only meant that a specific CoT prompt that they wrote didn't work. CoT is just a guideline for writing quality prompts, as OO is a guideline for writing quality code.
2
2
34
@denny_zhou
Denny Zhou
1 year
Key difference between LLMs and normal softwares: the engineers of a normal software can tell us exactly what their software can do. But don't expect anyone who trained an LLM can tell us what the LLM can do.
0
1
36
@denny_zhou
Denny Zhou
2 years
LLMs need prompts as hardwares need softwares. Without applying right prompts, LLMs mainly generate garbage. Regardless of scales.
Tweet media one
2
2
34
@denny_zhou
Denny Zhou
2 years
Least-to-most prompting has been proposed to solve length-generalization, or more generally, compositional generalization and easy-to-hard generalization. For simple problems like parity, least-to-most prompting can nearly perfectly solve it.
Tweet media one
@cem__anil
Cem Anil
2 years
🆕📜We study large language models’ ability to extrapolate to longer problems! 1) finetuning (with and without scratchpad) fails 2) few-shot scratchpad confers significant improvements 3) Many more findings (see the table & thread) Paper: [] 1/
Tweet media one
4
40
242
0
6
32
@denny_zhou
Denny Zhou
1 year
Looking forward to this workshop. Excited to see NLP tasks being solved by reasoning with task instructions, instead of relying solely on training models with tons of annotated examples.
@gregd_nlp
Greg Durrett
1 year
We ( @bhavana_dalvi , @peterjansen_ai , @danilodnr2 , @_jasonwei , and Lionel Wong from @MITCoCoSci ) are excited to announce the 1st Workshop on Natural Language Reasoning and Structured Explanations, co-located with ACL 2023. 🧵
Tweet media one
1
51
223
0
2
34
@denny_zhou
Denny Zhou
2 years
Did Aristotle use a laptop? Let us see how least-to-most prompting can be used to query LLMs to get this problem correctly solved! Here is the prompt:
Tweet media one
3
4
32
@denny_zhou
Denny Zhou
6 months
@ZhangLunjun RL is prefect for games
6
0
33
@denny_zhou
Denny Zhou
1 year
Not surprising at all. What I am surprised is that someone still needs annotations for text tasks.
@arankomatsuzaki
Aran Komatsuzaki
1 year
ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frames detection, w/ 20x less cost.
Tweet media one
21
155
735
0
3
32
@denny_zhou
Denny Zhou
1 year
@_jasonwei @Google @OpenAI All the best for your next chapter Jason! We’ll miss you!
1
0
32
@denny_zhou
Denny Zhou
1 year
Greatly enjoyed my time at UCSB. Had the opportunity to meet incredible faculty members and students. Thank you for the wonderful experience @WilliamWangNLP @ucsbNLP
@WilliamWangNLP
William Wang
1 year
We are super excited about Google Brain's @denny_zhou 's visit to UCSB Computer Science next week. He will talk about "Teach Language Models to Reason." Date: Monday, January 16th, 2022. Time: 3:30 - 4:30 pm. Location: Henley 1010. See you there!
Tweet media one
2
9
58
1
1
30
@denny_zhou
Denny Zhou
7 months
In the modern days, it becomes much more critical to have LLMs know "I don't know" than humans. :)
@ProfFeynman
Prof. Feynman
7 months
Illusion of knowledge is more dangerous than ignorance: It's Okay to say "I don't know" and admit that you don't know it. It's shameful to pretend that you know everything.
94
1K
6K
0
3
30
@denny_zhou
Denny Zhou
8 months
I hesitate to describe in-context learning as an "emergent property". At least for reasoning benchmarks, few-shot standard prompting shows flat performance curves across various model scales.
2
3
28
@denny_zhou
Denny Zhou
2 years
With 8-shot chain-of-thought prompting, PaLM solves 58% of the problems in GSM8K, outperforming OpenAI’s 55% achieved by fine-tuning the GPT-3with 7500 problems and combining it with an external calculator and verifier! chain-of-thought prompting paper:
@GoogleAI
Google AI
2 years
Introducing the 540 billion parameter Pathways Language Model. Trained on two Cloud #TPU v4 pods, it achieves state-of-the-art performance on benchmarks and shows exciting capabilities like mathematical reasoning, code writing, and even explaining jokes.
76
1K
4K
1
6
28
@denny_zhou
Denny Zhou
1 year
Data should be the most critical part of building an LLM. Hope you will find our preprint led by @ShayneRedford helpful.
@ShayneRedford
Shayne Longpre
1 year
#NewPaperAlert When and where does pretraining (PT) data matter? We conduct the largest published PT data study, varying: 1⃣ Corpus age 2⃣ Quality/toxicity filters 3⃣ Domain composition We have several recs for model creators… 📜: 1/ 🧵
Tweet media one
12
85
349
1
5
25
@denny_zhou
Denny Zhou
2 years
I asked @xinyun_chen_ how to make parrots, well known for being able to learn and mimic human speech, truly intelligent. She said scaling up.
Tweet media one
4
3
24
@denny_zhou
Denny Zhou
6 months
@tonyabracadabra Eg one can quickly learn how to drive without knowing RL
15
0
25
@denny_zhou
Denny Zhou
5 months
@jbhuang0604 where is the requirement for the industrial jobs from? In my team’s hiring, I usually only look at one arxiv paper that the candidates are most proud of or their projects on github. Below are irrelevant to my hiring: citations, h-index or publication venue.
3
0
24
@denny_zhou
Denny Zhou
1 year
@IntuitMachine prompting is 100x more data efficient than finetuning
7
0
23
@denny_zhou
Denny Zhou
7 months
Provide an estimate of the number of attendees expected for the Conference on Language Modeling (COLM) 2024 ()? This will greatly assist us in selecting an appropriately sized venue for the event.🙏🏿
<1000
85
1000 - 2000
100
2000+
175
1
8
24
@denny_zhou
Denny Zhou
1 year
Read all the rebuttals and review feedbacks for the ICLR submissions from my team. Have to say overall the feedback quality is high: constructive and helpful.
1
1
23
@denny_zhou
Denny Zhou
6 months
@polrjoy @ZhangLunjun general non-game tasks
10
1
24
@denny_zhou
Denny Zhou
5 months
If letting you name one ml technique that was considered to be critical in building AGI but now you think it is irrelevant or at least not important, what is on top of your mind?
12
1
23
@denny_zhou
Denny Zhou
1 year
9/9 Least-to-Most Prompting Enables Complex Reasoning in Large Language Models – Decompose a complex problem into subproblems – Solve problems harder than those in the exemplars slides/video:
Tweet media one
2
2
23
@denny_zhou
Denny Zhou
2 months
We look forward to hearing from talented and ambitious candidates who are eager to contribute to cutting-edge research in LLM.
2
1
22
@denny_zhou
Denny Zhou
1 year
7/9 Mind's Eye: Grounded Language Model Reasoning through Simulation — use a computational physics engine as a tool to ground language model reasoning slides/video:
Tweet media one
1
1
23
@denny_zhou
Denny Zhou
1 year
5/9 UL2: Unifying Language Learning Paradigms – Combines diverse pre-training paradigms together slides/video:
Tweet media one
1
2
21
@denny_zhou
Denny Zhou
2 years
"Tiny" LMs augmented with simulators obtain similar performance to "huge" LMs (100x larger). Dont have to build a big LM to store everything.
@RuiboLiu
Ruibo Liu
2 years
Simulation is All You Need for Grounded Reasoning!🔥 Mind's Eye enables LLM to *do experiments*🔬 and then *reason* over the observations🧑‍🔬, which is how we humans explore the unknown for decades.🧑‍🦯🚶🏌 Work done @GoogleAI Brain Team this summer!
12
65
328
0
2
22
@denny_zhou
Denny Zhou
1 year
@_akhaliq
AK
1 year
Memory Augmented Large Language Models are Computationally Universal abs:
Tweet media one
10
76
399
0
0
22
@denny_zhou
Denny Zhou
1 year
2/9 What learning algorithm is in-context learning? Investigations with linear models @akyurekekin @jacobandreas @tengyuma Transformer can implement SGD and Ridge regression slides/video:
Tweet media one
1
2
22
@denny_zhou
Denny Zhou
3 months
AI is nothing without #NVDA
Tweet media one
0
1
18
@denny_zhou
Denny Zhou
1 year
@Francis_YAO_ GPT4 has been pretrained with GSM8k (See the GPT 4 report)
3
1
20
@denny_zhou
Denny Zhou
24 days
My long-term collaborator Mengdi is leading Princeton’s new AI initiative. They seek postdoc fellows in genAI and AI for science&engineering. Check it out!
@MengdiWang10
Mengdi Wang
24 days
Princeton University #AI is recruiting Postdoc Fellows in AI for Accelerating Invention! Join us if you want to advance generative AI, RL and AI applications in engineering and science! Apply here today: @ryan_p_adams @jrexnet @EPrinceton @Princeton
Tweet media one
0
15
72
1
0
20
@denny_zhou
Denny Zhou
11 months
@YiTayML @WenhuChen Any AI breakthrough from mathematicians or learning theorists? Have not noticed any one.
7
0
19