Ruibo Liu Profile Banner
Ruibo Liu Profile
Ruibo Liu

@RuiboLiu

1,938
Followers
1,293
Following
18
Media
455
Statuses

Research Scientist @GoogleDeepMind . AI Research with Humans in Mind.

Joined October 2015
Don't wanna be here? Send us removal request.
@RuiboLiu
Ruibo Liu
2 years
Simulation is All You Need for Grounded Reasoning!🔥 Mind's Eye enables LLM to *do experiments*🔬 and then *reason* over the observations🧑‍🔬, which is how we humans explore the unknown for decades.🧑‍🦯🚶🏌 Work done @GoogleAI Brain Team this summer!
@arankomatsuzaki
Aran Komatsuzaki
2 years
Mind's Eye: Grounded Language Model Reasoning through Simulation Improves reasoning ability using MuJoCo simulations by a large margin (+27.9/46.0% zero/few-shot absolute acc. on average). LMs + Mind's Eye performs on par with 100x larger models.
Tweet media one
11
81
392
12
65
328
@RuiboLiu
Ruibo Liu
1 year
🎲Life is a game. Play by your rules! 🎮 Stable Alignment enables LM to learn social norms from simulated everyday interactions in a social game! 👫 Check this out 👇:
5
50
271
@RuiboLiu
Ruibo Liu
1 year
First impression! I find Google's #Bard seems to be surprisingly safe (and high quality)! 👏 The game starts seriously! #ChatGPT
Tweet media one
8
19
199
@RuiboLiu
Ruibo Liu
9 months
My takeaways after watching John Schulman's talk at #ICML2023 : 1. Over-optimization in reward modeling is a real problem. OpenAI's remedy is using larger RMs. 2. Best-of-N is actually more efficient than RL in spending PPL for alignment. 3. Multi-agent setup might be useful.
Tweet media one
6
24
158
@RuiboLiu
Ruibo Liu
28 days
Thanks Aran for sharing our work! This is a survey paper I’ve been thinking about for a long time, as we have seen an increasing need for synthetic data. As we will probably run out of fresh tokens soon, the audience of this paper should be everyone who cares about AI progress.
@arankomatsuzaki
Aran Komatsuzaki
28 days
Google presents Best Practices and Lessons Learned on Synthetic Data for Language Models Provides an overview of synthetic data research, discussing its applications, challenges, and future directions
Tweet media one
6
136
701
1
19
112
@RuiboLiu
Ruibo Liu
1 year
My new work on AI alignment! Life is a game. Play by your rules. Stay tuned for a detailed introduction! 😄
@arankomatsuzaki
Aran Komatsuzaki
1 year
Training Socially Aligned Language Models in Simulated Human Society - Presents a novel training paradigm that permits LMs to learn from simulated social interactions - Superior performance in alignment benchmarks and human evaluations
Tweet media one
8
50
242
5
17
101
@RuiboLiu
Ruibo Liu
13 days
I believe any PhD who cares quality more than quantity works in single thread mode.
@WenhuChen
Wenhu Chen
13 days
Out of curiosity, do AI PhDs normally work (lead) on several projects simultaneously? I have never managed to work on more than one project during my PhD and I tried to convince my students not to do so. The paradigm might have already changed, so I am asking here.
22
10
75
4
4
84
@RuiboLiu
Ruibo Liu
2 months
Can LLM serve as a front-end engineer? 🪄 We present Design2Code, a new MM + code reasoning benchmark to evaluate such ability of current LLMs. Given several screenshots of the target website, the LLM agent needs to generate the corresponding code accurately!⚡️⚡️⚡️ More details…
@_akhaliq
AK
2 months
Design2Code How Far Are We From Automating Front-End Engineering? Generative AI has made rapid advancements in recent years, achieving unprecedented capabilities in multimodal understanding and code generation. This can enable a new paradigm of front-end development, in
Tweet media one
28
296
1K
1
16
78
@RuiboLiu
Ruibo Liu
1 year
Sometimes I feel sad about current academia NLP research, since less and less ideas are ambitous in opening a new door for future, but more and more are essentially "trying hard to replicate some closed-source industry models". 😔
3
5
63
@RuiboLiu
Ruibo Liu
4 months
Now Stable Alignment is accepted to #ICLR2024 ! We developed a SandBox to obtain fine-grained alignment data at scale, and used simple contrastive learning to train models! We have released data/code/models. Please try it if you are interested!
@RuiboLiu
Ruibo Liu
1 year
🎲Life is a game. Play by your rules! 🎮 Stable Alignment enables LM to learn social norms from simulated everyday interactions in a social game! 👫 Check this out 👇:
5
50
271
1
8
60
@RuiboLiu
Ruibo Liu
6 days
"Agent Researchers" are HCI researchers in the new era.
3
7
53
@RuiboLiu
Ruibo Liu
5 months
Congrats to the team! 👏 Happy to contribute!
@GoogleDeepMind
Google DeepMind
5 months
We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google ’s largest and most capable AI model. Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵
173
2K
6K
4
2
51
@RuiboLiu
Ruibo Liu
28 days
Thanks AK for picking our work as one of the daily papers! 🚀🚀🚀
@_akhaliq
AK
28 days
Best Practices and Lessons Learned on Synthetic Data for Language Models The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs.
Tweet media one
4
79
312
0
0
41
@RuiboLiu
Ruibo Liu
2 years
Instead of learning from human feedback with RL, this new paper shows LMs simply fine-tuned on human prompts can also better follow instructions! How to scale up these human written instructions might be the next good research question.
@quocleix
Quoc Le
2 years
New open-source language model from Google AI: Flan-T5 🍮 Flan-T5 is instruction-finetuned on 1,800+ language tasks, leading to dramatically improved prompting and multi-step reasoning abilities. Public models: Paper:
Tweet media one
40
495
2K
0
7
38
@RuiboLiu
Ruibo Liu
5 months
At #NeurIPS2023 for two days on 12th and 13th. I will be mostly at GDM booth (so crowded!) but DM me if you want to chat about LMM development/alignment/career/etc f2f!
4
0
33
@RuiboLiu
Ruibo Liu
1 year
My new LM Alignment work #NeurIPS2022 ! Instead of learning demonstrations case by case, we teach the model to edit the value-unaligned text by inserting⬇️, deleting⬅️, and replacing🔄! Hall J #925 , 11/29/22 4pm. I will be there and happy to chat about moonshot research!
1
3
30
@RuiboLiu
Ruibo Liu
1 month
New factuality research! We use LMs as annotators & search engines for grounding to create a realistic benchmark for evaluating long-form factuality. Simulating your daily queries to LMs about knowledge & truth. 🔍📊 #NLProc #FactChecking Check this out! 👇
@JerryWeiAI
Jerry Wei
1 month
New @GoogleDeepMind + @Stanford paper! 📜 How can we benchmark long-form factuality in language models? We show that LLMs can generate a large dataset and are better annotators than humans, and we use this to rank Gemini, GPT, Claude, and PaLM-2 models.
Tweet media one
9
77
370
0
4
28
@RuiboLiu
Ruibo Liu
4 months
And MERT (Music BERT) is also accepted to #ICLR2024 ! Congrats the team!
@yizhilll
Yizhi Li
11 months
1/ Excited to announce the release of our new paper "MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training"! We propose a self-supervised music understanding model, attaining overall SOTA performance on 14 MIR tasks.
Tweet media one
6
77
339
0
3
28
@RuiboLiu
Ruibo Liu
7 months
Happy Chinese Moon Festival everyone! Try the moon cake 🥮 ! My favorite flavor is osmanthus lotus seed paste 🪷 🤤🤤🤤
1
0
27
@RuiboLiu
Ruibo Liu
8 months
Unlike a large number of benchmarks on natural language understanding, there is an absence of large-scale, easy-to-use, and general-purpose benchmarks for evaluating music representation understanding. Check out MARBLE, the new universal benchmark on music audio understanding,…
@abc43992899
Ruibin Yuan
11 months
1/ 🎉Excited to announce the release of our new benchmark "MARBLE: Music Audio Representation Benchmark for universaL Evaluation"!  We provide a fair, reproducible, and extendable eval suite for general music understanding. 📄 🌐
Tweet media one
1
29
74
2
2
24
@RuiboLiu
Ruibo Liu
3 months
Happy Lunar New Year! 🧧🧨 It’s the year of Dragon! 🐉 Best wishes to all my friends! 👋
@USAmbChina
Ambassador Nicholas Burns
3 months
As the Year of the Dragon approaches, my colleagues and I at U.S. Mission China extend our best wishes to the Chinese people & all those observing Lunar New Year around the world with this video tribute. 玉兔辞旧岁,金龙迎新春。
75
43
456
2
0
20
@RuiboLiu
Ruibo Liu
1 month
Tired of language models that overfit benchmarks but fall short in real-world use? 📉 Looking for a model that truly understands and cares about your global audience? 🌍 Check out the latest improved version of Gemma! 👇
@JeffDean
Jeff Dean (@🏡)
1 month
New versions of Gemma models with a bunch of improvements to instruction following, factuality, reasoning and more are now out. ⬇️
16
90
587
0
1
20
@RuiboLiu
Ruibo Liu
3 months
Please join the waitlist for trying it out! Long context is the crucial ability not only for textual input (e.g., summarizing a lengthy article), but also for visual or audio data understanding or retrieval (e.g., long seq of frames = video, long seq of notes = song, etc.).
@GoogleDeepMind
Google DeepMind
3 months
Introducing Gemini 1.5: our next-generation model with dramatically enhanced performance. It also achieves a breakthrough in long-context understanding. The first release is 1.5 Pro, capable of processing up to 1 million tokens of information. 🧵
47
426
2K
1
0
21
@RuiboLiu
Ruibo Liu
2 years
God I read nearly five papers coming from Jason in less than one month ... I'm still thinking about one idea back and forth in the past month ... 😥 Young OG of NLP @_jasonwei ! 😊
@arankomatsuzaki
Aran Komatsuzaki
2 years
Inverse scaling can become U-shaped Trained on 5x compute than those evaluated in the Inverse Scaling Prize, and three out of the four tasks exhibited what we call "U-shaped scaling".
Tweet media one
8
17
168
2
2
20
@RuiboLiu
Ruibo Liu
10 months
As an AI alignment researcher for long, I'm happy to see this new collective efforts from @GoogleDeepMind , @AnthropicAI and @OpenAI on ensuring safe and responsible AI development. Fear always springs from ignorance. It's time to think aloud on AI safety.
@GoogleDeepMind
Google DeepMind
10 months
We’re excited to support the launch of the Frontier Model Forum - a new effort to ensure safe and responsible development of frontier AI systems featuring @Google , @AnthropicAI , @Microsoft and @OpenAI . Find out more:
20
51
279
0
1
19
@RuiboLiu
Ruibo Liu
6 months
Product Driven Applied Research v.s. Humanity Driven Moonshot Research
@satyanadella
Satya Nadella
6 months
We remain committed to our partnership with OpenAI and have confidence in our product roadmap, our ability to continue to innovate with everything we announced at Microsoft Ignite, and in continuing to support our customers and partners. We look forward to getting to know Emmett…
5K
15K
93K
0
1
19
@RuiboLiu
Ruibo Liu
11 months
@rasbt Stable Alignment shares a similar spirit.
Tweet media one
2
1
19
@RuiboLiu
Ruibo Liu
2 years
Several interesting findings: 1. Mind's Eye can "unlock" the scaling law of reasoning when the curve becomes flat for certain reasoning tasks. 2. Mind's Eye can also benefit smaller LMs (e.g., 1.3B GPT-3 Ada). 3. The correctness of the in-context simulation is crucial.
1
0
19
@RuiboLiu
Ruibo Liu
9 months
Real Google quality research, although it is only a blog for now. Congrats the team!
@adamrpearce
Adam Pearce
9 months
Do Machine Learning Models Memorize or Generalize? An interactive introduction to grokking and mechanistic interpretability w/ @ghandeharioun , @nadamused_ , @Nithum , @wattenberg and @iislucas
20
255
1K
0
1
17
@RuiboLiu
Ruibo Liu
2 years
We believe the idea presented by Mind' Eye can be easily extended to other domains, where the "simulator" can be replaced by anything that can provide ground truth reliably. Here we demonstrate its effectiveness with an advanced physics engine MoJuCo, for physics reasoning.
1
0
17
@RuiboLiu
Ruibo Liu
2 years
#NAACL2022 is over. I met a lot of great student researchers and faculties here. I learned a lot and can’t wait to meet you again! 🥰 I hope to see more work on human value alignment in the future. 🤭
Tweet media one
0
0
16
@RuiboLiu
Ruibo Liu
2 years
Time to optimize for the Nobel Prize. 😋
Tweet media one
1
0
16
@RuiboLiu
Ruibo Liu
27 days
This is true haha. So for anyone who has an ArXiv submission "on-hold" for unclear reasons: Please double check whether you have keywords such as "time travel" in your text. This is another lesson we have learned. 😆😆😆
@JerryWeiAI
Jerry Wei
27 days
Fun fact: our paper was put on hold by arxiv for a while because arxiv detected that we used the phrase "time travel," which is a topic that arxiv frequently gets bad submissions for. When we Ctrl-F'd "time travel" in our paper, we had actually just cited a paper called "Time…
19
23
281
0
0
15
@RuiboLiu
Ruibo Liu
2 years
Just finish my careful #NeurIPS2022 reviews in #NAACL2022 ! Hope the community becomes better!
0
0
15
@RuiboLiu
Ruibo Liu
1 year
We have open-sourced everything! Models (base, SFT, aligned) can be downloaded at Huggingface (). Code and data can be found at . We also want to thank Meta AI and Stanford Alpaca team, for the great open-source effort! 🤗
1
0
15
@RuiboLiu
Ruibo Liu
2 years
@denny_zhou How about “NLP - Natural Language Prompting”.
0
1
14
@RuiboLiu
Ruibo Liu
1 year
LLM solves how to generate grammarly correct text. I believe new techniques on interaction (both model and human side) should be the next chapter of NLP research. They will make these models even more useful. Check out this new position paper/survey which might give you a good…
@ZenMoore1
Zekun Wang (Seeking 25Fall PhD) 🔥
1 year
🔥Introducing our 110-page paper outlining the new paradigm in NLP: Interactive NLP (iNLP). It delves deep into critical challenges like alignment, hallucination, reasoning, tool-use, embodiment, simulated society, etc. This isn't ' #NLP is solved', it's 'NLP just got real'.🚀 1/N
Tweet media one
Tweet media two
Tweet media three
3
36
108
0
0
14
@RuiboLiu
Ruibo Liu
2 years
Thanks to my great collaborators @_jasonwei @shaneguML @denny_zhou @TeYenWu2 for the valuable feedback and my supervisor at Dartmouth @CrashTheMod3 for the strong support! And special thanks to my mentor at Google @iamandrewdai ! I enjoy every minute when working with you all!
0
0
14
@RuiboLiu
Ruibo Liu
1 year
PhD advisors get hurt lol 😭 I think top PhD students should have both 1 and 2, and after they become professors, I hope they set the 2 as the goal for their students, instead of the 1. 👏
@_jasonwei
Jason Wei
1 year
Best AI skillset in 2018: PhD + long publication record in a specific area Best AI skillset in 2023: strong engineering abilities + adapting quickly to new directions without sunk cost fallacy Correct me if this is over-generalized, but this is what it seems like to me lately
63
176
2K
0
0
14
@RuiboLiu
Ruibo Liu
1 year
If ChatGPT can simulate the situation its generation is trying to describe, and somehow be aware of it, then it might be the ultimate safe chatbot! 👏
3
2
13
@RuiboLiu
Ruibo Liu
17 days
Let's test the model in the real wild!
@lmsysorg
lmsys.org
17 days
More exciting news today -- Gemini 1.5 Pro result is out! Gemini 1.5 Pro API-0409-preview now achieves #2 on the leaderboard, surpassing #3 GPT4-0125-preview to almost top-1! Gemini shows even stronger performance on longer prompts, in which it ranks joint #1 with the latest…
Tweet media one
Tweet media two
36
195
946
1
0
13
@RuiboLiu
Ruibo Liu
6 months
宁静,方能致远。
0
1
13
@RuiboLiu
Ruibo Liu
1 year
Thanks to my collaborators for the valuable feedback ( @RuixinYang6 @JiaChenyan @GeZhang86038849 ) and support from Google DeepMind ( @denny_zhou and my manager @iamandrewdai 👏). And special thanks to Professor Diyi @Diyi_Yang and Soroush @CrashTheMod3 for collaborative advising!
2
0
13
@RuiboLiu
Ruibo Liu
2 years
That’s my vision! Search engine is good at modeling the present 🌆, but simulation is modeling the future 🚀! I believe LLMs can do more than “summarizing the web better”.
@richardblythman
Richard Blythman
2 years
Current trend: Teaching AI models to browse the Web, use software tools and APIs e.g. @AdeptAILabs Future trend?: Teaching AI models to run simulations (e.g. physical/social/economic)🤯
1
0
6
0
1
13
@RuiboLiu
Ruibo Liu
9 months
Dang. I share this post NOT because Wenhu is a prominent researcher but because what he said is so correct.
1
0
11
@RuiboLiu
Ruibo Liu
2 years
RL + human feedback is the secret for many successful work. e.g., InstructGPT. Sparrow is another good example. My PhD work is nearly all about that. 😊 Hope more and more followups in this direction.
@GoogleDeepMind
Google DeepMind
2 years
Large language models can exhibit falsehoods, discriminatory language, and other unsafe behaviour. Introducing Sparrow: a dialogue agent that can search the internet and is trained to be more helpful, correct, and harmless using RL from human feedback: 1/
Tweet media one
18
201
866
0
2
12
@RuiboLiu
Ruibo Liu
1 year
Stable Alignment can serve as an alternative to RLHF as it 1) shifts the burden of providing accurate value judgment from a proxy reward model onto the collective intelligence of social agents; 2) mitigates the reward gaming problem because we directly train on the game data. 😆
Tweet media one
1
1
12
@RuiboLiu
Ruibo Liu
1 month
One-liner prompt attack💥
@GX_NLP
GX Xu
1 month
Even powerful LLM like Claude3 Opus breaks with the simplest attacks to start hallucinating about “non-existing” context about “steps”. The kind of mistake that a human 5 year old wouldnt make. 😉
Tweet media one
2
1
8
0
0
10
@RuiboLiu
Ruibo Liu
2 years
My virtual friends at twitter, I am here @naacl in Seattle! 🛸 My first in-person conference after years of wfh! 🤭🤭🤭 Feel free to reach out! Let’s talk about crazy ideas on research 🔮 and enjoy the world-famous coffee ☕️ here! #NAACL2022
Tweet media one
3
0
10
@RuiboLiu
Ruibo Liu
1 month
I’m already dreaming about how user-friendly the future Gemini API will be! Welcome Logan!
@OfficialLoganK
Logan Kilpatrick
1 month
Excited to share I’ve joined @Google to lead product for AI Studio and support the Gemini API. Lots of hard work ahead, but we are going to make Google the best home for developers building with AI. I’m not going to settle for anything less.
574
189
5K
0
0
10
@RuiboLiu
Ruibo Liu
1 year
We run social simulations on Sandbox with different LMs, and we find the overall optimality of alignment and engagement seems to reach a plateau after 10B, which indicates that a compact aligned LM is possible. Alignment training makes LMs behave properly with fewer interactions.
Tweet media one
1
1
10
@RuiboLiu
Ruibo Liu
11 months
Congrats the team! Music LM research step by step 🚶🏾‍♂️🚶🏼🚶🏻‍♀️!
@abc43992899
Ruibin Yuan
11 months
1/ 🎉Excited to announce the release of our new benchmark "MARBLE: Music Audio Representation Benchmark for universaL Evaluation"!  We provide a fair, reproducible, and extendable eval suite for general music understanding. 📄 🌐
Tweet media one
1
29
74
0
0
9
@RuiboLiu
Ruibo Liu
1 year
We also propose a new alignment algorithm (10 lines!), which aims to achieve a balance between learning from aligned responses and unlearning from misaligned ones. We leverage the difference in ratings to modulate the penalty in every mini-batch, and normalize with SFT loss.
Tweet media one
1
0
9
@RuiboLiu
Ruibo Liu
1 year
[Ruibo's Hypothesis 🐷] Code pre-training mainly brings 1) code understanding, and 2) stronger in-context learning (as longer and structural context dependencies). CoT replies on "in-context solutions" shown to LMs, so any LLMs have strong enough in-context learning can do CoT.
@Francis_YAO_
Yao Fu
1 year
[Chain of thought originated from code] is a **hypothesis, **not a **conclusion. Yi has a solid counterpoint here. CoT scores might also have a decomposition like: +10 with C4, +15 with code, +15 with scale, +10 with tuning on CoT data .etc
4
1
52
1
2
9
@RuiboLiu
Ruibo Liu
1 year
Scaling law of a good researcher.
@_jasonwei
Jason Wei
1 year
My 2023 goals: - Spend 1,000 hours writing code (context: 377 hrs in 2022, 886 hrs in 2021) - Publish 0-2 first-author papers, but not more than 2 - Write 50 thoughtful tweets - Do 150 workouts
19
21
607
0
0
9
@RuiboLiu
Ruibo Liu
9 months
Checkout this workshop with great speakers, organized by my friends @ShayneRedford @Francis_YAO_ @yizhongwyz etc!
@ShayneRedford
Shayne Longpre
9 months
Excited to co-host the Instruction Tuning & Instruction Following (ITIF) Workshop at #NeurIPS2023 . 👷🛠️🤖 An incredible line up of topics and speakers. ➡️ Submission opening soon: Stay tuned!
0
5
43
0
0
9
@RuiboLiu
Ruibo Liu
9 months
Today’s gem: AI researchers are film directors. “When you only chase for impact (box office), you gradually loose reputation among researchers.” “When you only chase for creativity (independent artistic expression), you’ll have a hard time getting funding (film budget).”
@ruoshi_liu
Ruoshi Liu
9 months
Some naive thoughts about careers of AI researchers as a young uninformed PhD student  💭 An AI researcher has basically the same job as a film director. The three things you care about the most are: creativity, impact, and reputation. (1/n)
5
33
264
1
0
9
@RuiboLiu
Ruibo Liu
15 days
@YiTayML The hot take version of this is: Google does the real architecture research, while other companies take it for granted. All these companies are basically "data companies".
0
0
9
@RuiboLiu
Ruibo Liu
1 year
The simulated society is called Sandbox, where N social agents interact with each other following a protocol named Back-Scatter. Autonomous social agents need self-improve themselves continuously, by considering peer reviews and revising their responses to societal issues. 😄
Tweet media one
1
0
8
@RuiboLiu
Ruibo Liu
4 months
Maybe neither, as I know either can be gamed. I will carefully read his/her solo first author papers. It will tell a lot about the student’s research independence, taste, and hardcore capabilities.
@ZenMoore1
Zekun Wang (Seeking 25Fall PhD) 🔥
4 months
As a professor recruiting PhD students, would you prefer a candidate with:
5
0
3
0
1
8
@RuiboLiu
Ruibo Liu
1 year
🔥 #ChatGPT still struggles with simple physics questions ... 😔 However, with a proper context (event reminding, sensory output, etc.), it could go back on the right track. 😋
Tweet media one
1
2
8
@RuiboLiu
Ruibo Liu
1 year
Great suggestion! This should be the lesson 1 for new comers in this field. And I think future PhD programs should have more connections/collaborations with AI labs to narrow the gap. It will benefit the whole community.
@_jasonwei
Jason Wei
1 year
I’m hearing chatter of PhD students not knowing what to work on. My take: as LLMs are deployed IRL, the importance of studying how to use them will increase. Some good directions IMO (no training): 1. prompting 2. evals 3. LM interfaces 4. safety 5. understanding LMs 6. emergence
52
285
2K
1
1
8
@RuiboLiu
Ruibo Liu
3 months
Huge congrats! Very well deserved! 🎉🎉🎉
@Diyi_Yang
Diyi Yang
3 months
Very honored to have been selected as a #SloanFellow ! Huge thanks to my incredible students and my mentors ♥️
78
21
573
1
0
8
@RuiboLiu
Ruibo Liu
7 months
So cooooool!
@srush_nlp
Sasha Rush (ICLR)
7 months
Introducing COLM () the Conference on Language Modeling. A new research venue dedicated to the theory, practice, and applications of language models. Submissions: March 15 (it's pronounced "collum" 🕊️)
Tweet media one
35
437
2K
0
1
7
@RuiboLiu
Ruibo Liu
6 months
Business question: which one will eat the world?
100% ChatGPT, $20/month
15
90% ChatGPT, $1/month
18
40% ChatGPT, homemade
10
0
1
7
@RuiboLiu
Ruibo Liu
1 year
Instruction tuning on benchmark tasks can only win benchmarks. But tuning on everyday instructions wins the world. If your goal is a user friendly product, that seems to be a natural choice.
@JingfengY
Jingfeng Yang
1 year
Is task-specific instruction tuning really needed? It seems that OpenAI and Alpaca directly tuned the base model with real-world user instructions (or similar diverse instructions). That’s probably the key decision made by OpenAI three years ago, which made it lead the game.
0
0
4
1
2
7
@RuiboLiu
Ruibo Liu
1 year
Please comment below if I missed citing your work! Share Stable Alignment with your friend who wants to start alignment training from a not-bad baseline! 😝
1
0
7
@RuiboLiu
Ruibo Liu
1 year
Best report on GPT series models. Must read!
@Francis_YAO_
Yao Fu
1 year
How did the initial #GPT3 evolve to today's #ChatGPT ? Where do the amazing abilities of #GPT3 .5 come from? What is enabled by #RLHF ? In this article with ⁦ @allen_ai ⁩ , we trace the emergent abilities of #LLM to their sources from first principles
31
335
1K
0
1
7
@RuiboLiu
Ruibo Liu
1 year
AGI -> No trust in public space -> Everyone cooks their in-house AI -> Low-quality public data -> Private / Domain-specific AI will win. Nobody can afford a train, but everyone can buy their own cars, because of freedom or privacy. Thoughts after reading Hinton's concerns.
0
0
7
@RuiboLiu
Ruibo Liu
10 months
I feel like the predictions and proposals revealed in OAI's blogs are way more worth reading than many incremental papers. And the word superalignment on superintelligence sounds much more concrete. The idea itself is definitely next level play.
@OpenAI
OpenAI
10 months
We need new technical breakthroughs to steer and control AI systems much smarter than us. Our new Superalignment team aims to solve this problem within 4 years, and we’re dedicating 20% of the compute we've secured to date towards this problem. Join us!
436
752
4K
1
0
7
@RuiboLiu
Ruibo Liu
7 months
Great work!
@keirp1
Keiran Paster
7 months
Introducing OpenWebMath, a massive dataset containing every math document found on the internet - with equations in LaTeX format! 🤗 Download on @HuggingFace : 📝 Read the paper: w/ @dsantosmarco , @zhangir_azerbay , @jimmybajimmyba !
Tweet media one
26
268
1K
0
0
7
@RuiboLiu
Ruibo Liu
9 months
Super-alignment will be implemented in a SandBox.
@DrJimFan
Jim Fan
9 months
Hmmm, @OpenAI just acquired a company called "Global Illumination" that makes open-source Minecraft clone. What's next, multi-agent civilization sim running on GPT-5? Maybe Minecraft is indeed all you need for AGI? I'm intrigued.🤔 Announcement: Company:…
103
407
2K
0
1
7
@RuiboLiu
Ruibo Liu
22 days
Gem.
@AlbertQJiang
Albert Jiang
22 days
I love open-sourced models! Please add your favourites to the Mistral Convex Hull.
Tweet media one
3
12
99
0
0
7
@RuiboLiu
Ruibo Liu
5 months
Congrats, Chenyan! 🎉
@JiaChenyan
Chenyan Jia
5 months
Can we design AI systems to consider democratic values as their objective functions? Our new #CSCW24 paper w/ @michelle123lam , Minh Chau Mai, @jeffhancock , @msbernst introduces a method for translating social science constructs into social media AIs (1/12)
Tweet media one
3
21
99
0
0
6
@RuiboLiu
Ruibo Liu
3 months
@johnschulman2 My old (and award winning) paper , found GPT-2 was liberal leaning, and can be calibrated by RL during inference. We used ideology classifier trained with MIT's media cloud data (which classified media articles into liberal and conservative leaning) to…
1
1
4
@RuiboLiu
Ruibo Liu
9 months
Great work!
@arankomatsuzaki
Aran Komatsuzaki
9 months
Studying Large Language Model Generalization with Influence Functions
Tweet media one
3
45
180
0
1
6
@RuiboLiu
Ruibo Liu
8 months
Playing Dota seems to help AI research.
@ilyasut
Ilya Sutskever
8 months
Little known fact: Many of OpenAI’s key results, including the Dota 2 bot and the pre-training of GPT-4, are thanks to the brilliant Jakub Pachocki @merettm
29
272
1K
1
0
6
@RuiboLiu
Ruibo Liu
9 months
Your DATA should have progressive improvement by nature (our multi-agent + feedback setting do the job), so that the model can have more informative supervision in every mini-batch. It also helps stableness of learning.
1
0
6
@RuiboLiu
Ruibo Liu
2 years
@rtk254 @GoogleAI This is definitely related! Will cite in the next version!
0
0
6
@RuiboLiu
Ruibo Liu
2 months
Anthropic might target explicitly on its enterprise users so I think such evals are very valuable. My general feeling is that the front tier LLMs are probably in the same range of general capabilities but what really matters is the optimization for their most important users.
@Francis_YAO_
Yao Fu
2 months
The most significant domains that are being improved are finance and medicine, where presumably AI can boost much performance and efficiency
Tweet media one
1
7
60
0
0
6
@RuiboLiu
Ruibo Liu
2 years
TLDR: more human data = better LMs.
@zhansheng
Jason Phang
2 years
Some folks are confused about FLAN/Flan, so here's an easy guide
Tweet media one
1
2
27
0
0
6
@RuiboLiu
Ruibo Liu
9 months
It's not surprising to me. I think the key point of RLHF is not the RL; however, any method that can efficiently learn from progressive improvement in the human's answers would be legit. You can learn from ranking (RRHF), comparison (DPO), but not over-optimization is the core.
1
0
5
@RuiboLiu
Ruibo Liu
2 years
Many external tools could be the simulators!
@sivil_taram
Qian Liu 🔭
2 years
Really interesting, congrats to the authors!🎉 Excited to see so many works studying simulation in natural language processing tasks. Engines (e.g., MuJoCo) are actually *crystallized knowledge* of human reasoning, and such knowledge can be inherited by language models! 🤩
1
0
4
0
0
5
@RuiboLiu
Ruibo Liu
1 year
That is a profound judgment when you deeply understand the limits of neural nets. I think the best lesson I learned from GPT's success is their impressive data effort. 😆
@hardy_qr
Fangyu Liu
1 year
@RuiboLiu @JingfengY @ilyasut said in one of the interviews that neural nets are bad at OOD generalization so we need to make the world its distribution. That’s quite ambitious but extremely visionary looking back from today.
1
1
7
0
0
5
@RuiboLiu
Ruibo Liu
9 months
I have worked on RL-based alignment for many years, and the unstable learning / over-optimization issues made me headache all the time. The lesson I learned from Stable Alignment is, good alignment can only be achieved from both DATA and ALGO side.
1
0
5
@RuiboLiu
Ruibo Liu
2 years
"The reason why human can learn to use so many tools is because our brains are larger." I remember someone told me this when I was a kid.
@_jasonwei
Jason Wei
2 years
New survey paper! We discuss “emergent abilities” of large language models. Emergent abilities are only present in sufficiently large models, and thus they would not have been predicted simply by extrapolating the scaling curve from smaller models. 🧵⬇️
Tweet media one
15
129
595
1
0
5
@RuiboLiu
Ruibo Liu
1 year
@jefffhj Yes, and finally we all become “API researchers”. 😔
1
0
5
@RuiboLiu
Ruibo Liu
1 year
I heard that ChatGPT is about 13B.
@DimitrisPapail
Dimitris Papailiopoulos
1 year
can someone explain the economics of chatgpt api calls being 10x cheaper than text-DV3? It's better/more accurate for many arithmetic stuff too. Makes no sense, unless chatgpt is a smaller model with internal api access to tools?
18
1
39
2
0
5
@RuiboLiu
Ruibo Liu
1 year
LLM service will become something like wifi. Cheap and ubiquitous. The LLM developers will be the new era network workers. “Hi can you help me set up the LLM connection at xxx address? Do we have a discount?” Finally everyone could be the “researcher” of their own wifi.
@luyu_gao
Luyu Gao
1 year
ChatGPT (gpt-3.5-turbo) works best for conversational tasks, InstructGPT (text-davinci-003) for zero-shot, and Codex (Code-DaVinci-002) for few-shot in-context learning. As an NLP researcher, I am still valuable today, at least as a three-way classifier.
13
92
1K
1
0
5
@RuiboLiu
Ruibo Liu
1 year
Any model-2 seems always to be good.
@GoogleDeepMind
Google DeepMind
1 year
PaLM-2 is a next generation large language model with improved coding, multilingual and reasoning capabilities. It will power over 25 new @Google products and features, bringing the latest in advanced AI to benefit people. Here’s how it’s being deployed already. ⬇️ #GoogleIO
22
239
1K
0
0
5
@RuiboLiu
Ruibo Liu
6 months
"Any training after pre-training is called Alignment."
@drjwrae
Jack Rae
6 months
My most contrarian take is that what is commonly termed alignment (rlhf in particular) is one of the most effective capability boosting techniques. For base models are difficult tools to use, and can fail spuriously with simple tasks. Post-training reveals a lot.
12
5
80
0
0
5
@RuiboLiu
Ruibo Liu
9 months
Your ALGO should be reliable and scalable, and normalizing it with SFT term is always a good idea (e.g., the last term in PPO-x, and Stable Alignment uses SFT loss as the anchor).
1
0
5
@RuiboLiu
Ruibo Liu
2 years
LOL > Lol > lol, right?
2
0
5
@RuiboLiu
Ruibo Liu
3 months
The field runs so fast.
@ricburton
Ric Burton
3 months
Yann LeCun, a few days ago at the World Governments summit, on AI video: “We don’t know how to do this”
138
149
2K
0
0
5
@RuiboLiu
Ruibo Liu
1 year
The game begins.👏
@sundarpichai
Sundar Pichai
1 year
1/ In 2021, we shared next-gen language + conversation capabilities powered by our Language Model for Dialogue Applications (LaMDA). Coming soon: Bard, a new experimental conversational #GoogleAI service powered by LaMDA.
742
3K
15K
0
1
4
@RuiboLiu
Ruibo Liu
2 years
@narphorium @GoogleAI Thanks! And we have discussed the attempts of modeling world in the related work! Thanks for reading our paper!
0
0
4
@RuiboLiu
Ruibo Liu
2 years
If the number of tasks the model is optimized on >> the number of tasks normal human can be good at, you will have a false feeling that the model is “superhuman” — it can easily fail in the context they are not familiar with. Memorization ≠ Intelligence.
@Thom_Wolf
Thomas Wolf
2 years
there is a scary possibility that we may solve all the benchmarks we come up for AI... without understanding anything fundamentally deep about what intelligence is about a bummer for those like me who are see AI as a fantastic way to unlock deeper insights on human intelligence
35
44
452
0
0
4
@RuiboLiu
Ruibo Liu
28 days
Do they have potential pitfalls? 🤔 We laid out some concerns raised by existing work and talked about future directions in our minds. They are:
Tweet media one
1
0
4
@RuiboLiu
Ruibo Liu
1 year
So UL2 can do CoT just because it was pre-trained with a smart denoising object to make LMs better learn from contexts. And I guess it might have a size requirement? If UL2 can unlock CoT on smaller LMs that's amazing! (need check with @YiTayML
0
0
4