Ruibo Liu @RuiboLiu Twitter profile

Last Seen Profiles

@humaid22

@hss_bu

@GypsyJohnFury_

@summersjenmeow

@chocolatepiemq

@NoraLindstroem

@ttzfml

@bublichek_3000

@osoriochong

@David5730731231

@ChristofLot666

@frozenpandaman

@kekebee__

@its_mena98

@babbylawya

@surgeryisme

@SweeneyABC

@RwandaSouth

@JessieNYC

@pkpyn_

@benevolentamiss

@mrjaytee

@iskenderbyk_80

@AljabierS

@Susu_jpg

@stw_pdg

@wojempf

@RehburgKat89406

@heidi_klo

@elsoldmargarita

@koine78236430

@samantachocron

@jandakembangstw

@ParamedicsUK

@aysthecutest

Ruibo Liu

@RuiboLiu

2 years

Simulation is All You Need for Grounded Reasoning!🔥 Mind's Eye enables LLM to *do experiments*🔬 and then *reason* over the observations🧑‍🔬, which is how we humans explore the unknown for decades.🧑‍🦯🚶🏌 Work done @GoogleAI Brain Team this summer!

Aran Komatsuzaki

@arankomatsuzaki

2 years

Mind's Eye: Grounded Language Model Reasoning through Simulation Improves reasoning ability using MuJoCo simulations by a large margin (+27.9/46.0% zero/few-shot absolute acc. on average). LMs + Mind's Eye performs on par with 100x larger models.

11

81

392

12

65

328

Ruibo Liu

@RuiboLiu

1 year

🎲Life is a game. Play by your rules! 🎮 Stable Alignment enables LM to learn social norms from simulated everyday interactions in a social game! 👫 Check this out 👇:

5

50

271

Ruibo Liu

@RuiboLiu

1 year

First impression! I find Google's #Bard seems to be surprisingly safe (and high quality)! 👏 The game starts seriously! #ChatGPT

8

19

199

Ruibo Liu

@RuiboLiu

9 months

My takeaways after watching John Schulman's talk at #ICML2023 : 1. Over-optimization in reward modeling is a real problem. OpenAI's remedy is using larger RMs. 2. Best-of-N is actually more efficient than RL in spending PPL for alignment. 3. Multi-agent setup might be useful.

6

24

158

Ruibo Liu

@RuiboLiu

28 days

Thanks Aran for sharing our work! This is a survey paper I’ve been thinking about for a long time, as we have seen an increasing need for synthetic data. As we will probably run out of fresh tokens soon, the audience of this paper should be everyone who cares about AI progress.

Aran Komatsuzaki

@arankomatsuzaki

28 days

Google presents Best Practices and Lessons Learned on Synthetic Data for Language Models Provides an overview of synthetic data research, discussing its applications, challenges, and future directions

6

136

701

1

19

112

Ruibo Liu

@RuiboLiu

1 year

My new work on AI alignment! Life is a game. Play by your rules. Stay tuned for a detailed introduction! 😄

Aran Komatsuzaki

@arankomatsuzaki

1 year

Training Socially Aligned Language Models in Simulated Human Society - Presents a novel training paradigm that permits LMs to learn from simulated social interactions - Superior performance in alignment benchmarks and human evaluations

8

50

242

5

17

101

Ruibo Liu

@RuiboLiu

13 days

I believe any PhD who cares quality more than quantity works in single thread mode.

Wenhu Chen

@WenhuChen

13 days

Out of curiosity, do AI PhDs normally work (lead) on several projects simultaneously? I have never managed to work on more than one project during my PhD and I tried to convince my students not to do so. The paradigm might have already changed, so I am asking here.

22

10

75

4

84

Ruibo Liu

@RuiboLiu

2 months

Can LLM serve as a front-end engineer? 🪄 We present Design2Code, a new MM + code reasoning benchmark to evaluate such ability of current LLMs. Given several screenshots of the target website, the LLM agent needs to generate the corresponding code accurately!⚡️⚡️⚡️ More details…

AK

@_akhaliq

2 months

Design2Code How Far Are We From Automating Front-End Engineering? Generative AI has made rapid advancements in recent years, achieving unprecedented capabilities in multimodal understanding and code generation. This can enable a new paradigm of front-end development, in

28

296

1K

1

16

78

Ruibo Liu

@RuiboLiu

1 year

Sometimes I feel sad about current academia NLP research, since less and less ideas are ambitous in opening a new door for future, but more and more are essentially "trying hard to replicate some closed-source industry models". 😔

3

5

63

Ruibo Liu

@RuiboLiu

4 months

Now Stable Alignment is accepted to #ICLR2024 ! We developed a SandBox to obtain fine-grained alignment data at scale, and used simple contrastive learning to train models! We have released data/code/models. Please try it if you are interested!

Ruibo Liu

@RuiboLiu

1 year

🎲Life is a game. Play by your rules! 🎮 Stable Alignment enables LM to learn social norms from simulated everyday interactions in a social game! 👫 Check this out 👇:

5

50

271

1

8

60

Ruibo Liu

@RuiboLiu

6 days

"Agent Researchers" are HCI researchers in the new era.

3

7

53

Ruibo Liu

@RuiboLiu

5 months

Congrats to the team! 👏 Happy to contribute!

Google DeepMind

@GoogleDeepMind

5 months

We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google ’s largest and most capable AI model. Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵

173

2K

6K

4

2

51

Ruibo Liu

@RuiboLiu

28 days

Thanks AK for picking our work as one of the daily papers! 🚀🚀🚀

AK

@_akhaliq

28 days

Best Practices and Lessons Learned on Synthetic Data for Language Models The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs.

4

79

312

0

41

Ruibo Liu

@RuiboLiu

2 years

Instead of learning from human feedback with RL, this new paper shows LMs simply fine-tuned on human prompts can also better follow instructions! How to scale up these human written instructions might be the next good research question.

Quoc Le

@quocleix

2 years

New open-source language model from Google AI: Flan-T5 🍮 Flan-T5 is instruction-finetuned on 1,800+ language tasks, leading to dramatically improved prompting and multi-step reasoning abilities. Public models: Paper:

40

495

2K

0

7

38

Ruibo Liu

@RuiboLiu

5 months

At #NeurIPS2023 for two days on 12th and 13th. I will be mostly at GDM booth (so crowded!) but DM me if you want to chat about LMM development/alignment/career/etc f2f!

4

0

33

Ruibo Liu

@RuiboLiu

1 year

My new LM Alignment work #NeurIPS2022 ! Instead of learning demonstrations case by case, we teach the model to edit the value-unaligned text by inserting⬇️, deleting⬅️, and replacing🔄! Hall J #925 , 11/29/22 4pm. I will be there and happy to chat about moonshot research!

1

3

30

Ruibo Liu

@RuiboLiu

1 month

New factuality research! We use LMs as annotators & search engines for grounding to create a realistic benchmark for evaluating long-form factuality. Simulating your daily queries to LMs about knowledge & truth. 🔍📊 #NLProc #FactChecking Check this out! 👇

Jerry Wei

@JerryWeiAI

1 month

New @GoogleDeepMind + @Stanford paper! 📜 How can we benchmark long-form factuality in language models? We show that LLMs can generate a large dataset and are better annotators than humans, and we use this to rank Gemini, GPT, Claude, and PaLM-2 models.

9

77

370

0

4

28

Ruibo Liu

@RuiboLiu

4 months

And MERT (Music BERT) is also accepted to #ICLR2024 ! Congrats the team!

Yizhi Li

@yizhilll

11 months

1/ Excited to announce the release of our new paper "MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training"! We propose a self-supervised music understanding model, attaining overall SOTA performance on 14 MIR tasks.

6

77

339

0

3

28

Ruibo Liu

@RuiboLiu

7 months

Happy Chinese Moon Festival everyone! Try the moon cake 🥮 ! My favorite flavor is osmanthus lotus seed paste 🪷 🤤🤤🤤

1

0

27

Ruibo Liu

@RuiboLiu

8 months

Unlike a large number of benchmarks on natural language understanding, there is an absence of large-scale, easy-to-use, and general-purpose benchmarks for evaluating music representation understanding. Check out MARBLE, the new universal benchmark on music audio understanding,…

Ruibin Yuan

@abc43992899

11 months

1/ 🎉Excited to announce the release of our new benchmark "MARBLE: Music Audio Representation Benchmark for universaL Evaluation"! We provide a fair, reproducible, and extendable eval suite for general music understanding. 📄 🌐

1

29

74

2

24

Ruibo Liu

@RuiboLiu

3 months

Happy Lunar New Year! 🧧🧨 It’s the year of Dragon! 🐉 Best wishes to all my friends! 👋

Ambassador Nicholas Burns

@USAmbChina

3 months

As the Year of the Dragon approaches, my colleagues and I at U.S. Mission China extend our best wishes to the Chinese people & all those observing Lunar New Year around the world with this video tribute. 玉兔辞旧岁，金龙迎新春。

75

43

456

2

0

20

Ruibo Liu

@RuiboLiu

1 month

Tired of language models that overfit benchmarks but fall short in real-world use? 📉 Looking for a model that truly understands and cares about your global audience? 🌍 Check out the latest improved version of Gemma! 👇

Jeff Dean (@🏡)

@JeffDean

1 month

New versions of Gemma models with a bunch of improvements to instruction following, factuality, reasoning and more are now out. ⬇️

16

90

587

0

1

20

Ruibo Liu

@RuiboLiu

3 months

Please join the waitlist for trying it out! Long context is the crucial ability not only for textual input (e.g., summarizing a lengthy article), but also for visual or audio data understanding or retrieval (e.g., long seq of frames = video, long seq of notes = song, etc.).

Google DeepMind

@GoogleDeepMind

3 months

Introducing Gemini 1.5: our next-generation model with dramatically enhanced performance. It also achieves a breakthrough in long-context understanding. The first release is 1.5 Pro, capable of processing up to 1 million tokens of information. 🧵

47

426

2K

1

0

21

Ruibo Liu

@RuiboLiu

2 years

God I read nearly five papers coming from Jason in less than one month ... I'm still thinking about one idea back and forth in the past month ... 😥 Young OG of NLP @_jasonwei ! 😊

Aran Komatsuzaki

@arankomatsuzaki

2 years

Inverse scaling can become U-shaped Trained on 5x compute than those evaluated in the Inverse Scaling Prize, and three out of the four tasks exhibited what we call "U-shaped scaling".

8

17

168

2

20

Ruibo Liu

@RuiboLiu

10 months

As an AI alignment researcher for long, I'm happy to see this new collective efforts from @GoogleDeepMind , @AnthropicAI and @OpenAI on ensuring safe and responsible AI development. Fear always springs from ignorance. It's time to think aloud on AI safety.

Google DeepMind

@GoogleDeepMind

10 months

We’re excited to support the launch of the Frontier Model Forum - a new effort to ensure safe and responsible development of frontier AI systems featuring @Google , @AnthropicAI , @Microsoft and @OpenAI . Find out more:

20

51

279

0

1

19

Ruibo Liu

@RuiboLiu

6 months

Product Driven Applied Research v.s. Humanity Driven Moonshot Research

Satya Nadella

@satyanadella

6 months

We remain committed to our partnership with OpenAI and have confidence in our product roadmap, our ability to continue to innovate with everything we announced at Microsoft Ignite, and in continuing to support our customers and partners. We look forward to getting to know Emmett…

5K

15K

93K

0

1

19

Ruibo Liu

@RuiboLiu

11 months

@rasbt Stable Alignment shares a similar spirit.

2

1

19

Ruibo Liu

@RuiboLiu

2 years

Several interesting findings: 1. Mind's Eye can "unlock" the scaling law of reasoning when the curve becomes flat for certain reasoning tasks. 2. Mind's Eye can also benefit smaller LMs (e.g., 1.3B GPT-3 Ada). 3. The correctness of the in-context simulation is crucial.

1

0

19

Ruibo Liu

@RuiboLiu

9 months

Real Google quality research, although it is only a blog for now. Congrats the team!

Adam Pearce

@adamrpearce

9 months

Do Machine Learning Models Memorize or Generalize? An interactive introduction to grokking and mechanistic interpretability w/ @ghandeharioun , @nadamused_ , @Nithum , @wattenberg and @iislucas

20

255

1K

0

1

17

Ruibo Liu

@RuiboLiu

2 years

We believe the idea presented by Mind' Eye can be easily extended to other domains, where the "simulator" can be replaced by anything that can provide ground truth reliably. Here we demonstrate its effectiveness with an advanced physics engine MoJuCo, for physics reasoning.

1

0

17

Ruibo Liu

@RuiboLiu

2 years

#NAACL2022 is over. I met a lot of great student researchers and faculties here. I learned a lot and can’t wait to meet you again! 🥰 I hope to see more work on human value alignment in the future. 🤭

0

16

Ruibo Liu

@RuiboLiu

2 years

Time to optimize for the Nobel Prize. 😋

1

0

16

Ruibo Liu

@RuiboLiu

27 days

This is true haha. So for anyone who has an ArXiv submission "on-hold" for unclear reasons: Please double check whether you have keywords such as "time travel" in your text. This is another lesson we have learned. 😆😆😆

Jerry Wei

@JerryWeiAI

27 days

Fun fact: our paper was put on hold by arxiv for a while because arxiv detected that we used the phrase "time travel," which is a topic that arxiv frequently gets bad submissions for. When we Ctrl-F'd "time travel" in our paper, we had actually just cited a paper called "Time…

19

23

281

0

15

Ruibo Liu

@RuiboLiu

2 years

Just finish my careful #NeurIPS2022 reviews in #NAACL2022 ! Hope the community becomes better!

0

15

Ruibo Liu

@RuiboLiu

1 year

We have open-sourced everything! Models (base, SFT, aligned) can be downloaded at Huggingface (). Code and data can be found at . We also want to thank Meta AI and Stanford Alpaca team, for the great open-source effort! 🤗

GitHub - agi-templar/Stable-Alignment: Multi-agent Social Simulation + Efficient, Effective, and...

Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society". - ...

github.com

1

0

15

Ruibo Liu

@RuiboLiu

2 years

@denny_zhou How about “NLP - Natural Language Prompting”.

0

1

14

Ruibo Liu

@RuiboLiu

1 year

LLM solves how to generate grammarly correct text. I believe new techniques on interaction (both model and human side) should be the next chapter of NLP research. They will make these models even more useful. Check out this new position paper/survey which might give you a good…

Zekun Wang (Seeking 25Fall PhD) 🔥

@ZenMoore1

1 year

🔥Introducing our 110-page paper outlining the new paradigm in NLP: Interactive NLP (iNLP). It delves deep into critical challenges like alignment, hallucination, reasoning, tool-use, embodiment, simulated society, etc. This isn't ' #NLP is solved', it's 'NLP just got real'.🚀 1/N

3

36

108

0

14

Ruibo Liu

@RuiboLiu

2 years

Thanks to my great collaborators @_jasonwei @shaneguML @denny_zhou @TeYenWu2 for the valuable feedback and my supervisor at Dartmouth @CrashTheMod3 for the strong support! And special thanks to my mentor at Google @iamandrewdai ! I enjoy every minute when working with you all!

0

14

Ruibo Liu

@RuiboLiu

1 year

PhD advisors get hurt lol 😭 I think top PhD students should have both 1 and 2, and after they become professors, I hope they set the 2 as the goal for their students, instead of the 1. 👏

Jason Wei

@_jasonwei

1 year

Best AI skillset in 2018: PhD + long publication record in a specific area Best AI skillset in 2023: strong engineering abilities + adapting quickly to new directions without sunk cost fallacy Correct me if this is over-generalized, but this is what it seems like to me lately

63

176

2K

0

14

Ruibo Liu

@RuiboLiu

1 year

If ChatGPT can simulate the situation its generation is trying to describe, and somehow be aware of it, then it might be the ultimate safe chatbot! 👏

3

2

13

Ruibo Liu

@RuiboLiu

17 days

Let's test the model in the real wild!

lmsys.org

@lmsysorg

17 days

More exciting news today -- Gemini 1.5 Pro result is out! Gemini 1.5 Pro API-0409-preview now achieves #2 on the leaderboard, surpassing #3 GPT4-0125-preview to almost top-1! Gemini shows even stronger performance on longer prompts, in which it ranks joint #1 with the latest…

36

195

946

1

0

13

Ruibo Liu

@RuiboLiu

6 months

宁静，方能致远。

0

1

13

Ruibo Liu

@RuiboLiu

1 year

Thanks to my collaborators for the valuable feedback ( @RuixinYang6 @JiaChenyan @GeZhang86038849 ) and support from Google DeepMind ( @denny_zhou and my manager @iamandrewdai 👏). And special thanks to Professor Diyi @Diyi_Yang and Soroush @CrashTheMod3 for collaborative advising!

2

0

13

Ruibo Liu

@RuiboLiu

2 years

That’s my vision! Search engine is good at modeling the present 🌆, but simulation is modeling the future 🚀! I believe LLMs can do more than “summarizing the web better”.

Richard Blythman

@richardblythman

2 years

Current trend: Teaching AI models to browse the Web, use software tools and APIs e.g. @AdeptAILabs Future trend?: Teaching AI models to run simulations (e.g. physical/social/economic)🤯

1

0

6

0

1

13

Ruibo Liu

@RuiboLiu

9 months

Dang. I share this post NOT because Wenhu is a prominent researcher but because what he said is so correct.

1

0

11

Ruibo Liu

@RuiboLiu

2 years

RL + human feedback is the secret for many successful work. e.g., InstructGPT. Sparrow is another good example. My PhD work is nearly all about that. 😊 Hope more and more followups in this direction.

Google DeepMind

@GoogleDeepMind

2 years

Large language models can exhibit falsehoods, discriminatory language, and other unsafe behaviour. Introducing Sparrow: a dialogue agent that can search the internet and is trained to be more helpful, correct, and harmless using RL from human feedback: 1/

18

201

866

0

2

12

Ruibo Liu

@RuiboLiu

1 year

Stable Alignment can serve as an alternative to RLHF as it 1) shifts the burden of providing accurate value judgment from a proxy reward model onto the collective intelligence of social agents; 2) mitigates the reward gaming problem because we directly train on the game data. 😆

1

12

Ruibo Liu

@RuiboLiu

1 month

One-liner prompt attack💥

GX Xu

@GX_NLP

1 month

Even powerful LLM like Claude3 Opus breaks with the simplest attacks to start hallucinating about “non-existing” context about “steps”. The kind of mistake that a human 5 year old wouldnt make. 😉

2

1

8

0

10

Ruibo Liu

@RuiboLiu

2 years

My virtual friends at twitter, I am here @naacl in Seattle! 🛸 My first in-person conference after years of wfh! 🤭🤭🤭 Feel free to reach out! Let’s talk about crazy ideas on research 🔮 and enjoy the world-famous coffee ☕️ here! #NAACL2022

3

0

10

Ruibo Liu

@RuiboLiu

1 month

I’m already dreaming about how user-friendly the future Gemini API will be! Welcome Logan!

Logan Kilpatrick

@OfficialLoganK

1 month

Excited to share I’ve joined @Google to lead product for AI Studio and support the Gemini API. Lots of hard work ahead, but we are going to make Google the best home for developers building with AI. I’m not going to settle for anything less.

574

189

5K

0

10

Ruibo Liu

@RuiboLiu

1 year

We run social simulations on Sandbox with different LMs, and we find the overall optimality of alignment and engagement seems to reach a plateau after 10B, which indicates that a compact aligned LM is possible. Alignment training makes LMs behave properly with fewer interactions.

1

10

Ruibo Liu

@RuiboLiu

28 days

It's my pleasure working with @JerryWeiAI @hardy_qr @ChengleiSi @StevenyzZhang @JinmengR @HuaixiuZheng @dypeng , and supervised by @Diyi_Yang @denny_zhou @iamandrewdai !

0

1

10

Ruibo Liu

@RuiboLiu

11 months

Congrats the team! Music LM research step by step 🚶🏾‍♂️🚶🏼🚶🏻‍♀️!

Ruibin Yuan

@abc43992899

11 months

1/ 🎉Excited to announce the release of our new benchmark "MARBLE: Music Audio Representation Benchmark for universaL Evaluation"! We provide a fair, reproducible, and extendable eval suite for general music understanding. 📄 🌐

1

29

74

0

9

Ruibo Liu

@RuiboLiu

1 year

We also propose a new alignment algorithm (10 lines!), which aims to achieve a balance between learning from aligned responses and unlearning from misaligned ones. We leverage the difference in ratings to modulate the penalty in every mini-batch, and normalize with SFT loss.

1

0

9

Ruibo Liu

@RuiboLiu

1 year

[Ruibo's Hypothesis 🐷] Code pre-training mainly brings 1) code understanding, and 2) stronger in-context learning (as longer and structural context dependencies). CoT replies on "in-context solutions" shown to LMs, so any LLMs have strong enough in-context learning can do CoT.

Yao Fu

@Francis_YAO_

1 year

[Chain of thought originated from code] is a **hypothesis, **not a **conclusion. Yi has a solid counterpoint here. CoT scores might also have a decomposition like: +10 with C4, +15 with code, +15 with scale, +10 with tuning on CoT data .etc

4

1

52

1

2

9

Ruibo Liu

@RuiboLiu

1 year

Scaling law of a good researcher.

Jason Wei

@_jasonwei

1 year

My 2023 goals: - Spend 1,000 hours writing code (context: 377 hrs in 2022, 886 hrs in 2021) - Publish 0-2 first-author papers, but not more than 2 - Write 50 thoughtful tweets - Do 150 workouts

19

21

607

0

9

Ruibo Liu

@RuiboLiu

9 months

Checkout this workshop with great speakers, organized by my friends @ShayneRedford @Francis_YAO_ @yizhongwyz etc!

Shayne Longpre

@ShayneRedford

9 months

Excited to co-host the Instruction Tuning & Instruction Following (ITIF) Workshop at #NeurIPS2023 . 👷🛠️🤖 An incredible line up of topics and speakers. ➡️ Submission opening soon: Stay tuned!

0

5

43

0

9

Ruibo Liu

@RuiboLiu

9 months

Today’s gem: AI researchers are film directors. “When you only chase for impact (box office), you gradually loose reputation among researchers.” “When you only chase for creativity (independent artistic expression), you’ll have a hard time getting funding (film budget).”

Ruoshi Liu

@ruoshi_liu

9 months

Some naive thoughts about careers of AI researchers as a young uninformed PhD student 💭 An AI researcher has basically the same job as a film director. The three things you care about the most are: creativity, impact, and reputation. (1/n)

5

33

264

1

0

9

Ruibo Liu

@RuiboLiu

15 days

@YiTayML The hot take version of this is: Google does the real architecture research, while other companies take it for granted. All these companies are basically "data companies".

0

9

Ruibo Liu

@RuiboLiu

1 year

The simulated society is called Sandbox, where N social agents interact with each other following a protocol named Back-Scatter. Autonomous social agents need self-improve themselves continuously, by considering peer reviews and revising their responses to societal issues. 😄

1

0

8

Ruibo Liu

@RuiboLiu

4 months

Maybe neither, as I know either can be gamed. I will carefully read his/her solo first author papers. It will tell a lot about the student’s research independence, taste, and hardcore capabilities.

Zekun Wang (Seeking 25Fall PhD) 🔥

@ZenMoore1

4 months

As a professor recruiting PhD students, would you prefer a candidate with:

5

0

3

0

1

8

Ruibo Liu

@RuiboLiu

1 year

🔥 #ChatGPT still struggles with simple physics questions ... 😔 However, with a proper context (event reminding, sensory output, etc.), it could go back on the right track. 😋

1

2

8

Ruibo Liu

@RuiboLiu

1 year

Great suggestion! This should be the lesson 1 for new comers in this field. And I think future PhD programs should have more connections/collaborations with AI labs to narrow the gap. It will benefit the whole community.

Jason Wei

@_jasonwei

1 year

I’m hearing chatter of PhD students not knowing what to work on. My take: as LLMs are deployed IRL, the importance of studying how to use them will increase. Some good directions IMO (no training): 1. prompting 2. evals 3. LM interfaces 4. safety 5. understanding LMs 6. emergence

52

285

2K

1

8

Ruibo Liu

@RuiboLiu

3 months

Huge congrats! Very well deserved! 🎉🎉🎉

Diyi Yang

@Diyi_Yang

3 months

Very honored to have been selected as a #SloanFellow ! Huge thanks to my incredible students and my mentors ♥️

78

21

573

1

0

8

Ruibo Liu

@RuiboLiu

7 months

So cooooool!

Sasha Rush (ICLR)

@srush_nlp

7 months

Introducing COLM () the Conference on Language Modeling. A new research venue dedicated to the theory, practice, and applications of language models. Submissions: March 15 (it's pronounced "collum" 🕊️)

35

437

2K

0

1

7

Ruibo Liu

@RuiboLiu

6 months

Business question: which one will eat the world?

100% ChatGPT, $20/month

15

90% ChatGPT, $1/month

18

40% ChatGPT, homemade

10

0

1

7

Ruibo Liu

@RuiboLiu

1 year

Instruction tuning on benchmark tasks can only win benchmarks. But tuning on everyday instructions wins the world. If your goal is a user friendly product, that seems to be a natural choice.

Jingfeng Yang

@JingfengY

1 year

Is task-specific instruction tuning really needed? It seems that OpenAI and Alpaca directly tuned the base model with real-world user instructions (or similar diverse instructions). That’s probably the key decision made by OpenAI three years ago, which made it lead the game.

0

4

1

2

7

Ruibo Liu

@RuiboLiu

1 year

Please comment below if I missed citing your work! Share Stable Alignment with your friend who wants to start alignment training from a not-bad baseline! 😝

1

0

7

Ruibo Liu

@RuiboLiu

1 year

Best report on GPT series models. Must read!

Yao Fu

@Francis_YAO_

1 year

How did the initial #GPT3 evolve to today's #ChatGPT ? Where do the amazing abilities of #GPT3 .5 come from? What is enabled by #RLHF ? In this article with ⁦ @allen_ai ⁩ , we trace the emergent abilities of #LLM to their sources from first principles

31

335

1K

0

1

7

Ruibo Liu

@RuiboLiu

1 year

AGI -> No trust in public space -> Everyone cooks their in-house AI -> Low-quality public data -> Private / Domain-specific AI will win. Nobody can afford a train, but everyone can buy their own cars, because of freedom or privacy. Thoughts after reading Hinton's concerns.

0

7

Ruibo Liu

@RuiboLiu

10 months

I feel like the predictions and proposals revealed in OAI's blogs are way more worth reading than many incremental papers. And the word superalignment on superintelligence sounds much more concrete. The idea itself is definitely next level play.

OpenAI

@OpenAI

10 months

We need new technical breakthroughs to steer and control AI systems much smarter than us. Our new Superalignment team aims to solve this problem within 4 years, and we’re dedicating 20% of the compute we've secured to date towards this problem. Join us!

436

752

4K

1

0

7

Ruibo Liu

@RuiboLiu

7 months

Great work!

Keiran Paster

@keirp1

7 months

Introducing OpenWebMath, a massive dataset containing every math document found on the internet - with equations in LaTeX format! 🤗 Download on @HuggingFace : 📝 Read the paper: w/ @dsantosmarco , @zhangir_azerbay , @jimmybajimmyba !

26

268

1K

0

7

Ruibo Liu

@RuiboLiu

9 months

Super-alignment will be implemented in a SandBox.

Jim Fan

@DrJimFan

9 months

Hmmm, @OpenAI just acquired a company called "Global Illumination" that makes open-source Minecraft clone. What's next, multi-agent civilization sim running on GPT-5? Maybe Minecraft is indeed all you need for AGI? I'm intrigued.🤔 Announcement: Company:…

103

407

2K

0

1

7

Ruibo Liu

@RuiboLiu

22 days

Gem.

Albert Jiang

@AlbertQJiang

22 days

I love open-sourced models! Please add your favourites to the Mistral Convex Hull.

3

12

99

0

7

Ruibo Liu

@RuiboLiu

5 months

Congrats, Chenyan! 🎉

Chenyan Jia

@JiaChenyan

5 months

Can we design AI systems to consider democratic values as their objective functions? Our new #CSCW24 paper w/ @michelle123lam , Minh Chau Mai, @jeffhancock , @msbernst introduces a method for translating social science constructs into social media AIs (1/12)

3

21

99

0

6

Ruibo Liu

@RuiboLiu

3 months

@johnschulman2 My old (and award winning) paper , found GPT-2 was liberal leaning, and can be calibrated by RL during inference. We used ideology classifier trained with MIT's media cloud data (which classified media articles into liberal and conservative leaning) to…

Mitigating Political Bias in Language Models Through Reinforced Calibration

Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings. In...

arxiv.org

1

4

Ruibo Liu

@RuiboLiu

9 months

Great work!

Aran Komatsuzaki

@arankomatsuzaki

9 months

Studying Large Language Model Generalization with Influence Functions

3

45

180

0

1

6

Ruibo Liu

@RuiboLiu

8 months

Playing Dota seems to help AI research.

Ilya Sutskever

@ilyasut

8 months

Little known fact: Many of OpenAI’s key results, including the Dota 2 bot and the pre-training of GPT-4, are thanks to the brilliant Jakub Pachocki @merettm

29

272

1K

1

0

6

Ruibo Liu

@RuiboLiu

9 months

Your DATA should have progressive improvement by nature (our multi-agent + feedback setting do the job), so that the model can have more informative supervision in every mini-batch. It also helps stableness of learning.

1

0

6

Ruibo Liu

@RuiboLiu

2 years

@rtk254 @GoogleAI This is definitely related! Will cite in the next version!

0

6

Ruibo Liu

@RuiboLiu

2 months

Anthropic might target explicitly on its enterprise users so I think such evals are very valuable. My general feeling is that the front tier LLMs are probably in the same range of general capabilities but what really matters is the optimization for their most important users.

Yao Fu

@Francis_YAO_

2 months

The most significant domains that are being improved are finance and medicine, where presumably AI can boost much performance and efficiency

1

7

60

0

6

Ruibo Liu

@RuiboLiu

2 years

TLDR: more human data = better LMs.

Jason Phang

@zhansheng

2 years

Some folks are confused about FLAN/Flan, so here's an easy guide

1

2

27

0

6

Ruibo Liu

@RuiboLiu

9 months

It's not surprising to me. I think the key point of RLHF is not the RL; however, any method that can efficiently learn from progressive improvement in the human's answers would be legit. You can learn from ranking (RRHF), comparison (DPO), but not over-optimization is the core.

1

0

5

Ruibo Liu

@RuiboLiu

2 years

Many external tools could be the simulators!

Qian Liu 🔭

@sivil_taram

2 years

Really interesting, congrats to the authors!🎉 Excited to see so many works studying simulation in natural language processing tasks. Engines (e.g., MuJoCo) are actually *crystallized knowledge* of human reasoning, and such knowledge can be inherited by language models! 🤩

1

0

4

0

5

Ruibo Liu

@RuiboLiu

1 year

That is a profound judgment when you deeply understand the limits of neural nets. I think the best lesson I learned from GPT's success is their impressive data effort. 😆

Fangyu Liu

@hardy_qr

1 year

@RuiboLiu @JingfengY @ilyasut said in one of the interviews that neural nets are bad at OOD generalization so we need to make the world its distribution. That’s quite ambitious but extremely visionary looking back from today.

1

7

0

5

Ruibo Liu

@RuiboLiu

9 months

I have worked on RL-based alignment for many years, and the unstable learning / over-optimization issues made me headache all the time. The lesson I learned from Stable Alignment is, good alignment can only be achieved from both DATA and ALGO side.

1

0

5

Ruibo Liu

@RuiboLiu

2 years

"The reason why human can learn to use so many tools is because our brains are larger." I remember someone told me this when I was a kid.

Jason Wei

@_jasonwei

2 years

New survey paper! We discuss “emergent abilities” of large language models. Emergent abilities are only present in sufficiently large models, and thus they would not have been predicted simply by extrapolating the scaling curve from smaller models. 🧵⬇️

15

129

595

1

0

5

Ruibo Liu

@RuiboLiu

1 year

@jefffhj Yes, and finally we all become “API researchers”. 😔

1

0

5

Ruibo Liu

@RuiboLiu

1 year

I heard that ChatGPT is about 13B.

Dimitris Papailiopoulos

@DimitrisPapail

1 year

can someone explain the economics of chatgpt api calls being 10x cheaper than text-DV3? It's better/more accurate for many arithmetic stuff too. Makes no sense, unless chatgpt is a smaller model with internal api access to tools?

18

1

39

2

0

5

Ruibo Liu

@RuiboLiu

1 year

LLM service will become something like wifi. Cheap and ubiquitous. The LLM developers will be the new era network workers. “Hi can you help me set up the LLM connection at xxx address? Do we have a discount?” Finally everyone could be the “researcher” of their own wifi.

Luyu Gao

@luyu_gao

1 year

ChatGPT (gpt-3.5-turbo) works best for conversational tasks, InstructGPT (text-davinci-003) for zero-shot, and Codex (Code-DaVinci-002) for few-shot in-context learning. As an NLP researcher, I am still valuable today, at least as a three-way classifier.

13

92

1K

1

0

5

Ruibo Liu

@RuiboLiu

1 year

Any model-2 seems always to be good.

Google DeepMind

@GoogleDeepMind

1 year

PaLM-2 is a next generation large language model with improved coding, multilingual and reasoning capabilities. It will power over 25 new @Google products and features, bringing the latest in advanced AI to benefit people. Here’s how it’s being deployed already. ⬇️ #GoogleIO

22

239

1K

0

5

Ruibo Liu

@RuiboLiu

6 months

"Any training after pre-training is called Alignment."

Jack Rae

@drjwrae

6 months

My most contrarian take is that what is commonly termed alignment (rlhf in particular) is one of the most effective capability boosting techniques. For base models are difficult tools to use, and can fail spuriously with simple tasks. Post-training reveals a lot.

12

5

80

0

5

Ruibo Liu

@RuiboLiu

9 months

Your ALGO should be reliable and scalable, and normalizing it with SFT term is always a good idea (e.g., the last term in PPO-x, and Stable Alignment uses SFT loss as the anchor).

1

0

5

Ruibo Liu

@RuiboLiu

2 years

LOL > Lol > lol, right?

2

0

5

Ruibo Liu

@RuiboLiu

3 months

The field runs so fast.

Ric Burton

@ricburton

3 months

Yann LeCun, a few days ago at the World Governments summit, on AI video: “We don’t know how to do this”

138

149

2K

0

5

Ruibo Liu

@RuiboLiu

1 year

The game begins.👏

Sundar Pichai

@sundarpichai

1 year

1/ In 2021, we shared next-gen language + conversation capabilities powered by our Language Model for Dialogue Applications (LaMDA). Coming soon: Bard, a new experimental conversational #GoogleAI service powered by LaMDA.

742

3K

15K

0

1

4

Ruibo Liu

@RuiboLiu

2 years

@narphorium @GoogleAI Thanks! And we have discussed the attempts of modeling world in the related work! Thanks for reading our paper!

0

4

Ruibo Liu

@RuiboLiu

2 years

If the number of tasks the model is optimized on >> the number of tasks normal human can be good at, you will have a false feeling that the model is “superhuman” — it can easily fail in the context they are not familiar with. Memorization ≠ Intelligence.

Thomas Wolf

@Thom_Wolf

2 years

there is a scary possibility that we may solve all the benchmarks we come up for AI... without understanding anything fundamentally deep about what intelligence is about a bummer for those like me who are see AI as a fantastic way to unlock deeper insights on human intelligence

35

44

452

0

4

Ruibo Liu

@RuiboLiu

28 days

Do they have potential pitfalls? 🤔 We laid out some concerns raised by existing work and talked about future directions in our minds. They are:

1

0

4

Ruibo Liu

@RuiboLiu

1 year

So UL2 can do CoT just because it was pre-trained with a smart denoising object to make LMs better learn from contexts. And I guess it might have a size requirement? If UL2 can unlock CoT on smaller LMs that's amazing! (need check with @YiTayML

0

4