Pei Zhou @peizNLP Twitter profile | Pikagi

Pikagi

Pei Zhou

@peizNLP

1,857

Followers

893

Following

59

Media

322

Statuses

PhD @nlp_usc | Ex- @GoogleDeepMind , @GoogleAI , @allen_ai @AmazonScience @UCLA | Common Ground Reasoning for Communicative Agents | he/him

Los Angeles, CA

https://t.co/7FI7PsTEWw

Joined July 2011

Don't wanna be here? Send us removal request.

Pinned Tweet

@peizNLP

Pei Zhou

26 days

PhDone!!!! 👨‍🎓 08/2019-04/2024 What a journey 🥳🚞 I especially feel lucky to share this once-in-a-life-time moment with people I love ❤️ . And seeing my passion-driven research efforts being acknowledged by researchers I deeply admire 🌞!! Special thanks to my awesome committee

Tweet media one

Tweet media two

Tweet media three

35

12

201

Last Seen Profiles

@asu_ti_132

@MilesSm01899932

@mrxmra

@BitcoinWhales_

@Level1Level2

@KNielsen95896

@AlanBall20

@ratata_nft

@JacquiLowson

@dearie_jessica

@Thomasena_

@swtdrip23

@Blackrose81Gr

@AmericasVoice

@ACReSALNigeria

@Anushre5299784

@AfikryAibrahim

@Samerer6

@caostik

@good4britta

@jamiedupree

@HudsonHays2026

@Manu03020

@Beanman317

@THEREAL_DV

@bhumika_sha

@lamoni_sh

@IveyRebik36537

@AST96WONWOO

@Justin76811759

@RutheniumFund

@E0QzvBUebXxj5xz

@Carterluca16

@WTFtv

@hikaru_acht

@AkOdunpazariGnc

@peizNLP

Pei Zhou

8 months

Can LLMs translate reasoning into decision-making insights? Bad news: NO! Without any help, LLMs "thinking" doesn't really translate into "doing". Good news: A little bit of structure goes FaR! We present Foresee and Reflect (FaR), a 0-shot reasoning mechanism that boosts

Tweet media one

Tweet media two

9

60

258

@peizNLP

Pei Zhou

3 months

There is an old Chinese saying: 千人千面 (Thousands of people and thousands of faces). Just as each person is unique, each problem is unique. How do we prepare LLMs to solve complex unseen problems through reasoning? Our ( @USCViterbi , @GoogleDeepMind ) new paper: Self-Discover:

Tweet media one

4

56

257

@peizNLP

Pei Zhou

1 year

📍Introducing an AI Dungeon Master’s Guide🧙‍♂️, or how to make a #DnD DM dialogue agent trained with intents and theory of mind-inspired💭reinforcement learning. Predicting how your players will react to you ahead of time makes for a better DM! 📃

Tweet media one

3

32

114

@peizNLP

Pei Zhou

3 years

Three (co) first-author #EMNLP2021 papers after multiple rounds of rejections and iterations. Feeling extremely glad and relieved that not giving up actually will pay off in the end. Kudos to those whose work got rejected and are fighting to try again, the day will come!! 🎈

Tweet media one

Tweet media two

Tweet media three

7

3

108

@peizNLP

Pei Zhou

2 years

With the official letter arrived, I’m thrilled to say that I’ll be interning at @allen_ai @ai2_mosaic team this summer in Seattle!! It has always been my dream to work on common sense research at Mosaic😊. Also equally excited to meet new and old friends in the Seattle area✈️

Tweet media one

4

3

85

@peizNLP

Pei Zhou

5 months

I’m at #NeurIPS23 and on the job market🎷🧳!! Come and talk about anything LLM reasoning, evaluating communicating agents, human-AI collaboration for new discoveries, coffee and jazz in NOLA☕️

Tweet media one

2

8

83

@peizNLP

Pei Zhou

2 years

I'm a "PhC"(andidate) now, one more alphabet to climb🧗!! Just passed my qualifying exam @CSatUSC @nlp_usc . Super grateful to my wonderful committee @robinomial @yanliu_usc @toby_mintz and advisors @xiangrenNLP @jay_mlr . Can't wait to see what this journey leads to next 🛤️🚂

Tweet media one

Tweet media two

8

5

64

@peizNLP

Pei Zhou

3 years

Have you used ConceptNet or COMeT in your model? Be more mindful because they might conflate *commonsense knowledge* with *harmful biases*. Our new preprint explores this issue 🧐. w/ @NinarehMehrabi @fredmorstatter @jay_mlr @xiangrenNLP @aram_galstyan

Tweet media one

1

11

65

@peizNLP

Pei Zhou

3 years

‼️New Paper‼️ Yeah, knowledge-grounded response generation models are 🆒, but have you tried using a single model to externalize the implicit common sense *and then* produce responses? We propose a “Think🤔-Before-Speak🗣️” self-talk model that generates better responses!🧵 [1/7]

Tweet media one

Tweet media two

1

13

47

@peizNLP

Pei Zhou

1 year

Our #DnD paper accepted at #ACL2023 🧝‍♂️🎉. Roll a perception check to catch us in Toronto, I’ll be wearing my “Dungeon Meowster” shirt 😼

Tweet media one

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

1 year

Now in #ACL2023 !! Look forward to @peizNLP 's presentation! See y'all in Toronto and let's chat #DnD dialogue, theory of mind, and all things interactive NLP!! Camera ready soon!

0

11

51

2

4

47

@peizNLP

Pei Zhou

3 years

🚨 Can response generation models read between the lines? Our 🆕 #EMNLP2021 paper probes RG models to see if they can identify common sense reasons by annotating CS explanations in dialogues and evaluating RG models for CS reasoning capabilities.

Tweet media one

1

14

45

@peizNLP

Pei Zhou

1 year

🎯Want your dialogue model to generate impressive responses like #ChatGPT 🤯but don’t have the compute like #OpenAI ? Try our new data Reflect💡accepted to #EMNLP2022 that helps models generate 30% more quality responses! Our secret is annotating common ground between speakers 🧵

1

16

42

@peizNLP

Pei Zhou

3 years

Excited to introduce 𝙍𝙄𝘾𝘼, a logically-grounded challenge to probe LMs' ability to make robust commonsense inferences despite textual perturbations. 🧵 Preprint: Project Page:

Tweet media one

Tweet media two

1

5

38

@peizNLP

Pei Zhou

10 months

Had so much fun presenting the poster!!! 😃 Do reach out if we talked during the session or not and you are interested! I love my team❤️

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

10 months

The AI DnD dream team reunited right here!! #ACL2023 Andrew Zhu @_jennhu @xiangrenNLP @peizNLP

Tweet media one

0

4

44

1

4

31

@peizNLP

Pei Zhou

1 year

Is the quality of *ACL reviews inverse proportional to the quality of LLMs? 🤷‍♂️ 1 review I received does not have any reasons to accept/reject and 1 review's reason to reject basically only spells out "ChatGPT" 🤡

3

0

30

@peizNLP

Pei Zhou

10 months

POV: your intern mentor bought a dungeon cat shirt to pair with yours at ACL presentation ❤️👯‍♂️

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

10 months

The Dungeon Meowsters are live in Toronto for #ACL2023 to talk all things: #DnD , theory of mind, multi agent grounded dialogue, reinforcement learning, table top games, and more!! Catch @peizNLP at 4 pm today at Session 8!!

Tweet media one

1

8

65

1

1

26

@peizNLP

Pei Zhou

11 months

I’m also at ACL now! Already met bunch of familiar faces on my way from airport to the venue! Excited to chat about theory-of-mind, communicating agents, NLP+games, life and anything!! #ACL2023 Pic taken on flight🌅 🍁

Tweet media one

0

2

26

@peizNLP

Pei Zhou

3 years

💬Excited to finally release my last summer's intern project that was accepted at @sigdial 2021! We align dialogs with ConceptNet triples and crowdsourced new ones prompted from SocialIQA. Models trained on these dialogs produce better responses! Link:

Tweet media one

@AmazonScience

Amazon Science

3 years

Learn more about the large, multiturn, open-domain dialogue dataset that is focused on commonsense knowledge: #AmazonScience

Tweet media one

0

4

13

2

7

23

@peizNLP

Pei Zhou

3 years

𝙍𝙄𝘾𝘼 is now accepted at #EMNLP2021 !! We've updated our arXiv version: @nlp_usc

Tweet card media

RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axioms

Pre-trained language models (PTLMs) have achieved impressive performance on commonsense inference benchmarks, but their ability to employ commonsense to make robust inferences, which is crucial...

@peizNLP

Pei Zhou

3 years

Excited to introduce 𝙍𝙄𝘾𝘼, a logically-grounded challenge to probe LMs' ability to make robust commonsense inferences despite textual perturbations. 🧵 Preprint: Project Page:

Tweet media one

Tweet media two

1

5

38

1

7

22

@peizNLP

Pei Zhou

5 years

Our EMNLP paper "Examining Gender Bias in Languages with Grammatical Gender" is on (w/ @WeijiaShi2 @jieyuzhao11 @kuanhao_ @muhao_chen @ryandcotterell @kaiwei_chang ). We separate semantic and grammatical gender info and found asymmetry between genders.

Tweet media one

Tweet media two

2

6

19

@peizNLP

Pei Zhou

3 years

Super excited that this simple idea to externalize knowledge-grounding in RG seems to be working😃! Project from my (second-time!) intern @AmazonScience Alexa AI. Huge thanks to my co-authors @AmazonScience @nlp_USC @USC_ISI !! Paper preview: 🧵 [7/7]

Tweet media one

1

8

18

@peizNLP

Pei Zhou

2 years

The usc nlp fam🥰🔥

@xiangrenNLP

Sean (Xiang) Ren

2 years

USC NLP ( @nlp_usc ) is like 5x of its size in the last in-person conference (Hong Kong, 2019) 👀 Can’t say enough about how amazing this #NAACL2022 conference is for our students. So much needed 😭🙌

Tweet media one

Tweet media two

Tweet media three

0

16

104

0

0

17

@peizNLP

Pei Zhou

3 years

Happy to share that this work has been accepted at #EMNLP2021 ! 📜Paper: 🤖Code:

Tweet card media

GitHub - Ninarehm/Commonsense_bias

Contribute to Ninarehm/Commonsense_bias development by creating an account on GitHub.

@peizNLP

Pei Zhou

3 years

Have you used ConceptNet or COMeT in your model? Be more mindful because they might conflate *commonsense knowledge* with *harmful biases*. Our new preprint explores this issue 🧐. w/ @NinarehMehrabi @fredmorstatter @jay_mlr @xiangrenNLP @aram_galstyan

Tweet media one

1

11

65

1

7

17

@peizNLP

Pei Zhou

5 years

Super excited to say that I’ve decided to accept USC CS PhD offer starting fall 2019 @USC_ISI @USCViterbi @CSatUSC @nlp_usc !!! Will work on Machine Common Sense and #NLProc with Professors @xiangrenUSC and Jay Pujara.

1

0

15

@peizNLP

Pei Zhou

1 year

I live 15 min away and was planning on attending the same event tomorrow. Can’t put to words how sad and terrified to hear about this tragedy happening on the lunar new year eve. Hope we all have strength to get through this and fight for some changes in 2023.

@BBCBreaking

BBC Breaking News

1 year

Several people shot in Monterey Park near Los Angeles, where thousands had earlier gathered for Lunar New Year, US media report

177

537

960

0

0

15

@peizNLP

Pei Zhou

2 months

🚩Exciting foundational work towards building an ever-improving interactive agent: - Identifying user intents/tasks and - Inferencing user satisfaction Both from unstructured chat logs are necessary first-steps for self-improving. Esp. found this fig of how ppl are actually

Tweet media one

@ylongqi

Longqi Yang

2 months

Learning from interaction is different from learning from annotations. Today we are excited to share how we are starting to learn from people's interactions to understand and improve Copilot (web) for our consumer customers: #Microsoft #Copilot #Bing

1

9

39

2

3

12

@peizNLP

Pei Zhou

4 years

Happy to announce that our paper on incorporating commonsense KG in LMs for social reasoning () has received The Best Paper Award at EMNLP-Deep Learning Inside Out (DeeLIO) Workshop! ✌️ @emnlp2020 #emnlp2020 #deelioEMNLP

Tweet card media

Incorporating commonsense knowledge graph in pretrained models for social commonsense tasks

Pretrained language models have excelled at many NLP tasks recently; however, their social intelligence is still unsatisfactory. To enable this, machines need to have a more general understanding of...

www.amazon.science

1

0

13

@peizNLP

Pei Zhou

3 months

Thanks for covering our paper!

@CShorten30

Connor Shorten

3 months

Self-Discover is a fascinating new algorithm from researchers at Google DeepMind and USC that searches for "atomic reasoning modules" One of the quickest ways to improve your LLM programs is to add Chain-of-Thought, "Let's think step by step ...", another primitive like this

Tweet media one

9

40

165

0

2

11

@peizNLP

Pei Zhou

5 years

Attending @IC2S2 in Amsterdam! Will present our work on automatically detecting politically-polarized words from online discussions (w/ Yupeng Gu, @YizhouSun , and @masonporter ) on July 20!!

Tweet media one

Tweet media two

Tweet media three

1

1

12

@peizNLP

Pei Zhou

7 months

Exciting work led by @aman_madaan on mixing and matching LMs to get performance boost with reasonable costs! Meta-verifiers help consolidate verification results and make better decisions on model routing😎

@aman_madaan

Aman Madaan

7 months

Language model APIs now come in all shapes and sizes ( @OpenAI , @AnthropicAI , @togethercompute ), with prices varying by up to 50x (Ada < Llama7b < Chatgpt < GPT-4). It makes sense to mix and match them, using smaller models for simpler queries and saving the $$ for the more

Tweet media one

4

32

182

0

1

11

@peizNLP

Pei Zhou

3 years

At #EMNLP2021 and interested in how to create logically-equivalent🧑‍🔬 but linguistically-varied probing sets and how big LMs 🤖 perform? Come to our 7D Oral session today 12:45-14:15 PST/16:45-18:15 AST where I present 𝙍𝙄𝘾𝘼! @nlp_usc

@peizNLP

Pei Zhou

3 years

Excited to introduce 𝙍𝙄𝘾𝘼, a logically-grounded challenge to probe LMs' ability to make robust commonsense inferences despite textual perturbations. 🧵 Preprint: Project Page:

Tweet media one

Tweet media two

1

5

38

0

3

11

@peizNLP

Pei Zhou

1 year

🥳Check out our theory of mind workshop at #ICML2023 Lots of new discussions on whether LLM displays some extent of theory of mind. Want to share your thoughts/hear more of how diff fields view ToM? Come to our workshop in July in Honolulu��️

@tom_icml2023

ToM Workshop

1 year

1. 🔔**𝘾𝙖𝙡𝙡 𝙛𝙤𝙧 𝙋𝙖𝙥𝙚𝙧𝙨 𝙛𝙤𝙧 𝙏𝙝𝙚𝙤𝙧𝙮-𝙤𝙛-𝙈𝙞𝙣𝙙 𝙒𝙤𝙧𝙠𝙨𝙝𝙤𝙥**🔔 The First Workshop on Theory of Mind in Communicating Agents (ToM 2023) will be hosted at @icmlconf in July'23 in Honolulu 🌺 CfP: 🧵 #ICML2023 #ToM2023 #ML #NLProc

2

34

91

0

0

11

@peizNLP

Pei Zhou

1 year

💫🤩Can't express how much fun I've had working on this project starting my internship @ai2_mosaic @allen_ai with the best coauthors on this adventure: Andrew Zhu, @_jennhu @jay_mlr @xiangrenNLP @ccb @YejinChoinka @rajammanabrolu Work in progress so any feedback is appreciated🙇

0

0

10

@peizNLP

Pei Zhou

1 year

remember back in 2020 (wow a long time ago) I got nervous receiving long and detailed reviews but learned a lot on how to do better research, now when it's release day I just feel numb to see generic and short reviews without any motivation to engage in deep scientific discussion

0

0

10

@peizNLP

Pei Zhou

10 months

- wonder why we chose #DnD for goal-driven grounded dialogs? - how we use RL to model theory-of-mind-inspired lookahead module? - how good is #GPT4 as a dungeon master? - where did I buy my dungeon meowster shirt?! Come by our #ACL2023 poster today at *4:15pm*🐲😼

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

1 year

Now in #ACL2023 !! Look forward to @peizNLP 's presentation! See y'all in Toronto and let's chat #DnD dialogue, theory of mind, and all things interactive NLP!! Camera ready soon!

0

11

51

0

3

10

@peizNLP

Pei Zhou

8 months

Huge thanks to all my brilliant co-authors🌟 @aman_madaan @SriVidya_1729 @adityagupta2211 @empiricallykev @universeinanegg @jay_mlr @xiangrenNLP @Swarooprm7 @aidanematzadeh @shyamupa @manaalfar This work also benefits a ton from discussions with wonderful researchers and friends

0

0

10

@peizNLP

Pei Zhou

10 months

Really enjoyed the keynote from @AlisonGopnik Esp. on how #cogsci and #devpsych can provide insights for #LLM research and more rigorous exp designs for cog abilities such as #ToM Very excited about designing and exploring proper evaluation along these lines, stay tuned 👀

@annargrs

Anna Rogers 🇺🇦🇪🇺 is looking for postdocs!

10 months

A few highlights from @AlisonGopnik keynote at #ACL2023 . Do watch it, it was amazing! A few highlights in this thread. /1

Tweet media one

1

21

126

1

1

9

@peizNLP

Pei Zhou

5 years

One paper accepted at @acl_srw !! Can’t wait to go to Florence for @ACL2019_Italy this summer🛫

@ACL2019_Italy

ACL2019

5 years

... and finally also notifications for student research workshop have been sent out. Congratulations to the smart students who will present their work at #acl2019nlp !

0

0

23

1

0

8

@peizNLP

Pei Zhou

3 years

🙋Catch us at the Ethics session for our paper on biases in commonsense knowledge bases tmr 8:30-10:30 AM PST/12:30-2:30 PM AST. Can't wait to chat with old and new friends!!!! @nlp_usc

@NinarehMehrabi

Ninareh Mehrabi

@NinarehMehrabi

3 years

We will be presenting this work tomorrow with @peizNLP at the virtual poster session II: Ethics and NLP (8:30 PT; 12:30 AST). Come say hi if you are around 😊 #EMNLP2021

0

4

19

1

1

9

@peizNLP

Pei Zhou

4 years

Maybe the days that I’m constantly worried about renewing my visa and afraid of cannot continue my studies once I leave the country have finally passed. #GodBlessAmerica

0

0

9

@peizNLP

Pei Zhou

4 years

Same for international students, especially PhDs. Many simply cannot risk their early academic career to go back home and see loved ones. My grandparents’ health conditions are worsening and not being able to go back breaks my heart every time I video chat with them.

@blekhman

Ran Blekhman

4 years

It sucks not being able to visit your family Just ask immigrant scientists who haven't seen their family in years because of visa issues & costs, and the very real possibility they would not be allowed back in the country It's time to fix the US visa system

6

114

1K

0

0

9

@peizNLP

Pei Zhou

5 years

Will be presenting this paper at the poster session 1:30-3:00. Come by and say hi!! @emnlp2019 #emnlp2019 #fairness

@peizNLP

Pei Zhou

5 years

Our EMNLP paper "Examining Gender Bias in Languages with Grammatical Gender" is on (w/ @WeijiaShi2 @jieyuzhao11 @kuanhao_ @muhao_chen @ryandcotterell @kaiwei_chang ). We separate semantic and grammatical gender info and found asymmetry between genders.

Tweet media one

Tweet media two

2

6

19

0

1

7

@peizNLP

Pei Zhou

2 months

Thanks for sharing💡💡💡

@ecardenas300

Erika Cardenas

2 months

Sunday morning read: Self-Discover: Large Language Models Self-Compose Reasoning Structures by @peizNLP et al. ☕️ When building LLM applications, developers typically break down complex tasks into sub-tasks. What if we use an LLM to "self-discover" how to break down the tasks?

Tweet media one

5

48

199

1

2

8

@peizNLP

Pei Zhou

8 months

The ArXiv link pdf is not showing properly even tho it did yesterday (I've emailed @arxiv ). Here's a drive link of paper pdf, sorry about that!

T4D-FaR-Preprintpdf.pdf

drive.google.com

0

1

8

@peizNLP

Pei Zhou

6 months

Ari's insights have always made me look at research questions in new ways that make me more passionate in pursuing them, it's magic! Join his lab🔥

@universeinanegg

Ari Holtzman

@universeinanegg

6 months

If you want a respite from OpenAI drama, how about joining academia? I'm starting Conceptualization Lab, recruiting PhDs & Postdocs! We need new abstractions to understand LLMs. Conceptualization is the act of building abstractions to see something new.

14

63

277

0

0

8

@peizNLP

Pei Zhou

3 months

Self-Discover consists of 2 stages: 1) Discovery stage where we use meta-prompts to guide LLMs to generate a structure given a few task examples (without labels!) 2) Solving stage where we simply append the structure on each task instance. (Full prompts in the paper)

Tweet media one

1

0

8

@peizNLP

Pei Zhou

4 months

@deliprao @allen_ai Thanks for sharing! The paper of this dataset () actually was released in late 2019. Maybe more old (🤔) test dataset/ideas should be revisited to eval LLMs!

Tweet card media

CommonGen: A Constrained Text Generation Challenge for Generative...

Recently, large-scale pre-trained language models have demonstrated impressive performance on several commonsense-reasoning benchmark datasets. However, building machines with commonsense to...

1

0

8

@peizNLP

Pei Zhou

3 months

@valentina__py @rajammanabrolu nobody asked but I really got my breath taken away on the Luzern-Interlaken Express in🇨🇭(you probably know)

Tweet media one

0

0

7

@peizNLP

Pei Zhou

8 months

Thanks for sharing❤️ I wrote a thread here!

@_akhaliq

AK

8 months

How FaR Are Large Language Models From Agents with Theory-of-Mind? paper page: "Thinking is for Doing." Humans can infer other people's mental states from observations--an ability called Theory-of-Mind (ToM)--and subsequently act pragmatically on those

Tweet media one

2

116

438

0

0

7

@peizNLP

Pei Zhou

3 months

Thanks for covering our paper! @VentureBeat

@VentureBeat

VentureBeat

3 months

Google Deepmind proposes ‘self-discover’ framework for LLMs, improves GPT-4 performance

Tweet media one

2

14

25

0

0

7

@peizNLP

Pei Zhou

3 months

We prepared some fun examples of self-discovered structures on different tasks. We compared Self-Discover with CoT/Plan-Solve and human-written structures, check em out here!

Tweet media one

Tweet media two

Tweet media three

1

0

6

@peizNLP

Pei Zhou

3 months

Love to see more work on AI-assisted Decision Making !👇

@olliezliu

Ollie Liu

3 months

New Preprint Alert! 📢 Classical decision theory has helped humans make rational decisions under uncertainty for decades. Can it do the same for Large Language Models? We present DeLLMa (“dilemma”), a Decision-making LLM assistant. 🔗 1/🧵

Tweet media one

1

19

116

1

0

7

@peizNLP

Pei Zhou

6 months

Hard to go wrong with Raj, SD, and boba if you want coolness and fun during PhD🌟🌟🌟

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

6 months

The PEARLS Lab at @UCSD_CSE is now open for business! I'm recruiting Fall 24 PhD students in all things interactive and grounded AI, RL, and NLP!! Join us in the land of 🏖️ beach (🧋pearl tea included). Apply by Dec 20. Please help spread the word! More:

Tweet media one

8

65

242

0

1

6

@peizNLP

Pei Zhou

1 year

Apply to work with Raj!!! Hands down the coolest and so much fun collaborating with♥️👾🕹️

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

1 year

Soon™, I'll be an Asst Prof @UCSanDiego @UCSD_CSE focusing on interactive & grounded AI, RL, NLP I will also be a research scientist @MosaicML helping lead efforts to make tech like RLHF more accessible Looking for PhD students & research eng/scientists to join me in ☀️SoCal🏖️

Tweet media one

Tweet media two

Tweet media three

75

41

548

1

0

6

@peizNLP

Pei Zhou

3 months

Further analysis shows the discovered structures display some extent of “universality”, transferable across multiple Large/Small LMs. It retains more performance compared to prompt optimization when transferred to different LMs. This effectively alleviates model-switching cost

Tweet media one

1

0

6

@peizNLP

Pei Zhou

3 months

That’s all folks (for now)! Grateful for my brilliant co-authors🥳 And thanks @_akhaliq and @arankomatsuzaki for sharing our paper!

@arankomatsuzaki

Aran Komatsuzaki

@arankomatsuzaki

3 months

Google presents Self-Discover The proposed method improves LLM performance on reasiong tasks It outperforms Self-Consistency by more than 20%, while requiring 10-40x fewer inference compute

Tweet media one

7

47

305

1

0

5

@peizNLP

Pei Zhou

5 months

And don’t miss @aman_madaan ‘s presentation of our work on AutoMix for more Efficient LLM inference on Friday!

@aman_madaan

Aman Madaan

5 months

Looking forward to discussing our recent work on using inference-time compute for effective reasoning at #NeurIPS2023 ! 🗓️ Self-Refine: Iterative Refinement with Self-Feedback, Wed 13 Dec 5 p.m., Great Hall & Hall B1+B2 (level 1) Poster #324 🗓️ AutoMix:

3

19

87

0

0

4

@peizNLP

Pei Zhou

3 months

We test on 25 challenging reasoning tasks including BigBench-Hard (23 sub-tasks), Thing4Doing, and MATH and improvements are pretty consistent! We also find that Self-Discover helps most on complex reasoning tasks requiring world knowledge.

Tweet media one

Tweet media two

1

0

6

@peizNLP

Pei Zhou

1 year

Arxiv link ready 📜

@peizNLP

Pei Zhou

1 year

📍Introducing an AI Dungeon Master’s Guide🧙‍♂️, or how to make a #DnD DM dialogue agent trained with intents and theory of mind-inspired💭reinforcement learning. Predicting how your players will react to you ahead of time makes for a better DM! 📃

Tweet media one

3

32

114

0

2

6

@peizNLP

Pei Zhou

8 months

Why is acting hard? We analyze 3 oracle settings with provided inferences and find that: whenever LLMs are given hints on what to reason about, they can choose actions much better! Bottleneck → LLM struggles at identifying implicit inferences by itself.

Tweet media one

1

0

5

@peizNLP

Pei Zhou

5 months

Here’s my LinkedIn, happy to connect! Appreciate if you could help spread the words if you know someone is hiring❤️

Tweet card media

Pei Zhou - Google DeepMind | LinkedIn

https://shaoxia57.github.io/ I am a final-year PhD candidate in computer science at… | Learn more about Pei Zhou's work experience, education, connections & more by visiting their profile on LinkedIn

www.linkedin.com

2

0

4

@peizNLP

Pei Zhou

5 years

@rikvannoord Would cross-lingual analogy (to test bilingual word embedding) make better sense? Like A:B :: C:D where A, B are in one language and C,D in another. In this way D naturally cannot be either A or B.

0

0

5

@peizNLP

Pei Zhou

3 years

@Leox1v95 Super cool, thanks for sharing!! We also have a new paper on similar idea to externalize implicit commonsense knowledge in response generation would love to connect and potentially discuss about this interesting direction sometime😃

Tweet card media

Think Before You Speak: Explicitly Generating Implicit Commonsense...

Implicit knowledge, such as common sense, is key to fluid human conversations. Current neural response generation (RG) models are trained to generate responses directly, omitting unstated implicit...

1

0

5

@peizNLP

Pei Zhou

1 month

Even from my brief interaction with Mohamed, I already got so impressed by his insights and learned a ton. Working with him in Nairobi would be a great delight!!

@m_0_a

Mohamed Ahmed

1 month

We research, build and ship some of the most exciting and challenging products in the world, with passion from Nairobi. Come join us build M365 Copilot. We ❤️ engineers and scientists that ❤️ breaking and building things! @MasakhaneNLP @DeepIndaba

2

15

42

1

0

5

@peizNLP

Pei Zhou

2 years

#DnD presents great data and testbed to study intentionality and grounding for dialogue agents and #nlproc in general, check this 💎💎paper/data out!

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

2 years

I've been waiting for this to drop publicly for a while. A great layout of why #DnD is so hard for AI by @ccb @LangTechLara @daphneipp and others.

1

4

19

0

0

5

@peizNLP

Pei Zhou

3 months

Since our meta-prompts operate on task-level, Self-Discover is also very efficient compared to inference-intensive methods like Self-Consistency. Here we find Self-Discover can even outperform methods requiring 40x more calls per instance!

Tweet media one

1

0

5

@peizNLP

Pei Zhou

3 years

We hope our study motivates more research in making RG models emulate human reasoning processes! Paper: Project Page: Huge thanks to my co-authors @PegahJM , @HJCH0 @billyuchenlin @jay_mlr @xiangrenNLP

Tweet media one

0

5

4

@peizNLP

Pei Zhou

8 months

We propose a new eval paradigm: Thinking for Doing (T4D) to probe whether LLMs can act based on inferences abt others’ mental states (theory-of-mind). We convert an inference-probing benchmark ToMi to action-probing T4D and LLMs drop significantly!

Tweet media one

1

0

4

@peizNLP

Pei Zhou

8 months

We design Foresee and Reflect to guide models to first predict potential future events then reflect on what actions to perform. FaR-structured reasoning outperforms chain-of-thought etc. and boosts GPT-4 from 50% to 71%.

Tweet media one

1

0

4

@peizNLP

Pei Zhou

3 years

Can’t agree more

@roydanroy

Dan Roy

3 years

It's easy to reject papers. You can manufacture issues and sink them based on amorphous/vague notions such as novelty, impact, clarity, etc. Every paper has minor problems that you can amplify. It takes a lot more courage to argue to accept something.

15

103

973

0

0

4

@peizNLP

Pei Zhou

3 months

@_akhaliq @arankomatsuzaki Thanks @_akhaliq for featuring us in daily papers on HuuggingFace!

@_akhaliq

AK

3 months

Google Deepmind presents Self-Discover Large Language Models Self-Compose Reasoning Structures paper page: SELF-DISCOVER substantially improves GPT-4 and PaLM 2's performance on challenging reasoning benchmarks such as BigBench-Hard, grounded agent

Tweet media one

4

171

769

0

0

4

@peizNLP

Pei Zhou

2 years

“It’s pitch black. We are (happily) eaten by a GRUE” super exciting work!!! 👾

@rajammanabrolu

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

2 years

The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing 💻RL4LMs: library to train any @huggingface LM w/ RL 👾GRUE: benchmark of 6 NLP tasks+rewards 📈NLPO: new RL alg 4 LMs 🌐

6

115

417

0

0

3

@peizNLP

Pei Zhou

1 year

Amazed to see how both of🥤Soda quantity and quality is on a different level. Wonderful resource and model for dialogue studies🔥🔥🔥

@hyunw__kim

Hyunwoo Kim

1 year

Have you ever wished for a large-scale public dialog dataset with quality? We'd like to tell you that your wish has finally come true🎄 To quench your thirst, we give you SODA🥤, the first MILLION-scale HIGH-quality dataset with RICH social interaction✨ 🧵

8

32

132

0

0

4

@peizNLP

Pei Zhou

1 year

Check out our poster at #EMNLP2022 !! I will not be there in person but the amazing @HJCH0 will present on *December 10th at 9AM* local time at the *Atrium* !

Tweet media one

0

1

4

@peizNLP

Pei Zhou

3 years

RICA is the first project in my PhD and it has been a long (sometimes exhausting) and fun journey🏃. I'm extremely grateful for all the help along the way, especially my amazing co-authors @ark_kade , Seyeon Lee, @billyuchenlin , Danial Ho, @jay_mlr , and @xiangrenNLP !!

0

0

4

@peizNLP

Pei Zhou

1 year

Results show promising signals of: 1) IDM-labeled data helps better guidance generation models; 2) Explicit intent modeling benefits goal-driven dialog generation; 3) ToM-RL modeling🪄 leads DM model to generate significantly better guidance as rated by humans+automatic metrics.

Tweet media one

Tweet media two

1

1

3

@peizNLP

Pei Zhou

3 months

@HarrySurden @Swarooprm7 @_akhaliq Thanks! It's a great question. Here are more detailed examples of how each stage in Self-Discover works based on the modules on a movie recommendation task. We plan to include a full example in appendix next!

Tweet media one

Tweet media two

Tweet media three

1

0

2

@peizNLP

Pei Zhou

2 months

@liliang_ren Congrats!! 🥳🤝

0

0

3

@peizNLP

Pei Zhou

3 years

We compare self-talk RG models against knowledge-grounded RG models that take in *ground-truth* knowledge with word overlap with reference response but still find our models produce better responses! Noisy knowledge input falls short expectedly. 🧵 [6/7]

Tweet media one

Tweet media two

1

0

2

@peizNLP

Pei Zhou

5 years

One of the best talks I’ve ever been to!! @kchonyc @emnlp2019

Tweet media one

Tweet media two

0

0

2

@peizNLP

Pei Zhou

1 year

🪄We propose a theory-of-mind-inspired methodology for training a model to generate guidance for students with RL, where a DM with intent: 1) learns to predict how the players will react; 2) uses this prediction as reward/feedback on how effective these utterances are at guiding.

Tweet media one

1

1

3

@peizNLP

Pei Zhou

8 months

Ablations show both F and R matter and noisy foresight hurts (how can models recover?). Crucially, we find FaR generalizes to diverse contexts including story structures and task domains.

Tweet media one

Tweet media two

1

0

3

@peizNLP

Pei Zhou

1 year

Meet Gandalf🧙‍♂️: new task on generating guidance in goal-driven & grounded communication. We aim to model a teacher (DM) who attempts to guide a set of students (players) toward performing certain actions grounded in a shared world.

1

0

3

@peizNLP

Pei Zhou

5 years

Attending and Presenting at #ICML2019 AI for Social Good Workshop (104B).Come say hi if you are around!

Tweet media one

Tweet media two

Tweet media three

0

0

2

@peizNLP

Pei Zhou

1 year

Compared to existing dialog datasets, Reflect💡1) contains explicit human annotations of 5 inference types for each response and 2) provides 15 different plausible responses for each dialogue context (9k in total). We argue this more naturally mimics human communication.

Tweet media one

1

0

2

@peizNLP

Pei Zhou

3 years

To better analyze this worrisome phenomenon, we define *representational harms* in a set of statements targeting various social groups and use two approx., sentiment and regard (from @ewsheng ) to quantify the harms: non-neutral views such as prejudice and favoritism.

1

0

2

@peizNLP

Pei Zhou

3 years

Human eval results comparing against traditional RG show that our model produces more informative and less generic responses. Using soft-matching to align knowledge and using information-seeking question-answer pairs as knowledge format helps produce even better ones. 🧵 [5/7]

Tweet media one

Tweet media two

1

0

2

@peizNLP

Pei Zhou

4 months

@deliprao @allen_ai eval result yes new, but test design is shared

0

0

1

@peizNLP

Pei Zhou

2 years

@faeze_brh @allen_ai @YejinChoinka @ai2_mosaic Congrats! Looking forward to meeting you there in the summer!

1

0

2

@peizNLP

Pei Zhou

3 years

We formulate abstract commonsense using first-order logic, use unseen strings to avoid factual recalling, and apply perturbations to form a logically-equivalent statement set. Then we probe LMs in two settings that directly examine the model’s predictions (non-parametric probes).

Tweet media one

1

0

2

@peizNLP

Pei Zhou

1 year

Models produce dull and generic dialogs due to existing data, esp. crowdsourced, containing simple and safe responses as workers want to annotate quickly. We propose a *two-step* process that asks ppl to infer about CG and then write responses based on each inference dimension.

Tweet media one

1

0

2

@peizNLP

Pei Zhou

3 years

We propose evaluation protocols targeting three dimensions: knowledge quality, knowledge-response connection, and response quality. We find that for 75% of the time, given unseen dialogues, our model produces relevant commonsense knowledge and responses are grounded. 🧵 [4/7]

Tweet media one

1

0

2

@peizNLP

Pei Zhou

1 year

#DnD is perfect for this setting since DM intends to guide players to achieve a set of story goals. Introducing G-Dragon🐉: large-scale D&D data with labeled guidance from online gameplay. We cast dialog generation as a POMDP and train inverse dynamics models to label guidance.

Tweet media one

1

0

2

@peizNLP

Pei Zhou

10 months

Love seeing ppls eyes light up when seeing the poster and taking their phones out taking photos, magic of dnd fans I guess 🧙‍♂️🥰

0

0

2

@peizNLP

Pei Zhou

3 years

@ewsheng @natarajan_prem @VioletNPeng @kaiwei_chang Congrats Emily, always learned a lot and enjoyed reading your every work!! Hope you the best for your future endeavors!!

0

0

2

@peizNLP

Pei Zhou

8 months

@IgorCarron here's a gdrive link to pdf!

T4D-FaR-Preprintpdf.pdf

drive.google.com

1

2

2

@peizNLP

Pei Zhou

3 years

We decompose the RG process to externalize the knowledge grounding step by training RG models to self-talk (inspired by @VeredShwartz et. al.) in a way that it can explicitly generate the relevant commonsense knowledge and reference them for responding. 🧵 [2/7]

1

0

2

@peizNLP

Pei Zhou

1 year

I'm so excited to see what the community can build off of Reflect💡and eager for more common ground-aware conversational models!! Huge thanks to my collaborators @USC_ISI 📜Paper: 💡Data: 🧭Website:

Tweet media one

Tweet media two

2

0

2