Alexis Ross @alexisjross Twitter profile | Pikagi

Pikagi

Alexis Ross

@alexisjross

2,633

Followers

898

Following

26

Media

368

Statuses

phd-ing @MIT_CSAIL , working on machine teaching | formerly nlp @allen_ai , comp sci & philosophy @harvard ‘20

Seattle, WA

https://t.co/9cSCuXTPi4

Joined June 2018

Don't wanna be here? Send us removal request.

Pinned Tweet

@alexisjross

Alexis Ross

9 days

Good teachers *adapt* to student beliefs & misconceptions: Can LLM teachers? In new work w/ @jacobandreas , we introduce 1) the AdapT 👩‍🏫 (Adaptive Teaching) evaluation framework & 2) AToM ⚛️ (Adaptive Teaching tOwards Misconceptions), a new probabilistic teaching method. (1/n)

Tweet media one

3

39

180

Last Seen Profiles

@wala_azizah

@cairusvt

@porchinko

@AccuracyInMedia

@_ines_art

@pajenrry76

@mini12du

@SamSledge1

@NeedMypadreNews

@kita

@weteen9

@TheComedyStore

@Dragonight1002

@wraithwriters

@Annichut

@mingood0104

@MA_Bloodshot

@Bebingtonswim

@donlu55

@cj_dinenage

@abo_Turki1010

@hoophallu

@ForzaaTerim

@Reaper11769

@ttsuede

@SHOWROOM_jp

@najeebalashmory

@mari_moxy

@ontheslyprod

@bakztfuture

@Amber_funable

@cswignes

@maxacarlin

@CortaPorLozano

@galaxiesnamjoon

@akahoro648

@alexisjross

Alexis Ross

2 years

Life update: I’ll be starting my PhD at @MITEECS & @MIT_CSAIL in the fall! Super excited to work with @jacobandreas , Yoon Kim, and the rest of the rich language ecosystem at MIT ✨

46

18

521

@alexisjross

Alexis Ross

11 months

GPT-4 is already becoming widely-used as a writing assistant, but can it edit a scientific paper in response to reviews? 📝 In work led by Mike D’Arcy, we study this question & release ARIES, a dataset of paper edits aligned to specific reviewer comments:

Tweet media one

5

78

343

@alexisjross

Alexis Ross

3 years

Excited to share our preprint, "Explaining NLP Models via Minimal Contrastive Editing (MiCE)" 🐭 This is joint work with @anmarasovic and @mattthemathman Link to paper: Thread below 👇 1/6

5

45

227

@alexisjross

Alexis Ross

2 years

Sunset cruise to start off the PhD at @MIT_CSAIL ☀️ Grateful to @MITEECS ’s GW6 for organizing!

Tweet media one

Tweet media two

3

8

211

@alexisjross

Alexis Ross

4 years

Excited to have won a Hoopes prize for my Computer Science/Philosophy thesis in explainable ML! Working on this interdisciplinary project was challenging but deeply rewarding. Forever grateful to my advisors Hima Lakkaraju & Bernhard Nickel for their invaluable guidance!

@hima_lakkaraju

𝙷𝚒𝚖𝚊 𝙻𝚊𝚔𝚔𝚊𝚛𝚊𝚓𝚞

@hima_lakkaraju

4 years

Some good news: Alexis Ross who is one of my first undergrad thesis advisees at Harvard won the Hoopes Prize for her thesis. Yayy!

4

2

76

7

11

120

@alexisjross

Alexis Ross

3 years

Our upcoming #nlphighlights episodes will be a series on PhD applications. If there are any questions or topics you would like to see discussed, feel free to reply or send me a DM! We will look at these responses as we prepare our episodes 😄

11

14

119

@alexisjross

Alexis Ross

3 years

I'm happy to share that our paper "Explaining NLP Models via Minimal Contrastive Editing (MiCE)" was accepted into Findings of ACL 2021! Updated paper: Code & models: Work with @anmarasovic @mattthemathman

Tweet card media

GitHub - allenai/mice

Contribute to allenai/mice development by creating an account on GitHub.

@alexisjross

Alexis Ross

3 years

Excited to share our preprint, "Explaining NLP Models via Minimal Contrastive Editing (MiCE)" 🐭 This is joint work with @anmarasovic and @mattthemathman Link to paper: Thread below 👇 1/6

5

45

227

3

18

103

@alexisjross

Alexis Ross

3 years

#nlphighlights 123: Robin Jia tells us about robustness in NLP: what it means for a system to be robust, how to evaluate it, why it matters, and how to build robust NLP systems. Thanks @robinomial and @pdasigi for a great discussion!

Tweet card media

123 - Robust NLP, with Robin Jia

In this episode, Robin Jia talks about how to build robust NLP systems. We discuss the different senses in which a system can be robust, reasons to care about system robustness, and the challenges inv

3

18

100

@alexisjross

Alexis Ross

9 months

*⃣ Resource alert for people applying to CS PhD programs this cycle *⃣ contains >60 example statements of purpose! It's made possible by the many generous submissions from new applicants, and new ones are always welcome! 😊

Tweet card media

CS PhD Statements of Purpose | Notion

Contact: Feel free to reach out to [email protected] with any questions!

cs-sop.notion.site

@cs_sop_org

cs-sop.org

9 months

Are you thinking about your statement of purpose for grad school applications? ✍️ If you are looking for good SoPs for reference, check out our ! Currently >60 SoPs 📜 have been generously contributed by students from >25 institutions, across >15 CS fields.

1

14

61

1

25

91

@alexisjross

Alexis Ross

1 year

we heard about @stanfordnlp 's Alpaca and thought we should join in on the fun 🦙 @gabe_grand @belindazli @zhaofeng_wu

Tweet media one

Tweet media two

2

5

88

@alexisjross

Alexis Ross

3 years

Happy to share that our paper, "Learning Models for Actionable Recourse," will appear in NeurIPS 2021! Very grateful to my collaborators/mentors @hima_lakkaraju and @obastani . Camera ready version coming soon!

@hima_lakkaraju

𝙷𝚒𝚖𝚊 𝙻𝚊𝚔𝚔𝚊𝚛𝚊𝚓𝚞

@hima_lakkaraju

3 years

Learning Models for Actionable Recourse with @alexisjross and @obastani [5/n]

2

3

16

5

7

85

@alexisjross

Alexis Ross

2 years

Does training models with free-text rationales facilitate learning *for the right reasons*? 🤔 We ask this question in our #EMNLP2022 paper, "Does Self-Rationalization Improve Robustness to Spurious Correlations?" W/ @anmarasovic @mattthemathman 🧵 1/n

Tweet card media

Does Self-Rationalization Improve Robustness to Spurious Correlations?

Rationalization is fundamental to human reasoning and learning. NLP models trained to produce rationales along with predictions, called self-rationalization models, have been investigated for...

2

19

80

@alexisjross

Alexis Ross

2 years

Happy to share that Tailor🪡 will appear at #ACL2022 as an oral presentation! For details, w/ new & improved results, check out our... - in-person talk (5/23, session 3) & poster (5/24, session 5) 🇮🇪 - updated paper 📰: - code 👩‍💻:

Tweet card media

GitHub - allenai/tailor

Contribute to allenai/tailor development by creating an account on GitHub.

@tongshuangwu

Sherry Tongshuang Wu

3 years

New preprint alert! *Tailor: Generating and Perturbing Text with Semantic Controls* Title says it all: we perturb sentences in semantically controlled ways like how a tailor changes clothes 🪡. w/ @alexisjross , @haopeng01 , @mattthemathman , @nlpmattg 1/n

Tweet media one

2

43

183

2

10

76

@alexisjross

Alexis Ross

1 year

Feeling grateful to have attended a wonderful #EMNLP2022 ! Highlights include the many interesting poster sessions and a memorable desert sunset 🌅 Big thank you to everyone who stopped by our poster yesterday (and @i_beltagy for the 📸)!

Tweet media one

Tweet media two

1

3

72

@alexisjross

Alexis Ross

3 years

#nlphighlights 133: The 1st episode in our series on NLP PhD apps is live! @complingy & @996roma share faculty & student perspectives on preparing application materials, including statements of purpose and recommendation letters. Co-hosted w/ @nsubramani23

Tweet card media

133 - PhD Application Series: Preparing Application Materials, with...

This episode is the first in our current series on PhD applications. How should people prepare their applications to PhD programs in NLP? In this episode, we invite Nathan Schneider (Professor of Lin

2

19

71

@alexisjross

Alexis Ross

2 years

I’ll be a mentor for MIT EECS’s Graduate Application Assistance Program this application cycle—Please consider signing up if you’re applying to PhD programs this fall!

@_k_a_c_h_

Kartik Chandra (also on Mastodon and Bsky)

2 years

Applying to grad school in EE/CS this fall? …need help? Ask the MIT EECS Graduate Application Assistance Program! GAAP pairs applicants who need help with current PhD students for 1:1 mentoring. We match mentors weekly so it's never too late to sign up!

Tweet media one

1

40

123

3

7

67

@alexisjross

Alexis Ross

3 years

#nlphighlights 134: The 2nd episode in our PhD app series is on PhDs in Europe vs the US. @barbara_plank & Gonçalo Correia share faculty & student perspectives on things to consider when choosing. We also discuss the ELLIS program. Cohosted w/ @zhaofeng_wu

Tweet card media

134 - PhD Application Series: PhDs in Europe versus the US, with...

This episode is the second in our current series on PhD applications. How do PhD programs in Europe differ from PhD programs in the US, and how should people decide between them? In this episode, we

3

15

64

@alexisjross

Alexis Ross

3 years

Very grateful to have attended #EMNLP2021 in Punta Cana! It was wonderful meeting so many virtually familiar and new faces in real life and discussing all things NLP (especially on the beach!) 😊🏝

Tweet media one

Tweet media two

Tweet media three

1

0

64

@alexisjross

Alexis Ross

2 years

#nlphighlights 135: The 3rd episode in our PhD app series is live! @radamihalcea , @ashkamath20 , and @sanjayssub join us to talk about interviews, visit days, & factors to consider in accepting an offer. Co-hosted with @nsubramani23

Tweet card media

135 - PhD Application Series: After Submitting Applications

This episode is the third in our current series on PhD applications. We talk about what the PhD application process looks like after applications are submitted. We start with a general overview of th

1

15

61

@alexisjross

Alexis Ross

3 months

One of my favorite things about grad school has been getting to play chamber music again--Had a blast playing Tchaikovsky Piano Trio with @vikramsundar and @erencshin 😊

1

1

51

@alexisjross

Alexis Ross

11 months

Our new preprint, led by @zhaofeng_wu , shows that traditional benchmark evals may over-estimate the generalizability of LLMs' task abilities 🚨 We find LLM performance consistently drops on counterfactual variants of tasks (ex: code exec w/ 1-based indexing)! Details below 👇

@zhaofeng_wu

Zhaofeng Wu

11 months

Language models show impressive performance on a wide variety of tasks, but are they overfitting to evaluation instances and specific task instantiations seen in their pretraining? How much of this performance represents general task/reasoning abilities? 1/4

Tweet media one

9

108

466

0

6

50

@alexisjross

Alexis Ross

10 months

Had a wonderful time at #DISI2023 over the past few weeks learning about diverse intelligences and exploring Scotland! 🏴󠁧󠁢󠁳󠁣󠁴󠁿 Grateful to be leaving with many new friends 💙 @DivIntelligence

Tweet media one

Tweet media two

Tweet media three

Tweet media four

1

1

47

@alexisjross

Alexis Ross

2 years

The deadline for Predoctoral Young Investigator (PYI) applications for @ai2_allennlp is 2/15 — Two days left to apply! I *highly* recommend the program for anyone interested in pursuing a PhD in natural language processing.

Tweet card media

The Allen Institute for AI

boards.greenhouse.io

1

6

47

@alexisjross

Alexis Ross

2 years

#NeurIPS2021 Paper 📢: "Learning Models for Actionable Recourse" w/ @hima_lakkaraju & @obastani We'll be presenting this work at Poster Session 1. Happening tomorrow, Tues 12/7, 8:30-10 AM (PST). Come say hi! 👋 Paper: More info:

Tweet media one

0

5

44

@alexisjross

Alexis Ross

1 year

In Abu Dhabi for #emnlp2022 ! Presenting a poster for our work on self-rationalization & robustness on Sunday at 11 AM: I’d love to chat about pragmatics, pedagogy, the relationship b/w explanations & learning, or anything in between—please reach out! 🤗

Tweet card media

Does Self-Rationalization Improve Robustness to Spurious Correlations?

Rationalization is fundamental to human reasoning and learning. NLP models trained to produce rationales along with predictions, called self-rationalization models, have been investigated for...

@alexisjross

Alexis Ross

2 years

Does training models with free-text rationales facilitate learning *for the right reasons*? 🤔 We ask this question in our #EMNLP2022 paper, "Does Self-Rationalization Improve Robustness to Spurious Correlations?" W/ @anmarasovic @mattthemathman 🧵 1/n

2

19

80

1

2

40

@alexisjross

Alexis Ross

4 years

An interesting study on a real-world use of GPT-2: Generated “deepfake” comments were submitted to a federal public comment site for an Idhao Medicaid waiver and found to be indistinguishable from human comments. Paper: (1/2)

1

18

39

@alexisjross

Alexis Ross

5 years

Had a great time at #emnlp2019 presenting work with Ellie Pavlick, “How well do NLI models capture verb veridicality?” ()Thank you Hong Kong and EMNLP for a great first conference!

Tweet media one

Tweet media two

Tweet media three

4

0

33

@alexisjross

Alexis Ross

3 years

#nlphighlights 130: @pdasigi and I talk with @lisabeinborn about how to use cognitive signals to analyze and improve NLP models. Thank you Pradeep and Lisa for a really interesting discussion!

Tweet card media

130 - Linking human cognitive patterns to NLP Models, with Lisa...

In this episode, we talk with Lisa Beinborn, an assistant professor at Vrije Universiteit Amsterdam, about how to use human cognitive signals to improve and analyze NLP models. We start by discussing

1

11

30

@alexisjross

Alexis Ross

3 years

Super excited to share our work on Tailor: a *semantically-controlled, application-agnostic system for generation and perturbation* and result of a really fun collaboration! Details in thread below👇

@tongshuangwu

Sherry Tongshuang Wu

3 years

New preprint alert! *Tailor: Generating and Perturbing Text with Semantic Controls* Title says it all: we perturb sentences in semantically controlled ways like how a tailor changes clothes 🪡. w/ @alexisjross , @haopeng01 , @mattthemathman , @nlpmattg 1/n

Tweet media one

2

43

183

0

4

29

@alexisjross

Alexis Ross

3 years

Ana has been an incredible mentor to me (and so many others), and I have no doubt she is going to make an equally incredible professor! Any institution would be very lucky to have her 🙌🏼

@anmarasovic

Ana Marasović

3 years

Maybe this is also a good time to announce that I'm on the faculty job market ‼️ Reach out if I’m a good fit!

1

35

105

1

1

28

@alexisjross

Alexis Ross

1 year

Excited to share CREST, our new #ACL2023 work led by the awesome @MarcosTreviso ! This was a super fun collaboration w/ Marcos, @nunonmg , & @andre_t_martins 😊 CREST combines counterfactuals & rationales to improve model robustness / interpretability--details in the thread below👇

@MarcosTreviso

Marcos Treviso

1 year

1/7 Thrilled to announce that our paper "CREST: A Joint Framework for Rationalization and Counterfactual Text Generation" has been accepted at #ACL2023 oral! 🎉 This work is a result of a fantastic collaboration with @alexisjross , @nunonmg , and @andre_t_martins . Let's dive in!

1

7

39

0

2

27

@alexisjross

Alexis Ross

2 years

I am deeply grateful to my mentors, friends, & family, who helped me navigate all parts of the application process. Special thank you to everyone at @allen_ai for their support over the past 2 years 💖 I also feel so lucky to have met many wonderful people through this process!

1

0

26

@alexisjross

Alexis Ross

2 years

I spent two years as a predoctoral young investigator with @ai2_allennlp and could not have more positive things to say!! Please do apply if you want to work in an energizing and supportive environment with brilliant *and* kind people 😊

@ai2_allennlp

AllenNLP

2 years

Prepare for a PhD program by doing a 1-3 year-long stint as a Predoctoral Young Investigator! Apply by tomorrow, 10/15:

0

2

19

0

1

23

@alexisjross

Alexis Ross

2 years

Go work with Ana!! 👇🏻

@anmarasovic

Ana Marasović

2 years

I'm recruiting students! My interests include measuring usefulness of explanations for human-AI collaboration, addressing human factors that confound such measurements, & modeling interactive explainability (multimodality, few/zero-shot learning, dialogs, personalization, etc)

3

18

63

0

2

21

@alexisjross

Alexis Ross

11 months

Paper: Code & data: Joint work with Mike D’Arcy, Erin Bransom, Bailey Kuehl, @turingmusician , @Hoper_Tom , @_DougDowney , and @semanticscholar

Tweet card media

GitHub - allenai/aries: Aligned, Review-Informed Edits of Scientific Papers

Aligned, Review-Informed Edits of Scientific Papers - allenai/aries

0

3

16

@alexisjross

Alexis Ross

4 years

#nlphighlights 121: Alona Fyshe tells us about the connection between NLP representations and brain activity in this episode hosted with Matt Gardner. Thank you @alonamarie and @mattg for a really interesting discussion on language and the brain!

Tweet card media

121 - Language and the Brain, with Alona Fyshe

We invited Alona Fyshe to talk about the link between NLP and the human brain. We began by talking about what we currently know about the connection between representations used in NLP and representat

2

6

18

@alexisjross

Alexis Ross

1 year

One of my favorite posters was this really cool work by @yuntiandeng @volokuleshov @srush_nlp (presented by @jxmnop ) on evaluating long-form generated text in the latent space

Tweet media one

0

2

18

@alexisjross

Alexis Ross

1 year

Check out my labmates cool work!

@akyurekekin

Ekin Akyürek

1 year

I am on the front page of MIT today! I am grateful to MIT News for covering my research! You can read the full paper I take the opportunity to support the people who suffered from the *unprecedented* earthquake in Turkiye. Trustworthy orgs to donate:

11

52

412

0

0

11

@alexisjross

Alexis Ross

3 years

Had a great time talking with @pdasigi and @thePetrMarek about the winning submission, Alquist 4.0, and how it can conduct coherent and engaging conversations! (Teaser: Alquist is designed to store and follow-up about personal details you mention, like that you have a brother)

@pdasigi

Pradeep Dasigi

3 years

#nlphighlights 132: @alexisjross and I chatted with Petr Marek @thePetrMarek about the Alexa Prize Socialbot Challenge, and this year's winning submission from Petr and team from the Czech Technical University. Thanks for the informative discussion, Petr!

0

2

6

0

1

10

@alexisjross

Alexis Ross

3 years

@emilypahn I also struggle with this! What’s helped me is reading to write high level notes about the paper’s main contributions, my takeaways/questions, and connections to what I’m working on. Writing answers to these qs as I read helps to focus my attention and know when to move on

0

0

10

@alexisjross

Alexis Ross

3 years

Big thank you to @complingy for the idea for this series and to everyone who sent me questions they wanted to see discussed. More topics will be covered in upcoming episodes 🙂

0

0

9

@alexisjross

Alexis Ross

9 days

Lastly, a big thank you to my advisor @jacobandreas for being so supportive and making my first PhD project such a rewarding & fun experience 😊 (n/n)

0

0

8

@alexisjross

Alexis Ross

2 years

Overall, the variability of our results suggests that, despite the appeal of self-rationalization models for increasing model trustworthiness, self-rationalization training can have the unintended effect of *increasing* reliance on spurious features and biases 🚨 5/5

0

1

7

@alexisjross

Alexis Ross

3 years

@TheRealEGS I really like @bnickel34 ‘s advice on reverse outlining

1

0

5

@alexisjross

Alexis Ross

9 days

Paper link: . Code coming soon! (9/n)

Tweet card media

Toward In-Context Teaching: Adapting Examples to Students'...

When a teacher provides examples for a student to study, these examples must be informative, enabling a student to progress from their current state toward a target concept or skill. Good teachers...

3

0

5

@alexisjross

Alexis Ross

6 months

@wangzjeff @shannonzshen has built a cool slack chatbot :)

1

0

4

@alexisjross

Alexis Ross

3 years

@anmarasovic This looks so good and now I am craving fresh Mediterranean seafood 😩

1

0

5

@alexisjross

Alexis Ross

3 years

@tiancheng_hu @996roma @complingy @nsubramani23 Thanks @tiancheng_hu and apologies for the delayed response! We weren't discussing any specifically, but there are a few other predoc/residency programs in industry that I know of--Here's a list I found (though I haven't personally looked through each one)

Tweet card media

GitHub - dangkhoasdc/awesome-ai-residency: List of AI Residency Programs

List of AI Residency Programs. Contribute to dangkhoasdc/awesome-ai-residency development by creating an account on GitHub.

2

0

5

@alexisjross

Alexis Ross

11 months

In order for LMs to be effective writing assistants, they need to be able to model the relationship b/w paper feedback & revisions. Our findings highlight limitations in LMs' abilities to do so (eg focusing on surface-level meaning rather than underlying intent of feedback)

1

0

5

@alexisjross

Alexis Ross

3 years

Human explanations are *contrastive*–They explain why an event happened *instead of* another event (the contrast case). Making model explanations contrastive could thus make them more user-friendly/useful. However, this property has largely been ignored in interpretable NLP. 2/6

1

2

5

@alexisjross

Alexis Ross

1 year

@jxmnop currently listening

Tweet media one

1

0

5

@alexisjross

Alexis Ross

2 years

As a current PYI on AllenNLP, I’ve gained invaluable hands-on research experience and preparation for a PhD — and all in an incredibly collaborative, friendly, & supportive environment 😊 Please feel free to reach out over DM or email with any questions!

0

0

4

@alexisjross

Alexis Ross

1 year

@jxmnop Now eagerly awaiting this feature to be rolled out to my account 🥲

1

0

4

@alexisjross

Alexis Ross

11 months

We hope ARIES will aid researchers in studying the paper revision process and developing new methods for assisting authors and reviewers in the writing process!

1

0

4

@alexisjross

Alexis Ross

5 months

@shannonzshen and like RAG, a cheatsheet is not enough to completely rule out hallucinations at test time 😅

0

0

4

@alexisjross

Alexis Ross

2 years

@kayo_yin @SCSatCMU You’re a star!!! ⭐️

0

0

4

@alexisjross

Alexis Ross

3 years

Episodes will be uploaded here:

0

0

4

@alexisjross

Alexis Ross

3 years

MiCE has 2 stages: In Stage 1, we train an Editor model to make edits targeting given contrast labels. In Stage 2, we use the Editor to make edits using both binary search and beam search to find edits resulting in the highest contrast prediction probabilities from the model. 4/6

Tweet media one

1

0

4

@alexisjross

Alexis Ross

3 years

@LakeBrenden @glmurphy39 Looks very interesting, looking forward to reading! (5) reminds me of work by @belindazli @Maxwell_Nye @jacobandreas showing that word reps encode changing entity states based on inputs. Wonder if this would also hold for facts like "Dolphins are mammals"

Tweet card media

Implicit Representations of Meaning in Neural Language Models

Does the effectiveness of neural language models derive entirely from accurate modeling of surface word co-occurrence statistics, or do these models represent and reason about the world they...

0

0

4

@alexisjross

Alexis Ross

9 days

In AdapT, a teacher aims to teach a target concept to a student who has unknown misconceptions. AdapT includes 3 domains: 1) fraction arithmetic 2) English verb conjugation 3) function learning. Students are both simulated 🤖 & human 👩‍💻. (3/n)

Tweet media one

1

0

5

@alexisjross

Alexis Ross

9 days

Using AdapT, we evaluate both GPT4 and probabilistic teaching methods. We introduce AToM ⚛️, which performs online inference of student priors, then selects informative teaching examples based on these inferences. (4/n)

1

0

4

@alexisjross

Alexis Ross

3 years

We present Minimal Contrastive Editing, or MiCE, a two-stage approach to generating contrastive explanations of model predictions. A MiCE explanation is a modification of an input that causes the model being explained to change its prediction to a given contrast prediction. 3/6

Tweet media one

1

0

3

@alexisjross

Alexis Ross

3 years

Finally, we show how MiCE edits can be used for two use cases in NLP system development–discovering dataset artifacts (ex: IMDB edit below) and debugging incorrect model predictions (ex: RACE edit below). Feel free to to reach out with any questions or comments! 6/6

Tweet media one

1

0

3

@alexisjross

Alexis Ross

2 years

@shobsund @DeepMind @MIT_CSAIL @phillip_isola Go shobi!! 🥳

0

0

3

@alexisjross

Alexis Ross

4 years

More from the paper: Federal comment sites are not currently equipped to detect such automated submissions. The ease with which deepfake text can be created and used highlights the need for technological reforms (2/2)

0

0

3

@alexisjross

Alexis Ross

11 months

ARIES includes a synthetic training dataset of 3.9k examples and a manually-annotated test set of 196 examples, using computer science papers, reviews, and author responses from OpenReview.

1

0

2

@alexisjross

Alexis Ross

3 years

@mark_riedl @lrvarshney @danielrembrandt This paper might be relevant:

1

0

3

@alexisjross

Alexis Ross

2 years

@anmarasovic And soon afterwork skiing? ⛷️

1

0

3

@alexisjross

Alexis Ross

4 years

@hima_lakkaraju Thank you so much for all of your indispensable advising and support!! 😊

0

0

3

@alexisjross

Alexis Ross

2 years

We realize many interviews in this cycle have already happened, but we hope this episode is still useful for people currently navigating visit days/PhD decisions (and for future applicants)!

0

0

3

@alexisjross

Alexis Ross

9 days

We also find that AToM makes more *accurate inferences* about student beliefs (both simulated & human) than GPT4. It also selects *key teaching examples* (i.e. that target student misconceptions) earlier in teaching. (6/n)

Tweet media one

1

0

3

@alexisjross

Alexis Ross

11 months

In particular, GPT-4 likes to rigidly follow instructions & paraphrase the comment it is responding to, and it includes fewer technical details than real edits.

Tweet media one

1

1

3

@alexisjross

Alexis Ross

3 years

@lambdaviking @NYUDataScience This is the best phd announcement yet 🤣

0

0

3

@alexisjross

Alexis Ross

2 years

@W4ngatang @hhexiy @ml_perception @JoaoSedoc @sleepinyourhat @kchonyc Congrats Alex!!! 🥳

0

0

3

@alexisjross

Alexis Ross

2 years

We train 6 model types on NLI & commonsense QA with/without free-text rationales and measure robustness to spurious correlations through 1) challenge datasets 2) test sets where reliance on spurious correlations would lead to incorrect answers 👩‍🏫 2/n

Tweet media one

1

0

2

@alexisjross

Alexis Ross

3 years

@joeddav So awesome! We should have brought papers to read on our HK hike 😆

1

0

2

@alexisjross

Alexis Ross

3 days

@jwanglvy Great q! You're right; ATOM is a probabilistic method with main components: set of Bayesian student models & set of possible examples to choose from. It tracks student predictions and chooses both a student model & an optimal teaching example at each step. Hope that clarifies!

1

0

1

@alexisjross

Alexis Ross

3 days

Cool new work by @seungwookh and @IdanShenfeld on aligning/personalizing models at decoding time!

@seungwookh

Seungwook Han

3 days

🚀 Stronger, simpler, and better! 🚀 Introducing Value Augmented Sampling (VAS) - our new algorithm for LLM alignment and personalization that outperforms existing methods!

Tweet media one

5

31

123

0

0

4

@alexisjross

Alexis Ross

4 years

@delliott The assignments for this Harvard NLP class were to reproduce various papers and might be a good place to look:

0

0

2

@alexisjross

Alexis Ross

2 years

While results are model/task-specific, we observe some general trends 📈: - Data: Improvements tend to be in lower-resource settings & self-rationalization can hurt in higher-resource settings - Model size: Within model families, larger models benefit more from rationales 3/n

Tweet media one

1

0

2

@alexisjross

Alexis Ross

2 years

We also find that *rationale content* affects results. Training with positive rationales in the ECQA dataset improves robustness, while using freeflow/negative rationales harms robustness. 4/n

Tweet media one

1

0

2

@alexisjross

Alexis Ross

3 years

@peterbhase @GoogleAI @mohitban47 @uncnlp Congrats! 🎉

0

0

2

@alexisjross

Alexis Ross

2 years

@anmarasovic @UUtah @UtahSoC @viveksrikumar @EllenRiloff CONGRATS Ana!!! 🥳🌟

0

0

2

@alexisjross

Alexis Ross

9 days

Consider the 3rd-grader who answers 1/2 + 2/4 = 3/6 ❌ A good teacher might immediately develop a hypothesis about the student’s misconceptions–over-generalizing the rule for multiplication (adding both nums & denoms)–which should influence their course of instruction. (2/n)

1

0

3

@alexisjross

Alexis Ross

1 year

@VictorButoi

0

0

2

@alexisjross

Alexis Ross

2 years

@aaronsteven You might also be interested in our work on Tailor (), which guides generation with control codes that derive from PropBank representations

1

0

2

@alexisjross

Alexis Ross

9 days

On the whole, our results point to complementary advantages of LLM-based teachers like GPT4 and more structured models like AToM. An interesting direction for future work would be to look into combining these to get the best of both worlds–See our paper for more! (8/n)

1

0

2

@alexisjross

Alexis Ross

2 years

@DanielKhashabi @jhuclsp @JHUCompSci @JohnsHopkins Congratulations Daniel!!

0

0

2

@alexisjross

Alexis Ross

2 years

@tallinzen @RTomMcCoy @jacobandreas Congrats Tom!! 🎉🥳

1

0

2

@alexisjross

Alexis Ross

11 months

When tasked with writing edits given a reviewer comment (eg as a writing assistant), GPT-4 often produces content that looks reasonable on a surface level, but has systemic differences from human-written edits.

1

0

2

@alexisjross

Alexis Ross

3 years

Experiments on classification/multiple-choice Q&A show that MiCE edits are not only contrastive, but also *minimal* and *fluent*, consistent with human contrastive edits. 5/6

Tweet media one

1

0

2

@alexisjross

Alexis Ross

1 year

@yanaiela working on it 😄

0

0

2

@alexisjross

Alexis Ross

11 months

This indirectness is compounded by the fact that authors sometimes disagree with a request or get creative with how to address it. We find that GPT does worse at aligning indirect comments and non-compliant edits.

Tweet media one

2

0

2

@alexisjross

Alexis Ross

11 months

Identifying which comments correspond to which edits is hard, even for GPT-4. Many review comments are nuanced & indirect; instead of saying "You should use a more realistic dataset, like ImageNet", they might say "I am not convinced that the dataset used is realistic".

Tweet media one

1

0

2

@alexisjross

Alexis Ross

6 months

@neuranna Congratulations Anya!! 🥳 We will miss you at MIT!

0

0

2

@alexisjross

Alexis Ross

1 year

@joeddav @KahlertSoC @UUtah @UtahNLP Huge congrats Joe!! 🥳 So excited for you :)

1

0

2

@alexisjross

Alexis Ross

9 days

On the other hand, GPT4 seems to have encoded information harder to represent in structured methods like AToM–for example, that it may be easier for human students to learn from examples closer to the origin when trying to learn the weights of a line. (7/n)

Tweet media one

1

0

2

@alexisjross

Alexis Ross

9 days

We find that simulated students 🤖 learn more efficiently with AToM than GPT4. In experiments with human students 👩‍💻, both AToM & GPT4 outperform random example selection. (5/n)

1

0

3

@alexisjross

Alexis Ross

2 years

@nlpnoah Aw thank you so much Noah 😊 I am very grateful for your support!!

0

0

1