Tal Linzen @tallinzen Twitter profile

Last Seen Profiles

@JasmineTaraNeil

@guangzhoukeyue2

@Ray_WongHKI

@prepositionjoe1

@ryartchus

@sohofootball

@OneFreshPillow

@dankutcher_

@champcamera

@docTPlive

@nobasaran

@NBTiller

@order

@OracleDevs

@OfficialKING_TV

@kohagurayu78583

@arjanschouten

@SharkTankEgypt

@GaiaPic

@RiverFCMen1

@noya417

@leathermyjoe

@Coach_Kraus

@ActonLab

@meri04_

@Yuushin_07

@Susan9093141590

@rolle

@2BrewersClapham

@LopezNorman44

@DietzandWatson

@monznomad

@KrajewskaSylwia

@SqueakerBat

@remozellmer

@hannacosplay

Tal Linzen

@tallinzen

6 years

One of the major perks of academia is that you get to travel to exciting locations and spend time in different locations of corporate hotel chains working on your slides late at night.

42

390

4K

Tal Linzen

@tallinzen

4 years

I'm thrilled to announce that I'll be moving to NYU in September 2020 for a joint position in @NYUDataScience and @nyuling !

54

20

887

Tal Linzen

@tallinzen

3 years

As always, Lila Gleitman tells it like it is

9

136

854

Tal Linzen

@tallinzen

3 years

I'll be recruiting a PhD student through the NYU Data Science program this year! Two areas I'm particularly interesed in working on with a student in the next few years are:

15

276

839

Tal Linzen

@tallinzen

11 months

So a nice thing that just happened is I got tenure!

118

4

782

Tal Linzen

@tallinzen

5 years

If you were wondering what it's like to be a computational linguist: I got an email yesterday with the title "URGENT: NEED VERBS"

11

80

666

Tal Linzen

@tallinzen

2 years

Attention is all you need, except residual connections, layer norm, position embeddings, extra feedforward layers, multiple heads, and masking future words

16

48

667

Tal Linzen

@tallinzen

8 years

This reaction by Geoff Pullum to the astonishing progress in speech recognition is probably shared by most linguists (me included).

9

276

396

Tal Linzen

@tallinzen

2 years

Please stop using BERT, or any other model that gets both the left and right context of a word, to simulate results from human left-to-right reading experiments! It's a non-starter even if BERT is a famous model and also easy to download! Thanks!

16

21

329

Tal Linzen

@tallinzen

4 years

I'll be recruiting a PhD student through the NYU Data Science program (to start in Fall 2021). Would love to chat about it "at" #acl2020nlp . Email me if interested in a computational psycholinguistics ∩ deep learning PhD (even if you're not attending the conference)!

13

102

315

Tal Linzen

@tallinzen

1 year

slides from my short #phildeeplearning talk yesterday, "What, if Anything, Can Large Language Models Teach Us About Human Language Acquisition?":

linzen_phil_deep_learning.pdf

Shared with Dropbox

www.dropbox.com

8

45

310

Tal Linzen

@tallinzen

2 years

At 10k followers I will reveal which type of neural network you're permitted to use to model human reading

9

557

173

Tal Linzen

@tallinzen

3 years

Time for the yearly a reminder that, if you're fortunate enough to have more than one PhD offer, it is not frivolous to decide where to go based on the school's location!

7

19

307

Tal Linzen

@tallinzen

3 months

Very pleased to see this article in print! In a study with 2000 subjects, we track how people read syntactically complex sentences, and find that word predictability estimated from language models does a poor job of explaining the human data.

9

59

292

Tal Linzen

@tallinzen

6 years

Standard ML cycle: 1. Group 1 releases a dataset with massive biases, doesn't test baselines. 2. Groups 2 to 199 publish models with increasingly great results on dataset. 3. Group 200 shows that those models were probably just fitting the biases. Seems good for everyone's CV.

Zachary Lipton

@zacharylipton

6 years

New paper with @dkaushik96 is up: ***How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks*** We establish sensible RC baselines, finding question- and passage-only models perform surprisingly well.

3

30

157

4

58

273

Tal Linzen

@tallinzen

3 years

Plea to please stop spreading the trope that you need to work 60/80/7000 hours a week to succeed as a tenure-track professor at a research university. I don't think there's good evidence for this, and I'm always skeptical that anyone actually works that much in a sustained way.

21

12

265

Tal Linzen

@tallinzen

2 years

sad that the charming misspelling phase only lasted about two weeks :/

8

17

266

Tal Linzen

@tallinzen

1 year

This semester I've been teaching a seminar titled "Computational Linguistics and Cognitive Science". We've been reading a mix of cognitively-inspired evaluations of language models and papers using LMs as cognitive models. It's been fun

Syllabus: Computational Linguistics and Cognitive Science 2023

docs.google.com

9

52

256

Tal Linzen

@tallinzen

1 year

Saying that syntax isn't real because language models work just fine without syntactic supervision is a little bit like saying that objects aren't real because vision transformers aren't designed with built-in object representations.

14

28

252

Tal Linzen

@tallinzen

2 years

Might get canceled for this controversial take but international conferences full of jetlagged individuals should have plentiful coffee available throughout the entire time

14

8

256

Tal Linzen

@tallinzen

5 years

"We find that pretrained language models are equally prone to generate facts ("birds can fly") and their negation ("birds cannot fly")" 🐦😢

Negated and Misprimed Probes for Pretrained Language Models: Birds...

Building on Petroni et al. (2019), we propose two new probing tasks analyzing factual knowledge stored in Pretrained Language Models (PLMs). (1) Negation. We find that PLMs do not distinguish...

arxiv.org

7

43

251

Tal Linzen

@tallinzen

4 years

My critical contribution to COVID research

5

8

247

Tal Linzen

@tallinzen

5 years

So there's a Facebook model similar to BERT (). The paper has better experiments, e.g. this one varying the amount of data. I calculated that at this rate we'll need a corpus of 2.14e+29 tokens to get to human performance on MNLI. Get scraping!

6

66

241

Tal Linzen

@tallinzen

4 years

Evergreen tweet: if you're an advanced graduate student, please have a website with your publications and CV. Some job opportunities may find you without you having to apply for them. Thanks!

7

36

228

Tal Linzen

@tallinzen

5 years

One of my favorite things about Twitter is how people started signaling that a tweet is the first one in a thread after it became unfashionable to just say "(thread)". Let me explain.

3

47

218

Tal Linzen

@tallinzen

3 years

The review that Marco Baroni and I wrote on syntax in neural networks and why linguists should care is now "officially published"

3

45

224

Tal Linzen

@tallinzen

2 years

Just to make sure this point doesn't get lost: the success of instruction-finetuning highlights the *limitation* of the self-supervised language modeling objective.

Quoc Le

@quocleix

2 years

New open-source language model from Google AI: Flan-T5 🍮 Flan-T5 is instruction-finetuned on 1,800+ language tasks, leading to dramatically improved prompting and multi-step reasoning abilities. Public models: Paper:

40

494

2K

2

25

220

Tal Linzen

@tallinzen

2 years

Some work news: today is my first day as part-time research scientist at Google. It's been a great experience spending time at Google as visiting faculty and I'm glad to be able to continue working with the team going forward!

6

2

212

Tal Linzen

@tallinzen

2 years

There is a very narrow sweet spot on the spectrum between buying uncritically into AI hype and just reflexively saying "this is not real intelligence" to everything and I am fortunate that my views lie exactly in that sweet spot.

8

16

205

Tal Linzen

@tallinzen

2 years

Wild that "you don't need word order to understand English" is a common enough position in NLP that people need to publish papers showing that, contrary to common wisdom, "the dog ate the cat" does not mean the same thing as "the cat ate the dog"

11

10

205

Tal Linzen

@tallinzen

2 years

Years of parties in NYC: me: I'm a linguist them: that's so cool, I've learned a language once! my kid is learning a language! Party last night in Tel Aviv: me: I'm a linguist them: oh that doesn't sound very interesting does it

5

202

Tal Linzen

@tallinzen

1 year

Very grateful to @NSF , and the American taxpayers, for a CAREER award supporting our work on evaluating and improving structural generalization by neural networks:

9

6

201

Tal Linzen

@tallinzen

4 years

. @raquel_dmg and I are organizing CoNLL 2020! Unlike previous years, where the scope of the conference was quite broad, this year we explicitly invite submissions focused on "theoretically, cognitively and scientifically motivated approaches to computational linguistics".

CoNLL 2023

@conll_conf

4 years

Follow this account for updates about CoNLL 2020. Please read the call carefully as it has changed substantially compared to previous editions of the conference!

1

15

37

4

50

197

Tal Linzen

@tallinzen

3 months

Very nice idea for a language model benchmark: can the model learn to translate a language (Kalamang, spoken by a small number of people in New Guinea) into English just from a grammar written by a field linguist?

A Benchmark for Learning to Translate a New Language from One Grammar Book

Large language models (LLMs) can perform impressive feats with in-context learning or lightweight finetuning. It is natural to wonder how well these models adapt to genuinely new tasks, but how...

arxiv.org

6

34

195

Tal Linzen

@tallinzen

1 year

Overheard on the line: "you guys are in line for a philosophy lecture? oh my god"

4

12

186

Tal Linzen

@tallinzen

5 years

If you ever need to have an LSTM diagram in a LaTeX document, may I suggest you steal (with credit!) this Tikz diagram from Luzi Sennhauser's paper on LSTMs' ability to learn learning Dyck languages

4

30

184

Tal Linzen

@tallinzen

3 years

Would Zoom let you share your screen if you didn't say, before doing it, "let me just share my screen"? Has anyone tried?

6

8

184

Tal Linzen

@tallinzen

4 years

For international conferences conducted in English where talks are pre-recorded and captioned, is there a good reason not to allow presentations in the authors' native language (assuming they'd be responsible for producing English subtitles)?

8

16

181

Tal Linzen

@tallinzen

4 years

Fascinating set of findings on syntactic generalization in neural language models: * amount of training data barely affects generalization * transformers are much better than LSTMs * perplexity is uncorrelated with syntactic generalization + much more

A Systematic Assessment of Syntactic Generalization in Neural...

While state-of-the-art neural network models continue to achieve lower perplexity scores on language modeling benchmarks, it remains unknown whether optimizing for broad-coverage predictive...

arxiv.org

5

47

180

Tal Linzen

@tallinzen

1 year

As larger language models get better at predicting the next word their word predictions seem to align less closely with human predictions (at least as reflected in reading times). This new paper from Byung-Doh Oh & William Schuler looks into why:

Why Does Surprisal From Larger Transformer-Based Language Models...

This work presents a detailed linguistic analysis into why larger Transformer-based pre-trained language models with more parameters and lower perplexity nonetheless yield surprisal estimates that...

aps.arxiv.org

6

28

176

Tal Linzen

@tallinzen

4 years

Marco Baroni and I wrote a paper for Annual Review of Linguistics on "Syntactic Structure from Deep Learning":

Syntactic Structure from Deep Learning

Modern deep neural networks achieve impressive performance in engineering applications that require extensive linguistic skills, such as machine translation. This success has sparked interest in...

arxiv.org

2

29

173

Tal Linzen

@tallinzen

2 years

Does anyone ever refer to computers as "machines" besides cognitive scientists applying for grants on "minds, brains and machines"?

41

1

172

Tal Linzen

@tallinzen

4 years

Professorial

4

170

Tal Linzen

@tallinzen

5 months

Actually, I can't think of a plausible functional definition of "storing" that wouldn't apply to a situation where the LLM reproduces large chunks of text.

Kevin A. Bryan

@Afinetheorem

5 months

NYT/OpenAI lawsuit completely misunderstands how LLMs work, and judges getting this wrong will do huge damage to AI. Basic point: LLMs DON'T "STORE" UNDERLYING TRAINING TEXT. It is impossible- the parameter size of GPT-3.5 or 4 is not enough to losslessly encode the training set.

199

286

2K

13

12

169

Tal Linzen

@tallinzen

2 years

Congratulations, Dr. @RTomMcCoy !

8

6

167

Tal Linzen

@tallinzen

6 months

It's the time of year where people are deciding which Ph.D. programs to apply to. I thought I'd mention some recent examples of work from my group, generally at the LLM / cognitive science intersection, to help you decide if it's a good fit:

1

34

167

Tal Linzen

@tallinzen

4 years

Sometimes Firefox inexplicably switches my default search engine to Bing

5

165

Tal Linzen

@tallinzen

5 years

Massive network trained on a word prediction objective successfully does random thing X: "the language modeling objective is magical and can teach you everything" Same network fails to do random thing Y: "this is an unfair test, you never trained it to do Y"

5

19

162

Tal Linzen

@tallinzen

4 years

Smart, Karen from HP customer support just passed the Turing test

5

1

161

Tal Linzen

@tallinzen

4 years

As we near the ACL camera-ready deadline, here's a checklist that will help you make sure the paper looks nice and the repo is maintained even after you've graduated and left to pursue a professional surfing career in the Philippines. Did I miss anything?

3

38

157

Tal Linzen

@tallinzen

2 years

Everyone keeps talking about academic perks like grant reports and student evaluations but having a PhD student write a great dissertation and land the job of their dreams feels pretty good too! Would recommend.

3

157

Tal Linzen

@tallinzen

6 years

nice finding - attention-only language models are significantly worse than recurrent models on long-distance syntactic dependencies despite having comparable perplexity:

2

63

151

Tal Linzen

@tallinzen

6 years

LaTeX will be able to handle URLs with underscores by 2076

Sam Bowman

@s8mb

6 years

AI will outperform humans at... translating languages by 2024; Writing high-school essays by 2026; driving a truck by 2027; working in retail by 2031; writing a bestselling book by 2049; working as a surgeon by 2053. (from a survey of 352 AI experts)

41

89

187

3

10

150

Tal Linzen

@tallinzen

5 years

On this note, I should mention that @yoavgo and I just received funding from the United States-Israel Binational Science Foundation to continue our psycholinguistics/neural networks/syntax collaboration!

13

3

151

Tal Linzen

@tallinzen

2 years

A lot of grad school and job applications are due today. A little known fact that might help calm your nerves a little: advisors and search committees rarely read the reference letters on the day they're due, and your application won't be rejected if a letter is late.

6

24

151

Tal Linzen

@tallinzen

2 years

It was great to host @chrmanning as part of the NLP speaker series at @NYUDataScience earlier this week! Very interesting work on detecting emergent syntactic structure in transformers.

6

17

147

Tal Linzen

@tallinzen

7 years

we'd like to thank Reviewer 2 for performing a simulation showing we were wrong + uploading the relevant 216 lines of code as supp materials

5

20

146

Tal Linzen

@tallinzen

5 years

My favorite piece of advice to give to PhD-minded senior undergraduates is "you don't need to do your PhD straight after undergrad; you don't need to move to a place you don't want to live in; your class is not a relevant comparison group". Love the visible relief on their faces

2

12

146

Tal Linzen

@tallinzen

7 years

theoretical linguists don't live tweet, they fax 14-page handouts to each other

SLLA Lab

@SllaLab

7 years

@tallinzen Is anyone live tweeting NELS? I haven’t seen anything other than your msg

2

0

3

6

22

142

Tal Linzen

@tallinzen

2 years

Read this paper! The intuition is that speakers who are trying to be informative are unlikely to say something that's entirely entailed by what they just said. So we can use the probability distribution over sequences of sentences to determine which sentences entail which.

Will Merrill

@lambdaviking

2 years

[1/6] Excited to share a year-long project re: theory of language understanding in LMs w/ @a_stadt , @tallinzen TLDR: Judging entailments (NLI) can be reduced to LMing over "Gricean data"* ∴ Learning distribution (perfectly) => learning semantics

2

37

270

5

14

144

Tal Linzen

@tallinzen

4 years

- Can you hear me? - I can hear you but can't see you - Let me restart Zoom - How about I turn off my headphones? 5 minutes later: - Any progress on the project? - No, still busy with end-of-semester stuff - Nothing on my end either - Talk again next week? - Ok

3

144

Tal Linzen

@tallinzen

1 year

@LakeBrenden everything is terrible

4

1

140

Tal Linzen

@tallinzen

5 years

Open Sesame: Getting Inside BERT’s Linguistic Knowledge (realistically the only appropriate name for a BERTology paper; from my collaborator Bob Frank and his students)

Open Sesame: Getting Inside BERT's Linguistic Knowledge

How and to what extent does BERT encode syntactically-sensitive hierarchical information or positionally-sensitive linear information? Recent work has shown that contextual representations like...

arxiv.org

4

19

135

Tal Linzen

@tallinzen

5 years

Sometimes I worry that my students are getting too good and there's nothing I can teach them anymore and then I notice they used an en-dash instead of an em-dash in a manuscript and it makes me feel better.

4

3

138

Tal Linzen

@tallinzen

6 years

Never use the passive voice, except when you're citing a paper by someone with a Dutch surname and you can't figure out how to get Bibtex to capitalize "van"

12

17

135

Tal Linzen

@tallinzen

7 years

An email I got from a linguist who played with an almost-SOTA deep learning textual entailment system (TE tab in )

10

58

133

Tal Linzen

@tallinzen

1 month

Will you be at Princeton next Wednesday? Do you like Understanding? I have just the event for you!

10

6

132

Tal Linzen

@tallinzen

5 years

Based on a random sample of tweets from the last couple of hours I estimate that the NAACL acceptance rate this year is between 92% and 95%.

6

7

130

Tal Linzen

@tallinzen

4 years

Trying to explain to my students why people who work on text generation in NLP keep reporting progress on metrics that are famously uncorrelated with human judgments instead of dropping everything and developing better metrics and I'm not sure what to say.

16

11

129

Tal Linzen

@tallinzen

4 years

Satisfying faculty experience #438 : in a meeting with a senior grad student and a more junior student, being able to sit back, relax, and occasionally nod enthusiastically as Senior Grad Student makes better comments than I possibly could.

0

3

130

Tal Linzen

@tallinzen

7 years

Paper shows LSTM LMs beat newer architectures, makes everyone feel better about checking their arXiv feed less often

3

24

130

Tal Linzen

@tallinzen

11 months

I love Twitter

5

0

127

Tal Linzen

@tallinzen

2 years

Academics: I have a 5/5 teaching load. I work 178 hours a week. when am I supposed to exercise. work life balance Also academics: going to spend the day figuring out this clunky twitter alternative even though twitter is exactly as usable as it was last week. hashtag resist

3

4

130

Tal Linzen

@tallinzen

2 years

Today's 🧵! A lot of recent work in psycholinguistics and cognitive neuroscience appears to assume strong convergence between human predictions during sentence comprehension and the predictions of neural language models.

3

16

129

Tal Linzen

@tallinzen

4 years

I wrote a position piece for the ACL (as part of the "taking stock" thematic session) that argues for a greater focus on generalization in evaluating natural language understanding models:

How Can We Accelerate Progress Towards Human-like Linguistic...

This position paper describes and critiques the Pretraining-Agnostic Identically Distributed (PAID) evaluation paradigm, which has become a central tool for measuring progress in natural language...

arxiv.org

2

22

129

Tal Linzen

@tallinzen

2 years

Apply to be a Faculty Fellow at the NYU Center for Data Science! Those are 2-year positions with a very light teaching load and a lot of independence. We'll review applications in two rounds, first starting November 1, and then again in late December.

3

63

128

Tal Linzen

@tallinzen

4 years

Impressed with how far ahead of its time @ybisk and @boknilev 's paper, Ruin a Neural Machine Translation Model's Predictions By Changing One Letter, was

Synthetic and Natural Noise Both Break Neural Machine Translation

Character-based neural machine translation (NMT) models alleviate out-of-vocabulary issues, learn morphology, and move us closer to completely end-to-end translation systems. Unfortunately, they...

arxiv.org

2

11

127

Tal Linzen

@tallinzen

5 years

I'm often asked if my name is short for something. Gmail's Smart Compose is so smart it doesn't even have to ask!

2

6

126

Tal Linzen

@tallinzen

5 years

Excited to share this news from my lab

6

9

122

Tal Linzen

@tallinzen

5 years

I received my US permanent resident card (green card) yesterday, at the end a process that started more than two years ago. Very grateful to the colleagues who have written letters of support for me!

4

1

122

Tal Linzen

@tallinzen

5 years

I went back and forth for a bit on how to write the date on a form for the Israeli consulate in the US then decided it could wait until 9/9

3

5

122

Tal Linzen

@tallinzen

3 months

Undoubtedly one of the more creative LLM reasoning evals I've seen

2

14

123

Tal Linzen

@tallinzen

1 year

Today I'm grateful for the era where AI companies still published conference papers because I can assign my students an 8-page reading instead of a 800-page arXiv brain dump.

5

120

Tal Linzen

@tallinzen

6 years

You realize that there are some downsides to open bar receptions at conferences when the next morning you get an email titled "Thanks for agreeing to be on my thesis committee!"

1

2

119

Tal Linzen

@tallinzen

4 years

I'm toast

2

1

121

Tal Linzen

@tallinzen

2 years

Actually, maybe 1% of linguists hate each other but they are responsible for 99% of acrimonious tweets and response letters. The rest of us read and appreciate a wide range of approaches to studying language.

Felix Hill

@FelixHill84

2 years

News from linguistics: everyone hates each other. Also, don't whatever you do use modern methods to understand something as important as language.

3

0

14

5

6

119

Tal Linzen

@tallinzen

6 years

Just submitted my second grant proposal this week. I think I deserve some likes.

4

0

119

Tal Linzen

@tallinzen

5 years

Thanks for the support, @GoogleAI !

Google AI

@GoogleAI

5 years

Congratulations to the recipients of the 2018 Google Research Awards! This round we received over 900 proposals from universities across the globe, covering a diverse set of research areas such as HCI, machine perception, distributed computing and more.

8

102

474

4

117

Tal Linzen

@tallinzen

3 months

for my talk next week at the University of Maryland, I have good news and bad news

3

4

116

Tal Linzen

@tallinzen

7 months

Have a PhD in an area related to ML/data science, or will soon? You should consider the @NYUDataScience Faculty Fellows program (deadline Nov 28) - it's a great deal - full research independence, low teaching load, good salary and (probably) cheap housing!

1

38

119

Tal Linzen

@tallinzen

7 years

Neural reading comprehension systems can be fooled by adding an irrelevant sentence to the text: Good quote:

3

53

118

Tal Linzen

@tallinzen

8 years

Turns out a 2-layer LSTM with 8192 units per layer, trained for 3 weeks on 32 GPUs, makes a great language model

2

64

116

Tal Linzen

@tallinzen

6 years

this is really therapeutic

3

12

115

Tal Linzen

@tallinzen

6 years

The program for BlackboxNLP, the EMNLP workshop on analyzing and interpreting neural networks for NLP, is now online. The workshop will have 3 invited speakers, 8 contributed talks and 48 posters:

5

35

116

Tal Linzen

@tallinzen

2 years

solidarity with everyone who's spending their sunday in their high school friend group chat dealing with AI sentience nonsense

2

11

116

Tal Linzen

@tallinzen

2 years

Sequence-to-sequence models are incapable of compositional generalization to inputs that require novel mixing and matching of syntactic structures (Weißenhorn, Yao, Donatelli, @alkoller ): (h/t @najoungkim )

Compositional Generalization Requires Compositional Parsers

A rapidly growing body of research on compositional generalization investigates the ability of a semantic parser to dynamically recombine linguistic elements seen in training into unseen...

arxiv.org

4

19

114

Tal Linzen

@tallinzen

5 years

- US B1/2 visa - US F1 visa - Managed to rent a place in NYC without a credit score - French work visa - French health insurance card (maybe hardest) - Managed to rent a place in Paris without a CDI (don't ask) - New US B1/2 visa - H1B visa - Another H1B visa - Got US green card

4

1

111

Tal Linzen

@tallinzen

2 years

Honored and humbled to receive the ACL shitposting award! Many thanks to all members of the ACL but especially my PhD students who helped me workshop my posts on the lab slack!

6

1

112

Tal Linzen

@tallinzen

4 years

Is there a formal definition of what it means for a language model to "know" something? E.g. which of the following scenarios counts as knowing that Paris is the capital of France?

12

15

112

Tal Linzen

@tallinzen

1 year

I'll reveal my opinions about the Chomsky op-ed if and when I get tenure

3

0

111