Dieuwke Hupkes Profile
Dieuwke Hupkes

@_dieuwke_

1,980
Followers
239
Following
47
Media
356
Statuses

Joined September 2017
Don't wanna be here? Send us removal request.
@_dieuwke_
Dieuwke Hupkes
6 months
📣 New LLM Benchmark alert! We propose *WorldSense* a cognitively inspired, _unbiased_ synthetically generated benchmark to test grounded reasoning and tacit world models in LLMs. How do LLMs bode? (spoiler: not great!) [1/n]
Tweet media one
9
37
240
@_dieuwke_
Dieuwke Hupkes
4 months
Are you a PhD student interested in memorisation, generalisation and the role of data in the era of LLMs? Come do an internship with me at @AIatMeta ! (Send me a ping if you apply)
15
41
237
@_dieuwke_
Dieuwke Hupkes
3 years
Compositionality in NNs is usually tested using artificial data. But in natural language, it is not that simple! How can we test for compositionality in the wild? In our paper, we evaluate NMT models trained on real, unfiltered natural language. (1/11)
Tweet media one
3
48
235
@_dieuwke_
Dieuwke Hupkes
5 years
Curious what people may mean when they say a neural network is (not) compositional? And how that relates to linguistics and philosophy literature on compositionality? Check our new paper on compositionality in neural networks: !
Tweet media one
6
37
191
@_dieuwke_
Dieuwke Hupkes
10 months
Many people discover mistakes in commonly used NLP eval datasets, but there is no protocol to report and correct those mistakes -- until now! We are introducing the *That is good data* repo, a place to record eval data issues. Head over to to have a look!
4
33
180
@_dieuwke_
Dieuwke Hupkes
2 years
I'm excited about this!
@stanfordnlp
Stanford NLP Group
2 years
For tomorrow's Stanford NLP seminar, we're excited to host @_dieuwke_ Hupkes who will discuss how to evaluate generalisation in neural models for NLP, using compositional generalisation as a case study. Registration: , Abstract:
Tweet media one
4
19
105
3
15
172
@_dieuwke_
Dieuwke Hupkes
3 years
Very happy with this #CoNLL2021 paper, but maaaybe even more thrilled that this is my first paper with an ALL women authorlist (5 of them!)💃 😍. It was really great working together with this team of 🦸‍♀️🦸‍♀️ from @facebookai @UvA_Amsterdam and @EdinburghNLP @AmsterdamNLP
@vernadankers
Verna Dankers
3 years
Predicting the plural of German nouns boils down to choosing from 6 suffixes: -(e)n, -e, -∅, -er, -s. In our new #CoNLL2021 paper, we investigate how a recurrent encoder-decoder model chooses the suffix. But first, why is this an interesting task? (1/12)
Tweet media one
1
16
99
2
5
76
@_dieuwke_
Dieuwke Hupkes
1 year
Are you interested in language, generalisation, evaluation and what on earth ChatGPT might be doing? I'm looking for a PhD student! #NLProc You can apply here: Ping me to let me know you've applied!
0
18
69
@_dieuwke_
Dieuwke Hupkes
1 year
Join us at #blackboxNLP for @lena_voita 's keynote! @MetaAI
Tweet media one
1
11
67
@_dieuwke_
Dieuwke Hupkes
4 years
Tomorrow will be the first virtual edition of BlackboxNLP! We have an exciting programme with three keynotes, six oral presentations and three packed poster sessions. Check for more information. @emnlp2020 @gchrupala @boknilev @yuvalpi @afraalishahi
2
13
66
@_dieuwke_
Dieuwke Hupkes
2 years
#NLProc What do you think are the most convincing examples of generalisation failures in recent NLP models -- i.e. cases where they perform poorly on data that is (slightly) different from their training data? References to support appreciated!
9
15
61
@_dieuwke_
Dieuwke Hupkes
3 years
In the making of this work, we had many (sometimes heated!) discussions about linguistics, what rules actually are (!?), what NNs can even tell us about linguistics and human processing.. I'm so thrilled that this resulted in this award, thanks for this honour @conll_conf ! (1/4)
@conll_conf
CoNLL 2023
3 years
And the winner of the Best Paper award is: Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network By Verna Dankers, Anna Langedijk, Kate McCurdy, Adina Williams and Dieuwke Hupkes
0
14
89
1
7
48
@_dieuwke_
Dieuwke Hupkes
5 years
Very proud of this honourable mention at #CoNLL2019 ! 😊 Thanks! @AmsterdamNLP @conll2019
@conll2019
CoNLL 2019
5 years
We also had 2 honorable mentions for each best paper category: Honorable mentions for best paper: 1) by Koshorek, Stanovsky, Zhou, Srikumar, Berant 2) by Jumelet, Zuidema, Hupkes
2
10
32
0
3
41
@_dieuwke_
Dieuwke Hupkes
6 months
This was such a blast! 🤩😍 Thanks @adinamwilliams @tatsu_hashimoto and @annargrs for your valuable contributions, and all my amazing coorganisers, @vernadankers @c_christodoulop on site and @ryandcotterell @khuyagbaatar_b & @Amirhossein offline
@GenBench
GenBench
6 months
That's a wrap for #genbench2023 ! But good news: We'll be back at EMNLP next year for #genbench2024 😍 Keep an eye on our websites for slides and recordings of our keynote speakers.
Tweet media one
0
4
32
0
4
38
@_dieuwke_
Dieuwke Hupkes
3 years
Can't wait for #blackboxNLP to start tomorrow! 🤩 We have an exciting programme with 6 orals, 55 posters & 3 keynotes, repeated twice for timezone-friendliness Join me 14 Beijing / 7am Amsterdam in Zoom for the virtual opening, or catch a later session!
1
8
38
@_dieuwke_
Dieuwke Hupkes
1 year
Do you have opinions about generalisation in NLP or exciting ideas on how to evaluate it (in LLMs or other models)? Submit to the first GenBnech workshop on (benchmarking) generalisation in NLP @emnlpmeeting ! More info here: #emnlp2023 #nlpproc
@GenBench
GenBench
1 year
The 1st GenBench workshop () is calling for work on generalisation in NLP! Submit your paper to the regular track, or submit your data + paper to our 💥collaborative benchmarking task (CBT)💥 before September 1. Will we see you at #EMNLP2023 ? 1/7
1
11
31
1
3
30
@_dieuwke_
Dieuwke Hupkes
2 years
Thanks @wellecks for the invitation and your great questions! It's an honour to be on your podcast :)
@thesisreview
The Thesis Review Podcast
2 years
Episode 36 of The Thesis Review: Dieuwke Hupkes ( @_dieuwke_ ), "Hierarchy and Interpretability in Neural Models of Language Processing" We discuss her PhD work on compositionality in language (and how to define it), RNNs as explanatory models, & more
Tweet media one
1
5
54
0
2
30
@_dieuwke_
Dieuwke Hupkes
5 years
I'm presenting at Computational Cognition 2019 today, held in the botanical gardens in Osnabrueck! Thanks to the #comco19 organisers for putting together what looks like a great program! @comco2019
0
0
26
@_dieuwke_
Dieuwke Hupkes
7 months
Couldn't be more proud that this work was published in Nature Machine Intelligence! 😍 Big thanks to all my collaborators:)
@AIatMeta
AI at Meta
7 months
Published today in Nature Machine Intelligence — GenBench is an effort led by AI researchers at Meta that aims to make state-of-the-art generalization testing the new status quo for NLP work. Read more about the work in @NatMachIntell ➡️
Tweet media one
10
104
354
2
3
25
@_dieuwke_
Dieuwke Hupkes
2 years
Thanks to everyone who tirelessly (or at least so it seemed 😏) working with me to make this happen!
@GenBench
GenBench
2 years
Ever wanted to know more about generalisation in NLP but overwhelmed with the number of papers on ArXiv? Fear not! We read 400+ papers, 600+ experiments, and designed a taxonomy 📝 to categorise the research for you! (1/n) 🧵
Tweet media one
5
138
523
1
0
21
@_dieuwke_
Dieuwke Hupkes
3 years
Looking forward to being on this panel! Thanks for the invite!
@MRQA_workshop
MRQA Workshop
3 years
Alongside our speakers, we'll also welcome @boydgraber and @LukeZettlemoyer for a Multilingual QA panel, and @_dieuwke_ @DanRothNLP and Michael Collins for panel on Interpretability
Tweet media one
1
2
13
1
0
20
@_dieuwke_
Dieuwke Hupkes
3 years
For instance, when I used a SOTA MT system to translate a COVID certificate from French to English, I got promoted to GODkwe Hupkes (in French, Dieu=God). One could say the translation was… too compositional! (3/11)
Tweet media one
2
1
17
@_dieuwke_
Dieuwke Hupkes
1 year
So excited about this! 😍😍
@GenBench
GenBench
1 year
🎉 Super happy to announce that our proposed workshop “GenBench: The first workshop on (benchmarking) generalisation in NLP” will take place at #EMNLP2023 🤩 🧵 (1/5)
Tweet media one
1
10
66
0
2
16
@_dieuwke_
Dieuwke Hupkes
4 years
The official launch of 30 new #ELLIS units is about to start! I'm very excited to see the presentations of all the different units. Watch it here with me:
@ELLISforEurope
ELLIS
4 years
The countdown is on: join us today as we officially launch 30 #ELLIS in 14 countries across Europe! Details and links to the live stream here:
2
30
88
0
2
14
@_dieuwke_
Dieuwke Hupkes
3 months
GenBench keynotes and panel discussions are live on YouTube! 😍
@GenBench
GenBench
3 months
The GenBench workshop at #EMNLP2023 had 3 fantastic keynotes and a great panel discussion. Did you miss them, or would you just like to listen to them again? Visit our YouTube channel 📺 ( ) to watch @annargrs talk about "A sanity check...
1
3
7
0
2
14
@_dieuwke_
Dieuwke Hupkes
6 months
Have a look at the paper for more details! Work done at @AIatMeta Core WorldSense team: @megdrv @marksibrahim @pascal20100 @benchek_youssef @_dieuwke_ & Emmanuel Dupoux
Tweet media one
1
1
14
@_dieuwke_
Dieuwke Hupkes
6 months
Overall, our conclusion on whether LLMs learned and successfully use tacit world models remains negative. To allow further exploration, we release our data, including our finetuning dataset and various within- and out-of-distribution test sets. [8/n]
1
1
11
@_dieuwke_
Dieuwke Hupkes
1 year
🤩 Very happy about this new llm eval work with @xenia_ohmer and @eliabruni in which we use multi-sense (the fregean one) consistency to evaluate LLMs' understanding in the face of data contamination #NLP @MetaAI
@xenia_ohmer
Xenia Ohmer
1 year
📝When does an #LLM l really understand a task? We propose *multi-sense consistency* as a novel paradigm for evaluating understanding in LLMs. 1/9🧵work with @eliabruni & @dieuwkehupkes @metaAI #NLProc
Tweet media one
2
4
16
1
1
13
@_dieuwke_
Dieuwke Hupkes
6 months
@mmitchell_ai Excellent point, and of course unbiased doesn't exist. What we did do is make a deliberate point to make sure the responses aren't biased, and that the answers are decorrelated from the questions. Having people with actual cogci background on board pays off! 🔥
2
0
13
@_dieuwke_
Dieuwke Hupkes
3 years
#blackboxNLP call for papers! Deadline: August 5. We also accept submissions via @ReviewAcl . Hope to see many of your submissions!
0
0
13
@_dieuwke_
Dieuwke Hupkes
6 months
I'm so excited! 😍😍
@GenBench
GenBench
6 months
The GenBench workshop is about to start in Central Ballroom 3; see you there! After the opening remarks, at 9.15AM, Anna Rogers kicks off with the first keynote.
Tweet media one
0
0
3
0
0
12
@_dieuwke_
Dieuwke Hupkes
5 years
@wzuidema kicking of our Lorentz workshop on compositionality with an interesting talk on what is it, who has it and how did it evolve.
Tweet media one
1
2
12
@_dieuwke_
Dieuwke Hupkes
3 years
Very proud of this publication in Cognition @ELSneuroscience and happy with the opportunity to collaborate with such a diverse team! @facebookai @AmsterdamNLP @unimib @NeuroSpin_91 @UPFBarcelona @icreacommunity Special thanks to @lakretz for driving this research.
@lakretz
Yair Lakretz
3 years
[1/10] Happy to announce our new manuscript just published at Cognition, in which we compare recursive processing in humans and artificial neural networks. With @_dieuwke_ , @MarelliMar , Alessandra Vergallito, Marco Baroni and @StanDehaene
Tweet media one
3
22
70
0
3
12
@_dieuwke_
Dieuwke Hupkes
11 months
I have many questions about how generalisation should be evaluated in modern NLP models (and even about what it's importance is). Think you got any answers? Or more informed questions? 😁 Submit to the Genbench workshop and/or collaborative benchmarking task!
@GenBench
GenBench
11 months
The Collaborative Benchmarking Task is now accepting submissions🚀 ! The CBT is hosted by the 1st GenBench workshop (), to be held at #EMNLP2023 on December 6! A recap of our CfP: 1/3
1
7
12
0
3
12
@_dieuwke_
Dieuwke Hupkes
1 year
Looking forward to this! 😍
@cambridgenlp
CambridgeNLP
1 year
This Friday, we have another great NLIP seminar scheduled: Dieuwke Hupkes ( @_dieuwke_ ) will be giving a talk titled "GenBench -- State-of-the-art generalisation research in NLP". The talk will take place online on Zoom at 12pm GMT. Info at
1
7
28
0
0
12
@_dieuwke_
Dieuwke Hupkes
2 years
Very biased of course, but absolutely yes! 🤩I would love a workshop on evaluating generalisation in NLP! 💪🦾
@GenBench
GenBench
2 years
📣Hey #NLProc ! We are planning to organise a *ACL workshop on generalisation (benchmarking) in #NLP . Goals: provide a stage to discuss SOTA evaluation & create+agree on a set of challenging generalisation tests for the near future. Interested? 💪🙌
0
20
30
0
1
11
@_dieuwke_
Dieuwke Hupkes
1 year
I was very happy to be a part of this year's #blackboxNLP ! If you missed it, check out our YouTube Channel!
@jasmijnbastings
Jasmijn Bastings
1 year
#BlackboxNLP now has a YouTube channel! This year's keynotes from @lena_voita , @catherineols , and @davidbau are already there, and more stuff is coming up! ↘️ #NLProc #XAI #interpretability (posted this already yesterday on Mastodon and LinkedIn)
0
11
68
0
0
11
@_dieuwke_
Dieuwke Hupkes
3 years
Don't miss @vernadankers ' presentation at @conll_conf later today! 15:10 Punta Cana / 20:10 Amsterdam / 19:10 Edinburgh Little teaser in the corresponding image: she's a graphics master! @AmsterdamNLP @AIatMeta @EdinburghNLP
Tweet media one
@vernadankers
Verna Dankers
3 years
Predicting the plural of German nouns boils down to choosing from 6 suffixes: -(e)n, -e, -∅, -er, -s. In our new #CoNLL2021 paper, we investigate how a recurrent encoder-decoder model chooses the suffix. But first, why is this an interesting task? (1/12)
Tweet media one
1
16
99
1
4
11
@_dieuwke_
Dieuwke Hupkes
2 years
Excited to be a part of this!
@raphaelmilliere
Raphaël Millière
2 years
Excited to announce this two-day online workshop on compositionality and AI co-organized with @GaryMarcus with a stellar line-up of speakers! Program (TBC) & registration: 1/
Tweet media one
7
65
237
0
0
11
@_dieuwke_
Dieuwke Hupkes
3 years
One clear take-away: “are neural networks compositional” might not be the right question! Instead, we should focus on their level of compositionality. And what even is the right level? I’m not sure we know! (Szabó, 2012; Garcı́a-Ramı́rez, 2019, Jacobson 2002, Nefdt 2020) (10/11)
1
0
10
@_dieuwke_
Dieuwke Hupkes
2 years
I'm pretty excited about this research direction! Experiments are still very early on the roadmap, tons of different options for next steps, would love to hear what you think!
@maartjeterhoeve
Maartje ter Hoeve
2 years
Happy to share a preprint of the work I did on Interactive Language Modeling during my internship at FAIR w/ @n0mad_0 , @_dieuwke_ and Emmanuel Dupoux! 😃 Link: This is pioneering work, so we are looking forward to your feedback! (1/3)
Tweet media one
1
8
78
0
1
10
@_dieuwke_
Dieuwke Hupkes
3 years
First session tomorrow: @willemzuidema on Language models, brains & interpretability. Curious about the video his first slide is promising! @AmsterdamNLP @UvA_Science @UvA_Amsterdam
Tweet media one
1
0
9
@_dieuwke_
Dieuwke Hupkes
6 months
WorldSense has three types of problems – inference problems, consistency problems and completeness problems – each with their own “trivial” control. Importantly, the dataset is unbiased, and can’t be solved with heuristics or shallow statistical patterns! [2/n]
Tweet media one
2
1
9
@_dieuwke_
Dieuwke Hupkes
5 months
If you missed her poster, try to find @vernadankers later at the conf! I am absolutely sure she will have something interesting to tell you 😁
@vernadankers
Verna Dankers
5 months
At 11AM in PS5, I'll be presenting "Memorisation cartography: Mapping out the memorisation-generalisation continuum in neural machine translation" with dream team @iatitov & @_dieuwke_ . Memorisation in neural networks is concerning but also needed... (1/5)
Tweet media one
1
7
47
0
0
9
@_dieuwke_
Dieuwke Hupkes
3 years
Great round-up! (Small disclaimer: I might be biased because the first paper discussed on the EACL list is @LucWeberr 's paper about studying language models as multi-task learners! 😍)
@seb_ruder
Sebastian Ruder
3 years
EACL, ICLR, NAACL papers round-up, Research reality checks, ML on code New newsletter with papers from recent conferences, reflections on research progress (Transformer + optimizer improvements), and an overview of ML on code.
1
51
218
0
2
9
@_dieuwke_
Dieuwke Hupkes
6 months
We tested GPT3.5, GPT4 and Llama-70B-chat. Only GPT4 got reasonable accuracy on two of the trivial controls, but… [3/n]
Tweet media one
1
1
7
@_dieuwke_
Dieuwke Hupkes
5 years
Interested in biases and information flow in neural language models? The very bright @JumeletJ generalised Contextual Decomposition for LMs, check this twitter thread for a taste of the results!
@JumeletJ
Jaap Jumelet
5 years
How does information flow in neural language models? Which inputs are causally related to which outputs? What biases does such a network have? Check my new CoNLL paper on 'Generalized Contextual Decomposition' for LSTMs w/   @_dieuwke_ and @wzuidema !
2
18
75
0
0
8
@_dieuwke_
Dieuwke Hupkes
3 years
Ik heb genoten van mijn lezing voor @SG_UU gisteren! Mijn dank aan de organisatoren voor de uitnodiging en de uitstekende organisatie, aan het publiek voor de interactie en interessante vragen en aan Laura Mol voor het voorzitten!
@SG_UU
Studium Generale UU
3 years
Kijk nu mee!
0
0
0
0
0
8
@_dieuwke_
Dieuwke Hupkes
2 years
I like this thread about strong and weak compositionality!
@AndrewLampinen
Andrew Lampinen
2 years
To summarize, it is important to not conflate strong and weak compositionality. I share with the authors their interest in the fact that the world has structure and people generalize over the structure. But that doesn’t mean either are strongly compositional.
1
0
13
0
1
7
@_dieuwke_
Dieuwke Hupkes
7 months
@aclmeeting
ACL 2024
7 months
#EMNLP2024 @emnlpmeeting (3) NLP4Science: The 1st Workshop on Natural Language Processing for Science (4) 7th Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2024) (5) GenBench: The 2nd workshop on generalisation (benchmarking) in NLP #NLProc
1
1
3
0
0
7
@_dieuwke_
Dieuwke Hupkes
3 years
Intriguiging to see that LSTMs and Transformers, really quite different architectures in many aspects, have such striking similarities when it comes to syntactic processing! (also this recent tweet of @_jasonwei : )
@lakretz
Yair Lakretz
3 years
[1/7] Happy to announce our new manuscript, in which we show how brittle syntactic processing is in transformer-based language models, compared to humans. With @DesbordesTheo , @_dieuwke_ and @StanDehaene
Tweet media one
3
14
74
0
0
7
@_dieuwke_
Dieuwke Hupkes
2 years
Just learned about this awesome account because @wzuidema is running it this week! 🤩 If you understand Dutch at all, follow it to get his always interesting takes on developments in AI, interpretability and probably more!
@NL_Wetenschap
@NL_Wetenschap
2 years
Hier is een draadje, in het Engels, dat ik begin 2021 schreef over de opmars van het Transformer model vanuit de natuurlijke taalverwerking naar andere deelgebieden van de AI.
2
0
2
0
0
7
@_dieuwke_
Dieuwke Hupkes
6 months
@annargrs engaging with the crowd @GenBench #EMNLP2023 Curious about your opinion on this question!
Tweet media one
0
1
6
@_dieuwke_
Dieuwke Hupkes
2 years
If you are going to ACL, @vernadankers should absolutely be high on your list of people to have a chat with!
@vernadankers
Verna Dankers
2 years
ACL week has started and I'm happy as a clam!🤩If idioms fascinate you too: stop by poster session 1 tomorrow at 11AM (interpretability session), where I will present work with @iatitov and Chris Lucas on idioms in NMT! Some highlights... 1/9
Tweet media one
1
18
177
0
0
7
@_dieuwke_
Dieuwke Hupkes
5 years
They do overgeneralise! Convolution based models and Transformers first overgeneralise and then memorise the exception, LSTMs have difficulty accomodating both rules and exceptions. (For the impact of the exception frequency, check the paper!)
Tweet media one
1
1
7
@_dieuwke_
Dieuwke Hupkes
5 months
Thanks for this honour @conll_conf ! 😍
@LucWeberr
Lucas Weber (on the 👨🏼‍💻 market)
5 months
Yey! We ( @_dieuwke_ , @eliabruni and me) received an honourable mention for our paper at #CoNLL2023 😃
Tweet media one
Tweet media two
1
3
25
0
0
7
@_dieuwke_
Dieuwke Hupkes
1 year
Missed this great talk but still curious (you should be!) ? @lena_voita 's info is on the slides, she has a number of interesting blog posts about her work too! #blackboxNLP
Tweet media one
1
1
7
@_dieuwke_
Dieuwke Hupkes
6 months
Using more complex prompting strategies such as chain-of-thought (COT) or in-context learning (ICL) hardly improves performances. [6/n]
Tweet media one
1
0
6
@_dieuwke_
Dieuwke Hupkes
6 months
👎 None of the models had good performance on the regular experiments, and the average WorldSense accuracies do not even reach 80% [4/n]
Tweet media one
1
0
6
@_dieuwke_
Dieuwke Hupkes
3 years
This work follows a research direction very important to me: trying to learn more about human language processing through interpretability and meticulous analysis of NNs learning tasks that involve (alleged?) important aspects of human processing, such as rule learning. (2/4)
1
0
6
@_dieuwke_
Dieuwke Hupkes
3 years
I'm already looking forward!
0
0
6
@_dieuwke_
Dieuwke Hupkes
6 months
Also, models have *massive* response biases, preferring particular answers regardless of which question was even asked 😮 [5/n]
Tweet media one
1
0
6
@_dieuwke_
Dieuwke Hupkes
3 years
Last but not least: the team for this paper was amazing! @adinamwilliams , @vernadankers , Anna and Kate, it was great to work with you and learn about your perspectives on these questions we all find interesting, coming from quite different backgrounds. Thank you! (4/4)
0
0
5
@_dieuwke_
Dieuwke Hupkes
3 years
Whoops, I stand corrected! Sorry @raquel_dmg for forgetting about this one! 😱
0
0
5
@_dieuwke_
Dieuwke Hupkes
3 years
I’m very excited about this direction of work, and grateful for the opportunity to work at this at @facebookAI , together with the amazing @vernadankers and @eliabruni . (and thanks to my beautiful cats for posing for the systematicity example) (11/11)
1
0
5
@_dieuwke_
Dieuwke Hupkes
3 years
SO proud of my friend, colleague and ex #MoL student Ryan Nefdt for being nominated for the @jcileaders top 10 outstanding young persons of South Africa, calling him "one of the most nationally awarded young academics" #illc @UvA_Amsterdam @AmsterdamNLP
1
0
5
@_dieuwke_
Dieuwke Hupkes
2 years
So cool to see the contributions of this diverse @Ellis_Amsterdam recognised in this cool work! 🤩
@JumeletJ
Jaap Jumelet
2 years
Honoured to be author #147 on this paper 😆 Last year we participated in the BIG-bench challenge of 14 students from the UvA, ranging from bachelor- to PhD level, supervised by @glnmario , @_dieuwke_ , and me. We ended up submitting 3 exciting tasks that are now part of BIG bench!
1
5
23
1
0
5
@_dieuwke_
Dieuwke Hupkes
3 years
We started! Thrilled to listen to this very interesting presentation from Willem Zuidema. Thanks @wzuidema for joining us at 7am your time!
Tweet media one
2
0
5
@_dieuwke_
Dieuwke Hupkes
3 years
Great news, I'm very excited to see what comes out of this!
@wzuidema
Jelle Zuidema (@[email protected])
3 years
Some exciting news: we've been awarded a large grant on Interpreting deep learning models for language, speech & music! We = an amazing group of academics, companies and non-for-profits here in the NL + me The grant = a National Science Agenda grant from @NWONieuws .
Tweet media one
13
15
178
0
0
5
@_dieuwke_
Dieuwke Hupkes
6 months
Finetuning on WorldSense data does help (and shows WorldSense data is not prone to memorisation!), even for some types of ood generalisation, but even finetuned models do not exceed 80% accuracy. [7/n]
Tweet media one
1
0
5
@_dieuwke_
Dieuwke Hupkes
1 year
@GenBench eval cards are live! 😍 I am very excited about this. Let's put some systematicity in the way we talk about generalisation research in NLP!
Tweet media one
@GenBench
GenBench
1 year
Writing a paper on generalisation for #ACL2023 ? Consider using our easy to parse evaluation cards to impress your reviewers! It just takes < 1 min to generate online! 🧵 (1/4)
1
13
30
1
1
5
@_dieuwke_
Dieuwke Hupkes
5 months
Thanks for this honour @conll_conf ! 😍
@KaiserWhoLearns
Kaiser Sun
6 months
Honorable mention at #CoNLL2023 , thank you so much😆! @adinamwilliams @_dieuwke_
Tweet media one
6
5
65
0
0
4
@_dieuwke_
Dieuwke Hupkes
6 years
@tallinzen @glnmario @wzuidema Thank you! That was a great surprise at the end of a great workshop!
2
0
4
@_dieuwke_
Dieuwke Hupkes
3 years
In our analysis we bring together a number of different techniques I have used and/or proposed in the past, when I was still doing my PhD with @wzuidema at @UvA_Amsterdam , such as diagnostic classifiers and diagnostic interventions. (3/4)
1
0
4
@_dieuwke_
Dieuwke Hupkes
6 months
Can't wait for the GenBench workshop next week! 🤩
@GenBench
GenBench
6 months
📢Coming up on Dec 6: the GenBench workshop! Our workshop ends with a panel discussion about generalisation evaluation in NLP. Submit your burning questions ⁉️ for our experts over the next few days!! Reply to this post or submit & upvote via , event 20000!
0
1
3
0
0
4
@_dieuwke_
Dieuwke Hupkes
3 years
For instance, merely changing the spelling of airplane to aeroplane has unexpected consequences 15+ words away from this change! (8/11)
Tweet media one
2
0
4
@_dieuwke_
Dieuwke Hupkes
3 years
@_jasonwei Nice! It lines up well with our 2018 findings that when SV agreement goes wrong in LSTMs this seems to be caused by the subject encoding, and can in fact largely be fixed by doing an "intervention" there! Funny to see that this is the same for BERT!
Tweet media one
1
0
4
@_dieuwke_
Dieuwke Hupkes
3 years
@roger_p_levy @koustuvsinha Thanks for sharing that paper! I wasn't aware of it.
1
0
3
@_dieuwke_
Dieuwke Hupkes
3 years
@MacJobanputra I think indeed that instead of translating "Dieuwke" as a whole, it translated the subword "Dieu" + "wke". Interesting connection also with the faithful vs robust question posed by Prasanna Parthasarathi, @koustuvsinha , Joelle Pineau and @adinamwilliams
0
0
3
@_dieuwke_
Dieuwke Hupkes
3 years
@gneubig Took us a bit longer to write proper READMEs 👀. Here we go: . Let us know if you have questions of course!
0
0
3
@_dieuwke_
Dieuwke Hupkes
2 years
Interesting read, would love to see more work like this!
@percyliang
Percy Liang
2 years
5/ New benchmarks ideally would be clear about which goal(s) they intend to cover. These thoughts are based on work with @nelsonfliu Tony Lee @robinomial (see section 6.2 of on direct versus indirect improvements).
0
3
37
0
0
3
@_dieuwke_
Dieuwke Hupkes
2 years
@adinamwilliams @JenniferCWhite @MetaAI It was great working with you! Egocentrically almost glad the project is not completely wrapped up yet so we still get to stay in touch 😊
0
0
3
@_dieuwke_
Dieuwke Hupkes
1 year
Badr Abdullah on acoustic word embeddings #blackboxnlp
Tweet media one
1
0
3
@_dieuwke_
Dieuwke Hupkes
3 years
To translate adequately, neural MT models do not just need to be compositional, they often also need to take into account context to produce correct translations. This does not always go right! (2/11)
1
2
3
@_dieuwke_
Dieuwke Hupkes
4 years
@JumeletJ will present his work on diagNNose at BlackboxNLP in 5 minutes, come watch his presentation in the workshop's zoom session!
@wzuidema
Jelle Zuidema (@[email protected])
4 years
Jaap is releasing his diagNNose library today, and presenting it at BlackboxNLP! It has some features that you don't find in other recent interpretability toolkits & includes support for diagnostic classifiers (a.k.a. 'probes') and (Shapley-based) contextual decomposition. Yay!
0
1
15
0
0
3
@_dieuwke_
Dieuwke Hupkes
3 years
This is also something we see in our substitutivity test: changing a word from British to American English has an effect on the translation of completely unrelated clauses further on in the sentence! (7/11)
Tweet media one
1
0
3
@_dieuwke_
Dieuwke Hupkes
2 years
And thanks @MetaAI and @riedelcastro and for giving me the opportunity to run a project with such a diverse group of people all over the world! 🙏
0
0
3
@_dieuwke_
Dieuwke Hupkes
3 years
Laura Aina on how LMs process temporary syntactic ambiguities (or: garden path sentences)
Tweet media one
1
0
2
@_dieuwke_
Dieuwke Hupkes
4 years
0
0
3
@_dieuwke_
Dieuwke Hupkes
4 years
@boknilev @mjpost @tallinzen Softconf is not the most user friendly tool to do this (you can check the github repository with the long list of instructions). Perhaps after doing it a few times it becomes more obvious, but it took me easily a few hours that indeed I would have preferred to spend more usefully
2
0
2
@_dieuwke_
Dieuwke Hupkes
3 years
First live oral presentation by Hendrik Schuff on their paper "Does External Knowledge Help Explainable Natural Language Inference? Automatic Evaluation vs. Human Ratings"
Tweet media one
1
0
2
@_dieuwke_
Dieuwke Hupkes
5 years
But what does it actually mean for a neural network to be compositional? What definition of compositionality is used? And how does this relate to the principle of compositionality and the vast amount of literature about this topic?
1
0
2
@_dieuwke_
Dieuwke Hupkes
2 years
@machelreid @BastingsJasmijn @myrthereuver I like it too, but I'm still wondering if there is a way we could do that with just one phone or laptop. It seems to me that different breakout rooms on one laptop wouldn't really solve this. I really appreciate these ideas though
0
0
2
@_dieuwke_
Dieuwke Hupkes
3 years
@gneubig 1. Oh yes you are right! Apologies. Give me a day or two to move our follow up experiments and I'll resend you the link! 2. Great, thanks!
1
0
2
@_dieuwke_
Dieuwke Hupkes
1 year
0
0
2
@_dieuwke_
Dieuwke Hupkes
2 years
@GaryMarcus Coincidence indeed! Will your talk also be physically at Stanford somewhere or only online?
1
0
2
@_dieuwke_
Dieuwke Hupkes
3 years
The ILLC is a great place to work!
@wzuidema
Jelle Zuidema (@[email protected])
3 years
Amsterdam's Institute for Logic, Language and Computation is looking for a tenure-track assistant professor of *Responsible* Artificial Intelligence, focusing on language, music, reasoning, or other ILLC themes. Application deadline this Sunday!
0
7
11
0
1
2
@_dieuwke_
Dieuwke Hupkes
5 years
Tweet media one
1
0
2
@_dieuwke_
Dieuwke Hupkes
5 years
@boknilev @ACL2019_Italy @AriannaBisazza @LIACS @JhuCogsci One of my questions for the #BlackboxNLP panel, specifically addressing (computational) linguists
Tweet media one
0
1
2