Sebastian Ruder @seb_ruder Twitter profile

Last Seen Profiles

@theg0negirl

@gryphonpages

@_michaelruffin

@VibeMagazine

@thedawntitan

@CdS_Football

@Vodacomcongo

@bru_rodrigues65

@_Deadkat

@DivyTripathi7

@US_AU

@neuletoile

@udonoharu

@NightVerseGame

@RollandMoulin

@Mik3_N_ikes

@ConcernedApe

@NBCTheVoice

@ChisaiTiktok

@pettipali

@juatmanmenorca1

@Upplyser

@aozamechan

@u_katayama

@TMfariji

@SoulReaper2222

@una_cos

@hunterpage

@GIGA_Institute

@BaobeiBai

@EmmaSantamaria8

@jwaonline

@AlJoumhouriyaTV

@roji_xx

@SweetHoopss

@fruitpunch956

Sebastian Ruder

@seb_ruder

3 years

Types of ML / NLP Papers

21

757

3K

Sebastian Ruder

@seb_ruder

5 years

This is a super cool resource: Papers With Code now includes 950+ ML tasks, 500+ evaluation tables (including SOTA results) and 8500+ papers with code. Probably the largest collection of NLP tasks I've seen including 140+ tasks and 100 datasets.

39

1K

3K

Sebastian Ruder

@seb_ruder

5 years

My PhD thesis Neural Transfer Learning for Natural Language Processing is now online. It includes a general review of transfer learning in NLP as well as new material that I hope will be useful to some.

25

585

2K

Sebastian Ruder

@seb_ruder

4 years

Why You Should Do NLP Beyond English 7000+ languages are spoken around the world but NLP research has mostly focused on English. In this post, I give an overview of why you should work on languages other than English.

Why You Should Do NLP Beyond English

7000+ languages are spoken around the world but NLP research has mostly focused on English. This post outlines why you should work on languages other than English.

www.ruder.io

32

646

2K

Sebastian Ruder

@seb_ruder

2 years

ML and NLP Research Highlights of 2021 These are the research areas and papers I found most exciting & inspiring in 2021.

ML and NLP Research Highlights of 2021

This post summarizes progress across multiple impactful areas in ML and NLP in 2021.

www.ruder.io

27

417

1K

Sebastian Ruder

@seb_ruder

5 years

10 Exciting Ideas of 2018 in NLP: A collection of 10 ideas that I found exciting and impactful this year—and that we'll likely see more of in the future.

13

552

1K

Sebastian Ruder

@seb_ruder

5 years

Most of the world’s text is not in English. We are releasing MultiFiT to train and fine-tune language models efficiently in any language. Post: Paper: With @eisenjulian @PiotrCzapla Marcin Kadras @GuggerSylvain @jeremyphoward

12

371

1K

Sebastian Ruder

@seb_ruder

6 years

Do you often find it cumbersome to track down the best datasets or the state-of-the-art for a particular task in NLP? I've created a resource (a GitHub repo) to make this easier.

19

431

1K

Sebastian Ruder

@seb_ruder

5 years

I'm excited to share some personal news: I've successfully defended my dissertation "Neural Transfer Learning for Natural Language Processing". I'm grateful for my time at @_aylien and @insight_centre and for everyone I got to meet on this journey, both online and offline.

87

40

1K

Sebastian Ruder

@seb_ruder

5 years

"What are the 3 biggest open problems in NLP?" We had asked experts a few simple but big questions for the NLP session at the @DeepIndaba . We're now happy to share the full responses from Yoshua Bengio, @redpony , @RichardSocher and many others

13

472

1K

Sebastian Ruder

@seb_ruder

4 months

I've decided to leave Google DeepMind to pursue a new adventure. I feel incredibly lucky to have had the chance to work with and learn from so many amazing colleagues and mentors over the last 4 1/2 years. I'm grateful & excited for what's next!

56

17

1K

Sebastian Ruder

@seb_ruder

3 years

ML and NLP Research Highlights of 2020 It's been inspiring to look back on all the exciting advances that happened despite such a tumultuous year. Here's a selection of my highlights.

ML and NLP Research Highlights of 2020

This post summarizes progress in 10 exciting and impactful directions in ML and NLP in 2020.

www.ruder.io

5

323

968

Sebastian Ruder

@seb_ruder

4 years

10 ML & NLP Research Highlights of 2019 New blog post on ten ML and NLP research directions that I found exciting and impactful in 2019.

5

290

946

Sebastian Ruder

@seb_ruder

6 years

New piece about a direction I'm super excited about: NLP's ImageNet moment has arrived @gradientpub

NLP's ImageNet moment has arrived

The time is ripe for practical transfer learning to make inroads into NLP.

thegradient.pub

11

370

934

Sebastian Ruder

@seb_ruder

3 years

10 Things You Need to Know About BERT and the Transformer Architecture That Are Reshaping the AI Landscape This super comprehensive post covers most things that are important in current NLP including BERT, transfer and avocado chairs 🥑 by @cathalhoran

7

204

896

Sebastian Ruder

@seb_ruder

4 years

NLP Year in Review — 2019 An extensive list of interesting publications, creative and societal applications, tools and datasets, articles, and resources of 2019 by @omarsar0 .

NLP Year in Review — 2019

NLP highlights for the year 2019.

medium.com

3

279

888

Sebastian Ruder

@seb_ruder

4 years

10 Tips for Research and a PhD I've been asked in the past to provide advice on doing research. Here are 10 tips that worked well for me and will hopefully also be useful to others.

10 Tips for Research and a PhD

This post outlines 10 things that I did during my PhD and found particularly helpful in the long run.

www.ruder.io

14

230

853

Sebastian Ruder

@seb_ruder

3 years

Recent Advances in Language Model Fine-tuning New blog post that takes a closer look at fine-tuning, the most common way large pre-trained language models are used in practice.

Recent Advances in Language Model Fine-tuning

This post provides an overview of recent methods to fine-tune large pre-trained language models.

www.ruder.io

9

250

846

Sebastian Ruder

@seb_ruder

4 months

I'm excited to announce that I've joined @cohere to help make LLMs more multilingual! It’s crazy how the capabilities of NLP models have evolved over the last years. I’m thrilled to work with a team full of smart, dedicated and kind individuals to push the boundaries of LLMs.

60

24

853

Sebastian Ruder

@seb_ruder

5 years

Here are the materials for our @NAACLHLT tutorial on Transfer Learning in NLP with @Thom_Wolf @swabhz @mattthemathman : Slides: Colab: Code: #NAACLTransfer

10

287

840

Sebastian Ruder

@seb_ruder

5 years

New blog post: The State of Transfer Learning in NLP A review of key insights and takeaways from our NAACL 2019 tutorial with updates based on recent work.

4

254

764

Sebastian Ruder

@seb_ruder

4 years

I'm excited to announce XTREME, a new benchmark that covers 9 tasks and 40 typologically diverse languages. Paper: Blog post: Code:

3

206

764

Sebastian Ruder

@seb_ruder

5 years

This is a nice diagram by Zhengyan Zhang and @BakserWang that shows how many recent pretrained language models are connected. The GitHub repo contains a full list of relevant papers:

5

286

736

Sebastian Ruder

@seb_ruder

4 years

The Transformer encoder visualized A nice visualization and tutorial of the Transformer encoder layers by @UlfMertens . It incorporates the batch dimension, resulting in 3D tensors.

4

156

722

Sebastian Ruder

@seb_ruder

4 years

This is a *really* extensive repo containing ~380 BERT-related papers sorted into downstream tasks, modifications, probes, multilingual models, and more. Nice job, @stomohide !

GitHub - tomohideshibata/BERT-related-papers: BERT-related papers

BERT-related papers. Contribute to tomohideshibata/BERT-related-papers development by creating an account on GitHub.

github.com

5

187

662

Sebastian Ruder

@seb_ruder

3 years

Some professional news: The previous week was my last week at DeepMind. DeepMind is an amazing place to do impactful, long-term research and I’m grateful to have had the chance to work alongside so many kind, smart, and inspiring people.

17

9

636

Sebastian Ruder

@seb_ruder

3 years

Multi-Task Learning with Deep Neural Networks: A Survey I learned a lot reading this comprehensive overview by @CrichaelMawshaw . It categorizes recent work into architecture design, optimization methods, and task relationship learning.

6

115

631

Sebastian Ruder

@seb_ruder

6 years

New blog post: Requests for research. A collection of interesting research directions around transfer learning and NLP

12

224

633

Sebastian Ruder

@seb_ruder

5 years

Papers with Code now has badges to put on your GitHub repo that indicate that your model is state-of-the-art. 🏅 This seems like a great way to incentivize open-sourcing code! I hope we'll see a lot more badges to highlight useful implementations. 🥇🥈🥉

Papers with Code

@paperswithcode

5 years

🎉 New feature: State-of-the-art GitHub badges. Submit evaluation results from your paper to obtain a badge for the official GitHub repository. A new way to highlight your paper's performance!

3

85

335

2

121

626

Sebastian Ruder

@seb_ruder

3 years

Our RemBERT model (ICLR 2021) is finally open-source and available in 🤗 Transformers. RemBERT is a large multilingual Transformer that outperforms XLM-R (and mT5 with similar # of params) in zero-shot transfer. Docs: Paper:

6

155

629

Sebastian Ruder

@seb_ruder

4 years

If you want to learn about privacy-preserving machine learning, then there is no better resource than this step-by-step notebook tutorial by @iamtrask . From the basics of private deep learning to building secure ML classifiers using PyTorch & PySyft.

1

158

624

Sebastian Ruder

@seb_ruder

4 years

It's been a while.. Here's a new edition of NLP News containing an ML and NLP starter toolkit, a Low-resource NLP toolkit, and discussions of "Can an LM ever understand natural language?" and the next generation of NLP benchmarks. (via @revue )

6

145

606

Sebastian Ruder

@seb_ruder

6 years

New blog post: A Review of the Recent History of Natural Language Processing. The 8 biggest milestones in the last ~15 years of #NLProc . From our NLP session at @DeepIndaba . @_aylien

6

238

595

Sebastian Ruder

@seb_ruder

6 years

The multilingual BERT model is out now (earlier than anticipated). It covers 102 languages and features an extensive README motivating certain preprocessing and modelling choices.

3

195

578

Sebastian Ruder

@seb_ruder

4 years

Transfer learning is increasingly going multilingual with language-specific BERT models: - 🇩🇪 German BERT - 🇫🇷 CamemBERT , FlauBERT - 🇮🇹 AlBERTo - 🇳🇱 RobBERT

RobBERT: a Dutch RoBERTa-based Language Model

Pre-trained language models have been dominating the field of natural language processing in recent years, and have led to significant performance gains for various complex natural language tasks....

arxiv.org

19

150

572

Sebastian Ruder

@seb_ruder

5 years

If you're doing anything with NLP, this is a great place to start! A PyTorch library of state-of-the-art pretrained Transformer language models featuring BERT, GPT-2, XLNet, and more.

Hugging Face

@huggingface

5 years

🥁🥁🥁 Welcome to "pytorch-transformers", the 👾 library for Natural Language Processing!

7

216

703

4

121

566

Sebastian Ruder

@seb_ruder

5 years

Check out @danqi_chen 's just published PhD thesis for an up-to-date overview of the world (and future) of neural reading comprehension 👏🏻

2

178

554

Sebastian Ruder

@seb_ruder

5 years

The New Era of NLP (SciPy 2019 Keynote): This is a great presentation by @math_rachel that focuses on transfer learning and discusses one of the most important problems of our times, disinformation and information glut

Keynote: The New Era in NLP | SciPy 2019 | Rachel Thomas

In the past year, we have seen a remarkable number of breakthroughs in the field of natural language processing, including huge leaps forward in classifying,...

www.youtube.com

3

130

506

Sebastian Ruder

@seb_ruder

6 years

Code and pretrained weights for BERT are out now. Includes scripts to reproduce results. BERT-Base can be fine-tuned on a standard GPU; for BERT-Large, a Cloud TPU is required (as max batch size for 12-16 GB is too small).

GitHub - google-research/bert: TensorFlow code and pre-trained models for BERT

TensorFlow code and pre-trained models for BERT. Contribute to google-research/bert development by creating an account on GitHub.

github.com

4

213

497

Sebastian Ruder

@seb_ruder

5 years

New NLP News: BERT, Transfer learning for dialogue, Deep Learning SOTA 2019, Gaussian Processes, VI, NLP lesson curricula, lessons, AlphaStar, How to manage research teams, and lots more via @revue

1

136

488

Sebastian Ruder

@seb_ruder

5 years

It's great to see the growing landscape of NLP transfer learning libraries: - pytorch-transformers by @huggingface : - spacy-pytorch-transformers by @explosion_ai : - FARM by @deepset_ai

GitHub - deepset-ai/FARM: :house_with_garden: Fast & easy transfer learning for NLP. Harvesting...

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering. - deepset-ai/FARM

github.com

5

139

476

Sebastian Ruder

@seb_ruder

5 years

New NLP News—BERT, GPT-2, XLNet, NAACL, ICML, arXiv, EurNLP (via @revue )

3

129

465

Sebastian Ruder

@seb_ruder

4 years

New NLP News: 2020 NLP wish lists, HuggingFace + fastai, NeurIPS 2019, GPT-2 things, Machine Learning Interviews via @revue

1

120

464

Sebastian Ruder

@seb_ruder

6 years

New blog post: Optimization for Deep Learning Highlights in 2017

3

194

460

Sebastian Ruder

@seb_ruder

7 years

New blog post: Word embeddings in 2017 - Trends and future directions

4

229

450

Sebastian Ruder

@seb_ruder

5 years

This is a super intuitive (and well illustrated) guide to state-of-the-art Transfer Learning methods in NLP. From the author of the superb Illustrated Transformer post.

1

132

444

Sebastian Ruder

@seb_ruder

5 years

New NLP News: NLP in Industry, Leaderboard madness, NLP, Transfer learning tools via @revue

5

124

444

Sebastian Ruder

@seb_ruder

5 years

Pretrained language models are not only applicable to natural language but also to other domains where sequences have an underlying structure, such as genomics. We can get better performance with more meaningful token representations (e.g. using k-mers instead of nucleotides).

towards_entropy

@Towards_Entropy

5 years

So it turns out ULMFiT by @seb_ruder and @jeremyphoward works great for classifying genomic sequences, producing competitive or superior results to existing literature #Genomics #DeepLearning #Bioinformatics #compbio

3

91

294

7

85

433

Sebastian Ruder

@seb_ruder

3 years

A Survey on Recent Approaches for Natural Language Processing in Low-Resource Scenarios This survey is a great starting point for learning about low-resource NLP, common methods, and open challenges. Work by @jannikstroetgen @MicHedderich @dklakow

5

116

434

Sebastian Ruder

@seb_ruder

1 year

In our new survey “Modular Deep Learning”, we provide a unified taxonomy of the building blocks of modular neural nets and connect disparate threads of research. 📄 📢 🌐 w/ @PfeiffJo @licwu @PontiEdoardo

8

97

427

Sebastian Ruder

@seb_ruder

5 years

New blog post: Unsupervised cross-lingual representation learning An overview of learning cross-lingual representations without supervision, from the word level to deep multilingual models. Based on our ACL 2019 tutorial.

2

126

419

Sebastian Ruder

@seb_ruder

5 years

@_aylien @insight_centre Next week, I'll start as a research scientist at @DeepMindAI in London where I'll be working on models for general linguistic intelligence. I'm thrilled about what lies ahead and looking forward to keep being part of this amazing community.

25

7

424

Sebastian Ruder

@seb_ruder

6 years

Here are the slides of my talk on Transfer learning with language models at the Belgium NLP meetup last week. I tried to distill our current understanding of what LMs capture.

4

99

417

Sebastian Ruder

@seb_ruder

5 years

New on NLP-progress: A comprehensive overview of the state-of-the-art in text simplification, due to @feralvam 👏🏻

Simplification

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

nlpprogress.com

6

117

413

Sebastian Ruder

@seb_ruder

1 month

Command R+ (⌘ R+) is our most capable model (with open weights!) yet! I’m particularly excited about its multilingual capabilities. It should do pretty well in 10 languages (English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese). You can…

cohere

@cohere

1 month

Today, we’re introducing Command R+: a state-of-the-art RAG-optimized LLM designed to tackle enterprise-grade workloads and speak the languages of global business. Our R-series model family is now available on Microsoft Azure, and coming soon to additional cloud providers.

26

197

955

17

69

420

Sebastian Ruder

@seb_ruder

3 years

From today, I’ll be at Google Research where I’ll be working on NLP for under-represented languages, with a particular focus on languages in Sub-Saharan Africa. I’m looking forward to helping make NLP more accessible together with colleagues at Google.

11

7

417

Sebastian Ruder

@seb_ruder

5 years

1/ Our paper Episodic Memory in Lifelong Language Learning with Cyprien de Masson d'Autume, @ikekong , and @DaniYogatama was accepted to @NeurIPSConf . We go beyond MTL and tackle lifelong learning where models need to acquire new information continually:

2

83

401

Sebastian Ruder

@seb_ruder

5 years

Only a weekend after @ACL2019_Italy , there are already awesome reviews available on various topics: - Trends in NLP by @mihail_eric - Knowledge graphs by @michael_galkin - MT by @noecasas 👏 #acl2019nlp

Knowledge Graphs in Natural Language Processing @ ACL 2019

Hello, ACL 2019 has just finished and I attended the whole week of the conference talks, tutorials, and workshops in beautiful Florence…

mgalkin.medium.com

2

134

376

Sebastian Ruder

@seb_ruder

1 year

My new blog post takes a look at the state of multilingual AI. 🌍 How multilingual are current models in NLP, vision, and speech? 🏛 What are the recent contributions in this area? ⛰ What challenges remain and how we can we address them?

The State of Multilingual AI

This post takes a look at the state of multilingual AI. How multilingual are current models? What are recent contributions and remaining challenges?

www.ruder.io

6

126

384

Sebastian Ruder

@seb_ruder

6 years

Good practices in Modern Tensorflow for NLP: A notebook of best practice code snippets covering Eager execution, , and tf.estimator by @roamanalytics

0

107

376

Sebastian Ruder

@seb_ruder

3 months

Thoughts on the 2024 AI Job Market Some thoughts on AI research jobs in 2024, how the nature of research has changed in the era of LLMs, and why I joined @cohere .

Thoughts on the 2024 AI Job Market

And Why I Joined Cohere

newsletter.ruder.io

13

36

380

Sebastian Ruder

@seb_ruder

6 years

NLP-progress is trending on GitHub today! And already 4 PRs! This is so awesome! Thanks everyone for contributing!

4

69

370

Sebastian Ruder

@seb_ruder

5 years

Natural Questions: A new QA dataset consisting of 300,000+ naturally occurring questions (posed to Google search) with human provided long & short answers based on Wikipedia. Looks like an exciting new benchmark! Paper: Competition:

4

125

365

Sebastian Ruder

@seb_ruder

4 years

Are you interested in data-to-text generation (generating text based on structured data, e.g. tables or graphs)? @rvaaau has added a nice overview of standard datasets and recent models to NLP Progress. 👏

3

54

366

Sebastian Ruder

@seb_ruder

5 years

In the meantime, here are the slides from my PhD defence presentation on Neural Transfer Learning for Natural Language Processing:

neural_transfer_learning_for_nlp.pdf

drive.google.com

6

91

361

Sebastian Ruder

@seb_ruder

6 years

Updated overview of SGD optimization algorithms: Now includes AMSGrad (ICLR 2018), the latest adaptive learning rate method

On the Convergence of Adam and Beyond

We investigate the convergence of popular optimization algorithms like Adam , RMSProp and propose new variants of these methods which provably converge to optimal solution in convex settings.

openreview.net

3

105

359

Sebastian Ruder

@seb_ruder

6 years

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding: SOTA on 11 tasks. Main additions: - Bidirectional LM pretraining w/ masking - Next-sentence prediction aux task - Bigger, more data It seems LM pretraining is here to stay.

BERT: Pre-training of Deep Bidirectional Transformers for Language...

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is...

arxiv.org

3

117

361

Sebastian Ruder

@seb_ruder

5 years

New paper with @mattthemathman & @nlpnoah on adapting pretrained representations: We compare feature extraction & fine-tuning with ELMo and BERT and try to give several guidelines for adapting pretrained representations in practice.

4

95

356

Sebastian Ruder

@seb_ruder

6 years

This is a super useful paper that we need more of: Better ImageNet models are not necessarily better feature extractors (ResNet is best); but for fine-tuning, ImageNet performance is strongly correlated with downstream performance.

Do Better ImageNet Models Transfer Better?

Transfer learning is a cornerstone of computer vision, yet little work has been done to evaluate the relationship between architecture and transfer. An implicit hypothesis in modern computer...

arxiv.org

1

105

351

Sebastian Ruder

@seb_ruder

6 years

Transfer learning with language models is getting hot! 🔥New state-of-the-art results today by two different research groups: Trinh and Le (Google) on the Winograd challenge and Radford et al. (OpenAI) on a diverse range of tasks.

A Simple Method for Commonsense Reasoning

Commonsense reasoning is a long-standing challenge for deep learning. For example, it is difficult to use neural networks to tackle the Winograd Schema dataset (Levesque et al., 2011). In this...

arxiv.org

0

149

353

Sebastian Ruder

@seb_ruder

5 years

This is a great post that highlights the connection between the building blocks used in Transformers and Capsule Networks. Definitely worth reading!

Samira

@samiraabnar

5 years

From Attention in Transformers to Dynamic Routing in Capsule Nets

4

137

483

1

77

349

Sebastian Ruder

@seb_ruder

3 years

A Primer on Pretrained Multilingual Language Models This survey is a great starting point to learn about anything related to state-of-the-art multilingual models in NLP.

3

83

350

Sebastian Ruder

@seb_ruder

4 years

New NLP News: NLP Progress, Restrospectives and look ahead, New NLP courses, Independent research initiatives, Interviews, Lots of resources (via @revue )

0

112

346

Sebastian Ruder

@seb_ruder

6 years

Microsoft reports that they've achieved human parity on Chinese-to-English translation (27.40 BLEU; 1 BLEU better than best result of WMT 2017). Model is a Transformer (NIPS 2017) + Dual Learning (NIPS 2016) + Deliberation Nets (NIPS 2017).

10

111

343

Sebastian Ruder

@seb_ruder

7 years

New blog post on Multi-Task Learning in Deep Neural Networks

7

129

340

Sebastian Ruder

@seb_ruder

5 years

A new bigger, better language model by @OpenAI : - Scaled-up version of their Transformer (10x params) - Trained on 10x more curated data (40 GB of Reddit out links w/ >2 karma) - SOTA on many LM-like tasks - Discuss potential for malicious use

6

95

336

Sebastian Ruder

@seb_ruder

4 years

Curriculum for Reinforcement Learning "Learning is probably the best superpower we humans have." @lilianweng explores four types of curricula that have been used to help RL models learn to solve complicated tasks.

0

81

334

Sebastian Ruder

@seb_ruder

3 years

NLP News: ICLR 2021 Outstanding Papers, Char Wars, Speech-first NLP, Virtual conference ideas Featuring a round-up of @iclr_conf best papers, ideas for fun things to do at virtual conferences, Star Wars references 🛸, and more...

8

61

332

Sebastian Ruder

@seb_ruder

5 years

Super interesting tutorial on visualization for ML at #NeurIPS2018 w/ case study on multilingual embedding visualization (at 1:40:29). First evidence I've seen that a multilingual NMT system brings languages together rather than separating them.

2

84

330

Sebastian Ruder

@seb_ruder

5 years

New NLP News: Bigger vs. smaller models, powerful vs. dumb models via @revue

4

83

328

Sebastian Ruder

@seb_ruder

5 years

The new study by @colinraffel et al. provides a great overview of best practices in the current transfer learning landscape in NLP. Check out page 33 of the paper or below for the main takeaways.

1

79

328

Sebastian Ruder

@seb_ruder

4 years

I really like the new Methods section in @paperswithcode to find applications and similar methods. For language models in NLP, you can see at a glance the most common LMs and explore the papers that employ them.

1

71

325

Sebastian Ruder

@seb_ruder

2 years

ACL 2022 Highlights ☘️ My highlights of #acl2022nlp including language diversity and multimodality, prompting, the next big ideas, and my favorite papers.

ACL 2022 Highlights

This post discusses my highlights of ACL 2022, including language diversity and multimodality, prompting, the next big ideas and keynotes, my favorite papers, and the hybrid conference experience.

www.ruder.io

4

93

323

Sebastian Ruder

@seb_ruder

3 years

New NLP Newsletter: GitHub Copilot, The Perceiver, Beyond the Transformer, Data augmentation, NL augmenter 🦎 → 🐍, Research communication

1

66

315

Sebastian Ruder

@seb_ruder

6 years

New NLP News: TensorFlow 2.0, PyToch Dev Conference, DecaNLP, BERT, Annotated Encoder-Decoder, ICLR 2019 reading, v2, AllenNLP v0.7, 10 writing tips, AutoML & Maths for ML books, TensorFlow NLP best practices (via @revue )

5

98

313

Sebastian Ruder

@seb_ruder

7 years

New blog post: Deep Learning for #NLProc Best Practices -- a collection of best practices for applying NNs to NLP

4

157

310

Sebastian Ruder

@seb_ruder

5 years

Besides the obvious things (ELMo, BERT, etc.), is there anything that we should definitely discuss at the NAACL "Transfer Learning in NLP" tutorial? Anything that is under-appreciated in transfer learning?

Thomas Wolf

@Thom_Wolf

5 years

Currently working on the coming NAACL "Transfer Learning in NLP" tutorial with @seb_ruder @mattthemathman and @swabhz . Pretty excited! And I've discovered you can write a Transformer model like GPT-2 in less than 40 lines of code now! 40 lines of code & 40 GB of data...

15

280

1K

24

50

307

Sebastian Ruder

@seb_ruder

5 years

Tutorial on Unsupervised Deep Learning at #NeurIPS2018 . NLP part starts at 1:16:00. Still sizable gap between unsupervised vs. supervised pretraining in CV. Lots of progress in NLP, but not entirely satisfactory. A general principle is still missing.

2

69

307

Sebastian Ruder

@seb_ruder

6 years

Rules of Machine Learning: Best Practices for ML Engineering. Based on the Rules of ML pdf file (). Includes a ton of important tips and tricks.

3

114

306

Sebastian Ruder

@seb_ruder

5 years

If you're interested in interpretability and better understanding #NLProc models 🔎, read this excellent TACL '19 survey by @boknilev . Clearly covers important research areas. Paper: Appendix (categorizing all methods):

Analysis Methods in Neural Language Processing: A Survey

The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new models have been...

arxiv.org

2

97

308

Sebastian Ruder

@seb_ruder

3 years

Challenges and Opportunities in NLP Benchmarking Recent NLP models have outpaced the benchmarks to test for them. I provide an overview of challenges and opportunities in this blog post.

Challenges and Opportunities in NLP Benchmarking

Recent NLP models have outpaced the benchmarks to test for them. This post provides an overview of challenges and opportunities for NLP benchmarks.

www.ruder.io

2

83

302

Sebastian Ruder

@seb_ruder

4 years

I'm really excited about our new paper with @PfeiffJo , @licwu & IGurevych. We propose MAD-X, a new adapter-based framework to adapt multilingual models to low-resource languages and languages that were not covered in their training data.

6

68

304

Sebastian Ruder

@seb_ruder

5 years

New NLP News: ML on code, Understanding RNNs, Deep Latent Variable Models, Writing Code for NLP Research, Quo vadis, NLP?, Democratizing AI, ML Cheatsheets, Spinning Up in Deep RL, Papers with Code, Unsupervised MT, Multilingual BERT via @revue

1

88

298

Sebastian Ruder

@seb_ruder

4 years

Pretrained language models for 12 Indic languages in the iNLTK toolkit:

7

86

299

Sebastian Ruder

@seb_ruder

5 years

New NLP Newsletter: Marvel, Stanford & CMU NLP Playlists, Voynich, Bitter Lesson Vol. 2, ICLR 2019, Dialogue Demos (via @revue )

0

78

296

Sebastian Ruder

@seb_ruder

6 years

It's amazing how fast #NLProc is moving these days. We have now reached super-human performance on SWAG, a commonsense task that will only be introduced at @emnlp2018 in November! We need even more challenging tasks! BERT: SWAG:

8

88

296

Sebastian Ruder

@seb_ruder

4 years

NLP News—Reviewing, Taking stock, Theme papers, Poisoning and stealing models, Multimodal generation This newsletter took a bit longer. Going forward, I'll try to cover some themes more in-depth. (via @revue )

3

70

296

Sebastian Ruder

@seb_ruder

6 years

Are you interested in summarization? @tbsflk compiled the results on the most common datasets (CNN/DailyMail, Gigaword, DUC04 Task 1) from 2015-2018. 👏🏻

4

77

294

Sebastian Ruder

@seb_ruder

2 years

🚀 Excited to present a tutorial on "Modular and Parameter-Efficient Fine-Tuning for NLP Models" at #EMNLP2022 with @PfeiffJo & @licwu . We'll give an overview of common methods, benefits and usage scenarios, and how to adapt pre-trained LMs to real-world low-resource settings.

5

33

296

Sebastian Ruder

@seb_ruder

5 years

My AAAI 2019 Highlights—including dialogue, reproducibility, question answering, the Oxford style debate, invited talks, and a diverse set of research papers

2

77

287