Victor Sanh @SanhEstPasMoi Twitter profile | Pikagi

Pikagi

Victor Sanh

@SanhEstPasMoi

8,582

Followers

2,369

Following

188

Media

2,320

Statuses

Dog sitter by day, Scientist at @huggingface 🤗 by night

New York City

Joined May 2012

Don't wanna be here? Send us removal request.

Pinned Tweet

@SanhEstPasMoi

Victor Sanh

26 days

New multimodal model in town: Idefics2! 💪 Strong 8B-parameters model: often on par with open 30B counterparts. 🔓Open license: Apache 2.0. 🚀 Strong improvement over Idefics1: +12 points on VQAv2, +30 points on TextVQA while having 10x fewer parameters. 📚 Better data:…

5

73

273

Last Seen Profiles

@vivaciouss01

@FataAlnfood

@SuffolkGazette

@charismaakapoor

@machipole_iwaki

@BuckyIsotope

@freixenet

@CoachPeff

@pertex04

@SustainOSS

@DragVisa

@TheOfficialBCM

@shaneharris

@BonniesMBB

@qq_dd55

@Kaidenb2026

@enge_tanja

@CutCarbonNet

@Sencre1Random

@silypupilly

@JBs_Plays

@faixvey

@FizFaiez

@LuisZam47678202

@FairfieldArts

@TVHgroup

@DOCTORCODM_

@emn_fitness

@Kyranlofthouse2

@va_cao

@AccionCine

@kindofkh

@FewoTriberg

@orinaaa2

@MSashy15727

@ChiquitaBoxing

@SanhEstPasMoi

Victor Sanh

1 year

We are reproducing Flamingo, a vision and language model developed by Deepmind (). We spent a good amount of time fighting training divergences (aka "instabilities"). Surprisingly, even at the ~2-3B scale. Some learnings from overcoming these 🧵:

Tweet media one

26

251

1K

@SanhEstPasMoi

Victor Sanh

5 years

A few weeks ago, a friend of mine asked me "Which papers can I read to catch up with the latest trends in modern NLP?". 🏃‍♂️👨‍🎓 I compiled a list of papers and resources for him 📚 and thought it would be great to share it!

🌻 The Best and Most Current of Modern Natural Language Processing

Which papers can I read to catch up with the latest trends in modern Natural Language Processing?

11

418

1K

@SanhEstPasMoi

Victor Sanh

5 years

There is a trend for huge Transformers. We went the other way: decreasing the size! 🤗 Introducing DistilBERT: a smaller, faster, cheaper, lighter BERT trained w/ distillation! 95% of BERT's GLUE perf w/ 66M parameters. 📃: 💻:

Tweet media one

23

450

1K

@SanhEstPasMoi

Victor Sanh

5 years

One NLP model to rule them all 😉 We've open sourced code & demo of our latest Hierarchical Multi-Task Learning model. SOTA on several NLP tasks! Try (and modify !) it for yourself 🎮 Demo: Code: Medium:

Tweet media one

7

337

972

@SanhEstPasMoi

Victor Sanh

4 years

🔥🔥Series A!!🔥🔥 Extremely excited to share the news with you and so in awe of what we have built with the community over the past few months!! 🧡 We really are JUST GETTING STARTED!!🚀 Also, we are hiring!! @huggingface

@TechCrunch

TechCrunch

4 years

Hugging Face raises $15 million to build the definitive natural language processing library by @romaindillet

Tweet media one

15

83

276

26

64

653

@SanhEstPasMoi

Victor Sanh

4 years

Excited to share our latest work on extreme pruning in the context of transfer learning 🧀 95% of the original perf with only ~5% of remaining weights in the encoder💪 Paper: With amazing collaborators @Thom_Wolf & @srush_nlp at @huggingface [1/7]

Tweet media one

6

152

643

@SanhEstPasMoi

Victor Sanh

9 months

Introducing IDEFICS, the first open state-of-the-art visual language model at the 80B scale! The model accepts arbitrary sequences of images and texts and produces text. A bit like a multimodal ChatGPT! Blogpost: Playground:

Tweet media one

23

179

572

@SanhEstPasMoi

Victor Sanh

3 years

How it started vs how it's going!🖼🏆 @huggingface

Tweet media one

Tweet media two

5

30

519

@SanhEstPasMoi

Victor Sanh

5 years

Here's how we beat the state-of-the-art in NLP with HMTL 💪 Happy to finally share our latest paper on multi-task learning: !! And we are also releasing the code!! The training code relies on the AllenNLP library @ai2_allennlp .

Tweet card media

GitHub - huggingface/hmtl: 🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural...

🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP - huggingface/hmtl

6

137

500

@SanhEstPasMoi

Victor Sanh

2 years

Hugging Face 🤗 in Paris!

Tweet media one

3

15

454

@SanhEstPasMoi

Victor Sanh

5 years

It has been in our TODO stack for an eternity now… So excited that we are taking the time to write a paper for our 🤗Transformers library! Stay tuned, you’ll finally have a citable paper very soon! 📃

Tweet media one

2

51

435

@SanhEstPasMoi

Victor Sanh

3 years

One day, I'll understand git

29

20

429

@SanhEstPasMoi

Victor Sanh

5 years

Excited to see our DistilBERT paper accepted at NeurIPS 2019 ECM^2 wkshp! 40% smaller 60% faster than BERT => 97% of the performance on GLUE w. a triple loss signal 💥We also distilled GPT2 in an 82M params model 📖 Code&weights:

Tweet media one

Tweet media two

7

86

345

@SanhEstPasMoi

Victor Sanh

4 years

How do you say "faster!" in 104 languages? Ask 🤗Transformers! Please welcome **Distil-mBERT**, 104 languages, 92% of mBERT’s performance on XNLI, 25% smaller, and twice as fast.🔥 > Come talk to me about it at #NeurIPS2019 ! 🇨🇦🤗

Tweet media one

5

74

311

@SanhEstPasMoi

Victor Sanh

3 years

It's 2021 and I am still struggling with CUDA drivers...

10

15

305

@SanhEstPasMoi

Victor Sanh

5 years

✨ Distil-All-The-Things series✨ You ask for it, we do it: today we release **DistilRoBERTa**: 95% of RoBERTa's performance on GLUE, twice as fast and 35% smaller! Available in 🤗Transformers:

Tweet media one

1

74

298

@SanhEstPasMoi

Victor Sanh

5 years

"The model uses 8.3 billion parameters and is 24 times larger than BERT and 5 times larger than OpenAI’s GPT-2" Why am I not surprised? For such a computational effort, I hope the weights will be released publicly...

Tweet card media

Nvidia trains world’s largest Transformer-based language model

Nvidia said it has trained the world's largest Transformer-based models and achieved the fastest training and inference for Google's popular BERT model.

venturebeat.com

10

85

284

@SanhEstPasMoi

Victor Sanh

2 years

Beta testing some @huggingface merch 🤗 What do you think? Who wants a hoodie?!? 🙋‍♂️

Tweet media one

53

3

277

@SanhEstPasMoi

Victor Sanh

5 years

➕ Two new additions to @PyTorch Hub: Transformer-XL and GPT-2 from @huggingface 's pytorch_pretrained_bert! 🤗 All of our pytorch_pretrained_bert pre-trained models are now also available through Hub!! 🎮

Tweet media one

5

65

269

@SanhEstPasMoi

Victor Sanh

3 years

🚨New pre-print on avoiding dataset biases We show a method to train a model to ignore dataset biases without explicitly identifying/modeling them by learning from the errors of a “dumb” model. Link: W/ 🤩 collaborators @Thom_Wolf , @boknilev & @srush_nlp

Tweet media one

4

51

271

@SanhEstPasMoi

Victor Sanh

2 months

When Greg Brockman demo-ed GPT4 by hand-sketching a joke website on a piece of paper and asking the system to convert that into an HTML webpage, it blew my mind. Can you build your own Screenshot-to-HTML system with much fewer resources? With this new resource, most likely…

Tweet media one

8

56

265

@SanhEstPasMoi

Victor Sanh

3 years

We’ve seen crazy interest in T0++ (pronounced "T Zero Plus Plus"), and almost 10’000 queries to the model since we announced it 3 days ago. Probably the most hilariously decisive prediction from the model (courtesy of @_philschmid ): 1/N

Tweet media one

7

43

246

@SanhEstPasMoi

Victor Sanh

22 days

Can't wait to see multimodal LLama 3! We released a resource that might come in handy: The Cauldron🍯 The Cauldron is a massive manually-curated collection of 50 vision-language sets for instruction fine-tuning. 3.6M images, 30.3M query/answer pairs. It covers a large…

Tweet media one

6

40

241

@SanhEstPasMoi

Victor Sanh

6 years

Excited to announce that our paper “A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks” has been accepted at #AAAI2019 !! Our model achieves SotA on several tasks including NER on OntoNotes. (1/6)

Tweet media one

Tweet media two

7

48

209

@SanhEstPasMoi

Victor Sanh

9 months

Last week, we released IDEFICS, a state-of-the-art visual language model of 80B parameters. IDEFICS is one of the strongest open-access alternatives to GPT-4. Training it was a rollercoaster. We don't appreciate how challenging it is to train such multimodal systems. 🧵

Tweet media one

8

46

194

@SanhEstPasMoi

Victor Sanh

1 year

Latest addition to the NYC Hugging Face office 🤗💡

Tweet media one

3

5

175

@SanhEstPasMoi

Victor Sanh

5 years

🚪 ✊ Knockknock is now available on pip (and conda)! A small lib to be notified on Slack, Telegram or email when your training is complete or if it crashes in the middle. Integration requires only two additional lines of code.

Tweet media one

3

29

174

@SanhEstPasMoi

Victor Sanh

2 months

An increasing number of engineers and researchers are developing foundational models. Navigating the tools, resources, codebases, and best practices guides is daunting for new contributors. Introducing the Foundation Model Development Cheatsheet, a succinct guide with 250+…

Tweet media one

2

34

169

@SanhEstPasMoi

Victor Sanh

4 years

At @huggingface , we care about speed. 75x faster tokenizers. 🤯

@moi_anthony

Anthony MOI

4 years

I heard some of you found the tokenizers too slow. I think you are going to love what we are cooking for you @huggingface

Tweet media one

12

42

286

0

20

166

@SanhEstPasMoi

Victor Sanh

4 years

🚨We've updated our paper accepted at #NeurIPS2020 📑 Extreme sparsity in the context of transfer learning. 95% of the original perf with only ~5% of remaining weights in the encoder Paper: Code & weights: Thread: 👇 @huggingface

Tweet card media

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning; however, it is less effective in the transfer learning regime that has become standard for...

@SanhEstPasMoi

Victor Sanh

4 years

Excited to share our latest work on extreme pruning in the context of transfer learning 🧀 95% of the original perf with only ~5% of remaining weights in the encoder💪 Paper: With amazing collaborators @Thom_Wolf & @srush_nlp at @huggingface [1/7]

Tweet media one

6

152

643

3

33

158

@SanhEstPasMoi

Victor Sanh

1 year

For what it's worth, we are not planning to trim out the actually exciting and interesting nuggets from the technical reports :) Open-science for the win!

@SanhEstPasMoi

Victor Sanh

1 year

We are reproducing Flamingo, a vision and language model developed by Deepmind (). We spent a good amount of time fighting training divergences (aka "instabilities"). Surprisingly, even at the ~2-3B scale. Some learnings from overcoming these 🧵:

Tweet media one

26

251

1K

2

17

157

@SanhEstPasMoi

Victor Sanh

4 years

On my way to #NeurIPS2019 to talk about distillation/compression of large Language Models on Friday! Also, WE ARE HIRING, so say hi 👋! NB: I might have some goodies 🤗🤗

Tweet media one

4

3

149

@SanhEstPasMoi

Victor Sanh

4 years

This week at the 🤗 Reading Group, we dived into the Von Mises-Fisher distribution with the paper "Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs" from @shocheen & Yulia Tsekov. Colab:

Tweet media one

Tweet media two

5

21

128

@SanhEstPasMoi

Victor Sanh

5 months

On my way to #Neurips2023 with a couple of folks from Hugging Face 🤗 Hit me up if you want to chat about multimodal models! Also hiring, so let’s chat if you are passionate about open-source/open-science!

3

8

120

@SanhEstPasMoi

Victor Sanh

4 months

How do we get visual language models to generate simple UI codes using synthetic data only? Step 1: Prompt a language model to generate various ideas for websites. We used @MistralAI 7B instruct for that step. Having diversity of topics in these website descriptions is critical:…

Tweet media one

Tweet media two

Tweet media three

1

24

109

@SanhEstPasMoi

Victor Sanh

10 months

The pace of iteration on top of Llama v2 is unmatched. Yesterday, it was demos, tutorials, and integration in open-source libraries. Today, it is accelerated inference. Tomorrow, it will be entire creative side projects. Next month, it will be a wave of startups.

7

18

108

@SanhEstPasMoi

Victor Sanh

3 years

🔥🔥Series B!!🔥🔥 Extremely excited to share the news! To all the community (open-source contributors, researchers, machine learning engineers, data scientists, forum users, etc.), thank you 40 million times! I am constantly in awe of all the goodwill in the community! 🎊🚀📈

@TechCrunch

TechCrunch

3 years

Hugging Face raises $40 million for its natural language processing library by @romaindillet

0

22

99

4

7

96

@SanhEstPasMoi

Victor Sanh

1 year

A lot more learnings in this internal note: Hope you find that useful! 👀

Tweet card media

Knowledge sharing note - What did we learn in the past 3 months?

docs.google.com

3

13

92

@SanhEstPasMoi

Victor Sanh

1 year

Learning 2: What really made a difference is layer-normalizing the outputs of Q and K in the attention block: `LN(Q(x))` and `LN(K(x))`. This change was inspired by

@_basilM

Basil Mustafa

1 year

2️⃣2️⃣🅱️: We trained a 22B parameter ViT model, and scale continues to prove its merit! I want to zero in on an aspect of this which is useful however at all scales: a method for improving training stability in transformers.

4

26

184

3

8

92

@SanhEstPasMoi

Victor Sanh

2 years

Huge milestone for @BigscienceW ! The training of a 176B parameters auto-regressive multilingual LM has started! 176B thanks to the support of @Genci_fr , IDRIS, and @s_requena who make this possible! Follow @BigScienceLLM for updates!

Tweet media one

2

13

89

@SanhEstPasMoi

Victor Sanh

5 years

. @Thom_Wolf and I are heading to #ACL in Florence! 🇮🇹 Ping us to chat! We have a couple of @huggingface goodies 🤗🤗🤗

Tweet media one

2

5

87

@SanhEstPasMoi

Victor Sanh

2 years

Series C! 🚀🔥🎉 Super proud of the team and the impact we are having. Pumped to continue making machine learning more transparent, open, and collaborative and continue building with the community! 🤗 And we're hiring for every position you can think of :)

@huggingface

Hugging Face

2 years

🤗🚀

Tweet media one

98

239

2K

5

3

86

@SanhEstPasMoi

Victor Sanh

6 months

Excited to see more open-access visual models being adapted on high-resolution images! The GPT4V documentation () suggests it was trained on inputs up to 2048 x 2048. 512 x 512 is even considered low resolution. Most of the latest…

Tweet media one

Tweet media two

Tweet media three

2

20

85

@SanhEstPasMoi

Victor Sanh

4 years

ETS (GRE, TOEFL, etc.) is such a money-making scam. Since when sending out to 5 scores/figures to someone cost 20$+??????? We switched to the 21st century almost 20 years ago...!!!!

5

5

84

@SanhEstPasMoi

Victor Sanh

4 years

Overheard in a startup: “They upgraded their snack counter. I think they just raised a series A.”

Tweet media one

2

5

82

@SanhEstPasMoi

Victor Sanh

2 years

In the medium to long term, a well-documented model card with access to the model weights is 10x more impactful than a polished blog post of cherry-picked generated samples, no matter how cool and 🤯 these samples/images are.

0

11

79

@SanhEstPasMoi

Victor Sanh

1 year

Excited to see multiple groups (including ours at Hugging Face) reproducing and releasing in open access Flamingo/GPT4 style models! We are not there yet downstream performance-wise, but I am confident there will be a model in open access of the quality of GPT4 by EOY!

@anas_awadalla

Anas Awadalla 🍉

1 year

🦩 Introducing OpenFlamingo! A framework for training and evaluating Large Multimodal Models (LMMs) capable of processing images and text. More details below (including a multimodal LLaMA model!)⬇️ Blog: Demo:

27

472

2K

0

12

78

@SanhEstPasMoi

Victor Sanh

5 months

Let's keep AI open! 💪 Join the builders/researchers/open-source contributors shaping the future of open-source AI at the coolest #NeurIPS2023 community event! I hear that there are a few fun surprises in store 😎🐊🦙 Hosted by @huggingface x @nomic_ai x @coatuemgmt …

Tweet media one

1

13

77

@SanhEstPasMoi

Victor Sanh

1 year

Tell me you work in ML without telling me you work in ML, autocomplete version

Tweet media one

3

0

79

@SanhEstPasMoi

Victor Sanh

7 months

It's a pretty wild and extensive report of success and failure cases of GPT4-V. A few consistent failure cases: - Counting, especially with higher numbers (people, objects, animals, etc.) - Making stuff up (describing objects that are not in the image) -…

Tweet media one

Tweet media two

2

18

77

@SanhEstPasMoi

Victor Sanh

5 months

AI is nothing without open-source #keepAIopen

1

10

79

@SanhEstPasMoi

Victor Sanh

2 years

Hugging Face party tomorrow (Friday) night! 🤗🤗 Dumbo, NYC Deets: party @huggingface .co

2

13

76

@SanhEstPasMoi

Victor Sanh

5 months

Who are the best multimodal data engineers? Want to join the coolest open-source/open-science company? Also, I am at #Neurips2023 , find me to get these brand new glittery stickers! Blond hair (although not too tall), hard to miss 😃

Tweet media one

5

4

74

@SanhEstPasMoi

Victor Sanh

1 year

We'll open-source our work soon 💪, with hopefully no forms involved! 🤗 Stay tuned!

7

2

74

@SanhEstPasMoi

Victor Sanh

4 years

When the training you launched a while ago (and that you forgot about) crashes, you're happy that it didn't crash silently... 🥶 Knockknock: Get notified when your training ends with only two additional lines of code. *10* different platforms supported!

Tweet media one

1

12

68

@SanhEstPasMoi

Victor Sanh

3 years

The acc of a shallow probe on top of a frozen BERT doesn't always reflect whether a property is captured. Our past summer intern Steven Cao w/ @srush_nlp @ @huggingface will present a new perspective on probing using sparsity. Later today, Session 3C (2:40pm ET)! #NAACL2021

Tweet media one

1

15

67

@SanhEstPasMoi

Victor Sanh

2 years

Humbled to see @YejinChoinka ’s shout-out of our ethical charter in her #acl2022 keynote! For our new multimodal project at 🤗 @huggingface , we wrote down our ethical values from the start, instead of having these discussions at the end. 👉

Tweet media one

1

15

68

@SanhEstPasMoi

Victor Sanh

5 years

🚪👋✊ Introducing knockknock, a small lib to send you Slack or email notifications when your training ends or crashes for unexpected reasons. Integration only requires 2 additional lines of code!

2

14

67

@SanhEstPasMoi

Victor Sanh

8 months

Asking AI systems how to use a bike is a thing. Try out IDEFICS now! Fully open-access multimodal model and already rolled out!

Tweet media one

5

13

61

@SanhEstPasMoi

Victor Sanh

9 months

🐶 IDEFICS is a reproduction of Flamingo, a multimodal model developed by @GoogleDeepMind , which has not been released publicly. IDEFICS is built solely on publicly available data and models and is available in open-access! Here's IDEFICS' take on Barbieheimer and Barack Potter

Tweet media one

Tweet media two

2

10

61

@SanhEstPasMoi

Victor Sanh

5 years

ML pun: train a GAN to generate healthy food images and name it "V-GAN"

2

9

60

@SanhEstPasMoi

Victor Sanh

5 years

[Amazing community 🏘️] Knockknock has two additional supports: Microsoft Teams and text message! These are the contributions of @mediumnok and @abhi1thakur . Hat tip! 🚪✊Get notified when your training ends/crashes with only two additional lines of code:

Tweet media one

5

9

60

@SanhEstPasMoi

Victor Sanh

4 years

[1K+ 🌟- Amazing community] This milestone wouldn't be possible without the contributions of the community: @mediumnok , @abhi1thakur , and @valtterinpoika . Thank you!! 🤗🤗🤗 🚪✊

Tweet media one

3

10

57

@SanhEstPasMoi

Victor Sanh

4 years

Differences I’m having a hard time pronouncing as a non-native speaker, a sample: - Bowl - Ball - This is - Decease - Beer - Bear - Poor - Pour - Beach - Bitch - Bold - Bald - Sheet - Shit - Peace - Piss - Piece Just imagine all the embarrassing situations I can put myself in…

12

5

56

@SanhEstPasMoi

Victor Sanh

3 years

This is just the beginning of something *very* big. I'm constantly in awe of all the goodwill in our community. So happy we can be a platform to catalyze all these positive energies!

Tweet media one

@YJernite

Yacine Jernite

3 years

Seeing how many people have introduced themselves and started sharing resources in the new "Languages at @huggingface " section of the forum is bringing me so much joy 🌍🌏🌎🤗 Looking forward to helping build more community beyond English-language NLP!

1

16

85

1

9

55

@SanhEstPasMoi

Victor Sanh

5 years

I'll give a talk during this Meetup on how we push the latest NLP models into production at @huggingface . If you are in NYC, join us on March 7th!

@Cometml

Comet

5 years

RSVP for our next Meetup to hear speakers from @XaxisTweets , @huggingface , and @runwayml discuss productionizing machine learning models! See the full details for March 7th here:

1

4

12

2

9

54

@SanhEstPasMoi

Victor Sanh

2 months

Can you beat the AI at Raven puzzles? How many puzzles can you answer correctly? 🎮 The most powerful vision+language AI systems like Gemini or GPT4V struggle with this problem when used out-of-the-box (). But when properly…

Tweet media one

Tweet media two

@YizheZhangNLP

Yizhe Zhang @ ICLR 2024

2 months

Thanks @_akhaliq for sharing our work. We evaluated SOTA VLMs on challenging Raven's Progressive Matrices. We found that VLMs still struggle to reach human-level performance, with perception being the main bottleneck.

1

22

82

3

16

55

@SanhEstPasMoi

Victor Sanh

4 years

I wanna visit the Paris office 🤗😱🤩

@ClementDelangue

clem 🤗

@ClementDelangue

4 years

We moved in to a new office in Paris! Ft @MorganFunto

Tweet media one

Tweet media two

Tweet media three

Tweet media four

19

7

348

0

3

54

@SanhEstPasMoi

Victor Sanh

4 years

At the Paris office. Who’s there? #NLProc @LysandreJik @MorganFunto

Tweet media one

3

4

54

@SanhEstPasMoi

Victor Sanh

4 years

"Hugging Face (yes, that really is the name)" 🤗🤗🤗 A Survey of Deep Learning for Scientific Discovery, from @maithra_raghu & @ericschmidt

Tweet media one

0

10

51

@SanhEstPasMoi

Victor Sanh

19 days

Another useful resource for multimodal LLama3: OBELICS. OBELICS is an open, massive, and high-quality collection of interleaved image-text web documents, containing 141M English documents, 115B text tokens, and 353M images, extracted from Common Crawl dumps between February 2020…

@SanhEstPasMoi

Victor Sanh

22 days

Can't wait to see multimodal LLama 3! We released a resource that might come in handy: The Cauldron🍯 The Cauldron is a massive manually-curated collection of 50 vision-language sets for instruction fine-tuning. 3.6M images, 30.3M query/answer pairs. It covers a large…

Tweet media one

6

40

241

1

12

51

@SanhEstPasMoi

Victor Sanh

6 years

Everyone in the NLP research community should start watching/starring/bookmarking this awesome repo !

@seb_ruder

Sebastian Ruder

6 years

Do you often find it cumbersome to track down the best datasets or the state-of-the-art for a particular task in NLP? I've created a resource (a GitHub repo) to make this easier.

19

431

1K

0

8

51

@SanhEstPasMoi

Victor Sanh

9 months

Working on releasing a new open-access model while leveling up my meme game 🤡

Tweet media one

7

6

48

@SanhEstPasMoi

Victor Sanh

1 year

@hardmaru @StabilityAI "Grant us the gift of low latency, And protect us from the curse of communication bottlenecks."

Tweet media one

2

7

50

@SanhEstPasMoi

Victor Sanh

10 months

This made my day. AI for architects! 🗼🗽🏙 I helped an architect friend to use an AI rendering engine based on a YT tutorial from @design74043 . From a rough sketch to a realistic mock-up in a few seconds. All of that based solely on open-access models!

Tweet media one

4

7

48

@SanhEstPasMoi

Victor Sanh

3 years

That’s sweet 🤗🤗🤗

@huggingface

Hugging Face

3 years

We are honored to be awarded the Best Demo Paper for "Transformers: State-of-the-Art Natural Language Processing" at #emnlp2020 😍 Thank you to our wonderful team members and the fantastic community of contributors who make the library possible 🤗🤗🤗

Tweet media one

30

139

995

4

0

48

@SanhEstPasMoi

Victor Sanh

5 years

If you're at #NeurIPS2018 or in Montreal, stop by! I'll be presenting our latest Multi-Task Learning architecture: HMTL (w/ @Thom_Wolf and @seb_ruder ).

@Rasa_HQ

Rasa

5 years

If you are in Montreal, join us tomorrow for a special edition of the Montreal chatbots meetup! @alanmnichol will be speaking about the new embedding policy (presented at #NeurIPS ) and how to use it in Rasa Core

0

4

17

1

9

46

@SanhEstPasMoi

Victor Sanh

4 years

This a colossal effort 💪 First release. ~100 datasets. Same standard API. This is part of a broader vision of facilitating every single aspect of the NLP pipeline for research & engineering. Amazing job @Thom_Wolf @dramemariama20 @qlhoest @PatrickPlaten @julienplu @julien_c 🏆

@Thom_Wolf

Thomas Wolf

4 years

Surviving every AI wave, two kernels have consistently been the beating hearts of Natural Language Processing: Datasets and Metrics Today we release "nlp", a library to easily share & load data/metrics already providing access to 99+ datasets! Try it👉

Tweet media one

17

408

2K

1

1

46

@SanhEstPasMoi

Victor Sanh

2 months

Left: a synthetically generated HTML + Tailwind CSS website Right: its rendered screenshot This is a sample from WebSight v0.2, a collection of 2M of these pairs. It allows bootstrapping training multimodal LLMs on the task of converting UI screenshots into code. All of these…

Tweet media one

Tweet media two

1

7

47

@SanhEstPasMoi

Victor Sanh

9 months

🕵️‍♂️ Exploring such a large dataset requires tooling. With @nomic_ai , we built an interactive visualization that allows navigating OBELICS' content.

Tweet media one

2

12

44

@SanhEstPasMoi

Victor Sanh

1 year

Context: Flamingo is a model that combines a pretrained vision encoder with a pretrained language model by gluing some newly initialized parameters in between the two. The training resembles pure LM training because the objective is essentially next-token prediction.

2

0

45

@SanhEstPasMoi

Victor Sanh

3 years

What does large-scale open-science collaboration look like in AI? With @BigscienceW , we are trying to answer that question! w/ @YJernite @mgalle @Thom_Wolf @samsontmr @suzatweet & @mmitchell_ai

Tweet card media

NLP needs to be open. 500+ researchers are trying to make it happen

Tech giants have a stranglehold on natural language tech, allowing them to control which research gets shared and how it impacts society.

venturebeat.com

1

15

43

@SanhEstPasMoi

Victor Sanh

4 years

It was so much fun @bhutanisanyam1 ! Thanks again! We talk about the vision and the culture behind our open-source efforts at @huggingface with one recent example: DistilBERT! Tune in! 🤗🤗 Also I’ll be at @NeurIPSConf next week to talk about DistilBERT more deeply! Say hi 👋 !

@bhutanisanyam1

Sanyam Bhutani

@bhutanisanyam1

4 years

Really excited to finally share my interview with @SanhEstPasMoi where we talk all about DistilBERT, NLP Research at @huggingface and Research in Machine Learning. Audio: Video:

0

7

57

1

7

43

@SanhEstPasMoi

Victor Sanh

4 years

Paper: Code & Weights will be released very soon! Stay tuned! In the meantime, here’s a sneak peek at the memory size compressions: [7/7]

Tweet media one

5

0

41

@SanhEstPasMoi

Victor Sanh

4 years

1K+ machine translation models available through the usual user interface! Tremendous ressource based on @marian_nmt !

@huggingface

Hugging Face

4 years

Let’s democratize NLP for all languages! 🌎🌎🌎 Today, with v2.9.1, we are releasing 1,008 machine translation models, covering ` of 140 different languages trained by @jorgtiedemann with @marian , ported by @sam_shleifer . Find your language here: [1/4]

Tweet media one

20

352

1K

0

7

41

@SanhEstPasMoi

Victor Sanh

4 years

Thanks for having us and the interesting discussion! We were honored to be guests on arguably the best ML/NLP podcast out there!

@nlpmattg

Matt Gardner

4 years

#nlphighlights 104: @SanhEstPasMoi and @Thom_Wolf talk to us about model distillation, when you try to approximate a large model's decision boundary with a smaller model. After talking about the general area, we dive into DistilBERT.

2

24

141

0

6

42

@SanhEstPasMoi

Victor Sanh

9 months

The adrenaline level just before a release

3

1

41

@SanhEstPasMoi

Victor Sanh

6 months

Very long image captions are extremely hard (and costly) to obtain. Looks like a great resource!

Tweet card media

Lin-Chen/ShareGPT4V · Datasets at Hugging Face

@_akhaliq

AK

6 months

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions paper page: In the realm of large multi-modal models (LMMs), efficient modality alignment is crucial yet often constrained by the scarcity of high-quality image-text data. To address…

Tweet media one

4

55

258

0

16

35

@SanhEstPasMoi

Victor Sanh

3 years

We are kicking off a weird one-year-long collaborative endeavor in NLP 👀👀

@Thom_Wolf

Thomas Wolf

3 years

The LHC involves 10.000 researchers from 100 countries In many scientific fields, worldwide research collaborations create tools useful for the entire research community: LHC, ITER, ISS.. Maybe it's time to build similar large, diverse, open research collab in AI/NLP as well 👇

9

63

328

0

2

40

@SanhEstPasMoi

Victor Sanh

4 years

#ICLR2020 You should absolutely join the ICLR virtual town. It is reallyyyy fun. Gonna leave my camera on.

1

3

40

@SanhEstPasMoi

Victor Sanh

9 months

Typical Wednesday in New York. Wait for Friday yoga.

Tweet media one

0

2

36

@SanhEstPasMoi

Victor Sanh

4 years

I'm (virtually) attending #acl2020nlp this week🏠 Already had a blast at the tutorials yesterday🤯 Hit me up (DM or RocketChat) if you wanna chat! Btw, we have created a #huggingface -ama channel if you want to interact with the broader @huggingface team!

1

2

37

@SanhEstPasMoi

Victor Sanh

4 years

Is training 1.1 billion parameters still considered large scale in August 2020? #absurdquestion

Tweet media one

4

3

35

@SanhEstPasMoi

Victor Sanh

4 years

I can’t recall the last time I accepted an incoming LinkedIn connection request from someone I actually know.

2

0

33

@SanhEstPasMoi

Victor Sanh

3 years

I think it's worth exchanging 5 mins of your time against the 100s of hours of coding that the Transformers library saved you in the past 2+ years ⌚️🙃

@huggingface

Hugging Face

3 years

🤗Transformers will pass 40 000 Github 🌟 this week🤯 What a crazy journey! Our libraries are all about the community and we need your input to define the direction of the next 40k stars Take 5 minutes to choose the future of the library here 👇

0

44

195

0

2

33

@SanhEstPasMoi

Victor Sanh

5 years

@honnibal @_stefan_munich @Thom_Wolf @sai_prasanna @jeremyphoward @_inesmontani It's out!

@SanhEstPasMoi

Victor Sanh

5 years

There is a trend for huge Transformers. We went the other way: decreasing the size! 🤗 Introducing DistilBERT: a smaller, faster, cheaper, lighter BERT trained w/ distillation! 95% of BERT's GLUE perf w/ 66M parameters. 📃: 💻:

Tweet media one

23

450

1K

1

3

33

@SanhEstPasMoi

Victor Sanh

5 years

1/ Find a good name for a model. Something that sparks joy in you. Something you can easily say multiple times in a talk. 2/ Build the research direction that fits this name. 🤯🤣

3

6

34

@SanhEstPasMoi

Victor Sanh

4 months

It's 2024! Let's get multimodality fired up! 🔥

1

4

34

@SanhEstPasMoi

Victor Sanh

2 years

It is very special to see this coming together beautifully with the model release in open-access🌸 It will be even more beautiful to see what the community builds with BLOOM! 🤗

@BigscienceW

BigScience Research Workshop

2 years

BLOOM is here. The largest open-access multilingual language model ever. Read more about it or get it at

Tweet media one

29

815

3K

2

0

33