Victor Sanh Profile Banner
Victor Sanh Profile
Victor Sanh

@SanhEstPasMoi

8,582
Followers
2,369
Following
188
Media
2,320
Statuses

Dog sitter by day, Scientist at @huggingface 🤗 by night

New York City
Joined May 2012
Don't wanna be here? Send us removal request.
Pinned Tweet
@SanhEstPasMoi
Victor Sanh
26 days
New multimodal model in town: Idefics2! 💪 Strong 8B-parameters model: often on par with open 30B counterparts. 🔓Open license: Apache 2.0. 🚀 Strong improvement over Idefics1: +12 points on VQAv2, +30 points on TextVQA while having 10x fewer parameters. 📚 Better data:…
5
73
273
@SanhEstPasMoi
Victor Sanh
1 year
We are reproducing Flamingo, a vision and language model developed by Deepmind (). We spent a good amount of time fighting training divergences (aka "instabilities"). Surprisingly, even at the ~2-3B scale. Some learnings from overcoming these 🧵:
Tweet media one
26
251
1K
@SanhEstPasMoi
Victor Sanh
5 years
A few weeks ago, a friend of mine asked me "Which papers can I read to catch up with the latest trends in modern NLP?". 🏃‍♂️👨‍🎓 I compiled a list of papers and resources for him 📚 and thought it would be great to share it!
11
418
1K
@SanhEstPasMoi
Victor Sanh
5 years
There is a trend for huge Transformers. We went the other way: decreasing the size! 🤗 Introducing DistilBERT: a smaller, faster, cheaper, lighter BERT trained w/ distillation! 95% of BERT's GLUE perf w/ 66M parameters. 📃: 💻:
Tweet media one
23
450
1K
@SanhEstPasMoi
Victor Sanh
5 years
One NLP model to rule them all 😉 We've open sourced code & demo of our latest Hierarchical Multi-Task Learning model. SOTA on several NLP tasks! Try (and modify !) it for yourself 🎮 Demo: Code: Medium:
Tweet media one
7
337
972
@SanhEstPasMoi
Victor Sanh
4 years
🔥🔥Series A!!🔥🔥 Extremely excited to share the news with you and so in awe of what we have built with the community over the past few months!! 🧡 We really are JUST GETTING STARTED!!🚀 Also, we are hiring!! @huggingface
@TechCrunch
TechCrunch
4 years
Hugging Face raises $15 million to build the definitive natural language processing library by @romaindillet
Tweet media one
15
83
276
26
64
653
@SanhEstPasMoi
Victor Sanh
4 years
Excited to share our latest work on extreme pruning in the context of transfer learning 🧀 95% of the original perf with only ~5% of remaining weights in the encoder💪 Paper: With amazing collaborators @Thom_Wolf & @srush_nlp at @huggingface [1/7]
Tweet media one
6
152
643
@SanhEstPasMoi
Victor Sanh
9 months
Introducing IDEFICS, the first open state-of-the-art visual language model at the 80B scale! The model accepts arbitrary sequences of images and texts and produces text. A bit like a multimodal ChatGPT! Blogpost: Playground:
Tweet media one
23
179
572
@SanhEstPasMoi
Victor Sanh
3 years
How it started vs how it's going!🖼🏆 @huggingface
Tweet media one
Tweet media two
5
30
519
@SanhEstPasMoi
Victor Sanh
5 years
Here's how we beat the state-of-the-art in NLP with HMTL 💪 Happy to finally share our latest paper on multi-task learning: !! And we are also releasing the code!! The training code relies on the AllenNLP library @ai2_allennlp .
6
137
500
@SanhEstPasMoi
Victor Sanh
2 years
Hugging Face 🤗 in Paris!
Tweet media one
3
15
454
@SanhEstPasMoi
Victor Sanh
5 years
It has been in our TODO stack for an eternity now… So excited that we are taking the time to write a paper for our 🤗Transformers library! Stay tuned, you’ll finally have a citable paper very soon! 📃
Tweet media one
2
51
435
@SanhEstPasMoi
Victor Sanh
3 years
One day, I'll understand git
29
20
429
@SanhEstPasMoi
Victor Sanh
5 years
Excited to see our DistilBERT paper accepted at NeurIPS 2019 ECM^2 wkshp! 40% smaller 60% faster than BERT => 97% of the performance on GLUE w. a triple loss signal 💥We also distilled GPT2 in an 82M params model 📖 Code&weights:
Tweet media one
Tweet media two
7
86
345
@SanhEstPasMoi
Victor Sanh
4 years
How do you say "faster!" in 104 languages? Ask 🤗Transformers! Please welcome **Distil-mBERT**, 104 languages, 92% of mBERT’s performance on XNLI, 25% smaller, and twice as fast.🔥 > Come talk to me about it at #NeurIPS2019 ! 🇨🇦🤗
Tweet media one
5
74
311
@SanhEstPasMoi
Victor Sanh
3 years
It's 2021 and I am still struggling with CUDA drivers...
10
15
305
@SanhEstPasMoi
Victor Sanh
5 years
✨ Distil-All-The-Things series✨ You ask for it, we do it: today we release **DistilRoBERTa**: 95% of RoBERTa's performance on GLUE, twice as fast and 35% smaller! Available in 🤗Transformers:
Tweet media one
1
74
298
@SanhEstPasMoi
Victor Sanh
5 years
"The model uses 8.3 billion parameters and is 24 times larger than BERT and 5 times larger than OpenAI’s GPT-2" Why am I not surprised? For such a computational effort, I hope the weights will be released publicly...
10
85
284
@SanhEstPasMoi
Victor Sanh
2 years
Beta testing some @huggingface merch 🤗 What do you think? Who wants a hoodie?!? 🙋‍♂️
Tweet media one
53
3
277
@SanhEstPasMoi
Victor Sanh
5 years
➕ Two new additions to @PyTorch Hub: Transformer-XL and GPT-2 from @huggingface 's pytorch_pretrained_bert! 🤗  All of our pytorch_pretrained_bert pre-trained models are now also available through Hub!! 🎮
Tweet media one
5
65
269
@SanhEstPasMoi
Victor Sanh
3 years
🚨New pre-print on avoiding dataset biases We show a method to train a model to ignore dataset biases without explicitly identifying/modeling them by learning from the errors of a “dumb” model. Link: W/ 🤩 collaborators @Thom_Wolf , @boknilev & @srush_nlp
Tweet media one
4
51
271
@SanhEstPasMoi
Victor Sanh
2 months
When Greg Brockman demo-ed GPT4 by hand-sketching a joke website on a piece of paper and asking the system to convert that into an HTML webpage, it blew my mind. Can you build your own Screenshot-to-HTML system with much fewer resources? With this new resource, most likely…
Tweet media one
8
56
265
@SanhEstPasMoi
Victor Sanh
3 years
We’ve seen crazy interest in T0++ (pronounced "T Zero Plus Plus"), and almost 10’000 queries to the model since we announced it 3 days ago. Probably the most hilariously decisive prediction from the model (courtesy of @_philschmid ): 1/N
Tweet media one
7
43
246
@SanhEstPasMoi
Victor Sanh
22 days
Can't wait to see multimodal LLama 3! We released a resource that might come in handy: The Cauldron🍯 The Cauldron is a massive manually-curated collection of 50 vision-language sets for instruction fine-tuning. 3.6M images, 30.3M query/answer pairs. It covers a large…
Tweet media one
6
40
241
@SanhEstPasMoi
Victor Sanh
6 years
Excited to announce that our paper “A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks” has been accepted at #AAAI2019 !! Our model achieves SotA on several tasks including NER on OntoNotes. (1/6)
Tweet media one
Tweet media two
7
48
209
@SanhEstPasMoi
Victor Sanh
9 months
Last week, we released IDEFICS, a state-of-the-art visual language model of 80B parameters. IDEFICS is one of the strongest open-access alternatives to GPT-4. Training it was a rollercoaster. We don't appreciate how challenging it is to train such multimodal systems. 🧵
Tweet media one
8
46
194
@SanhEstPasMoi
Victor Sanh
1 year
Latest addition to the NYC Hugging Face office 🤗💡
Tweet media one
3
5
175
@SanhEstPasMoi
Victor Sanh
5 years
🚪 ✊ Knockknock is now available on pip (and conda)! A small lib to be notified on Slack, Telegram or email when your training is complete or if it crashes in the middle. Integration requires only two additional lines of code.
Tweet media one
3
29
174
@SanhEstPasMoi
Victor Sanh
2 months
An increasing number of engineers and researchers are developing foundational models. Navigating the tools, resources, codebases, and best practices guides is daunting for new contributors. Introducing the Foundation Model Development Cheatsheet, a succinct guide with 250+…
Tweet media one
2
34
169
@SanhEstPasMoi
Victor Sanh
4 years
At @huggingface , we care about speed. 75x faster tokenizers. 🤯
@moi_anthony
Anthony MOI
4 years
I heard some of you found the tokenizers too slow. I think you are going to love what we are cooking for you @huggingface
Tweet media one
12
42
286
0
20
166
@SanhEstPasMoi
Victor Sanh
4 years
🚨We've updated our paper accepted at #NeurIPS2020 📑 Extreme sparsity in the context of transfer learning. 95% of the original perf with only ~5% of remaining weights in the encoder Paper: Code & weights: Thread: 👇 @huggingface
@SanhEstPasMoi
Victor Sanh
4 years
Excited to share our latest work on extreme pruning in the context of transfer learning 🧀 95% of the original perf with only ~5% of remaining weights in the encoder💪 Paper: With amazing collaborators @Thom_Wolf & @srush_nlp at @huggingface [1/7]
Tweet media one
6
152
643
3
33
158
@SanhEstPasMoi
Victor Sanh
1 year
For what it's worth, we are not planning to trim out the actually exciting and interesting nuggets from the technical reports :) Open-science for the win!
@SanhEstPasMoi
Victor Sanh
1 year
We are reproducing Flamingo, a vision and language model developed by Deepmind (). We spent a good amount of time fighting training divergences (aka "instabilities"). Surprisingly, even at the ~2-3B scale. Some learnings from overcoming these 🧵:
Tweet media one
26
251
1K
2
17
157
@SanhEstPasMoi
Victor Sanh
4 years
On my way to #NeurIPS2019 to talk about distillation/compression of large Language Models on Friday! Also, WE ARE HIRING, so say hi 👋! NB: I might have some goodies 🤗🤗
Tweet media one
4
3
149
@SanhEstPasMoi
Victor Sanh
4 years
This week at the 🤗 Reading Group, we dived into the Von Mises-Fisher distribution with the paper "Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs" from @shocheen & Yulia Tsekov. Colab:
Tweet media one
Tweet media two
5
21
128
@SanhEstPasMoi
Victor Sanh
5 months
On my way to #Neurips2023 with a couple of folks from Hugging Face 🤗 Hit me up if you want to chat about multimodal models! Also hiring, so let’s chat if you are passionate about open-source/open-science!
3
8
120
@SanhEstPasMoi
Victor Sanh
4 months
How do we get visual language models to generate simple UI codes using synthetic data only? Step 1: Prompt a language model to generate various ideas for websites. We used @MistralAI 7B instruct for that step. Having diversity of topics in these website descriptions is critical:…
Tweet media one
Tweet media two
Tweet media three
1
24
109
@SanhEstPasMoi
Victor Sanh
10 months
The pace of iteration on top of Llama v2 is unmatched. Yesterday, it was demos, tutorials, and integration in open-source libraries. Today, it is accelerated inference. Tomorrow, it will be entire creative side projects. Next month, it will be a wave of startups.
7
18
108
@SanhEstPasMoi
Victor Sanh
3 years
🔥🔥Series B!!🔥🔥 Extremely excited to share the news! To all the community (open-source contributors, researchers, machine learning engineers, data scientists, forum users, etc.), thank you 40 million times! I am constantly in awe of all the goodwill in the community! 🎊🚀📈
@TechCrunch
TechCrunch
3 years
Hugging Face raises $40 million for its natural language processing library by @romaindillet
0
22
99
4
7
96
@SanhEstPasMoi
Victor Sanh
1 year
A lot more learnings in this internal note: Hope you find that useful! 👀
3
13
92
@SanhEstPasMoi
Victor Sanh
1 year
Learning 2: What really made a difference is layer-normalizing the outputs of Q and K in the attention block: `LN(Q(x))` and `LN(K(x))`. This change was inspired by
@_basilM
Basil Mustafa
1 year
2️⃣2️⃣🅱️: We trained a 22B parameter ViT model, and scale continues to prove its merit! I want to zero in on an aspect of this which is useful however at all scales: a method for improving training stability in transformers.
4
26
184
3
8
92
@SanhEstPasMoi
Victor Sanh
2 years
Huge milestone for @BigscienceW ! The training of a 176B parameters auto-regressive multilingual LM has started! 176B thanks to the support of @Genci_fr , IDRIS, and @s_requena who make this possible! Follow @BigScienceLLM for updates!
Tweet media one
2
13
89
@SanhEstPasMoi
Victor Sanh
5 years
. @Thom_Wolf and I are heading to #ACL in Florence! 🇮🇹 Ping us to chat! We have a couple of @huggingface goodies 🤗🤗🤗
Tweet media one
2
5
87
@SanhEstPasMoi
Victor Sanh
2 years
Series C! 🚀🔥🎉 Super proud of the team and the impact we are having. Pumped to continue making machine learning more transparent, open, and collaborative and continue building with the community! 🤗 And we're hiring for every position you can think of :)
@huggingface
Hugging Face
2 years
🤗🚀
Tweet media one
98
239
2K
5
3
86
@SanhEstPasMoi
Victor Sanh
6 months
Excited to see more open-access visual models being adapted on high-resolution images! The GPT4V documentation () suggests it was trained on inputs up to 2048 x 2048. 512 x 512 is even considered low resolution. Most of the latest…
Tweet media one
Tweet media two
Tweet media three
2
20
85
@SanhEstPasMoi
Victor Sanh
4 years
ETS (GRE, TOEFL, etc.) is such a money-making scam. Since when sending out to 5 scores/figures to someone cost 20$+??????? We switched to the 21st century almost 20 years ago...!!!!
5
5
84
@SanhEstPasMoi
Victor Sanh
4 years
Overheard in a startup: “They upgraded their snack counter. I think they just raised a series A.”
Tweet media one
2
5
82
@SanhEstPasMoi
Victor Sanh
2 years
In the medium to long term, a well-documented model card with access to the model weights is 10x more impactful than a polished blog post of cherry-picked generated samples, no matter how cool and 🤯 these samples/images are.
0
11
79
@SanhEstPasMoi
Victor Sanh
1 year
Excited to see multiple groups (including ours at Hugging Face) reproducing and releasing in open access Flamingo/GPT4 style models! We are not there yet downstream performance-wise, but I am confident there will be a model in open access of the quality of GPT4 by EOY!
@anas_awadalla
Anas Awadalla 🍉
1 year
🦩 Introducing OpenFlamingo! A framework for training and evaluating Large Multimodal Models (LMMs) capable of processing images and text. More details below (including a multimodal LLaMA model!)⬇️ Blog: Demo:
27
472
2K
0
12
78
@SanhEstPasMoi
Victor Sanh
5 months
Let's keep AI open! 💪 Join the builders/researchers/open-source contributors shaping the future of open-source AI at the coolest #NeurIPS2023 community event! I hear that there are a few fun surprises in store 😎🐊🦙 Hosted by @huggingface x @nomic_ai x @coatuemgmt
Tweet media one
1
13
77
@SanhEstPasMoi
Victor Sanh
1 year
Tell me you work in ML without telling me you work in ML, autocomplete version
Tweet media one
3
0
79
@SanhEstPasMoi
Victor Sanh
7 months
It's a pretty wild and extensive report of success and failure cases of GPT4-V. A few consistent failure cases: - Counting, especially with higher numbers (people, objects, animals, etc.) - Making stuff up (describing objects that are not in the image) -…
Tweet media one
Tweet media two
2
18
77
@SanhEstPasMoi
Victor Sanh
5 months
AI is nothing without open-source #keepAIopen
1
10
79
@SanhEstPasMoi
Victor Sanh
2 years
Hugging Face party tomorrow (Friday) night! 🤗🤗 Dumbo, NYC Deets: party @huggingface .co
2
13
76
@SanhEstPasMoi
Victor Sanh
5 months
Who are the best multimodal data engineers? Want to join the coolest open-source/open-science company? Also, I am at #Neurips2023 , find me to get these brand new glittery stickers! Blond hair (although not too tall), hard to miss 😃
Tweet media one
5
4
74
@SanhEstPasMoi
Victor Sanh
1 year
We'll open-source our work soon 💪, with hopefully no forms involved! 🤗 Stay tuned!
7
2
74
@SanhEstPasMoi
Victor Sanh
4 years
When the training you launched a while ago (and that you forgot about) crashes, you're happy that it didn't crash silently... 🥶 Knockknock: Get notified when your training ends with only two additional lines of code. *10* different platforms supported!
Tweet media one
1
12
68
@SanhEstPasMoi
Victor Sanh
3 years
The acc of a shallow probe on top of a frozen BERT doesn't always reflect whether a property is captured. Our past summer intern Steven Cao w/ @srush_nlp @ @huggingface will present a new perspective on probing using sparsity. Later today, Session 3C (2:40pm ET)! #NAACL2021
Tweet media one
1
15
67
@SanhEstPasMoi
Victor Sanh
2 years
Humbled to see @YejinChoinka ’s shout-out of our ethical charter in her #acl2022 keynote! For our new multimodal project at 🤗 @huggingface , we wrote down our ethical values from the start, instead of having these discussions at the end. 👉
Tweet media one
1
15
68
@SanhEstPasMoi
Victor Sanh
5 years
🚪👋✊ Introducing knockknock, a small lib to send you Slack or email notifications when your training ends or crashes for unexpected reasons. Integration only requires 2 additional lines of code!
2
14
67
@SanhEstPasMoi
Victor Sanh
8 months
Asking AI systems how to use a bike is a thing. Try out IDEFICS now! Fully open-access multimodal model and already rolled out!
Tweet media one
5
13
61
@SanhEstPasMoi
Victor Sanh
9 months
🐶 IDEFICS is a reproduction of Flamingo, a multimodal model developed by @GoogleDeepMind , which has not been released publicly. IDEFICS is built solely on publicly available data and models and is available in open-access! Here's IDEFICS' take on Barbieheimer and Barack Potter
Tweet media one
Tweet media two
2
10
61
@SanhEstPasMoi
Victor Sanh
5 years
ML pun: train a GAN to generate healthy food images and name it "V-GAN"
2
9
60
@SanhEstPasMoi
Victor Sanh
5 years
[Amazing community 🏘️] Knockknock has two additional supports: Microsoft Teams and text message! These are the contributions of @mediumnok and @abhi1thakur . Hat tip! 🚪✊Get notified when your training ends/crashes with only two additional lines of code:
Tweet media one
5
9
60
@SanhEstPasMoi
Victor Sanh
4 years
[1K+ 🌟- Amazing community] This milestone wouldn't be possible without the contributions of the community: @mediumnok , @abhi1thakur , and @valtterinpoika . Thank you!! 🤗🤗🤗 🚪✊
Tweet media one
3
10
57
@SanhEstPasMoi
Victor Sanh
4 years
Differences I’m having a hard time pronouncing as a non-native speaker, a sample: - Bowl - Ball - This is - Decease - Beer - Bear - Poor - Pour - Beach - Bitch - Bold - Bald - Sheet - Shit - Peace - Piss - Piece Just imagine all the embarrassing situations I can put myself in…
12
5
56
@SanhEstPasMoi
Victor Sanh
3 years
This is just the beginning of something *very* big. I'm constantly in awe of all the goodwill in our community. So happy we can be a platform to catalyze all these positive energies!
Tweet media one
@YJernite
Yacine Jernite
3 years
Seeing how many people have introduced themselves and started sharing resources in the new "Languages at @huggingface " section of the forum is bringing me so much joy 🌍🌏🌎🤗 Looking forward to helping build more community beyond English-language NLP!
1
16
85
1
9
55
@SanhEstPasMoi
Victor Sanh
5 years
I'll give a talk during this Meetup on how we push the latest NLP models into production at @huggingface . If you are in NYC, join us on March 7th!
@Cometml
Comet
5 years
RSVP for our next Meetup to hear speakers from @XaxisTweets , @huggingface , and @runwayml discuss productionizing machine learning models! See the full details for March 7th here:
1
4
12
2
9
54
@SanhEstPasMoi
Victor Sanh
2 months
Can you beat the AI at Raven puzzles? How many puzzles can you answer correctly? 🎮 The most powerful vision+language AI systems like Gemini or GPT4V struggle with this problem when used out-of-the-box (). But when properly…
Tweet media one
Tweet media two
@YizheZhangNLP
Yizhe Zhang @ ICLR 2024
2 months
Thanks @_akhaliq for sharing our work. We evaluated SOTA VLMs on challenging Raven's Progressive Matrices. We found that VLMs still struggle to reach human-level performance, with perception being the main bottleneck.
1
22
82
3
16
55
@SanhEstPasMoi
Victor Sanh
4 years
I wanna visit the Paris office 🤗😱🤩
@ClementDelangue
clem 🤗
4 years
We moved in to a new office in Paris! Ft @MorganFunto
Tweet media one
Tweet media two
Tweet media three
Tweet media four
19
7
348
0
3
54
@SanhEstPasMoi
Victor Sanh
4 years
At the Paris office. Who’s there? #NLProc @LysandreJik @MorganFunto
Tweet media one
3
4
54
@SanhEstPasMoi
Victor Sanh
4 years
"Hugging Face (yes, that really is the name)" 🤗🤗🤗 A Survey of Deep Learning for Scientific Discovery, from @maithra_raghu & @ericschmidt
Tweet media one
0
10
51
@SanhEstPasMoi
Victor Sanh
19 days
Another useful resource for multimodal LLama3: OBELICS. OBELICS is an open, massive, and high-quality collection of interleaved image-text web documents, containing 141M English documents, 115B text tokens, and 353M images, extracted from Common Crawl dumps between February 2020…
@SanhEstPasMoi
Victor Sanh
22 days
Can't wait to see multimodal LLama 3! We released a resource that might come in handy: The Cauldron🍯 The Cauldron is a massive manually-curated collection of 50 vision-language sets for instruction fine-tuning. 3.6M images, 30.3M query/answer pairs. It covers a large…
Tweet media one
6
40
241
1
12
51
@SanhEstPasMoi
Victor Sanh
6 years
Everyone in the NLP research community should start watching/starring/bookmarking this awesome repo !
@seb_ruder
Sebastian Ruder
6 years
Do you often find it cumbersome to track down the best datasets or the state-of-the-art for a particular task in NLP? I've created a resource (a GitHub repo) to make this easier.
19
431
1K
0
8
51
@SanhEstPasMoi
Victor Sanh
9 months
Working on releasing a new open-access model while leveling up my meme game 🤡
Tweet media one
7
6
48
@SanhEstPasMoi
Victor Sanh
1 year
@hardmaru @StabilityAI "Grant us the gift of low latency, And protect us from the curse of communication bottlenecks."
Tweet media one
2
7
50
@SanhEstPasMoi
Victor Sanh
10 months
This made my day. AI for architects! 🗼🗽🏙 I helped an architect friend to use an AI rendering engine based on a YT tutorial from @design74043 . From a rough sketch to a realistic mock-up in a few seconds. All of that based solely on open-access models!
Tweet media one
4
7
48
@SanhEstPasMoi
Victor Sanh
3 years
That’s sweet 🤗🤗🤗
@huggingface
Hugging Face
3 years
We are honored to be awarded the Best Demo Paper for "Transformers: State-of-the-Art Natural Language Processing" at #emnlp2020 😍 Thank you to our wonderful team members and the fantastic community of contributors who make the library possible 🤗🤗🤗
Tweet media one
30
139
995
4
0
48
@SanhEstPasMoi
Victor Sanh
5 years
If you're at #NeurIPS2018 or in Montreal, stop by! I'll be presenting our latest Multi-Task Learning architecture: HMTL (w/ @Thom_Wolf and @seb_ruder ).
@Rasa_HQ
Rasa
5 years
If you are in Montreal, join us tomorrow for a special edition of the Montreal chatbots meetup! @alanmnichol will be speaking about the new embedding policy (presented at #NeurIPS ) and how to use it in Rasa Core
0
4
17
1
9
46
@SanhEstPasMoi
Victor Sanh
4 years
This a colossal effort 💪 First release. ~100 datasets. Same standard API. This is part of a broader vision of facilitating every single aspect of the NLP pipeline for research & engineering. Amazing job @Thom_Wolf @dramemariama20 @qlhoest @PatrickPlaten @julienplu @julien_c 🏆
@Thom_Wolf
Thomas Wolf
4 years
Surviving every AI wave, two kernels have consistently been the beating hearts of Natural Language Processing: Datasets and Metrics Today we release "nlp", a library to easily share & load data/metrics already providing access to 99+ datasets! Try it👉
Tweet media one
17
408
2K
1
1
46
@SanhEstPasMoi
Victor Sanh
2 months
Left: a synthetically generated HTML + Tailwind CSS website Right: its rendered screenshot This is a sample from WebSight v0.2, a collection of 2M of these pairs. It allows bootstrapping training multimodal LLMs on the task of converting UI screenshots into code. All of these…
Tweet media one
Tweet media two
1
7
47
@SanhEstPasMoi
Victor Sanh
9 months
🕵️‍♂️ Exploring such a large dataset requires tooling. With @nomic_ai , we built an interactive visualization that allows navigating OBELICS' content.
Tweet media one
2
12
44
@SanhEstPasMoi
Victor Sanh
1 year
Context: Flamingo is a model that combines a pretrained vision encoder with a pretrained language model by gluing some newly initialized parameters in between the two. The training resembles pure LM training because the objective is essentially next-token prediction.
2
0
45
@SanhEstPasMoi
Victor Sanh
4 years
It was so much fun @bhutanisanyam1 ! Thanks again! We talk about the vision and the culture behind our open-source efforts at @huggingface with one recent example: DistilBERT! Tune in! 🤗🤗 Also I’ll be at @NeurIPSConf next week to talk about DistilBERT more deeply! Say hi 👋 !
@bhutanisanyam1
Sanyam Bhutani
4 years
Really excited to finally share my interview with @SanhEstPasMoi where we talk all about DistilBERT, NLP Research at @huggingface and Research in Machine Learning. Audio: Video:
0
7
57
1
7
43
@SanhEstPasMoi
Victor Sanh
4 years
Paper: Code & Weights will be released very soon! Stay tuned! In the meantime, here’s a sneak peek at the memory size compressions: [7/7]
Tweet media one
5
0
41
@SanhEstPasMoi
Victor Sanh
4 years
1K+ machine translation models available through the usual user interface! Tremendous ressource based on @marian_nmt !
@huggingface
Hugging Face
4 years
Let’s democratize NLP for all languages! 🌎🌎🌎 Today, with v2.9.1, we are releasing 1,008 machine translation models, covering ` of 140 different languages trained by @jorgtiedemann with @marian , ported by @sam_shleifer . Find your language here: [1/4]
Tweet media one
20
352
1K
0
7
41
@SanhEstPasMoi
Victor Sanh
4 years
Thanks for having us and the interesting discussion! We were honored to be guests on arguably the best ML/NLP podcast out there!
@nlpmattg
Matt Gardner
4 years
#nlphighlights 104: @SanhEstPasMoi and @Thom_Wolf talk to us about model distillation, when you try to approximate a large model's decision boundary with a smaller model. After talking about the general area, we dive into DistilBERT.
2
24
141
0
6
42
@SanhEstPasMoi
Victor Sanh
9 months
The adrenaline level just before a release
3
1
41
@SanhEstPasMoi
Victor Sanh
6 months
Very long image captions are extremely hard (and costly) to obtain. Looks like a great resource!
@_akhaliq
AK
6 months
ShareGPT4V: Improving Large Multi-Modal Models with Better Captions paper page: In the realm of large multi-modal models (LMMs), efficient modality alignment is crucial yet often constrained by the scarcity of high-quality image-text data. To address…
Tweet media one
4
55
258
0
16
35
@SanhEstPasMoi
Victor Sanh
3 years
We are kicking off a weird one-year-long collaborative endeavor in NLP 👀👀
@Thom_Wolf
Thomas Wolf
3 years
The LHC involves 10.000 researchers from 100 countries In many scientific fields, worldwide research collaborations create tools useful for the entire research community: LHC, ITER, ISS.. Maybe it's time to build similar large, diverse, open research collab in AI/NLP as well 👇
9
63
328
0
2
40
@SanhEstPasMoi
Victor Sanh
4 years
#ICLR2020 You should absolutely join the ICLR virtual town. It is reallyyyy fun. Gonna leave my camera on.
1
3
40
@SanhEstPasMoi
Victor Sanh
9 months
Typical Wednesday in New York. Wait for Friday yoga.
Tweet media one
0
2
36
@SanhEstPasMoi
Victor Sanh
4 years
I'm (virtually) attending #acl2020nlp this week🏠 Already had a blast at the tutorials yesterday🤯 Hit me up (DM or RocketChat) if you wanna chat! Btw, we have created a #huggingface -ama channel if you want to interact with the broader @huggingface team!
1
2
37
@SanhEstPasMoi
Victor Sanh
4 years
Is training 1.1 billion parameters still considered large scale in August 2020? #absurdquestion
Tweet media one
4
3
35
@SanhEstPasMoi
Victor Sanh
4 years
I can’t recall the last time I accepted an incoming LinkedIn connection request from someone I actually know.
2
0
33
@SanhEstPasMoi
Victor Sanh
3 years
I think it's worth exchanging 5 mins of your time against the 100s of hours of coding that the Transformers library saved you in the past 2+ years ⌚️🙃
@huggingface
Hugging Face
3 years
🤗Transformers will pass 40 000 Github 🌟 this week🤯 What a crazy journey! Our libraries are all about the community and we need your input to define the direction of the next 40k stars Take 5 minutes to choose the future of the library here 👇
0
44
195
0
2
33
@SanhEstPasMoi
Victor Sanh
5 years
@SanhEstPasMoi
Victor Sanh
5 years
There is a trend for huge Transformers. We went the other way: decreasing the size! 🤗 Introducing DistilBERT: a smaller, faster, cheaper, lighter BERT trained w/ distillation! 95% of BERT's GLUE perf w/ 66M parameters. 📃: 💻:
Tweet media one
23
450
1K
1
3
33
@SanhEstPasMoi
Victor Sanh
5 years
1/ Find a good name for a model. Something that sparks joy in you. Something you can easily say multiple times in a talk. 2/ Build the research direction that fits this name. 🤯🤣
3
6
34
@SanhEstPasMoi
Victor Sanh
4 months
It's 2024! Let's get multimodality fired up! 🔥
1
4
34
@SanhEstPasMoi
Victor Sanh
2 years
It is very special to see this coming together beautifully with the model release in open-access🌸 It will be even more beautiful to see what the community builds with BLOOM! 🤗
@BigscienceW
BigScience Research Workshop
2 years
BLOOM is here. The largest open-access multilingual language model ever. Read more about it or get it at
Tweet media one
29
815
3K
2
0
33