Omar Sanseviero Profile Banner
Omar Sanseviero Profile
Omar Sanseviero

@osanseviero

31,744
Followers
2,203
Following
1,162
Media
9,193
Statuses

Chief Llama Officer @huggingface 🦙 Founder @AI_Learners . Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽

Zurich
Joined October 2010
Don't wanna be here? Send us removal request.
@osanseviero
Omar Sanseviero
2 years
My Machine Learning projects these days
88
1K
6K
@osanseviero
Omar Sanseviero
2 years
1 year doing Open Source for living
Tweet media one
65
125
3K
@osanseviero
Omar Sanseviero
1 year
Very excited to share some personal news! @johnowhitaker @pcuenq @multimodalart and I are writing a book with @oreilly about generative ML🤗 We'll cover many topics from theory and practical aspects, discuss creative applications, and more! What topics would you like to see?
Tweet media one
98
313
2K
@osanseviero
Omar Sanseviero
1 year
Graphs are **everywhere**, from social media and knowledge systems to molecules and meshes! 🧑‍🏫 Want to learn about Machine Learning for Graphs? Check out this thread! 🧵
Tweet media one
24
354
2K
@osanseviero
Omar Sanseviero
2 years
🧵Stable Diffusion weights are officially public, and we got some surprises! 🤗 🤗 Public weights 🧨Support with the diffusers library 🔥Load and use the model with a few lines of code 📖Blog post explaining how it works
Tweet media one
15
356
2K
@osanseviero
Omar Sanseviero
4 months
The ML ecosystem in France is on fire🔥 It has amazing talent and resources. Here are 10 facts you might not know: 1. There are great research labs - from @MistralAI and @kyutai_labs to large ones from @AIatMeta and @GoogleDeepMind . The Llama 2 and CodeLlama authors are based in…
Tweet media one
45
293
2K
@osanseviero
Omar Sanseviero
4 months
Have you used transformers but not fully grasped how they work internally?👀 Welcome the Random Transformer, a step-by-step walkthrough doing the math of the transformer model. Kick off your year understanding what's going on under the hood.
Tweet media one
Tweet media two
15
277
1K
@osanseviero
Omar Sanseviero
1 year
💫StarCoder: May the Source be with You! 🔥15B LLM with 8k context 🥳Trained on permissively-licensed code 💻Acts as tech assistant 🤯80+ programming languages 🚀Open source and data 💫Online demos 🧑‍💻VSCode plugin 🪅1 trillion tokens Follow this amazing experience! 🧵
Tweet media one
26
252
1K
@osanseviero
Omar Sanseviero
4 years
Acabo de cumplir un año trabajando en Google Suiza. En general soy introvertido y no comparto mucho, pero hoy gustaría compartir un poco. Abro hilo.💻
36
103
1K
@osanseviero
Omar Sanseviero
2 years
Classical software engineering issue 🙃 You can either read the documentation or... just try until it works. 📘 Just trying is the way 🤖
Tweet media one
15
118
1K
@osanseviero
Omar Sanseviero
2 years
Yesterday @GoogleAI released Flan T5, a model that can solve 1800 different tasks 🤯 Since then - Open-source models: - Support in @huggingface transformers - An interactive demo I created to play with the model
22
149
1K
@osanseviero
Omar Sanseviero
3 years
I'm super excited to share that today I'm joining @huggingface 🤗 as a Machine Learning engineer in the Open Source team. Looking forward to contributing to the community and democratizing ML 🚀🚀🚀
Tweet media one
36
46
972
@osanseviero
Omar Sanseviero
1 year
I'm happy to share a new personal milestone! This week I reached 500 Untitled @GoogleColab notebooks 🔥📓 My personal favorites are 42, 124 and 359!
Tweet media one
39
33
993
@osanseviero
Omar Sanseviero
7 months
Just a beautiful medieval town
Tweet media one
16
57
967
@osanseviero
Omar Sanseviero
5 months
LLM Visualization: Amazing animated interactive tutorial to learn about GPT internals
Tweet media one
5
183
917
@osanseviero
Omar Sanseviero
2 years
🧵Thread on images generated with free Open Source tools. No coding is needed to try them out! "a monkey head that is only made out of avocado, 3D" by dalle mini
Tweet media one
27
146
832
@osanseviero
Omar Sanseviero
1 year
Ultralytics YOLOv8 is out and open-source! Open demo here 👇
Tweet media one
13
155
817
@osanseviero
Omar Sanseviero
2 years
Six open-source ML demos from the last 6 days 🔥 1. Stable Diffusion Infinity 🎨 Outpaint Stable Diffusion on an infinite canvas Demo GitHub by @lkwq007
7
163
821
@osanseviero
Omar Sanseviero
2 months
WebGPU will change ML 🤯 With the recent release of ONNX Runtime with WebGPU, in-browser ML is about to change. We can now fully leverage GPUs to run ML models (think of Phi, SD, etc) entirely in the browser Benchmark in my computer: 40x faster ⚡️
Tweet media one
13
142
827
@osanseviero
Omar Sanseviero
1 year
GPT 4, Claude, Alpaca, ChatGLM, PALM API... Can we pause for today? Too much to look at! 🙏
21
87
792
@osanseviero
Omar Sanseviero
2 years
Personal News! 🥳 The last 3 years I did an online ML master in Georgia Tech in my free time and last week I completed the program.💥 🎉🎓✨🍾
49
13
786
@osanseviero
Omar Sanseviero
1 month
Introducing: Zephyr 141B-A35B 🥁 🔥Mixtral-8x22B fine-tune 🤯 Using DORPO: new alignment algorithm (no SFT, open ) 🚀 With 7k instances of (open) data Very strong IFEval, BBH, AGIEval... Enjoy! 🤗
15
144
732
@osanseviero
Omar Sanseviero
3 years
Combine the power of a simple PyTorch model, CLIP, and @Gradio and you get: Draw To Search! 🎨🧑‍🎨 Try it out here! Find images from movies in a different way 🔥 #model_of_the_day
13
159
719
@osanseviero
Omar Sanseviero
8 months
With people doing 🤯 spirals and squares with Stable Diffusion, I decided to do the llama version 🦙 Thoughts?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
38
54
697
@osanseviero
Omar Sanseviero
1 year
OpenAI released Point-E, a text-to-3D (point clouds) demo 🔥 You can check out an open-source demo for it at 🤯Enjoy! The demo uses the lower-quality but much faster version of the model.
Tweet media one
18
165
686
@osanseviero
Omar Sanseviero
5 months
Prediction: We're about to face the longest AI Winter in a long, long time❄️ 😱 Starting this weekend, many people will be taking time off to celebrate the new year. We might not have SOTA releases for two or three days📉
32
56
662
@osanseviero
Omar Sanseviero
4 months
You keep reading about sentence embeddings, but you might still not know exactly what they are. You are not alone! 🤗 I wrote a step-by-step walkthrough with code, math, applications, and memes. Kick off your year understanding what embeddings are!
Tweet media one
Tweet media two
Tweet media three
16
100
651
@osanseviero
Omar Sanseviero
1 year
Image Editing with Instructions🔥 Input an image, give an instruction ("remove the boy with blue backbag"), and get your image edited. Amazing what you can do with open-source tools 🤯
Tweet media one
Tweet media two
16
93
624
@osanseviero
Omar Sanseviero
5 months
11 Machine Learning books for your 2024 🧠Foundations 1. Deep Learning: Foundations and Concepts (Bishop & Bishop, 2023) 2. Mathematics for Machine Learning (Deisenroth, Faisal, Ong, 2020) 3. Probabilistic Machine Learning (Murphy, 2012-2023) 4. Linear Algebra and Learning from…
16
121
599
@osanseviero
Omar Sanseviero
1 year
Are you overwhelmed by everything happening in the ML ecosystem? We're doing a small crowdsourced initiative with a high-level distillation+timeline of cool big things happening in the ML landscape. Feel free to contribute! 🤗
Tweet media one
17
112
582
@osanseviero
Omar Sanseviero
2 years
[1/n] So you train a Machine Learning model. Is it good? 🤷Who knows! Let's explore some ways to do error analysis! 🧠 👇
7
92
554
@osanseviero
Omar Sanseviero
2 months
NVIDIA, Hugging Face, and ServiceNow release TheStack v2, a massive code dataset🌸 - 37 TB of de-duplicated code - 913B tokens - 619 programming languages Blog post Data
4
115
561
@osanseviero
Omar Sanseviero
9 months
Welcome Candle, minimalistic ML framework in Rust 🕯️ 🦙Whisper, Llama, Falcon, Bert, StarCoder 🖥️Run models in the browser with WASM ✨Flash Attention 💾Dataset loaders All in Rust.
Tweet media one
11
128
553
@osanseviero
Omar Sanseviero
2 years
What can you find at @huggingface ? - 26000 models for NLP, Audio, Computer Vision and Tabular Data - 2500 datasets for different domains - Over 1500 ML demos All shared by the community, open to everyone, open source. 🤗 The time of closed ML should be over.
2
88
552
@osanseviero
Omar Sanseviero
4 months
🚨Breaking news🚨 Claude 5 inference code leaked on Reddit (r/LocalLlama)
Tweet media one
18
31
550
@osanseviero
Omar Sanseviero
1 year
You can now transcribe audio with Whisper 70 times faster than the original implementation! 🔥 Transcribe 2-hour movies in 2 minutes!🤯For free, with Open Source tools! Kaggle notebook (free TPUs) GitHub repo:
Tweet media one
5
128
530
@osanseviero
Omar Sanseviero
1 month
Apropos of nothing, here is a mini-tutorial on three types of Mixture of Experts (MoE): Pre-trained MoE, upcycled MoEs, and FrankenMoEs. MoE refresher MoEs replace the feed-forward layers with sparse MoE layers. These layers contain a certain number of experts (e.g. 8), each one…
Tweet media one
Tweet media two
Tweet media three
7
122
519
@osanseviero
Omar Sanseviero
2 years
I have a Colab problem 😅
Tweet media one
29
20
499
@osanseviero
Omar Sanseviero
5 months
How it feels these days
Tweet media one
9
39
496
@osanseviero
Omar Sanseviero
6 months
The top 15 most-liked organizations on @huggingface 1. @StabilityAI 20k likes 2. @AIatMeta 20k 3. @runwayml 11k 4. CompVis 10k 5. @thukeg 7k 6. @BigscienceW 7k 7. @TIIuae 7k 8. @Microsoft 6.5k 9. @GoogleAI 6k 10. @OpenAI 4k 11. @BigCodeProject 4k 12. @MosaicML 4k 13. @UKPLab 3k…
10
107
467
@osanseviero
Omar Sanseviero
4 months
Spoiler alert: this might be one of the most exciting weeks for code LLMs since Code Llama
28
33
481
@osanseviero
Omar Sanseviero
4 months
PhotoMaker by Tencent is out Customize photos within seconds, no LoRA training Demos - Realistic - Stylization
Tweet media one
15
78
479
@osanseviero
Omar Sanseviero
1 year
🚨Free @huggingface courses🚨 Check out our new page with high-quality material 🤗 - NLP and RL courses - Notebooks, interactive demos, videos - Lots of exciting things are coming soon! Which topics would you like to see here?
20
108
474
@osanseviero
Omar Sanseviero
3 years
Physics-based Deep Learning Book 🤯🤩 - Differentiable physics and simulations - Reinforcement Learning for inverse problems - and more! Check it out!
Tweet media one
2
91
466
@osanseviero
Omar Sanseviero
1 month
Open ML is going brrr. In just 5 days 🧱 Databricks releases DBRX 🦾 Mistral releases 7B v2 🚀Qwen1.5 MoE-A2.7B 🐍Jamba, a MoE SSM LLM 🤏Wild 1-bit and 2-bit quantization with HQQ+ 3 big pre-trained MoEs, a new Mistral base, and crazy updates for on-device. Let's goo 🚀
8
75
474
@osanseviero
Omar Sanseviero
8 months
Falcon 180B is out🤯 - 180B params - Trained on 3.5 trillion tokens+7 million GPU hours - Quality on par with PaLM 2 outperforms Llama 2 and GPT-3.5 across 13 benchmarks - 4bit and 8bit precision with similar quality Demo: Blog:
12
105
450
@osanseviero
Omar Sanseviero
3 months
LLaVA 1.6 is out! 🥳 - Outperforms Gemini PRO on some benchmarks - Higher resolution than LLaVA 1.5 (up to 4x more pixels!) - Better OCR capability and instruction-following - More conversational Models: Blog:
Tweet media one
7
85
451
@osanseviero
Omar Sanseviero
5 months
Mixture of Experts Explained, a deep dive into MoEs, their challenges, scaling them up, and their state in the Open Source world.
Tweet media one
5
105
420
@osanseviero
Omar Sanseviero
2 years
Some extremely exciting news!🤗 We are raising $100 million looking forward to building the future of open Machine Learning, from Computer Vision to Reinforcement Learning. The future of ML is collaborative.🤗 Jobs: Announcement:
13
50
444
@osanseviero
Omar Sanseviero
2 years
Looking forward to the next year: 🎻 Fast-to-generate, synthetic music ☁️ Diffusion models applied for 3D objects (meshes, point clouds) 📽️ High-quality artificial video generation (with audio) All open source and accessible to the community 🤗
7
36
435
@osanseviero
Omar Sanseviero
3 years
🧠 Would you like to learn about Graph Neural Networks? Stanford is organizing a workshop with leaders of academia and industry. 🤖 The schedule looks amazing and will be streamed for free! You can register at
Tweet media one
4
110
430
@osanseviero
Omar Sanseviero
1 month
These are the faces of people that open source 🤗 Amazing to be with @jefrankle from Databricks, @sarahookr from Cohere, @sophiamyang from Mistral, @dvilasuero from Argilla, and @_lewtun from Hugging Face
Tweet media one
48
27
387
@osanseviero
Omar Sanseviero
3 months
OpenBMB, the creators of UltraFeedback, silently released a series of very strong edge models! - 2.4B base model close to Mistral 7B - 2.4 DPO outperforming Llama 70B on MT Bench - A 3B bilingual VLLM (+12B version RLHF VLLM) Check the models at 🚀
2
82
436
@osanseviero
Omar Sanseviero
1 year
Each week the @huggingface Spaces of the week look more 🔥 - AudioLM: Text to Audio - Text to Motion - BioGPT - BLIP2: Image to text - Instruct Pix2Pix - CoCa: Image captioning - Lip Movement Recognition - GLIGen for text-to-image Check them out in
Tweet media one
Tweet media two
4
80
422
@osanseviero
Omar Sanseviero
4 months
Tweet media one
9
58
427
@osanseviero
Omar Sanseviero
6 months
From r/LocalLlama New Claude 2.1 refuses to kill a python process 🙃
Tweet media one
26
48
417
@osanseviero
Omar Sanseviero
6 months
ChatGPT vs HuggingChat
Tweet media one
Tweet media two
29
46
418
@osanseviero
Omar Sanseviero
4 months
Ready for January! 🔥
Tweet media one
6
21
418
@osanseviero
Omar Sanseviero
2 months
Introducing: Zephyr Gemma! The community has struggled to do a good preference-tune of Gemma, so we built an open-source recipe and trained a model to help people get started. Model: Demo: Handbook:
Tweet media one
6
61
303
@osanseviero
Omar Sanseviero
1 year
Open Source ML is going brrr 🚀 Here are 5 amazing free OS demos 1⃣Stable Diffusion + ControlNet for hand control Demo: Look at those fingers! 😍 ⬅️+promp = Input ➡️ = Output
Tweet media one
Tweet media two
8
79
408
@osanseviero
Omar Sanseviero
24 days
Dear Open Source Community, I'm about to have a trip of 19 hours with no internet access (going from Europe to South America). Please hold off of releasing new SOTA models for a bit. I'll be back soon. Appreciated, Hacker Llama🤗
29
20
403
@osanseviero
Omar Sanseviero
4 months
MoEs paper list: a chronological, annotated curation of MoE-related papers. From Outrageously Large Neural Networks all the way to Mixtral, check it out!
11
68
405
@osanseviero
Omar Sanseviero
2 years
Did you know that @huggingface transformers has a new document-question-answering pipeline that lets you get insights on your documents and invoices with 3 lines of code? 🤯 Try it out in this colab ⤵️
Tweet media one
7
69
396
@osanseviero
Omar Sanseviero
1 year
200 likes, and we do a huge Open Source ML @huggingface Meetup in London
28
21
392
@osanseviero
Omar Sanseviero
9 months
Hugging Face in @Nasdaq tower in Times Square😍
Tweet media one
4
31
385
@osanseviero
Omar Sanseviero
2 months
Introducing...🥁🥁Mamba models are now compatible with 🤗transformers! Mamba models are super fast (scale well!) and have solid quality⚡️ - Generation utilities - PEFT fine-tuning - TRL support Check the repos here!
Tweet media one
9
67
395
@osanseviero
Omar Sanseviero
1 month
Happy Friday! New Gemma instruct model is out 🔥have a fun weekend! 🤗
6
62
393
@osanseviero
Omar Sanseviero
1 year
Did you know @huggingface tokenizers lib is written in Rust! You can get huge speedups (e.g. @chainyo_ai recently tokenized his 40GB dataset in 5 minutes rather than 4 hours) 🔥
Tweet media one
9
48
388
@osanseviero
Omar Sanseviero
8 months
Am I doing this right? Try it at
Tweet media one
Tweet media two
Tweet media three
Tweet media four
12
33
392
@osanseviero
Omar Sanseviero
30 days
Welcome Zephyr 141B to Hugging Chat🔥 🎉A Mixtral-8x22B fine-tune ⚡️Super fast generation with TGI 🤗Fully open source (from the data to the UI)
Tweet media one
12
85
387
@osanseviero
Omar Sanseviero
2 months
Is working at Hugging Face worth it? - Open source - Lots of flexibility (in schedule/work topics) - No meetings+async - Team members from all around the world - Collaborative environment (internally and externally) - Competitive compensation - Keep at forefront of ML - Growth
20
24
391
@osanseviero
Omar Sanseviero
3 months
If you finetune Microsoft's Phi...is it called phinetuning? It seems the answer is yes! Check out this community tutorial about end-to-end phinetuning, from the dataset to the benchmarking!
5
60
383
@osanseviero
Omar Sanseviero
6 months
So a researcher at <BigTech> kinda DoS-ed one of our free APIs by abusing them to do benchmarking of Mistral, Llama, and a few other models 🙃 You would have expected them to have the hardware resources or at least pay for the API...Weren't they the GPU-rich?
15
18
376
@osanseviero
Omar Sanseviero
6 months
Want to learn about Q-Learning?🧠 Check out @ThomasSimonini free open-source Deep Reinforcement Learning course! where you learn in a practical way about: - Deep Q-Learning - Policy Gradient - Actor-Critic Methods - PPO Check it out!
Tweet media one
8
77
376
@osanseviero
Omar Sanseviero
3 years
🧵ML interview questions of the week! Back to the basics. 1⃣ What is cross validation? Why is it needed? 2⃣🤯What is the curse of dimensionality? 3⃣♨️What is one-hot encoding?
7
51
375
@osanseviero
Omar Sanseviero
1 year
If anyone was affected by the BigTech layoffs and is ready to look for a new job: at @huggingface we're still hiring for all kinds of roles (from developer advocacy to ML engineering). We're looking for growth and generalist mindsets and people excited to work in OS ML. Ping me🤗
8
51
373
@osanseviero
Omar Sanseviero
1 month
"Can you figure out what the experts in a Mixture of Experts model are each specialized in?" Yes, this is touched on in the Mixtral paper (2024) and discussed quite extensively in the ST-MoE paper (2022), section 7. Also summarized in People's intuition…
@dwarkesh_sp
Dwarkesh Patel
1 month
. @_sholtodouglas poses a challenge. In the spirit of @natfriedman (whose Vesuvius Challenge was solved by a listener of my podcast - @LukeFarritor ). Can you figure out what the experts in a Mixture of Experts model are each specialized in? "A wonderful research project to do:…
19
43
492
2
74
375
@osanseviero
Omar Sanseviero
2 months
LLMs to write selenium for automated web actions🤯 Docs: - -
4
84
369
@osanseviero
Omar Sanseviero
6 months
So excited about the launch of Kyutai. What is it, and why does it matter: What is it? A strongly funded open science and open source research lab just announced in France. It will focus on high-quality open research, training, mentoring, and contributing to the AI ecosystem.…
Tweet media one
12
77
362
@osanseviero
Omar Sanseviero
1 month
CodeGemma is here 🔥Official model from Google with impressive code results for its size Three models - 2B for code generation and infilling - 7B for code infilling and natural language - 7B for instruct following Enjoy! 🤗
3
73
364
@osanseviero
Omar Sanseviero
2 years
At @huggingface , we're still hiring and have a strong remote-friendly, decentralized culture 🤗 Check some jobs at , but you can also apply in Wild Card if there's no good fit for your profile
Tweet media one
13
44
356
@osanseviero
Omar Sanseviero
2 years
The @huggingface RL Course just launched its first unit! 🔥 This is not the typical RL course. You get to train an agent, share it with the community, and compare your results on a leaderboard. Too bad I'm at the bottom 😅
4
49
357
@osanseviero
Omar Sanseviero
8 months
What happened in the open-source AI world in August? August is traditionally a slow month...but not for AI it seems! 👇Here is a recap! Code goes wild💻🦙 - Just 6 months after LLaMA, @MetaAI releases Code Llama, a family of LLMs for code . You can now…
Tweet media one
5
99
340
@osanseviero
Omar Sanseviero
2 years
So nice to finally meet part of the @huggingface team 🤩🤩🤩🤩🤩🤩🤩
Tweet media one
9
10
344
@osanseviero
Omar Sanseviero
3 years
Would you like to learn about Geometric Deep Learning? 🧠 This 12-hour lecture course part of @AIMS_Next in is an excellent opportunity to learn about it. Awesome job from @PetarV_93 , @mmbronstein , @TacoCohen and @joanbruna 🚀
Tweet media one
2
66
341
@osanseviero
Omar Sanseviero
4 months
E5 mistral-7b: New technique and SOTA model for text embeddings by Microsoft Paper: Model: Leaderboard: - Only trained on synthetic data (for 93 langs) - Decoder-only LLM 🤯(Mistral 7B fine-tune) - Tops…
Tweet media one
7
67
336
@osanseviero
Omar Sanseviero
2 years
🧵Many people asking how they can contribute to Open Source, here is a thread of different ways to do so! ⤵️
8
63
323
@osanseviero
Omar Sanseviero
1 month
Octopus-V2-2B A Gemma-based model trained for Android API - extremely fast, better than Llama+RAG, great results Paper: Model:
Tweet media one
6
63
323
@osanseviero
Omar Sanseviero
2 years
How fast has the @huggingface Hub grown since exactly a year ago? Models: 20k->150k (x7) Datasets: 5k->31k (x6) Spaces: 300->14k (x46 😅 but we were just launching) Bets for next year?
10
33
320
@osanseviero
Omar Sanseviero
9 months
Welcome Code LLama to @huggingface ! 🔥 Blog post: Models: Code playground: Chat playground:
Tweet media one
9
70
318
@osanseviero
Omar Sanseviero
2 years
This weekend we achieved a new milestone at @huggingface . We reached 30000 public models! 🥳🚀 Let that number sink in. Thousands of individuals and organizations have shared their ML models public and available for the whole ecosystem. 🎉🎉🎉
6
34
309
@osanseviero
Omar Sanseviero
3 years
Find a sentiment analysis model in @huggingface , create a @gradio app using Codex and test it out all in 30 seconds. Challenge accepted. ⚡️⚡️
12
54
311
@osanseviero
Omar Sanseviero
7 months
One step closer to 1-bit quantization🤯Hidden gem from last week with released code: Extreme low-bit quantization via partially binarized LLMs while maintaining performance (which previous binarization methods failed to do)
Tweet media one
7
60
306
@osanseviero
Omar Sanseviero
1 month
Proposal: with new MoEs, let's discuss less about the total number of experts, and instead focus on the two main things that we care about: - # of total params - # of default activated params Mixtral-8x7B -> Mixtral-47B-A12B Mixtral-8x22B -> Mixtral-141B-A35B
12
42
312
@osanseviero
Omar Sanseviero
5 months
5 open-access video generation models 📹 1. Stable Video Diffusion (image-to-video) 2. LaVie (text-to-video) 3. SEINE (image-to-video) 4. Hotshot-XL (gifs) 5. ModelScope ttv The space is on 🔥Here is a blog post from May introducing the space
7
73
305
@osanseviero
Omar Sanseviero
5 months
Windows AI Studio Preview Official tool to fine-tune and test models locally, with coming support for Phi 2, RAGs and Windows Optimized Models
7
58
304
@osanseviero
Omar Sanseviero
3 months
Massive release: Qwen 1.5 is out! - Models from 0.5B to 72B - Chat models released - Very strong metrics (best base model, strong chat one!) - Support long contexts 30 new models are out! Enjoy 🚀
10
64
303
@osanseviero
Omar Sanseviero
1 month
Another week in Open ML 🥳 - Cohere Command R+ - Google Gemma Instruct 1.1 - Qwen 1.5 32B model family is out - JetMoE - Sailor: LMs for South-East Asia - Mixture of Depths replication - Two different bitnet 1.5 open-source replications Open ML going brrr 🚀
9
42
294