Cerebras Profile Banner
Cerebras Profile
Cerebras

@CerebrasSystems

10,858
Followers
239
Following
588
Media
1,349
Statuses

Exaflops of AI compute that programs like a single accelerator. Try our models:

Sunnyvale, CA
Joined July 2016
Don't wanna be here? Send us removal request.
Pinned Tweet
@CerebrasSystems
Cerebras
2 months
📣 ANNOUNCEMENT DAY AT CEREBRAS 📣 Today, we are thrilled to share some of the biggest announcements in our company’s history. 📢 Cerebras announces CS-3, the world’s fastest AI Chip with a whopping 4 trillion transistors 📢 Cerebras selects Qualcomm to deliver unprecedented…
16
59
307
@CerebrasSystems
Cerebras
1 year
🎉 Exciting news! Today we are releasing Cerebras-GPT, a family of 7 GPT models from 111M to 13B parameters trained using the Chinchilla formula. These are the highest accuracy models for a compute budget and are available today open-source! (1/5) Press:
32
341
1K
@CerebrasSystems
Cerebras
2 months
📣ANNOUNCING THE FASTEST AI CHIP ON EARTH📣 Cerebras proudly announces CS-3: the fastest AI accelerator in the world. The CS-3 can train up to 24 trillion parameter models on a single device. The world has never seen AI at this scale. CS-3 specs: ⚙ 46,225 mm2 silicon | 4…
Tweet media one
54
170
972
@CerebrasSystems
Cerebras
11 months
📣 New dataset drop! Introducing SlimPajama-627B: the largest extensively deduplicated, multi-corpora, open-source dataset for training large language models. 🧵
Tweet media one
14
191
684
@CerebrasSystems
Cerebras
10 months
Introducing BTLM-3B-8K: an open, state-of-the art 3B parameter model with 7B level performance. When quantized, it fits in as little as 3GB of memory 🤯. It runs on iPhone, Google Pixel, even Raspberry Pi. BTLM goes live on Bittensor later this week! 🧵👇
Tweet media one
19
170
599
@CerebrasSystems
Cerebras
10 months
📣 Today we are announcing Condor Galaxy-1: a 4 exaflop AI supercomputer built in partnership with @G42ai . Powered by 64 Cerebras CS-2 systems, 54M cores, and 82TB of memory – it's the largest AI supercomputer we've ever built. But that's not all: CG-1 is just the start..
Tweet media one
18
84
369
@CerebrasSystems
Cerebras
5 months
Introducing gigaGPT: our implementation of @karpathy ’s nanoGPT that trains GPT-3 sized models in just 565 lines of code. 🤯 #NeurIPS2023
Tweet media one
8
34
373
@CerebrasSystems
Cerebras
1 year
Cerebras-GPT models have been downloaded over 130k times since our announcement and our 111M parameter model just crossed 85k downloads! We've already seen users enjoy this model using their local deployments! Models - Discord -
Tweet media one
6
56
308
@CerebrasSystems
Cerebras
1 year
Yesterday, we made available Cerebras-GPT, a family of seven GPT models ranging from 111m to 13B parameters, under the permissive Apache 2.0 license. This includes the models, training recipe, weights, and checkpoints - all of which can be accessed from Hugging Face and GitHub.
5
30
286
@CerebrasSystems
Cerebras
1 year
We are excited to see a Cerebras-GPT model trending on Hugging Face! Check out our page on Hugging Face to see the entire family of models:
Tweet media one
5
40
263
@CerebrasSystems
Cerebras
5 years
Today is the perfect day to share our first tweet as we have announced the largest chip ever built to accelerate AI compute! Read all about the Cerebras Wafer Scale Engine (WSE) via @WIRED and @tsimonite
Tweet media one
21
111
235
@CerebrasSystems
Cerebras
2 months
By popular demand: Nvidia B200 plotted on our Moore's Law chart. It's pretty big for a GPU. But kinda small when you're used to wafer scale. 😅
Tweet media one
12
35
230
@CerebrasSystems
Cerebras
9 months
Cerebras BTLM-3B-8K model crosses 1M downloads🤯 It's the #1 ranked 3B language model on @huggingface ! A big thanks to all the devs out there building on top of open source models 🙌
Tweet media one
5
53
200
@CerebrasSystems
Cerebras
1 year
The AI industry is becoming increasingly closed. We believe in fostering open access to the most advanced models. Cerebras-GPT is being released under the Apache 2.0 license, allowing royalty-free use for research or commercial applications. (2/5)
Tweet media one
6
39
188
@CerebrasSystems
Cerebras
1 year
The Cerebras-GPT 111M parameter model has been downloaded over 175k times since our announcement! This is exciting! Check out our models here - Join our Discord here -
Tweet media one
3
41
181
@CerebrasSystems
Cerebras
10 months
Need best 7B-70B models? Use LLaMA 2. Need best 3B model? Use BTLM. That's it. That's the tweet.
6
40
177
@CerebrasSystems
Cerebras
10 months
We are excited to see BTLM-3B-8k trending on Hugging Face with over 250k downloads! BTLM-3B-8k outperforms models trained on hundreds of billions more tokens and achieves comparable performance to open 7B parameter models. Check out the model:
Tweet media one
4
40
165
@CerebrasSystems
Cerebras
9 months
BTLM-3B-8k is surging! This state of the art model is trending on Hugging Face with over 500k downloads in just over 2 weeks! Try out the model here:
Tweet media one
3
39
148
@CerebrasSystems
Cerebras
1 month
The Cerebras CS-3 redefines scalability in AI supercomputing. A 2048 CS-3 cluster can deliver an astounding 256 exaflops of AI compute. This makes it possible to train Llama2-70B in less than one day—a task that would take at least one month on gigantic GPU clusters. The entire…
Tweet media one
6
24
150
@CerebrasSystems
Cerebras
2 months
Cerebras + @Qualcomm to deliver unprecedented performance in AI inference. 📈 Up to a 10x performance improvement for large-scale generative AI inference when using Cerebras CS-3 for training and Qualcomm® Cloud AI 100 Ultra for inference. 📉 Radically lower inference costs…
Tweet media one
3
23
149
@CerebrasSystems
Cerebras
11 months
Why we built SlimPajama – it's all about training efficiency. Without de-duplication, a model would have to go through 1.2T tokens before seeing ~600B unique tokens. SlimPajama sees 600B tokens in half the time. That saves you 50% on compute costs!
Tweet media one
2
28
141
@CerebrasSystems
Cerebras
10 months
Join us on Friday, July 28th, at 10:00 am PT as we host an AMA on our Discord server to talk about BTLM-3B-8K. This AMA will be hosted by Cerebras and Opentensor. Join our Discord here: Send yourself a calendar invite:
Tweet media one
6
34
125
@CerebrasSystems
Cerebras
1 year
Cerebras-GPT models are available now on Hugging Face. You can also test drive Cerebras CS-2 systems via our Model Studio on the cloud. (5/5)
3
16
119
@CerebrasSystems
Cerebras
1 month
8 Exaflops, 64 systems, 108TB of memory – say hello to the Condor Galaxy 3 (CG-3) – the first AI #supercomputer powered by the Cerebras CS-3, built with our strategic partner G42. CG-3 provides the highest density of AI compute for our next generation of #AI builders. Online Q2…
Tweet media one
5
31
116
@CerebrasSystems
Cerebras
8 months
We just dropped the BTLM-3B-8K paper on arXiv! It distills our recipe for training SOTA LLMs: - Extensively deduplicated dataset (SlimPajama) - Hyperparameter search using muP - Variable sequence length training + ALiBi - Aggressive LR decay
Tweet media one
2
27
116
@CerebrasSystems
Cerebras
4 months
Cerebras is proud to announce a multi-year collaboration with @MayoClinic as its first generative AI collaborator in the development of large language models ( #LLMs ) for medical applications. To create the first truly patient-centric healthcare AI, Mayo Clinic selected Cerebras…
3
24
114
@CerebrasSystems
Cerebras
1 year
Meet Andromeda, our 13.5 million core #AI supercomputer, now available for commercial & academic research. Andromeda can “perform 1 exaflop worth of AI computing - or at least one quintillion (10 to the power of 18) operations/second” @leejane71 @reuters
9
28
105
@CerebrasSystems
Cerebras
5 months
Open-source weights != Open-source CrystalCoder is the first model released under the #LLM360 framework. LLM360 goes beyond #opensource and includes training code, data processing scripts, model checkpoints, and analytics See the assets on Hugging Face:
1
17
105
@CerebrasSystems
Cerebras
28 days
TSMC thinks it will take the rest of the industry 6 more years to achieve ¼ of the transistors that we already have today. Would you like to enjoy 4 trillion transistors today on CS-3 or 1-trillion transistors on a GPU in 2030? Read TSMC’s full report ()…
Tweet media one
5
20
98
@CerebrasSystems
Cerebras
7 months
(1/2) We are excited to share that our paper "BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model" has been accepted at the Efficient Natural Language and Speech Processing (ENLSP-III) workshop for NeurIPS2023! Learn more about the workshop here:
Tweet media one
4
18
92
@CerebrasSystems
Cerebras
1 year
One notable output of Cerebras-GPT is a new scaling law that predicts model performance for a given compute budget. This is the first scaling law derived using a public dataset. (3/5)
Tweet media one
3
16
90
@CerebrasSystems
Cerebras
5 months
Introducing BTLM-Chat at #NeurIPS2023 ! BTLM-Chat is our 3B parameter #LLM that provides performance improvements over our BTLM model on @Eleuther harness evaluation and minimizes negative impact on users or society. Try BTLM-Chat here:
3
28
88
@CerebrasSystems
Cerebras
1 year
Our research shows that smaller foundation models that are fine-tuned on domain-specific datasets can outperform larger foundation models. We show that a GPT-NeoX 1.4B model that is fine-tuned for 2,000 training steps can perform just as well as the out-of-the-box GPT-J 6B model.
Tweet media one
3
14
88
@CerebrasSystems
Cerebras
24 days
In Eric Savitz' latest article for Barron's, Cerebras is described as the most intriguing startup that is building AI supercomputers that rival NVIDIA. Highlights from the article: 🔥 Cerebras Wafer Scale Engine 3 (WSE-3) is 72 square inches and is the largest commercial chip…
Tweet media one
5
14
89
@CerebrasSystems
Cerebras
11 months
We recently announced the availablity of SlimPajama - an open-source, cleaned, and deduplicated version of RedPajama-1T. It is half the size and trains twice as fast and when upsampled, performs equal or better than RedPajama. See below for the dataset and preprocessing library
Tweet media one
3
15
88
@CerebrasSystems
Cerebras
10 months
WIRED reports "The world now has a new fastest AI training supercomputer. Condor Galaxy, a network of 9 interconnected AI supercomputers, was brought to life by the US-based AI company Cerebras Systems together with G42, UAE-based technology holding group
4
22
83
@CerebrasSystems
Cerebras
7 months
📣 Paper drop: Position Interpolation Improves ALiBi Extrapolation We found a simple method to 2x the context length of models that use ALiBi. This lets models like BTLM-3B-8K and MPT-7B-8K run high quality inference at up to 16K with no additional fine tuning. 👇
2
16
77
@CerebrasSystems
Cerebras
1 year
The seven Cerebras-GPT models were trained on CS-2 systems using our simple, data-parallel Weight Streaming architecture, which allowed us to train these models in just a few weeks. (4/5)
Tweet media one
2
14
74
@CerebrasSystems
Cerebras
1 month
The Cerebras CS-3 is the world's most scalable AI supercomputer. From 125 petaflops to 256 exaflops, 1 terabyte of memory to 1 petabyte, it provides the fastest and simplest way to achieve high model utilization on supercomputing hardware.
Tweet media one
3
15
72
@CerebrasSystems
Cerebras
2 years
We are aware of a startup crypto mining company using a misappropriated, altered image of the Cerebras CS-2 product for their own business purposes. (1/2)
10
17
70
@CerebrasSystems
Cerebras
2 months
The Cerebras CS-3 is built to scale – a single CS-3 rack can be fitted with 1.2 petabytes of memory – 1000x more than GPU servers and large enough to train 24 trillion parameter models. Watch our CTO's keynote to learn more- #AI #ML #Supercomputers
Tweet media one
4
13
71
@CerebrasSystems
Cerebras
10 months
To enable long sequence applications for BTLM-3B-8K, we used ALiBi position embeddings. To improve extrapolation and reduce wall-clock training times, we used variable sequence length training (VSL). To learn more about VSL, check out our blog -
Tweet media one
1
20
72
@CerebrasSystems
Cerebras
9 months
“I don’t see any startups other than Cerebras as being able to compete with Nvidia for data-center training.” Karl Freund mentions this in Therese Poletti's latest article, where Cerebras is described as a 'disruptive technology for AI'. Read here:
2
16
61
@CerebrasSystems
Cerebras
4 years
Good morning world! We are proud to announce the Cerebras CS-1 system, a purpose-built high performance AI compute solution which houses the Wafer Scale Engine, the world's largest chip - #waferscale #buildbigchips #DeepLearning #AI #MachineLearning
5
34
64
@CerebrasSystems
Cerebras
9 months
(1/3) Cerebras, G42's Inception, and MBZUAI are pleased to announce Jais, the world’s best-performing Arabic LLM. Jais is a 13B parameter model that was trained on a new 395 billion token Arabic-English-Code dataset. Jais brings the power of Generative AI to 400m Arabic speakers
Tweet media one
4
14
64
@CerebrasSystems
Cerebras
1 year
Some #wafer -scale eye candy When performance is everything, the best #memory hierarchy is no memory hierarchy at all! The Cerebras Wafer-Scale Engine has 40GB of #sram evenly distributed across 850,000 cores for a staggering total of 2.6 trillion #transistors . Thanks, #tsmc !
Tweet media one
1
17
63
@CerebrasSystems
Cerebras
11 months
Our SlimPajama dataset crossed 2k downloads and is trending on Hugging Face! This dataset is 50% smaller than the original RedPajama 1.2 trillion token dataset Interested in training a high quality LLM in less time? Check out the SlimPajama dataset:
Tweet media one
1
19
58
@CerebrasSystems
Cerebras
2 months
In the latest @Forbes article, @karlfreund writes "A faster chip, a faster cluster, and much faster time to deploy AI has helped Cerebras earn the support of organizations like the Mayo Clinic and Glaxo-Smith Klein." Read Karl’s article to learn why CS-3 is the AI Accelerator…
Tweet media one
3
6
56
@CerebrasSystems
Cerebras
2 months
Building large scale training clusters from scratch and achieving high MFU (Model Flop Utilization) and reliability is very hard. Train 24 trillion parameter models across 2048 nodes with the simplicity of a single device on Cerebras CS-3 clusters. Watch our technical keynotes:…
Tweet media one
1
12
50
@CerebrasSystems
Cerebras
11 months
Cerebras is now the leading AI hardware startup as measured by research paper citations! Cerebras-GPT is just the start – we have so much cool research that we can't wait to share with you all. Stay tuned! h/t: @nathanbenaich
Tweet media one
4
16
55
@CerebrasSystems
Cerebras
3 years
We’re serving up a record-breaking chip. It contains 2.6 trillion transistors and covers more than 46,225 square millimeters of silicon, about the size of a dinner plate.
Tweet media one
4
14
54
@CerebrasSystems
Cerebras
1 month
In the fast-paced world of AI hardware, the Cerebras CS-3 and Nvidia DGX B200 are two of the most exciting new offerings to hit the market in 2024. Both systems are designed to tackle large-scale AI training, but they take decidedly different approaches. For ML researchers and…
Tweet media one
3
11
54
@CerebrasSystems
Cerebras
1 year
Cerebras-GPT is just the start. What models should we train next?
15
6
52
@CerebrasSystems
Cerebras
3 months
🎉 Cerebras overtakes peers as #1 AI Semiconductor Startup 🎉 In the freshly-updated 2023 State of AI Report Compute Index from Air Street Capital and Zeta Alpha, Cerebras is highlighted for leading all AI semiconductor startups in research publication counts, open source…
Tweet media one
1
14
52
@CerebrasSystems
Cerebras
9 months
(1/2) Cerebras CEO  @andrewdfeldman sits down with Anastasiia from @AnastasiInTech to discuss Condor Galaxy, the AI Supercomputer built in partnership with G42. Click here to see their conversation:
1
15
53
@CerebrasSystems
Cerebras
2 months
Our CEO @andrewdfeldman opened Cerebras AI Day with his keynote that covered: ⚙️ The historic growth of AI and the massive compute it requires ⚙️ The pain and complexity of distributed computing ⚙️ Why Cerebras went bigger to build the largest chips ever made for AI ⚙️ How our…
Tweet media one
1
11
51
@CerebrasSystems
Cerebras
7 months
(1/2) Cerebras, along with other prominent companies in the AI industry, have announced the establishment of the AI Platform Alliance, a consortium aimed at ensuring that AI platforms will be more open, efficient, and sustainable. Learn more about here:
1
7
49
@CerebrasSystems
Cerebras
10 months
Reminder that at 10:00 am PT today, Cerebras and Opentensor will host an AMA on our Discord server to talk about BTLM-3B-8K. Come to ask questions, engage in a discussion, or simply enjoy the conversations! Join our Discord here:
Tweet media one
2
18
49
@CerebrasSystems
Cerebras
5 months
Cerebras led the way in open source AI development in 2024 – releasing complete recipes for model weights, training code, and data. Our ML team is constantly inventing new techniques to train LLMs for different languages and modalities – contact us to build your own custom model!
Tweet media one
0
6
49
@CerebrasSystems
Cerebras
3 years
We're proud to announce we've raised $250M in #SeriesF funding to accelerate our global business expansion and relentless pursuit of #AI innovation! #deeplearning
Tweet media one
3
17
47
@CerebrasSystems
Cerebras
6 months
Cerebras Software Release 2.0 is here - it's our biggest sw release of 2023! 📈50% Faster training for LLM models 🚀 PyTorch 2.0: new LTC/Torch-MLIR stack for expanded programmability 🖼️ Diffusion Transformers: image generation for the first time on CS-2
Tweet media one
1
11
47
@CerebrasSystems
Cerebras
1 month
The secret of Cerebras’ architecture isn’t just our giant wafer – it’s our highly scalable wafer-scale cluster design. This means that whether you program 1 or 2048 nodes, the entire cluster appears as a single chip. No Megatron, no DeepSpeed, no sharding – it’s the speed of a…
Tweet media one
2
4
48
@CerebrasSystems
Cerebras
11 months
RedPajama-1T is the largest open dataset today but contains a large percentage of duplicates, making a full training run costly and inefficient. Like the Falcon team, we found data quality is just as important as quantity – which led to SlimPajama.
1
3
47
@CerebrasSystems
Cerebras
10 months
🙋‍♂️We are doing a battery of tests to evaluate BTLM-3B-8K against other long context models. Which models do you want to see comparisons for?
5
12
47
@CerebrasSystems
Cerebras
3 months
🌟 Today, in collaboration with Barcelona Supercomputing Center, we announced the open-source availability of FLOR-6.3B, an English, Spanish, and Catalan multilingual language model that is state-of-the-art in Catalan. 🚀 FLOR-6.3B was trained in just 2.5 days on the Condor…
Tweet media one
3
10
42
@CerebrasSystems
Cerebras
10 months
One of our favorite charts for BTLM-3B-8K: training the model on 1 to 16 CS-2 systems is as easy as turning a dial🎚️!
@CerebrasSystems
Cerebras
10 months
@opentensor BTLM was trained on the Condor Galaxy 1 supercomputer thanks to the support of G42 Cloud and IIAI. As is standard on Cerebras CS-2s, the training is purely data parallel: no Megatron/DeepSpeed needed. We easily spun up and down # of systems with no interruptions or hw failures.
Tweet media one
1
3
24
2
9
46
@CerebrasSystems
Cerebras
2 months
(1/n) Introducing MediSwift, the first suite of biomedical language models that employ sparse pre-training techniques to significantly reduce computational costs, while outperforming existing models up to 7B parameters on benchmark tasks such as PubMedQA. Paper:…
Tweet media one
2
9
44
@CerebrasSystems
Cerebras
1 month
🌟 Cerebras is thrilled to be selected on the 2024 Forbes AI 50! 🌟 Here are a few reasons why we made the cut: 🎉 Cerebras is the only AI chip startup on this year’s list. Learn more about our latest generation of hardware, the CS-3: 🎉 We enable top…
Tweet media one
12
7
39
@CerebrasSystems
Cerebras
10 months
G42 Cloud is the largest public cloud provider of the UAE. To expand its AI offering, we are planning not one but *nine* AI supercomputers. When complete in 2024, the full Condor Galaxy system will have 9 instances, 576 CS-2s, for a total of 36 exaFLOPs of AI compute. 🤯🤯
Tweet media one
1
2
43
@CerebrasSystems
Cerebras
10 months
@Yampeleg People are ripping out their 3B and even 7B models and dropping in BTLM-3B. It's that good!
7
20
44
@CerebrasSystems
Cerebras
6 months
(1/2) Argonne National Laboratory researchers demonstrate the Cerebras CS-2 is 130x faster than NVidia A100 on a nuclear physics simulation workload. Read the paper here:
Tweet media one
6
8
44
@CerebrasSystems
Cerebras
1 year
“[We’re] seeing graduate students trying to fit [LLMs] on laptop CPUs, and [we’re] seeing all sorts of enormous creativity in an effort to do what they can with the resources that are available to them.” Our CEO @andrewdfeldman describes the usage of Cerebras-GPT with @sallywf
1
0
43
@CerebrasSystems
Cerebras
1 month
Cerebras is proud to build and operate the world’s largest AI #supercomputers here in the USA! Condor Galaxy, built with our strategic partner G42, is among the largest #AI supercomputing installations in the world. The first two clusters are based in Santa Clara and…
Tweet media one
1
8
43
@CerebrasSystems
Cerebras
2 years
We wrote our @PyTorch implementation to take full advantage of our WSE’s 850,000 #AI -optimized cores, explains our Senior Director @EmadBarsoumPi . Read his blog to learn more: #pytorch #deeplearning
4
14
35
@CerebrasSystems
Cerebras
11 months
SlimPajama cleans and deduplicates RedPajama-1T, reducing the total token count and file size by 50%. It's half the size and trains twice as fast! It’s the highest quality dataset when training to 600B tokens and when upsampled performs equal or better than RedPajama.
Tweet media one
1
5
42
@CerebrasSystems
Cerebras
2 years
This image improperly passes off our CS-2 as their own product, as you can see by the Cerebras logo still visible at the bottom of the image. We are not working, or otherwise affiliated, with this company. We are taking steps to address the situation. (2/2)
4
5
41
@CerebrasSystems
Cerebras
6 months
11 days 06 hours 39 minutes 17 seconds until the Cerebras Team lands in NOLA for #NeurIPS2023 . We have a jam-packed week of papers, posters, workshops, and socials planned for the #ML community! Follow & join the fun: #WeAreCerebras
Tweet media one
0
8
41
@CerebrasSystems
Cerebras
3 years
Today we introduced the world’s first brain-scale #AI solution at @hotchipsorg 2021, enabling a single CS-2 to support 120 trillion parameter models. We continue to push the boundaries of what’s possible in AI & unlock extreme-scale model potential!
4
15
40
@CerebrasSystems
Cerebras
2 months
📣 Announcing Condor Galaxy 3 📣 Cerebras and @G42ai announced the build of Condor Galaxy 3 (CG-3), the third cluster of their constellation of AI supercomputers, the Condor Galaxy. CG-3 will be built using 64 Cerebras CS-3 systems and is designed and delivered in the United…
Tweet media one
0
4
39
@CerebrasSystems
Cerebras
9 months
Condor Galaxy 1 (CG-1) was used to train BTLM, the top-performing open-source 3B parameter LLM. Now, CG-1 has been used to train Jais, the best open-source Arabic model. CG-1 is moving the open-source community forward. Contact us to train on CG-1
Tweet media one
0
10
39
@CerebrasSystems
Cerebras
27 days
Congratulations to our strategic partner, @G42ai , on their partnership and $1.5B investment from @Microsoft . This marks a significant milestone in G42’s journey as a global tech leader and accelerates their world-changing vision to accelerate AI development and global expansion.…
Tweet media one
1
3
42
@CerebrasSystems
Cerebras
3 months
(1/n) Excited to announce the "Cerebras PyTorch Sparsity Library," which democratizes access to #sparse training for all #ML researchers and developers. Read more here:
Tweet media one
1
6
41
@CerebrasSystems
Cerebras
2 months
Our CEO @andrewdfeldman kicked off Cerebras AI Day to a standing room only audience. Andrew’s keynote covered: 1. The AI Capabilities Chasm 2. The GPU Challenge 3. Large Models train best on Large Chips. #AI #AIcompute
Tweet media one
Tweet media two
Tweet media three
1
8
40
@CerebrasSystems
Cerebras
2 years
With Cerebras, you can now easily train and reconfigure GPT-3 and GPT-J language models with up to 20 billion parameters on a single CS-2 system: #gptj #gpt3 #AI #deeplearning
Tweet media one
2
11
38
@CerebrasSystems
Cerebras
6 months
Cerebras ML researcher @vithursant19 will be presenting our paper, Sparse Iso-FLOP Transformations for Maximizing Training Efficiency, at the @NeurIPSConf Workshop on Advancing Neural Network Training (WANT). To our knowledge, this is the first work to demonstrate the use of…
Tweet media one
0
4
35
@CerebrasSystems
Cerebras
2 months
Anton Shilov of @tomshardware writes that "The CS-3 can be configured in clusters of up to 2048 systems. This scalability allows it to fine-tune 70 billion parameter models in just one day with a four-system setup, and to train a Llama 70B model from scratch in the same timeframe…
Tweet media one
0
10
36
@CerebrasSystems
Cerebras
9 months
BTLM was built in a collaboration with @opentensor and trained on the CG-1 supercomputer by @G42ai and Cerebras. It packs 7B performance in a 3B model. Learn more here:
0
9
38
@CerebrasSystems
Cerebras
10 months
Let's compare CG-1 with Nvidia's Israel-1. Israel-1 has 2048 H100 GPUs. But each GPU shows up as a unique device. It's your job to break apart your model and farm them out to each GPU. On Cerebras, 1 to 64 CS-2s show as one accelerator. Just implement mini-batches and train!
Tweet media one
3
3
36
@CerebrasSystems
Cerebras
8 months
(1/2) This roofline plot shows theoretical performance of systems in Flops as a function of Flops per memory access Cerebras has a flat roofline: we achieve peak compute-bound perf. at lower operational intensity, thanks to our 40GB ultra-fast SRAM - accessible in a single cycle
Tweet media one
2
9
37
@CerebrasSystems
Cerebras
5 years
Sean Lie speaks about the Cerebras #waferscale engine at #HotChips31 @hotchipsorg #deeplearning #ai
Tweet media one
1
15
34
@CerebrasSystems
Cerebras
10 months
Today's popular models can run on a powerful PC but don't fit in popular mobile devices. In May @Opentensor challenged us to build a SoTA model that runs on any device and supports long context. Thus was born BTLM - a 3B model with 7B performance and 8K context length!
Tweet media one
1
2
35
@CerebrasSystems
Cerebras
5 months
The Open Source AI Party of the year is lit 🔥 with community love here in NOLA! Thanks @irinarish @togethercompute @laion_ai for your partnership in bringing the #OpenSource AI community together tonight! 🧡
0
3
36
@CerebrasSystems
Cerebras
3 years
“We have doubled in every critical aspect, delivering higher performance in every dimension,” Sean Lie, Chief Hardware Architect and Co-Founder
Tweet media one
1
9
35
@CerebrasSystems
Cerebras
14 days
We’re excited to spotlight the recently announced collaboration between @G42ai , @core42_ai , and @Qualcomm , some of our most important strategic partners. Core42, a G42 company and the UAE-based national-scale enabler for cloud and generative AI, announced a significant leap…
Tweet media one
2
10
37
@CerebrasSystems
Cerebras
10 months
@opentensor BTLM is available with an Apache 2.0 license on HuggingFace today. Use it as a drop-in replacement for any 3B model. Its size and inference speed is perfect for experimentation. Fine tune and quantize it to your heart's content!
1
2
36
@CerebrasSystems
Cerebras
5 months
(1/2) Cerebras ML Research @dmsobol will present BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model at the Efficient Natural Language and Speech Processing (ENLSP-III) workshop at #NeurIPS . Contact us to meet at NeurIPS:
Tweet media one
1
10
35
@CerebrasSystems
Cerebras
10 months
@opentensor BTLM is so strong that it performs like a 7B model – take away the orange and you wouldn't even know there's a 3B model in the mix.
Tweet media one
1
5
35
@CerebrasSystems
Cerebras
3 months
🌟 Announcing Cerebras AI Day! 🌟 On Tuesday, March 19, in San Jose, CA, we are proud to host the 1st ever Cerebras AI Day! AI has never been more important. Cerebras is bringing together top experts across hardware, software, and ML to take a close look at the challenges and…
Tweet media one
3
11
35