Cerebras @CerebrasSystems Twitter profile | Pikagi

Pikagi

Cerebras

@CerebrasSystems

10,858

Followers

239

Following

588

Media

1,349

Statuses

Exaflops of AI compute that programs like a single accelerator. Try our models:

Sunnyvale, CA

https://t.co/F87NwA5xyJ

Joined July 2016

Don't wanna be here? Send us removal request.

Pinned Tweet

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

📣 ANNOUNCEMENT DAY AT CEREBRAS 📣 Today, we are thrilled to share some of the biggest announcements in our company’s history. 📢 Cerebras announces CS-3, the world’s fastest AI Chip with a whopping 4 trillion transistors 📢 Cerebras selects Qualcomm to deliver unprecedented…

16

59

307

Last Seen Profiles

@MHuzaifa125

@cu_malawi

@aespaNingningPH

@panadama_

@kirokubito

@54czm

@stinkyhuggies

@vigtigos

@LONGiSolar_NA

@vampirefreakism

@brojakeedwards

@gba_dod

@Son_2xs

@17367121926zs

@huyw2725128

@argfan30000

@AF632

@Muhammad_Aadam1

@ahdalg53899959

@Tenhi_1

@trish_kelshaw

@SmylyBndr86394

@brightxt_

@cpasbipbip

@chiconaumv76257

@oscFN

@jrv589

@Worship_Jevon

@_ridile

@LeonLee91507364

@bnjmncrx

@Georgeoduwuor

@Matovu7

@MGMRewards

@populaxr

@loperwrestling

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

🎉 Exciting news! Today we are releasing Cerebras-GPT, a family of 7 GPT models from 111M to 13B parameters trained using the Chinchilla formula. These are the highest accuracy models for a compute budget and are available today open-source! (1/5) Press:

Tweet card media

Cerebras Systems Releases Seven New GPT Models Trained on CS-2 Wafer-Scale Systems

Cerebras Systems Releases Seven New GPT Models Trained on CS-2 Wafer-Scale Systems

www.businesswire.com

32

341

1K

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

📣ANNOUNCING THE FASTEST AI CHIP ON EARTH📣 Cerebras proudly announces CS-3: the fastest AI accelerator in the world. The CS-3 can train up to 24 trillion parameter models on a single device. The world has never seen AI at this scale. CS-3 specs: ⚙ 46,225 mm2 silicon | 4…

Tweet media one

54

170

972

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

📣 New dataset drop! Introducing SlimPajama-627B: the largest extensively deduplicated, multi-corpora, open-source dataset for training large language models. 🧵

Tweet media one

14

191

684

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

Introducing BTLM-3B-8K: an open, state-of-the art 3B parameter model with 7B level performance. When quantized, it fits in as little as 3GB of memory 🤯. It runs on iPhone, Google Pixel, even Raspberry Pi. BTLM goes live on Bittensor later this week! 🧵👇

Tweet media one

19

170

599

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

📣 Today we are announcing Condor Galaxy-1: a 4 exaflop AI supercomputer built in partnership with @G42ai . Powered by 64 Cerebras CS-2 systems, 54M cores, and 82TB of memory – it's the largest AI supercomputer we've ever built. But that's not all: CG-1 is just the start..

Tweet media one

18

84

369

@CerebrasSystems

Cerebras

@CerebrasSystems

5 months

Introducing gigaGPT: our implementation of @karpathy ’s nanoGPT that trains GPT-3 sized models in just 565 lines of code. 🤯 #NeurIPS2023

Tweet media one

8

34

373

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

Cerebras-GPT models have been downloaded over 130k times since our announcement and our 111M parameter model just crossed 85k downloads! We've already seen users enjoy this model using their local deployments! Models - Discord -

Tweet media one

6

56

308

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

Yesterday, we made available Cerebras-GPT, a family of seven GPT models ranging from 111m to 13B parameters, under the permissive Apache 2.0 license. This includes the models, training recipe, weights, and checkpoints - all of which can be accessed from Hugging Face and GitHub.

5

30

286

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

We are excited to see a Cerebras-GPT model trending on Hugging Face! Check out our page on Hugging Face to see the entire family of models:

Tweet media one

5

40

263

@CerebrasSystems

Cerebras

@CerebrasSystems

5 years

Today is the perfect day to share our first tweet as we have announced the largest chip ever built to accelerate AI compute! Read all about the Cerebras Wafer Scale Engine (WSE) via @WIRED and @tsimonite

Tweet media one

21

111

235

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

By popular demand: Nvidia B200 plotted on our Moore's Law chart. It's pretty big for a GPU. But kinda small when you're used to wafer scale. 😅

Tweet media one

12

35

230

@CerebrasSystems

Cerebras

@CerebrasSystems

9 months

Cerebras BTLM-3B-8K model crosses 1M downloads🤯 It's the #1 ranked 3B language model on @huggingface ! A big thanks to all the devs out there building on top of open source models 🙌

Tweet media one

5

53

200

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

The AI industry is becoming increasingly closed. We believe in fostering open access to the most advanced models. Cerebras-GPT is being released under the Apache 2.0 license, allowing royalty-free use for research or commercial applications. (2/5)

Tweet media one

6

39

188

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

The Cerebras-GPT 111M parameter model has been downloaded over 175k times since our announcement! This is exciting! Check out our models here - Join our Discord here -

Tweet media one

3

41

181

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

Need best 7B-70B models? Use LLaMA 2. Need best 3B model? Use BTLM. That's it. That's the tweet.

6

40

177

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

We are excited to see BTLM-3B-8k trending on Hugging Face with over 250k downloads! BTLM-3B-8k outperforms models trained on hundreds of billions more tokens and achieves comparable performance to open 7B parameter models. Check out the model:

Tweet media one

4

40

165

@CerebrasSystems

Cerebras

@CerebrasSystems

9 months

BTLM-3B-8k is surging! This state of the art model is trending on Hugging Face with over 500k downloads in just over 2 weeks! Try out the model here:

Tweet media one

3

39

148

@CerebrasSystems

Cerebras

@CerebrasSystems

1 month

The Cerebras CS-3 redefines scalability in AI supercomputing. A 2048 CS-3 cluster can deliver an astounding 256 exaflops of AI compute. This makes it possible to train Llama2-70B in less than one day—a task that would take at least one month on gigantic GPU clusters. The entire…

Tweet media one

6

24

150

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

Cerebras + @Qualcomm to deliver unprecedented performance in AI inference. 📈 Up to a 10x performance improvement for large-scale generative AI inference when using Cerebras CS-3 for training and Qualcomm® Cloud AI 100 Ultra for inference. 📉 Radically lower inference costs…

Tweet media one

3

23

149

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

Why we built SlimPajama – it's all about training efficiency. Without de-duplication, a model would have to go through 1.2T tokens before seeing ~600B unique tokens. SlimPajama sees 600B tokens in half the time. That saves you 50% on compute costs!

Tweet media one

2

28

141

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

Join us on Friday, July 28th, at 10:00 am PT as we host an AMA on our Discord server to talk about BTLM-3B-8K. This AMA will be hosted by Cerebras and Opentensor. Join our Discord here: Send yourself a calendar invite:

Tweet media one

6

34

125

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

Cerebras-GPT models are available now on Hugging Face. You can also test drive Cerebras CS-2 systems via our Model Studio on the cloud. (5/5)

3

16

119

@CerebrasSystems

Cerebras

@CerebrasSystems

1 month

8 Exaflops, 64 systems, 108TB of memory – say hello to the Condor Galaxy 3 (CG-3) – the first AI #supercomputer powered by the Cerebras CS-3, built with our strategic partner G42. CG-3 provides the highest density of AI compute for our next generation of #AI builders. Online Q2…

Tweet media one

5

31

116

@CerebrasSystems

Cerebras

@CerebrasSystems

8 months

We just dropped the BTLM-3B-8K paper on arXiv! It distills our recipe for training SOTA LLMs: - Extensively deduplicated dataset (SlimPajama) - Hyperparameter search using muP - Variable sequence length training + ALiBi - Aggressive LR decay

Tweet media one

2

27

116

@CerebrasSystems

Cerebras

@CerebrasSystems

4 months

Cerebras is proud to announce a multi-year collaboration with @MayoClinic as its first generative AI collaborator in the development of large language models ( #LLMs ) for medical applications. To create the first truly patient-centric healthcare AI, Mayo Clinic selected Cerebras…

3

24

114

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

Meet Andromeda, our 13.5 million core #AI supercomputer, now available for commercial & academic research. Andromeda can “perform 1 exaflop worth of AI computing - or at least one quintillion (10 to the power of 18) operations/second” @leejane71 @reuters

Tweet card media

Silicon Valley chip startup Cerebras unveils AI supercomputer

Silicon Valley startup Cerebras Systems, known in the industry for its dinner plate-sized chip made for artificial intelligence work, on Monday unveiled its AI supercomputer called Andromeda, which...

www.reuters.com

9

28

105

@CerebrasSystems

Cerebras

@CerebrasSystems

5 months

Open-source weights != Open-source CrystalCoder is the first model released under the #LLM360 framework. LLM360 goes beyond #opensource and includes training code, data processing scripts, model checkpoints, and analytics See the assets on Hugging Face:

LLM360/CrystalCoder · Hugging Face

1

17

105

@CerebrasSystems

Cerebras

@CerebrasSystems

28 days

TSMC thinks it will take the rest of the industry 6 more years to achieve ¼ of the transistors that we already have today. Would you like to enjoy 4 trillion transistors today on CS-3 or 1-trillion transistors on a GPU in 2030? Read TSMC’s full report ()…

Tweet media one

5

20

98

@CerebrasSystems

Cerebras

@CerebrasSystems

7 months

(1/2) We are excited to share that our paper "BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model" has been accepted at the Efficient Natural Language and Speech Processing (ENLSP-III) workshop for NeurIPS2023! Learn more about the workshop here:

Tweet media one

4

18

92

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

One notable output of Cerebras-GPT is a new scaling law that predicts model performance for a given compute budget. This is the first scaling law derived using a public dataset. (3/5)

Tweet media one

3

16

90

@CerebrasSystems

Cerebras

@CerebrasSystems

5 months

Introducing BTLM-Chat at #NeurIPS2023 ! BTLM-Chat is our 3B parameter #LLM that provides performance improvements over our BTLM model on @Eleuther harness evaluation and minimizes negative impact on users or society. Try BTLM-Chat here:

Tweet card media

cerebras/btlm-3b-8k-chat · Hugging Face

3

28

88

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

Our research shows that smaller foundation models that are fine-tuned on domain-specific datasets can outperform larger foundation models. We show that a GPT-NeoX 1.4B model that is fine-tuned for 2,000 training steps can perform just as well as the out-of-the-box GPT-J 6B model.

Tweet media one

3

14

88

@CerebrasSystems

Cerebras

@CerebrasSystems

24 days

In Eric Savitz' latest article for Barron's, Cerebras is described as the most intriguing startup that is building AI supercomputers that rival NVIDIA. Highlights from the article: 🔥 Cerebras Wafer Scale Engine 3 (WSE-3) is 72 square inches and is the largest commercial chip…

Tweet media one

5

14

89

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

We recently announced the availablity of SlimPajama - an open-source, cleaned, and deduplicated version of RedPajama-1T. It is half the size and trains twice as fast and when upsampled, performs equal or better than RedPajama. See below for the dataset and preprocessing library

Tweet media one

3

15

88

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

WIRED reports "The world now has a new fastest AI training supercomputer. Condor Galaxy, a network of 9 interconnected AI supercomputers, was brought to life by the US-based AI company Cerebras Systems together with G42, UAE-based technology holding group

Tweet card media

UAE's G42 reveals world's largest (and incredibly fast) AI supercomputer

The world's faster supercomputer built for generative AI projects is here. It will turbocharge the AI development in an unexpected way.

4

22

83

@CerebrasSystems

Cerebras

@CerebrasSystems

7 months

📣 Paper drop: Position Interpolation Improves ALiBi Extrapolation We found a simple method to 2x the context length of models that use ALiBi. This lets models like BTLM-3B-8K and MPT-7B-8K run high quality inference at up to 16K with no additional fine tuning. 👇

2

16

77

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

The seven Cerebras-GPT models were trained on CS-2 systems using our simple, data-parallel Weight Streaming architecture, which allowed us to train these models in just a few weeks. (4/5)

Tweet media one

2

14

74

@CerebrasSystems

Cerebras

@CerebrasSystems

1 month

The Cerebras CS-3 is the world's most scalable AI supercomputer. From 125 petaflops to 256 exaflops, 1 terabyte of memory to 1 petabyte, it provides the fastest and simplest way to achieve high model utilization on supercomputing hardware.

Tweet media one

3

15

72

@CerebrasSystems

Cerebras

@CerebrasSystems

2 years

We are aware of a startup crypto mining company using a misappropriated, altered image of the Cerebras CS-2 product for their own business purposes. (1/2)

10

17

70

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

The Cerebras CS-3 is built to scale – a single CS-3 rack can be fitted with 1.2 petabytes of memory – 1000x more than GPU servers and large enough to train 24 trillion parameter models. Watch our CTO's keynote to learn more- #AI #ML #Supercomputers …

Tweet media one

4

13

71

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

To enable long sequence applications for BTLM-3B-8K, we used ALiBi position embeddings. To improve extrapolation and reduce wall-clock training times, we used variable sequence length training (VSL). To learn more about VSL, check out our blog -

Tweet media one

1

20

72

@CerebrasSystems

Cerebras

@CerebrasSystems

9 months

“I don’t see any startups other than Cerebras as being able to compete with Nvidia for data-center training.” Karl Freund mentions this in Therese Poletti's latest article, where Cerebras is described as a 'disruptive technology for AI'. Read here:

Tweet card media

These AI-chip startups hope to challenge Nvidia, but it may take some time

As investors focus on Nvidia earnings, other chip startups hope to eventually gain some of the AI space from the chip maker.

www.marketwatch.com

2

16

61

@CerebrasSystems

Cerebras

@CerebrasSystems

4 years

Good morning world! We are proud to announce the Cerebras CS-1 system, a purpose-built high performance AI compute solution which houses the Wafer Scale Engine, the world's largest chip - #waferscale #buildbigchips #DeepLearning #AI #MachineLearning

Tweet card media

Introducing the Cerebras CS-1, the Industry’s Fastest Artificial Intelligence Computer - Cerebras

www.cerebras.net

5

34

64

@CerebrasSystems

Cerebras

@CerebrasSystems

9 months

(1/3) Cerebras, G42's Inception, and MBZUAI are pleased to announce Jais, the world’s best-performing Arabic LLM. Jais is a 13B parameter model that was trained on a new 395 billion token Arabic-English-Code dataset. Jais brings the power of Generative AI to 400m Arabic speakers

Tweet media one

4

14

64

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

Some #wafer -scale eye candy When performance is everything, the best #memory hierarchy is no memory hierarchy at all! The Cerebras Wafer-Scale Engine has 40GB of #sram evenly distributed across 850,000 cores for a staggering total of 2.6 trillion #transistors . Thanks, #tsmc !

Tweet media one

1

17

63

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

Our SlimPajama dataset crossed 2k downloads and is trending on Hugging Face! This dataset is 50% smaller than the original RedPajama 1.2 trillion token dataset Interested in training a high quality LLM in less time? Check out the SlimPajama dataset:

Tweet media one

1

19

58

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

@natfriedman There are a lot of galaxies in the night sky. Did you really have to pick the only one already used for an AI cluster 😂

Tweet card media

Cerebras’ Andromeda supercomputer has 13.5M cores that can do an exaflop in AI computing

Cerebras Systems is unveiling Andromeda, a 13.5 million-core AI supercomputer that can operate at more than an exaflop for AI applications.

venturebeat.com

2

2

55

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

In the latest @Forbes article, @karlfreund writes "A faster chip, a faster cluster, and much faster time to deploy AI has helped Cerebras earn the support of organizations like the Mayo Clinic and Glaxo-Smith Klein." Read Karl’s article to learn why CS-3 is the AI Accelerator…

Tweet media one

3

6

56

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

Building large scale training clusters from scratch and achieving high MFU (Model Flop Utilization) and reliability is very hard. Train 24 trillion parameter models across 2048 nodes with the simplicity of a single device on Cerebras CS-3 clusters. Watch our technical keynotes:…

Tweet media one

1

12

50

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

Cerebras is now the leading AI hardware startup as measured by research paper citations! Cerebras-GPT is just the start – we have so much cool research that we can't wait to share with you all. Stay tuned! h/t: @nathanbenaich

Tweet media one

4

16

55

@CerebrasSystems

Cerebras

@CerebrasSystems

3 years

We’re serving up a record-breaking chip. It contains 2.6 trillion transistors and covers more than 46,225 square millimeters of silicon, about the size of a dinner plate.

Tweet media one

4

14

54

@CerebrasSystems

Cerebras

@CerebrasSystems

1 month

In the fast-paced world of AI hardware, the Cerebras CS-3 and Nvidia DGX B200 are two of the most exciting new offerings to hit the market in 2024. Both systems are designed to tackle large-scale AI training, but they take decidedly different approaches. For ML researchers and…

Tweet media one

3

11

54

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

Cerebras-GPT is just the start. What models should we train next?

15

6

52

@CerebrasSystems

Cerebras

@CerebrasSystems

3 months

🎉 Cerebras overtakes peers as #1 AI Semiconductor Startup 🎉 In the freshly-updated 2023 State of AI Report Compute Index from Air Street Capital and Zeta Alpha, Cerebras is highlighted for leading all AI semiconductor startups in research publication counts, open source…

Tweet media one

1

14

52

@CerebrasSystems

Cerebras

@CerebrasSystems

9 months

(1/2) Cerebras CEO @andrewdfeldman sits down with Anastasiia from @AnastasiInTech to discuss Condor Galaxy, the AI Supercomputer built in partnership with G42. Click here to see their conversation:

Tweet card media

This New AI Supercomputer Outperforms NVIDIA

In this video I discuss New Cerebras Supercomputer with Cerebras's CEO Andrew Feldman.Timestamps:00:00 - Introduction 02:15 - Why such a HUGE Chip?02:37 - Ne...

www.youtube.com

1

15

53

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

Our CEO @andrewdfeldman opened Cerebras AI Day with his keynote that covered: ⚙️ The historic growth of AI and the massive compute it requires ⚙️ The pain and complexity of distributed computing ⚙️ Why Cerebras went bigger to build the largest chips ever made for AI ⚙️ How our…

Tweet media one

1

11

51

@CerebrasSystems

Cerebras

@CerebrasSystems

7 months

(1/2) Cerebras, along with other prominent companies in the AI industry, have announced the establishment of the AI Platform Alliance, a consortium aimed at ensuring that AI platforms will be more open, efficient, and sustainable. Learn more about here:

Tweet card media

AI Platform Alliance will drive AI to be more open, efficient and sustainable

A group of prominent companies in the AI industry announced the establishment of the AI Platform Alliance.

venturebeat.com

1

7

49

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

Reminder that at 10:00 am PT today, Cerebras and Opentensor will host an AMA on our Discord server to talk about BTLM-3B-8K. Come to ask questions, engage in a discussion, or simply enjoy the conversations! Join our Discord here:

Tweet media one

2

18

49

@CerebrasSystems

Cerebras

@CerebrasSystems

5 months

Cerebras led the way in open source AI development in 2024 – releasing complete recipes for model weights, training code, and data. Our ML team is constantly inventing new techniques to train LLMs for different languages and modalities – contact us to build your own custom model!

Tweet media one

0

6

49

@CerebrasSystems

Cerebras

@CerebrasSystems

3 years

We're proud to announce we've raised $250M in #SeriesF funding to accelerate our global business expansion and relentless pursuit of #AI innovation! #deeplearning

Tweet media one

3

17

47

@CerebrasSystems

Cerebras

@CerebrasSystems

6 months

Cerebras Software Release 2.0 is here - it's our biggest sw release of 2023! 📈50% Faster training for LLM models 🚀 PyTorch 2.0: new LTC/Torch-MLIR stack for expanded programmability 🖼️ Diffusion Transformers: image generation for the first time on CS-2

Tweet media one

1

11

47

@CerebrasSystems

Cerebras

@CerebrasSystems

1 month

The secret of Cerebras’ architecture isn’t just our giant wafer – it’s our highly scalable wafer-scale cluster design. This means that whether you program 1 or 2048 nodes, the entire cluster appears as a single chip. No Megatron, no DeepSpeed, no sharding – it’s the speed of a…

Tweet media one

2

4

48

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

RedPajama-1T is the largest open dataset today but contains a large percentage of duplicates, making a full training run costly and inefficient. Like the Falcon team, we found data quality is just as important as quantity – which led to SlimPajama.

Tweet card media

cerebras/SlimPajama-627B · Datasets at Hugging Face

1

3

47

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

🙋‍♂️We are doing a battery of tests to evaluate BTLM-3B-8K against other long context models. Which models do you want to see comparisons for?

5

12

47

@CerebrasSystems

Cerebras

@CerebrasSystems

3 months

🌟 Today, in collaboration with Barcelona Supercomputing Center, we announced the open-source availability of FLOR-6.3B, an English, Spanish, and Catalan multilingual language model that is state-of-the-art in Catalan. 🚀 FLOR-6.3B was trained in just 2.5 days on the Condor…

Tweet media one

3

10

42

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

One of our favorite charts for BTLM-3B-8K: training the model on 1 to 16 CS-2 systems is as easy as turning a dial🎚️!

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

@opentensor BTLM was trained on the Condor Galaxy 1 supercomputer thanks to the support of G42 Cloud and IIAI. As is standard on Cerebras CS-2s, the training is purely data parallel: no Megatron/DeepSpeed needed. We easily spun up and down # of systems with no interruptions or hw failures.

Tweet media one

1

3

24

2

9

46

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

(1/n) Introducing MediSwift, the first suite of biomedical language models that employ sparse pre-training techniques to significantly reduce computational costs, while outperforming existing models up to 7B parameters on benchmark tasks such as PubMedQA. Paper:…

Tweet media one

2

9

44

@CerebrasSystems

Cerebras

@CerebrasSystems

1 month

🌟 Cerebras is thrilled to be selected on the 2024 Forbes AI 50! 🌟 Here are a few reasons why we made the cut: 🎉 Cerebras is the only AI chip startup on this year’s list. Learn more about our latest generation of hardware, the CS-3: 🎉 We enable top…

Tweet media one

12

7

39

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

G42 Cloud is the largest public cloud provider of the UAE. To expand its AI offering, we are planning not one but *nine* AI supercomputers. When complete in 2024, the full Condor Galaxy system will have 9 instances, 576 CS-2s, for a total of 36 exaFLOPs of AI compute. 🤯🤯

Tweet media one

1

2

43

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

@Yampeleg People are ripping out their 3B and even 7B models and dropping in BTLM-3B. It's that good!

Tweet card media

cerebras/btlm-3b-8k-base · Hugging Face

7

20

44

@CerebrasSystems

Cerebras

@CerebrasSystems

6 months

(1/2) Argonne National Laboratory researchers demonstrate the Cerebras CS-2 is 130x faster than NVidia A100 on a nuclear physics simulation workload. Read the paper here:

Tweet media one

6

8

44

@CerebrasSystems

Cerebras

@CerebrasSystems

1 year

“[We’re] seeing graduate students trying to fit [LLMs] on laptop CPUs, and [we’re] seeing all sorts of enormous creativity in an effort to do what they can with the resources that are available to them.” Our CEO @andrewdfeldman describes the usage of Cerebras-GPT with @sallywf

1

0

43

@CerebrasSystems

Cerebras

@CerebrasSystems

1 month

Cerebras is proud to build and operate the world’s largest AI #supercomputers here in the USA! Condor Galaxy, built with our strategic partner G42, is among the largest #AI supercomputing installations in the world. The first two clusters are based in Santa Clara and…

Tweet media one

1

8

43

@CerebrasSystems

Cerebras

@CerebrasSystems

2 years

We wrote our @PyTorch implementation to take full advantage of our WSE’s 850,000 #AI -optimized cores, explains our Senior Director @EmadBarsoumPi . Read his blog to learn more: #pytorch #deeplearning

4

14

35

@CerebrasSystems

Cerebras

@CerebrasSystems

11 months

SlimPajama cleans and deduplicates RedPajama-1T, reducing the total token count and file size by 50%. It's half the size and trains twice as fast! It’s the highest quality dataset when training to 600B tokens and when upsampled performs equal or better than RedPajama.

Tweet media one

1

5

42

@CerebrasSystems

Cerebras

@CerebrasSystems

2 years

This image improperly passes off our CS-2 as their own product, as you can see by the Cerebras logo still visible at the bottom of the image. We are not working, or otherwise affiliated, with this company. We are taking steps to address the situation. (2/2)

4

5

41

@CerebrasSystems

Cerebras

@CerebrasSystems

6 months

11 days 06 hours 39 minutes 17 seconds until the Cerebras Team lands in NOLA for #NeurIPS2023 . We have a jam-packed week of papers, posters, workshops, and socials planned for the #ML community! Follow & join the fun: #WeAreCerebras

Tweet media one

0

8

41

@CerebrasSystems

Cerebras

@CerebrasSystems

3 years

Today we introduced the world’s first brain-scale #AI solution at @hotchipsorg 2021, enabling a single CS-2 to support 120 trillion parameter models. We continue to push the boundaries of what’s possible in AI & unlock extreme-scale model potential!

4

15

40

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

📣 Announcing Condor Galaxy 3 📣 Cerebras and @G42ai announced the build of Condor Galaxy 3 (CG-3), the third cluster of their constellation of AI supercomputers, the Condor Galaxy. CG-3 will be built using 64 Cerebras CS-3 systems and is designed and delivered in the United…

Tweet media one

0

4

39

@CerebrasSystems

Cerebras

@CerebrasSystems

9 months

Condor Galaxy 1 (CG-1) was used to train BTLM, the top-performing open-source 3B parameter LLM. Now, CG-1 has been used to train Jais, the best open-source Arabic model. CG-1 is moving the open-source community forward. Contact us to train on CG-1

Tweet media one

0

10

39

@CerebrasSystems

Cerebras

@CerebrasSystems

27 days

Congratulations to our strategic partner, @G42ai , on their partnership and $1.5B investment from @Microsoft . This marks a significant milestone in G42’s journey as a global tech leader and accelerates their world-changing vision to accelerate AI development and global expansion.…

Tweet media one

1

3

42

@CerebrasSystems

Cerebras

@CerebrasSystems

3 months

(1/n) Excited to announce the "Cerebras PyTorch Sparsity Library," which democratizes access to #sparse training for all #ML researchers and developers. Read more here:

Tweet media one

1

6

41

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

Our CEO @andrewdfeldman kicked off Cerebras AI Day to a standing room only audience. Andrew’s keynote covered: 1. The AI Capabilities Chasm 2. The GPU Challenge 3. Large Models train best on Large Chips. #AI #AIcompute

Tweet media one

Tweet media two

Tweet media three

1

8

40

@CerebrasSystems

Cerebras

@CerebrasSystems

2 years

With Cerebras, you can now easily train and reconfigure GPT-3 and GPT-J language models with up to 20 billion parameters on a single CS-2 system: #gptj #gpt3 #AI #deeplearning

Tweet media one

2

11

38

@CerebrasSystems

Cerebras

@CerebrasSystems

6 months

Cerebras ML researcher @vithursant19 will be presenting our paper, Sparse Iso-FLOP Transformations for Maximizing Training Efficiency, at the @NeurIPSConf Workshop on Advancing Neural Network Training (WANT). To our knowledge, this is the first work to demonstrate the use of…

Tweet media one

0

4

35

@CerebrasSystems

Cerebras

@CerebrasSystems

2 months

Anton Shilov of @tomshardware writes that "The CS-3 can be configured in clusters of up to 2048 systems. This scalability allows it to fine-tune 70 billion parameter models in just one day with a four-system setup, and to train a Llama 70B model from scratch in the same timeframe…

Tweet media one

0

10

36

@CerebrasSystems

Cerebras

@CerebrasSystems

9 months

BTLM was built in a collaboration with @opentensor and trained on the CG-1 supercomputer by @G42ai and Cerebras. It packs 7B performance in a 3B model. Learn more here:

Tweet card media

BTLM-3B-8K: 7B Performance in a 3 Billion Parameter Model - Cerebras

Cerebras and Opentensor introduce a new standard for compact large language models

www.cerebras.net

0

9

38

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

Let's compare CG-1 with Nvidia's Israel-1. Israel-1 has 2048 H100 GPUs. But each GPU shows up as a unique device. It's your job to break apart your model and farm them out to each GPU. On Cerebras, 1 to 64 CS-2s show as one accelerator. Just implement mini-batches and train!

Tweet media one

3

3

36

@CerebrasSystems

Cerebras

@CerebrasSystems

8 months

(1/2) This roofline plot shows theoretical performance of systems in Flops as a function of Flops per memory access Cerebras has a flat roofline: we achieve peak compute-bound perf. at lower operational intensity, thanks to our 40GB ultra-fast SRAM - accessible in a single cycle

Tweet media one

2

9

37

@CerebrasSystems

Cerebras

@CerebrasSystems

5 years

Sean Lie speaks about the Cerebras #waferscale engine at #HotChips31 @hotchipsorg #deeplearning #ai

Tweet media one

1

15

34

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

Today's popular models can run on a powerful PC but don't fit in popular mobile devices. In May @Opentensor challenged us to build a SoTA model that runs on any device and supports long context. Thus was born BTLM - a 3B model with 7B performance and 8K context length!

Tweet media one

1

2

35

@CerebrasSystems

Cerebras

@CerebrasSystems

5 months

The Open Source AI Party of the year is lit 🔥 with community love here in NOLA! Thanks @irinarish @togethercompute @laion_ai for your partnership in bringing the #OpenSource AI community together tonight! 🧡

0

3

36

@CerebrasSystems

Cerebras

@CerebrasSystems

3 years

“We have doubled in every critical aspect, delivering higher performance in every dimension,” Sean Lie, Chief Hardware Architect and Co-Founder

Tweet media one

1

9

35

@CerebrasSystems

Cerebras

@CerebrasSystems

14 days

We’re excited to spotlight the recently announced collaboration between @G42ai , @core42_ai , and @Qualcomm , some of our most important strategic partners. Core42, a G42 company and the UAE-based national-scale enabler for cloud and generative AI, announced a significant leap…

Tweet media one

2

10

37

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

@opentensor BTLM is available with an Apache 2.0 license on HuggingFace today. Use it as a drop-in replacement for any 3B model. Its size and inference speed is perfect for experimentation. Fine tune and quantize it to your heart's content!

Tweet card media

cerebras/btlm-3b-8k-base · Hugging Face

1

2

36

@CerebrasSystems

Cerebras

@CerebrasSystems

5 months

(1/2) Cerebras ML Research @dmsobol will present BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model at the Efficient Natural Language and Speech Processing (ENLSP-III) workshop at #NeurIPS . Contact us to meet at NeurIPS:

Tweet media one

1

10

35

@CerebrasSystems

Cerebras

@CerebrasSystems

10 months

@opentensor BTLM is so strong that it performs like a 7B model – take away the orange and you wouldn't even know there's a 3B model in the mix.

Tweet media one

1

5

35

@CerebrasSystems

Cerebras

@CerebrasSystems

3 months

🌟 Announcing Cerebras AI Day! 🌟 On Tuesday, March 19, in San Jose, CA, we are proud to host the 1st ever Cerebras AI Day! AI has never been more important. Cerebras is bringing together top experts across hardware, software, and ML to take a close look at the challenges and…

Tweet media one

3

11

35