RWKV @RWKV_AI Twitter profile | Pikagi

Pikagi

RWKV

@RWKV_AI

2,100

Followers

3

Following

13

Media

50

Statuses

AI model built by the community, for everyone in this world Part of the Linux Foundation, Apache 2 licensed An RNN scaled to 14B params with GPT-level of perf

World

https://t.co/z0zcmXGBML

Joined November 2023

Don't wanna be here? Send us removal request.

Pinned Tweet

@RWKV_AI

RWKV

7 months

#RWKV is One Dev's Journey to Dethrone Transformers The largest RNN ever (up to 14B). Parallelizable. Fast inference & training. Quantizable. Low vram usage. 3+ years of hard work Created by @BlinkDL_AI Computation sponsored by @StabilityAI @AiEleuther

0

4

26

Last Seen Profiles

@Drstevenhobbs

@HEATSHOWYOO

@tozonline

@Revert_Project

@Ostblock_Latina

@KNlGHTINRED

@thi_no_64

@GMiskelly

@BrawlStars

@SteamPrwedWoman

@FindomJoao

@Dawn_M_Dow

@LCSBaseball

@WillCoats2

@imvn68__

@infektdubstep

@hijabjilbab1

@Chris_Gaynor74

@MiniKotoba

@QuintanaXevi

@LaraKBaker

@colectivoDIME

@LandonWall_

@TakisUSA

@apacdao

@rohbear0

@APA

@capuchin_____

@sunrvuki

@VmaniakJ

@icecube

@wis771

@Daniel_Laurison

@Toka_Dursun

@BarrioPeleas

@Waifu_elle

@RWKV_AI

RWKV

4 months

Introducing Eagle-7B Based on the RWKV-v5 architecture, bringing into opensource space, the strongest - multi-lingual model (beating even mistral) - attention-free transformer today (10-100x+ lower inference) With comparable English performance with the best 1T 7B models

Tweet media one

23

251

1K

@RWKV_AI

RWKV

4 months

All while being - Cleanly licensed Apache 2, under @linuxfoundation (do anything with it!) - The world's greenest 7B model 🌲 (by per token, energy consumption) You can find out more from our full writeup:

Tweet card media

🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)

A brand new era for the RWKV-v5 architecture and linear transformer's has arrived - with the strongest multi-lingual model in open source today

3

20

157

@RWKV_AI

RWKV

2 months

🦅 Eagle & 🐦 Finch The RWKV v5 and v6 architecture paper is here Both of which, improve over RWKV-4, scaled up to 7.5b and 3.1b billion multilingual models respectively Open-source code, weights, and dataset Apache 2 licensed, under Linux Foundation

Tweet card media

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a...

5

43

170

@RWKV_AI

RWKV

2 months

The conclusive EagleX is here Based on the RWKV-v5 architecture, bringing into opensource 7B space, the best SOTA - Multi-lingual model - English perplexity model - Attention-free transformer today (10-100x+ lower inference) With comparable English performance to Mistral

Tweet media one

3

32

156

@RWKV_AI

RWKV

4 months

If you want to quickly give it a try, you can go to our official hugging face demo here, of our latest model: We would strongly encourage you to try in non-English languages!

Tweet card media

RWKV-Gradio-2 - a Hugging Face Space by BlinkDL

4

10

102

@RWKV_AI

RWKV

4 months

In terms of actual eval multi-lingual numbers, we see a substantial overall jump (by 4%!) from our previous RWKV-v4-based architecture, even with the same training dataset. A huge win for 50% of the world's population 🗺️ (going past 17% of the English-speaking world)

Tweet media one

3

0

59

@RWKV_AI

RWKV

4 months

This is significant - because it shows clear evidence that RWKV / linear transformers... Has strong potential to replace existing attention-based architecture, with its substantially lower inference cost, and no feature compromise So all we need to do next is get GPUs & scale

Tweet media one

2

2

57

@RWKV_AI

RWKV

7 months

RWKV V5 - 3B model (preview) is out Final fine tune, to increase its context length to 8k is on its way. Which will also hopefully give that final score bump 😉 For now it looks on track to match the top 3B models in english, and surpass everyone in multi-lingual benchmarks 🤞

Tweet media one

1

9

54

@RWKV_AI

RWKV

4 months

Regardless, we plan to further train this model with another 1T token, to bring it within direct comparison with LLaMA2 7B model, and hopefully surpass it Because it seems like we are scaling like a transformer by token count? As seen by similar 300B scores with Pythia

Tweet media one

1

0

43

@RWKV_AI

RWKV

4 months

While English-based evals show a similar leap. It brings us in line with similar token scaling laws of transformers Where we trade blows with other models with similar token count, or more. Before losing out to much longer-trained models like mistral

Tweet media one

1

1

41

@RWKV_AI

RWKV

4 months

Wrapping up: #RWKV was originally created by @BlinkDL_AI as a project at @AIEleuther and is now being hosted by @LFAIDataFdn Find out more on our wiki Compute was sponsored by @AIEleuther @StabilityAI & Others RWKV is not an official @StabilityAI product

2

0

37

@RWKV_AI

RWKV

2 months

All while being - Cleanly licensed Apache 2, under @linuxfoundation (do anything with it!) - The world's greenest 7B model 🌲 (by per token, energy consumption) - Trained on 2.25T of tokens You can find out more from our full writeup here:

Tweet card media

🦅 EagleX v2 : Soaring past LLaMA2 7B in both English and Multi-lang evals (RWKV-v5)

You have seen the teaser with the EagleX 1.7T, now its here - the definitive version of linear transformer trained past, LLaMA 2 7B.

1

1

23

@RWKV_AI

RWKV

2 months

Stay tuned for more details on our upcoming models this week - Eagle: 2.25T 7B - Finch: 2.5T 1.6B (Some of you probably already know where to find it, if you search through our repos / discord)

1

0

13

@RWKV_AI

RWKV

2 months

This also marks the final Eagle model, in our v5 line. Future Finch model will be based on the v6 architecture, which is shown to have approximately 10% (give and take) improvements in performance over v5 While being upcycling compatible with v5 So here comes the finch 🐦

Tweet media one

1

1

13

@RWKV_AI

RWKV

2 months

The RWKV community wiki can be found at: Our discord can be found at: Give the model a try, drop by our discord, and provide us feedback on how we can improve the model for the community.

Join the RWKV Language Model Discord Server!

RWKV Language Model & ChatRWKV | 9198 members

0

0

10

@RWKV_AI

RWKV

2 months

Does this cover our latest model? No - this covers our previously released Eagle and Finch line of models, trained up to 1.1T tokens A reminder, that as a fully Open Source project, we release in the following sequence: Code, Weights, then the paper Not the other way around

1

0

10

@RWKV_AI

RWKV

2 months

Why is this progress significant? Because it shows clear evidence that RWKV / linear transformers... Has the potential to replace existing attention-based architecture, with substantially lower inference cost, and no feature compromise 🦅 paper at:

Tweet media one

1

0

9

@RWKV_AI

RWKV

2 months

Wrapping up: #RWKV was originally created by @BlinkDL_AI as a project at @AIEleuther ; and is now being hosted by @LFAIDataFdn Compute for this training, was sponsored by @recursal_AI You can find the latest EagleX model on their cloud platform here:

Tweet card media

Recursively building the next AI model, for everyone in every language

1

0

8

@RWKV_AI

RWKV

2 months

As with the previous 7B model, we further the open-source SOTA landscape, with leading English perplexity performance. While maintaining SOTA multi-lingual performance across 23 languages

Tweet media one

Tweet media two

1

0

8

@RWKV_AI

RWKV

2 months

This is in line with our OSS group's overall goal: To ensure the best AI models are made accessible, to everyone worldwide, regardless of language, or economic status (approximate map of languages supported worldwide)

Tweet media one

1

0

8

@RWKV_AI

RWKV

2 months

All while surpassing llama2 7B across a mixture of 21 popular English evals. While closing the gap with Mistral 7B. Proving that with continued training, the model architecture scales similarly (or better) than transformers, by tokens.

Tweet media one

1

0

6

@RWKV_AI

RWKV

4 months

Our wiki is at: Our discord can be found at:

Join the RWKV Language Model Discord Server!

RWKV Language Model & ChatRWKV | 9198 members

0

0

6

@RWKV_AI

RWKV

2 months

Special shout-outs to @BlinkDL_AI : the creator of RWKV @AiEleuther : awesome folks who help us in the paper authoring process @LFAIDataFdn : for hosting the OSS project @StabilityAI : for partially sponsoring the bulk of the GPU used for these documented models

1

0

6

@RWKV_AI

RWKV

2 months

Tagging various authors/contributors in no particular order (not comprehensive): @BlinkDL_AI @ronsoros @zp_pengzhou @eric_alcaide @smerkyg @kranthigv @blancheminerva @picocreator @SatyaScribbles @ZhangRuichong @AlbalakAlon @guangysong @lukeZhu20 @KranthiGV

1

0

6

@RWKV_AI

RWKV

2 months

If you want to quickly give it a try, you can go to our official hugging face demo here, of our latest model: We would strongly encourage you to try in non-English languages!

Tweet card media

v5-EagleX-v2-7B-gradio - a Hugging Face Space by RWKV

1

0

5

@RWKV_AI

RWKV

7 months

You can give our 3B demo a try here: And compare it against our 1.5B demo: Model weights for this preview are available at:

Tweet card media

RWKV-5-World-3B-v2-20231113-ctx4096.pth · BlinkDL/rwkv-5-world at d46926c

1

0

3

@RWKV_AI

RWKV

2 months

@QuentinAnthon15 @BingchenZhao In addition, shout out to the: various contributors to the dataset, the model architecture, training & inference code Paper authorship: is reflected by paper writing contribution, which is separate from model creation/code/dataset contribution

0

0

3

@RWKV_AI

RWKV

4 months

@Mt_B00Ks "The bald eagle, the only predator that doesn't have humans as its enemy."

Tweet media one

0

0

1

@RWKV_AI

RWKV

7 months

#RWKV was originally created by @BlinkDL_AI as a project at @AIEleuther and is now being hosted by @LFAIDataFdn Find out more on our wiki Compute was sponsored from @AIEleuther and @StabilityAI . RWKV is not an official @StabilityAI product

0

0

0

@RWKV_AI

RWKV

6 months

@amirsalimiiii @K_P_Ise @SebastienBubeck @apples_jimmy You are probably referring to The above is a different technique however. Hopefully more methods to improve LLM math will appear in the future.

@BlinkDL_AI

BlinkDL

11 months

A tiny #RWKV with 2.9M (!) params can solve 18239.715*9.728263 or 4.2379*564.778-1209.01 etc. with CoT, while being 100% #RNN (L6-D192)🤯The trick: generate lots of data with reversed numbers (denoted by "f" here) to train the model🚀Try it now:

Tweet media one

5

40

196

0

0

1