Jürgen Schmidhuber @SchmidhuberAI Twitter profile

Last Seen Profiles

@BroOralWorship

@Bochoovic

@WWracers

@angelboybs

@ai_tmr01

@CapitalMX_

@BenABrittain

@Kwater_twitt

@bishopbabi18

@Hyp_You

@floradicontrole

@barubin

@qhintu_130

@EsteparioDrums

@T_80008

@SulphurAndWhite

@nicolaborzi

@TeamDrimedia

@yasujiro

@TheSafeAnnaAnon

@JeniKubota

@Basmala95228965

@EhsanSA

@FacesBeautyME

@bayernargentina

@M_dwr8

@drjimjonesceo

@KVC40

@FructoseNo

@redhawk_rangers

@reigenstrashcan

@PencilCraftsman

@yobrxxzy

@hissterious

@hackenclub

@nikmatkonten

Jürgen Schmidhuber

@SchmidhuberAI

5 months

Thanks @elonmusk for your generous hyperbole! Admittedly, however, I didn’t invent sliced bread, just #GenerativeAI and things like that: And of course my team is standing on the shoulders of giants: Original tweet by @elonmusk :…

118

475

5K

Jürgen Schmidhuber

@SchmidhuberAI

4 months

The GOAT of tennis @DjokerNole said: "35 is the new 25.” I say: “60 is the new 35.” AI research has kept me strong and healthy. AI could work wonders for you, too!

168

147

2K

Jürgen Schmidhuber

@SchmidhuberAI

1 year

LeCun's "5 best ideas 2012-22” are mostly from my lab, and older: 1 Self-supervised 1991 RNN stack; 2 ResNet = open-gated 2015 Highway Net; 3&4 Key/Value-based fast weights 1991; 5 Transformers with linearized self-attention 1991. (Also GAN 1990.) Details:

32

205

2K

Jürgen Schmidhuber

@SchmidhuberAI

4 years

Quarter-century anniversary: 25 years ago we received a message from N(eur)IPS 1995 informing us that our submission on LSTM got rejected. (Don’t worry about rejections. They mean little.) #NeurIPS2020

6

294

2K

Jürgen Schmidhuber

@SchmidhuberAI

5 years

In 2020, we will celebrate that many of the basic ideas behind the Deep Learning Revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" 1990-1991:

Deep Learning: Our Miraculous Year 1990-1991

In 2020-21, we celebrate that many of the basic ideas behind the Deep Learning Revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" 1990-91

people.idsia.ch

23

392

2K

Jürgen Schmidhuber

@SchmidhuberAI

5 months

Q*? 2015: reinforcement learning prompt engineer in Sec. 5.3 of “Learning to Think...” . A controller neural network C learns to send prompt sequences into a world model M (e.g., a foundation model) trained on, say, videos of actors. C also learns to…

47

250

2K

Jürgen Schmidhuber

@SchmidhuberAI

9 months

Meta used my 1991 ideas to train LLaMA 2, but made it insinuate that I “have been involved in harmful activities” and have not made “positive contributions to society, such as pioneers in their field.” @Meta & LLaMA promoter @ylecun should correct this ASAP. See…

61

169

1K

Jürgen Schmidhuber

@SchmidhuberAI

1 year

Regarding recent work on more biologically plausible "forward-only" backprop-like methods: in 2021, our VSML net already meta-learned backprop-like learning algorithms running solely in forward-mode - no hardwired derivative calculation!

18

177

1K

Jürgen Schmidhuber

@SchmidhuberAI

5 months

So @ylecun : "I've been advocating for deep learning architecture capable of planning since 2016" vs me: "I've been publishing deep learning architectures capable of planning since 1990." I guess in 2016 @ylecun also picked up the torch. (References attached)…

62

141

1K

Jürgen Schmidhuber

@SchmidhuberAI

2 years

Train a weight matrix to encode the backpropagation learning algorithm itself. Run it on the neural net itself. Meta-learn to improve it! Generalizes to datasets outside of the meta-training distribution. v4 2022 with @LouisKirschAI

13

214

1K

Jürgen Schmidhuber

@SchmidhuberAI

6 months

Silly AI regulation hype One cannot regulate AI research, just like one cannot regulate math. One can regulate applications of AI in finance, cars, healthcare. Such fields already have continually adapting regulatory frameworks in place. Don’t stifle the open-source movement!…

52

218

1K

Jürgen Schmidhuber

@SchmidhuberAI

2 years

25th anniversary of the LSTM at #NeurIPS2021 . reVIeWeR 2 - who rejected it from NeurIPS1995 - was thankfully MIA. The subsequent journal publication in Neural Computation has become the most cited neural network paper of the 20th century:

14

167

1K

Jürgen Schmidhuber

@SchmidhuberAI

2 years

30 years ago: Transformers with linearized self-attention in NECO 1992, equivalent to fast weight programmers (apart from normalization), separating storage and control. Key/value was called FROM/TO. The attention terminology was introduced at ICANN 1993

26

140

1K

Jürgen Schmidhuber

@SchmidhuberAI

2 years

Lecun ( @ylecun )’s 2022 paper on Autonomous Machine Intelligence rehashes but doesn’t cite essential work of 1990-2015. We’ve already published his “main original contributions:” learning subgoals, predictable abstract representations, multiple time scales…

2022 paper by LeCun rehashes but does not cite work of 1990-2015

1990: gradient descent learns subgoals. 1991: multiple time scales and levels of abstraction. 1997: world models learn predictable abstract representations...

people.idsia.ch

33

195

1K

Jürgen Schmidhuber

@SchmidhuberAI

1 year

Machine learning is the science of credit assignment. My new survey (also under arXiv:2212.11279) credits the pioneers of deep learning and modern AI (supplementing my award-winning 2015 deep learning survey): P.S. Happy Holidays!

27

257

1K

Jürgen Schmidhuber

@SchmidhuberAI

2 years

Yesterday @nnaisense released EvoTorch (), a state-of-the-art evolutionary algorithm library built on @PyTorch , with GPU-acceleration and easy training on huge compute clusters using @raydistributed . (1/2)

EvoTorch

Next-generation Evolutionary Search, Learning and Planning

evotorch.ai

10

209

1K

Jürgen Schmidhuber

@SchmidhuberAI

5 months

How 3 Turing awardees republished key methods and ideas whose creators they failed to credit. More than a dozen concrete AI priority disputes under

48

132

960

Jürgen Schmidhuber

@SchmidhuberAI

4 months

Best paper award for "Mindstorms in Natural Language-Based Societies of Mind" at #NeurIPS2023 WS Ro-FoMo. Up to 129 foundation models collectively solve practical problems by interviewing each other in monarchical or democratic societies

25

128

917

Jürgen Schmidhuber

@SchmidhuberAI

8 months

Unlike diffusion models, Bayesian Flow Networks operate on the parameters of data distributions, rather than on noisy versions of the data itself. I think this paper by Alex Graves et al. will be influential.

NNAISENSE

@nnaisense

9 months

📣 BFNs: A new class of generative models that - brings together the strengths of Bayesian inference and deep learning - trains on continuous, discretized or discrete data with simple end-to-end loss - places no restrictions on the network architecture

6

117

512

11

144

894

Jürgen Schmidhuber

@SchmidhuberAI

5 months

AI boom v AI doom: since the 1970s, I have told AI doomers that in the end all will be good. E.g., 2012 TEDx talk: : “Don’t think of us versus them: us, the humans, v these future super robots. Think of yourself, and humanity in general, as a small stepping…

62

126

795

Jürgen Schmidhuber

@SchmidhuberAI

4 years

Congrats to the awesome Sepp Hochreiter for the well-deserved 2021 IEEE Neural Networks Pioneer Award! It was my great honor to be Sepp's nominator.

10

67

756

Jürgen Schmidhuber

@SchmidhuberAI

1 year

As 2022 ends: 1/2 century ago, Shun-Ichi Amari published a learning recurrent neural network (1972) much later called the Hopfield network (based on the original, century-old, non-learning Lenz-Ising recurrent network architecture, 1920-25)

7

140

748

Jürgen Schmidhuber

@SchmidhuberAI

3 years

Kunihiko Fukushima was awarded the 2021 Bower Award for his enormous contributions to deep learning, particularly his highly influential convolutional neural network architecture. My laudation of Kunihiko at the 2021 award ceremony is on YouTube:

6

135

679

Jürgen Schmidhuber

@SchmidhuberAI

3 years

The most cited neural nets all build on our work: LSTM. ResNet (open-gated Highway Net). AlexNet & VGG (like our DanNet). GAN (an instance of our Artificial Curiosity). Linear Transformers (like our Fast Weight Programmers).

The most cited neural networks all build on work done in my labs

LSTM. ResNet (open-gated Highway Net). AlexNet & VGG (like our DanNet). GAN (an instance of my Artificial Curiosity). Linear Transformer (like my Fast Weight Programmer).

people.idsia.ch

31

86

671

Jürgen Schmidhuber

@SchmidhuberAI

2 years

Now on YouTube: “Modern Artificial Intelligence 1980s-2021 and Beyond.” My talk at AIJ 2020 (Moscow), also presented at NVIDIA GTC 2021 (US), ML Summit 2021 (Beijing), Big Data and AI (Toronto), IFIC (China), AI Boost (Lithuania), ICONIP 2021 (Jakarta)

11

109

640

Jürgen Schmidhuber

@SchmidhuberAI

9 months

In 2010, we used Jensen Huang's @nvidia GPUs to show that deep feedforward nets can be trained by plain backprop without any unsupervised pretraining. In 2011, our DanNet was the first superhuman CNN. Today, compute is 100+ times cheaper, and NVIDIA 100+ times more valuable.…

6

54

618

Jürgen Schmidhuber

@SchmidhuberAI

4 years

Stop crediting the wrong people for inventions made by others. At least in science, the facts will always win in the end. As long as the facts have not yet won, it is not yet the end. No fancy award can ever change that. #selfcorrectingscience #plagiarism

Critique of Honda Prize for Dr. Hinton

Honda credits Hinton for inventions of others whom he did not cite. Science must not allow corporate PR to distort the academic record.

people.idsia.ch

20

159

606

Jürgen Schmidhuber

@SchmidhuberAI

3 years

26 March 1991: Neural nets learn to program neural nets with fast weights - like today’s Transformer variants. Deep learning through additive weight changes. 2021: New work with Imanol & Kazuki. Also: fast weights for metalearning (1992-) and RL (2005-)

2021: 30th anniversary of linear Transformer principle

1991: Nets learn to program nets with outer product fast weights

people.idsia.ch

5

113

598

Jürgen Schmidhuber

@SchmidhuberAI

2 months

In 2016, at an AI conference in NYC, I explained artificial consciousness, world models, predictive coding, and science as data compression in less than 10 minutes. I happened to be in town, walked in without being announced, and ended up on their panel. It was great fun.…

37

92

605

Jürgen Schmidhuber

@SchmidhuberAI

4 years

The 2010s: Our Decade of Deep Learning / Outlook on the 2020s (also addressing privacy and data markets)

0

199

578

Jürgen Schmidhuber

@SchmidhuberAI

2 years

1/3: “On the binding problem in artificial neural networks” with Klaus Greff and @vansteenkiste_s . An important paper from my lab that is of great relevance to the ongoing debate on symbolic reasoning and compositional generalization in neural networks:

5

105

521

Jürgen Schmidhuber

@SchmidhuberAI

3 years

375th birthday of Leibniz, founder of computer science (just published in FAZ, 17/5/2021): 1st machine with a memory (1673); 1st to perform all arithmetic operations. Principles of binary computers (1679). Algebra of Thought (1686). Calculemus!

2021: 375th birthday of Leibniz, father of computer science

1st machine with memory (1673). Principles of binary computers (1679). Algebra of Thought (1686). Calculemus!

people.idsia.ch

7

91

507

Jürgen Schmidhuber

@SchmidhuberAI

2 months

Our #GPTSwarm models Large Language Model Agents and swarms thereof as computational graphs reflecting the hierarchical nature of intelligence. Graph optimization automatically improves nodes and edges.

14

90

506

Jürgen Schmidhuber

@SchmidhuberAI

3 years

I was invited to write a piece about Alan M. Turing. While he made significant contributions to computer science, their importance and impact is often greatly exaggerated - at the expense of the field's pioneers. It's not Turing's fault, though.

34

111

493

Jürgen Schmidhuber

@SchmidhuberAI

2 months

2010 foundations of recent $NVDA stock market frenzy: our simple but deep neural net on @nvidia GPUs broke MNIST . Things are changing fast. Just 7 months ago, I tweeted: compute is 100x cheaper, $NVDA 100x more valuable. Today, replace "100" by "250."…

15

52

500

Jürgen Schmidhuber

@SchmidhuberAI

1 year

Instead of trying to defend his paper on OpenReview (where he posted it), @ylecun made misleading statements about me in popular science venues. I am debunking his recent allegations in the new Addendum III of my critique

16

63

477

Jürgen Schmidhuber

@SchmidhuberAI

6 months

2023: 20th anniversary of the Gödel Machine, a mathematically optimal, self-referential, meta-learning, universal problem solver making provably optimal self-improvements by rewriting its own computer code

12

90

459

Jürgen Schmidhuber

@SchmidhuberAI

4 years

GANs are special cases of Artificial Curiosity (1990) and also closely related to Predictability Minimization (1991). Now published in Neural Networks 127:58-66, 2020. #selfcorrectingscience #plagiarism Open Access: Preprint:

10

82

459

Jürgen Schmidhuber

@SchmidhuberAI

1 year

Re: more biologically plausible "forward-only” deep learning. 1/3 of a century ago, my "neural economy” was local in space and time (backprop isn't). Competing neurons pay "weight substance” to neurons that activate them (Neural Bucket Brigade, 1989)

9

59

431

Jürgen Schmidhuber

@SchmidhuberAI

1 year

30 years ago in a journal: "distilling" a recurrent neural network (RNN) into another RNN. I called it “collapsing” in Neural Computation 4(2):234-242 (1992), Sec. 4. Greatly facilitated deep learning with 20+ virtual layers. The concept has become popular

10

63

427

Jürgen Schmidhuber

@SchmidhuberAI

2 years

With Kazuki Irie and @robert_csordas at #ICML2022 : any linear layer trained by gradient descent is a key-value/attention memory storing its entire training experience. This dual form helps us visualize how neural nets use training patterns at test time

5

81

395

Jürgen Schmidhuber

@SchmidhuberAI

2 years

KAUST (17 full papers at #NeurIPS2021 ) and its environment are now offering huge resources to advance both fundamental and applied AI research. We are hiring outstanding professors, postdocs, and PhD students:

6

85

384

Jürgen Schmidhuber

@SchmidhuberAI

1 year

KAUST, the university with the highest impact per faculty, has 24 papers #NeurIPS2022 . Visit Booth #415 of the @AI_KAUST Initiative! We are hiring on all levels.

11

35

381

Jürgen Schmidhuber

@SchmidhuberAI

3 years

1/3 century anniversary of thesis on #metalearning (1987). For its cover I drew a robot that bootstraps itself. 1992-: gradient descent-based neural metalearning. 1994-: meta-RL with self-modifying policies. 2003-: optimal Gödel Machine. 2020: new stuff!

Metalearning or Learning to Learn Since 1987

meta-RL with self-modifying policies (1994-), gradient-based meta-L in neural nets (1992-), optimal meta-L (Goedel Machine, 2003-), etc

people.idsia.ch

2

63

371

Jürgen Schmidhuber

@SchmidhuberAI

2 months

Numbers are lining up: one tweet per year of my life; one follower per LSTM citation 👍🙂

17

10

365

Jürgen Schmidhuber

@SchmidhuberAI

1 year

We address the two important things in science: (A) Finding answers to given questions, and (B) Coming up with good questions. Learning one abstract bit at a time through self-invented (thought) experiments encoded as neural networks

8

68

361

Jürgen Schmidhuber

@SchmidhuberAI

3 years

30-year anniversary of #Planning & #ReinforcementLearning with recurrent #WorldModels and #ArtificialCuriosity (1990). Also: high-dimensional reward signals, deterministic policy gradients, #GAN principle, and even simple #Consciousness & #SelfAwareness

2

65

354

Jürgen Schmidhuber

@SchmidhuberAI

3 years

2021: Directing AI Initiative at #KAUST , university with highest impact per faculty. Keeping current affiliations. Hiring on all levels. Great research conditions. Photographed dolphin on a snorkeling trip off the coast of KAUST

Director of AI Initiative at KAUST

KAUST is the university with the highest impact per faculty

people.idsia.ch

12

51

355

Jürgen Schmidhuber

@SchmidhuberAI

3 years

In 2001, I discovered how to make very stable rings from only rectangular LEGO bricks. Natural tilting angles between LEGO pieces define ring diameters. The resulting low-complexity artworks reflect the formal theory of beauty/creativity/curiosity:

2001: Stable rings from rectangular LEGO bricks

Certain natural tilting angles between LEGO pieces define diameters of corresponding circles

people.idsia.ch

6

39

357

Jürgen Schmidhuber

@SchmidhuberAI

3 years

90th anniversary of Kurt Gödel's 1931 paper which laid the foundations of theoretical computer science, identifying fundamental limitations of algorithmic theorem proving, computing, AI, logics, and math itself (just published in FAZ @faznet 16/6/2021)

1931: theoretical computer science & AI theory founded by Goedel

His famous 1931 paper identified the fundamental limits of math, logics, theorem proving, computing, AI.

people.idsia.ch

3

70

343

Jürgen Schmidhuber

@SchmidhuberAI

4 years

ACM lauds the awardees for work that did not cite the origins of the used methods. I correct ACM's distortions of deep learning history and mention 8 of our direct priority disputes with Bengio & Hinton. #selfcorrectingscience

13

66

314

Jürgen Schmidhuber

@SchmidhuberAI

3 years

10-year anniversary: Deep Reinforcement Learning with Policy Gradients for LSTM. Applications: @DeepMind ’s Starcraft player; @OpenAI 's dextrous robot hand & Dota player - @BillGates called this a huge milestone in advancing AI #deeplearning

2007-2010: Policy Gradients for LSTM

2010: First Journal Paper on Deep Reinforcement Learning with Policy Gradients for LSTM

people.idsia.ch

5

58

314

Jürgen Schmidhuber

@SchmidhuberAI

4 years

10-year anniversary of our deep multilayer perceptrons trained by plain gradient descent on GPU, outperforming all previous methods on a famous benchmark. This deep learning revolution quickly spread from Europe to North America and Asia. #deeplearning

3

76

310

Jürgen Schmidhuber

@SchmidhuberAI

3 years

3 decades of artificial curiosity & creativity. Our artificial scientists not only answer given questions but also invent new questions

Artificial Curiosity and Creativity Since 1990

In 2020-21, we celebrate 3 decades of artificial curiosity and creativity with neural world models

people.idsia.ch

3

67

297

Jürgen Schmidhuber

@SchmidhuberAI

3 years

15-year anniversary: first paper with "learn deep" in the title (2005). On deep #ReinforcementLearning & #NeuroEvolution solving problems of depth 1000 and more. 1st author: Faustino Gomez! #deeplearning #deepRL

2005: First paper with learn_deep in the title

On deep reinforcement learning for problems of depth 1000 and more

people.idsia.ch

0

56

289

Jürgen Schmidhuber

@SchmidhuberAI

3 years

2021: 10-year anniversary of deep CNN revolution through DanNet (2011), named after my outstanding postdoc Dan Ciresan. Won 4 computer vision contests in a row before other CNNs joined the party. 1st superhuman result in 2011. Now everybody is using this

2011: DanNet triggers deep CNN revolution

DanNet by Dan Ciresan et al. was the first CNN to win computer vision contests. Superhuman performance in 2011

people.idsia.ch

0

34

232

Jürgen Schmidhuber

@SchmidhuberAI

1 month

At ICANN 1993, I extended my 1991 unnormalised linear Transformer, introduced attention terminology for it, & published the "self-referential weight matrix." 3 decades later, they made me Chair of ICANN 2024 in Lugano. Call for papers (deadline March 25): …

12

17

234

Jürgen Schmidhuber

@SchmidhuberAI

2 years

@hardmaru This was accepted at ICML 2022. Thanks to Kazuki Irie, Imanol Schlag, and Róbert Csordás!

3

2

109

Jürgen Schmidhuber

@SchmidhuberAI

1 year

@yannx0130 sure, see the experiments

1

0

2