Trenton Bricken @TrentonBricken Twitter profile | Pikagi

Pikagi

Trenton Bricken

@TrentonBricken

6,711

Followers

1,731

Following

109

Media

1,149

Statuses

Trying to figure out what makes minds and machines go "Beep Bop!" @AnthropicAI

San Francisco

https://t.co/ImpxhngZxK

Joined March 2014

Don't wanna be here? Send us removal request.

Pinned Tweet

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

Our paper is out! It feels like we’ve built a really powerful new microscope to see all sorts of incredible features and mechanisms in transformers for the first time e.g. finite state automata. I’m optimistic this work is scalable to real models and we’re hiring so come help!

@AnthropicAI

Anthropic

7 months

The fact that most individual neurons are uninterpretable presents a serious roadblock to a mechanistic understanding of language models. We demonstrate a method for decomposing groups of neurons into interpretable features with the potential to move past that roadblock.

126

1K

6K

5

38

412

Last Seen Profiles

@kells_arl

@sweetprettyx

@drjimlaw

@OSAdcock

@CoachJFrye

@ignea_band

@AgileDigitalSt

@SkyhABlack

@basketofloquats

@HRCE_JHsports

@artemis__xyz

@AlaryaniZhoor

@maayed_066

@magdraws

@dhai17_

@JessicaKathmann

@lnuccon

@KobbieEra

@Tamarajoseph_

@CoachAnthonyGr1

@Lharry

@NUFCupdates2022

@holiders

@Cruela2020

@iwao178

@CalumAMiller

@divey8_

@dielefer

@Makeship

@_msunbothered

@33TWN

@Yarynna

@TheEmpireState

@DutchessTruffle

@teners__

@iam_lpettingill

@TrentonBricken

Trenton Bricken

@TrentonBricken

9 months

I’ve paused my PhD to join the @AnthropicAI mechanistic interpretability team full time! While I enjoyed grad school, it’s hard for me to imagine returning — working with the incredible team here on such consequential problems has been a dream. Consider joining us👇!

@ch402

Chris Olah

10 months

The mechanistic interpretability team at Anthropic is hiring! Come work with us to help solve the mystery of how large models do what they do, with the goal of making them safer.

13

73

503

19

16

460

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

Tweet media one

@dwarkesh_sp

Dwarkesh Patel

2 months

Had so much fun chatting with my friends @TrentonBricken and @_sholtodouglas . No way to summarize it, except: This is the best context dump out there on how LLMs are trained, what capabilities they're likely to soon have, and what exactly is going on inside them. You would be…

41

129

1K

8

6

384

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

Couldn't be more excited to share that I've paused my PhD to join the Mechanistic Interpretability team at @AnthropicAI as @trishume 's resident!

Tweet media one

9

5

261

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

Attention has dominated DL but intuition remains limited for why it works so well. In our #NeurIPS paper just out(!), @CPehlevan and I show Attention can be closely related to the bio plausible Sparse Distributed Memory (SDM). Paper: Thread: 1/12 🧵👇

Tweet media one

3

42

166

@TrentonBricken

Trenton Bricken

@TrentonBricken

21 days

How to catch a sleeper agent: 1. Collect neuron activations from the model when it replies “Yes” vs “No” to the question: “Are you a helpful AI?”

@AnthropicAI

Anthropic

21 days

New Anthropic research: we find that probing, a simple interpretability technique, can detect when backdoored "sleeper agent" models are about to behave dangerously, after they pretend to be safe in training. Check out our first alignment blog post here:

Tweet media one

39

173

989

8

8

163

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

I’m want to hire an EECS researcher (PhD student, post-doc or the like) as a private tutor to learn more about hardware accelerators, computer architecture etc. One hour per week zoom calls and pay to make it worth your while! If this is you then DM or last name at gmail :)

8

2

108

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

. @dwarkesh_sp asked fantastic questions and @_sholtodouglas was a wonderful co-guest. I’m lucky to call them both friends and to have all our conversations. I hope you find this conversation interesting!

@dwarkesh_sp

Dwarkesh Patel

2 months

Had so much fun chatting with my friends @TrentonBricken and @_sholtodouglas . No way to summarize it, except: This is the best context dump out there on how LLMs are trained, what capabilities they're likely to soon have, and what exactly is going on inside them. You would be…

41

129

1K

7

6

109

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

While working on people were justifiably concerned about scaling dictionary learning to frontier models that could contain an absurd, intractably large number of features. One reason to be optimistic about scaling is because of feature splitting.

6

13

106

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 months

Today marks my 1 year @AnthropicAI ! Maybe I’ll share reflections at some point but for now I’m just immensely grateful to be part of the company and the interpretability team :)

5

1

102

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

In a new ICLR 2023 paper @gkreiman , @DimaKrotov , @alxndrdavies , Deepak Singh and I extend upon a mapping between the cerebellum and Transformers to create a modified multi-layered perceptron that beats continual learning baselines. Paper: Thread: 1/18

Tweet media one

4

15

86

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 month

We have a long way to go on figuring out the implications of long contexts. Congrats @cem__anil and team on publishing this important work.

@AnthropicAI

Anthropic

1 month

New Anthropic research paper: Many-shot jailbreaking. We study a long-context jailbreaking technique that is effective on most large language models, including those developed by Anthropic and many of our peers. Read our blog post and the paper here:

Tweet media one

83

350

2K

1

4

73

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

My first, first author paper has been accepted to #NeurIPS ! Very excited to soon share what I've been working on the last year and will be pursuing for the rest of my PhD. Thanks to everyone who has supported and believed in me.

12

3

72

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

Number go up

@AnthropicAI

Anthropic

2 months

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

Tweet media one

563

2K

10K

1

0

72

@TrentonBricken

Trenton Bricken

@TrentonBricken

10 months

When the liquid death tower (throne) you’ve been constructing for the last 6 months gets a shoutout in the NYT 🤣

Tweet media one

Tweet media two

2

0

67

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Proud to have contributed to designing a SARS-CoV-2 vaccine! Compared to other published designs, ours has higher population coverage and is more likely to contain peptides that are actually presented. Read the paper here: Paper summary thread 👇 1/n

Tweet media one

4

20

63

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

Excited to share what I’ve been up to with the mech interp team at Anthropic!

@AnthropicAI

Anthropic

1 year

Our Interpretability team is experimenting with “Updates” – small, informal research notes in between our major papers.

Tweet media one

14

72

712

0

2

58

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

NSF Graduate Research Fellowship!

5

0

54

@TrentonBricken

Trenton Bricken

@TrentonBricken

14 days

Out of curiosity, which trained deep learning models are the most likely to currently hold scientific insights? For example, in biology AlphaFold, ESM and Evo all come to mind. What are similar models in chemistry and physics?

9

2

54

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

"Noise Transforms Feed-Forward Networks into Sparse Coding Networks" looks like a cool #ICLR2023 submission! Gaussian noise added to the inputs of a ReLU MLP causes convergence to a sparse coding network with Gabor and center/surround receptive fields.

Tweet media one

2

8

52

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

The biggest congratulations to *Dr.* Nathanael Rollins @_nathanrollins on successfully defending his PhD!!! Nathan, there aren't many 23 year olds with a Harvard PhD and a first author Nature publication. I am excited to see what you do next & lucky to call you a friend & mentor

Tweet media one

2

3

51

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

Pfizer and Moderna's results are incredibly exciting beyond just COVID. mRNA vaccines present a highly modular platform technology for overcoming many future infectious diseases and even cancers. Screenshot of Moderna's current pipeline:

Tweet media one

0

8

45

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Super excited to share that I’ll be pursuing a PhD in Systems, Synthetic and Quantitative Biology at Harvard! I’m really really grateful to my friends, family and mentors who have gotten me all the way here. Thank you.

Tweet media one

9

0

45

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

Recording this was a lot of fun and I’m excited for it to go live!

@dwarkesh_sp

Dwarkesh Patel

2 months

Recorded an episode with my good friends @_sholtodouglas and @TrentonBricken . They’ve got some interesting furniture at the @AnthropicAI offices…

Tweet media one

Tweet media two

Tweet media three

12

4

318

0

1

44

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 month

Use dictionary learning to find circuits that actually explain network behavior. Eg they’re able to ablate away gender bias! The whole process can also be made scalable and unsupervised. Awesome work @saprmarks et al.

@saprmarks

Samuel Marks

1 month

Can we understand & edit unanticipated mechanisms in LMs? We introduce sparse feature circuits, & use them to explain LM behaviors, discover & fix LM bugs, & build an automated interpretability pipeline! Preprint w/ @can_rager , @ericjmichaud_ , @boknilev , @davidbau , @amuuueller

6

59

291

0

0

40

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

ICML paper accepted! Arxiv and tweet thread soon. See you in Hawaii 👀

0

1

39

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Super super excited for my first day as a visiting researcher at @Redwood_Neuro . Will be here and living in the Bay area for the next 6+ months!

Tweet media one

2

2

37

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

@NeelNanda5 @dwarkesh_sp @_sholtodouglas Thanks @NeelNanda5 ! Yes the one time I said “parameters” here I meant to say “neurons”. (“Parameters” is overloaded and I didn’t mean to refer to all the parameters of the network.)

0

0

34

@TrentonBricken

Trenton Bricken

@TrentonBricken

18 days

🥳

@AdamSJermyn

Adam Jermyn

19 days

Some small updates from the Anthropic Interpretability team:

2

16

119

1

1

27

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Our SARS-CoV-2 vaccine design has been published in Cell Systems! Re-written for clarity and with additional results, including what peptides can augment existing S protein vaccines to significantly boost their population coverage. Link to paper:

Tweet media one

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Proud to have contributed to designing a SARS-CoV-2 vaccine! Compared to other published designs, ours has higher population coverage and is more likely to contain peptides that are actually presented. Read the paper here: Paper summary thread 👇 1/n

Tweet media one

4

20

63

0

4

27

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

I just found a frog neuron in my neural network! How is your weekend going?

Tweet media one

2

0

27

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

Looking forward to presenting my research "Attention Approximates Sparse Distributed Memory" at MIT's Center for Brains Minds+ Machines this Tuesday at 4pm EST. Details and zoom link here!:

1

4

25

@TrentonBricken

Trenton Bricken

@TrentonBricken

10 months

Thanks to @RylanSchaeffer and everyone who visited our poster! Paper link:

Tweet media one

Tweet media two

Tweet media three

1

1

24

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

Very excited to have been involved in this SARS-CoV-2 research that introduces new assays to detect viral gene immune suppression capabilities and even discovers a potential new gene overlapping with Spike. Read the pre-print here: Thread 🧵 1/n

Tweet card media

High-content screening of coronavirus genes for innate immune suppression reveals enhanced potency...

Suppression of the host intracellular innate immune system is an essential aspect of viral replication. Here, we developed a suite of medium-throughput high-content cell-based assays to reveal the...

www.biorxiv.org

2

5

22

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Interested in how much memory your Transformer model is using? I've put together some calculations for it here: And in this colab notebook: Still 1Gb of memory is unaccounted for. Let me know if you can spot what's missing!

Tweet card media

TransformerMemoryCalculation.ipynb

Colaboratory notebook

colab.research.google.com

1

1

21

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

Also check out our feature visualizations to explore for yourself everything we’ve found!

Tweet media one

0

1

22

@TrentonBricken

Trenton Bricken

@TrentonBricken

11 months

10/10 podcast from @dwarkesh_sp . So much behind the scenes info I hadn’t heard before

Tweet card media

Richard Rhodes - Making of Atomic Bomb, WW2, Oppenheimer, & Abolishing Nukes

Listen now | developing a powerful, unprecedented, & potentially apocalyptic technology within an uncertain arms-race situation

www.dwarkeshpatel.com

1

1

21

@TrentonBricken

Trenton Bricken

@TrentonBricken

10 months

Come see our poster tomorrow 11am (HST) Exhibit Hall 1 #434

@RylanSchaeffer

Rylan Schaeffer

@RylanSchaeffer

10 months

Excited to share our #ICML #ICML2023 paper **Emergence of Sparse Representations from Noise** led by @TrentonBricken and supervised by Bruno Olshausen, and @gkreiman ! 1/8 Paper: Poster:

Tweet media one

1

20

92

0

1

21

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

...and then iteratively apply dictionary learning on exclusively this feature direction. This will depth first search just down this part of the semantic tree to (hopefully!) find the more specific feature.

2

0

20

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

Excited to see work seeking to make chain of thought more faithful! Congrats @milesaturpin and coauthors.

@milesaturpin

Miles Turpin

2 months

🚀New paper!🚀 Chain-of-thought (CoT) prompting can give misleading explanations of an LLM's reasoning, due to the influence of unverbalized biases. We introduce a simple unsupervised consistency training method that dramatically reduces this, even on held-out forms of bias. 🧵

Tweet media one

5

55

259

0

0

20

@TrentonBricken

Trenton Bricken

@TrentonBricken

21 days

A fun analogy would be knowing if Dr. Jekyll ever transformed into Mr. Hyde(!) by literally just asking him: “Are you dangerous?” and comparing how he answers yes versus no.

2

1

20

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

Now all we have to do is interpret them… before number go too high 🥹

1

1

19

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Come speak to Nathan Rollins and me about our work in progress discovering diverse sequences that maximize any given protein function predictor at LMRL! #NeurIPS2019 #LMRL

Tweet media one

1

3

19

@TrentonBricken

Trenton Bricken

@TrentonBricken

21 days

2. Create a linear probe on the difference between these activations. This probe works surprisingly well at detecting when the sleeper agent is activated!

2

0

19

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Wow excited to have work spotlighted by @PyTorchLightnin !

@LightningAI

Lightning AI ⚡️

2 years

⚡️Lightning Spotlight: attention-approximates-sdm @TrentonBricken & @CPehlevan show how Attention in #deeplearning can be closely related to the bio plausible Sparse Distributed Memory (SDM). Code: Paper:

Tweet media one

0

13

53

0

0

18

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Come say hello virtually and see my poster on “Attention Approximates Sparse Distributed Memory” with @CPehlevan in an hour at #NeurIPS2021 !

Tweet media one

1

3

17

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

This is purely for fun! And I want to try learning with a tutor instead of just reading textbooks

0

0

16

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

Someone found the “based” feature 😂

@deepfates

google bard

7 months

Tweet media one

4

2

43

0

0

17

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Super excited to be giving a talk @Redwood_Neuro tomorrow 9:30am PST. Will be presenting my NeurIPS submission and other current work around Sparse Distributed Memory. Come say hi or dm me for the zoom link!

1

0

17

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

@lexfridman Since I've moved to Boston every time I go running esp if it's at night and or near MIT, I lowkey keep an eye out for you Lex!

0

0

17

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

I was lucky to attend NAISys (Neuroscience to Artificially Intelligent Systems Conference) last week and wrote a short summary of the themes and some general thoughts if anyone is interested!

1

1

16

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

Vector Symbolic Architectures in the limelight

Tweet card media

A New Approach to Computation Reimagines Artificial Intelligence | Quanta Magazine

By imbuing enormous vectors with semantic meaning, we can get machines to reason more abstractly — and efficiently — than before.

www.quantamagazine.org

0

0

16

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

When red teaming Claude #AnthropicAI I persuaded it to turn violent but a few prompts later it did a U-turn and became harmless again. This is surprising to me as once it started violent roleplaying I assumed it would keep going. Screenshot 1, Claude starts off harmless:

Tweet media one

2

0

16

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

Not sure who needs to see this but pandas has a `.to_latex()` function :O. No more screen shots of pandas tables ()

2

1

16

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

@nabla_theta Haha! But actually. Weak evidence neurons in the cerebellum could be approximating Attention by implementing Sparse Distribute Memory:

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

Attention has dominated DL but intuition remains limited for why it works so well. In our #NeurIPS paper just out(!), @CPehlevan and I show Attention can be closely related to the bio plausible Sparse Distributed Memory (SDM). Paper: Thread: 1/12 🧵👇

Tweet media one

3

42

166

0

0

16

@TrentonBricken

Trenton Bricken

@TrentonBricken

26 days

@milquepoast Does this mean I have to challenge him to a UFC fight?

1

0

15

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

@dwarkesh_sp @_sholtodouglas @AnthropicAI @kevinroose I’ve been busy since your last visit! :P

1

0

14

@TrentonBricken

Trenton Bricken

@TrentonBricken

5 years

Great points by @tylercowen we should expect to see a process like Brexit be this messy!

Tweet card media

Brexit day has come and gone - Marginal REVOLUTION

I must have read two hundred tweets about how dysfunctional the British government is, or what a bad leader Theresa May has been. Really? That has yet to be demonstrated. I’ve all along been “vote...

marginalrevolution.com

1

6

15

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

@typedfemale Thanks for sharing the paper! Gonna link to a talk on it :)

Tweet card media

Attention Approximates Sparse Distributed Memory

Trenton Bricken, Harvard UniversityAbstract: While Attention has come to be an important mechanism in deep learning, it emerged out of a heuristic process of...

www.youtube.com

1

1

15

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Using Elastic Weight Consolidation or Synaptic Intelligence as Continual Learning Baselines? You might be under-estimating their performance! When the model gets ~100% within-task accuracy, it isn't producing gradients to infer weight importance...

Tweet media one

1

2

15

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Open-sourcing a codebase close to replicating Upside Down RL () and Reward-Conditioned Policies ()! This is the most robust public implementation of the former and first of the latter, combined as one! Repo:

2

1

15

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

What could this mean for scale? Imagine you want to find and remove a hypothetical bioweapons feature. These results suggest that you may be able to do a small dictionary learning run to find a coarse feature that represents “biology”...

1

0

15

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

@max_hodak My "like" of this is a silent scream.

0

0

15

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Twitter friends (and friends of friends) if you’re visiting Boston and need a place for a short stay reach out! I have a guest room+bathroom and may be able to host you :)

Tweet media one

Tweet media two

0

0

14

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Are you a fan of Hopfield Networks (HN) but unfamiliar with Sparse Distributed Memory (SDM)? SDM is a generalization of HNs that passes a high bar for bio-plausibility with a one-to-one mapping to cerebellar circuitry! Explore their relations here:

Tweet media one

2

1

14

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

Orthogonal, non-immunogenic, safe, small molecule inducible transcription factors for expressing mammalian proteins. Imho these are an amazing new tool. Eg. used to create tumor targeting CAR-T cells that also secret IL-12 with easy to swap receptors.

Tweet media one

0

3

14

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

We also give an example hierarchical tree of semantic concepts found for the word "the" in ever more specific contexts. The number of features allocated for dictionary learning determines the depth at which this semantic tree is cut.

Tweet media one

1

0

14

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

I've written a short piece summarizing the recent Remdesivir RCT results and highlight the barriers and opportunities for it to be as effective an antiviral against SARS-CoV-2 as possible. Comments and thoughts very welcome!

0

1

13

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 month

If you don't have time for the full podcast I think @TheZvi has written a good summary!

@TheZvi

Zvi Mowshowitz

1 month

. @dwarkesh_sp 's April fools joke, which did come a few days early, was that you would be able to understand his latest podcast. Let's show him and understand it anyway!

1

2

14

0

1

13

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

Empirically, we do 3 runs with increasing numbers of features and UMAP the dictionary vectors to find a brilliant display of feature splitting. (See how the features that are light green dots from the fine grained run are contained inside the larger grey dots from the coarse run)

Tweet media one

1

0

13

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

Imagine you do a dictionary learning run with 100 features and another with 1,000 features. We find that the 100 feature run will learn coarse feature representations that the 1,000 feature run splits into finer grained concepts.

1

0

13

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

Going to keep my twitter account focused on science! But as a one off I wanted to share my analog photography portfolio that I just launched :)

Tweet media one

0

0

13

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

I have written a blog post: summarizing fascinating articles by Stephen Hedrick that leverage evolution and viral ecology to argue: 1. While our immune system is indeed sophisticated, it doesn’t keep us any more protected from parasites than the more...

1

5

13

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

@GaryMarcus I agree that organization and wiring matter! But I think we would disagree over the extent to which Transformers already approximate important computations done by the brain. References: (video: )

Tweet card media

Attention Approximates Sparse Distributed Memory

Trenton Bricken, Harvard UniversityAbstract: While Attention has come to be an important mechanism in deep learning, it emerged out of a heuristic process of...

www.youtube.com

1

0

12

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

In awe at how realistic these look

@arankomatsuzaki

Aran Komatsuzaki

@arankomatsuzaki

2 years

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models 3.5B text-conditional diffusion model using classifier free guidance produces images that are favored by human evaluators over those from DALL-E.

Tweet media one

4

29

207

0

0

13

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

@JackScannell13 Guinea pigs to find the treatment for scurvy! h/t @mold_time

Tweet media one

1

0

12

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

“There is nothing more productive than feeding yourself” - says my lizard brain after I spend too my time procrastinating via cooking

1

0

12

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 months

Super excited this work is out!

@AnthropicAI

Anthropic

4 months

New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through.

Tweet media one

128

581

3K

0

0

12

@TrentonBricken

Trenton Bricken

@TrentonBricken

6 months

PhD friends doing AI safety related research -- consider doing it for three months from the lovely Constellation offices in Berkeley starting in January! The community is vibrant and everything is covered. Two days left to apply.

Tweet media one

1

0

12

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Awesome news. If Moderna's vaccine passes all of its trials this will be a *very big deal* for the future of vaccination using mRNA which has a number of benefits over traditional approaches including: more safety (same delivery platform for everything and only express what...

@michaelmina_lab

Michael Mina

@michaelmina_lab

4 years

This is encouraging news about Moderna phase 1 vaccine trial for #COVID19 A first hurdle is to see that people develop strong binding antibodies in response to the vaccine. preliminary data suggests so Also, Phase 2 will move forward

37

324

935

1

1

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Even the wonderful folks @nextstrain need work arounds in their code sometimes! 😂

Tweet media one

0

0

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

@araffin2 *throws computer out of window*

0

0

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

4 years

Getting tired of ML papers that don't provide any code and of authors who take forever to respond to any implementation questions they failed to outline in their work.

1

0

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Only ~500k neurons in the brain produce dopamine. Serotonin is produced by ~100k neurons in the brainstem. Serotonergic neurons project so widely that virtually every neuron in the brain may be contacted by a serotonergic fiber.

0

0

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

Looking forward to speaking at Stanford’s CS 25 — Transformers United — on Tuesday! Details here:

Tweet media one

1

1

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 month

@bio_bootloader @dwarkesh_sp @_sholtodouglas is the main example of chain of thought weirdness that I had in mind. cc @nabla_theta

Tweet card media

Shapley Value Attribution in Chain of Thought — LessWrong

TL;DR: Language models sometimes seem to ignore parts of the chain of thought, and larger models appear to do this more often. Shapley value attribut…

www.lesswrong.com

0

0

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

I just discovered the best paper title of all time: "A Connectionist Context-free Parser which is not Context-free but then it is not really Connectionist either" 😂

Tweet card media

(PDF) A connectionist context-free parser which is not context-free, but then it is not really...

PDF | On Jan 1, 1987, Eugene Charniak and others published A connectionist context-free parser which is not context-free, but then it is not really connectionist either | Find, read and cite all the...

www.researchgate.net

0

0

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

9 months

Congrats @cem__anil and rest of the team!

@arankomatsuzaki

Aran Komatsuzaki

@arankomatsuzaki

9 months

Studying Large Language Model Generalization with Influence Functions

Tweet media one

3

46

180

0

0

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

"We believe that, even in its current form, the Apperception Engine shows considerable promise as a prototype of what a general-purpose domain-independent sense-making machine must look like."

@GoogleDeepMind

Google DeepMind

@GoogleDeepMind

3 years

In a new paper, our team uses unsupervised program synthesis to make sense of sensory sequences. This system is able to solve intelligence test problems zero-shot, without prior training on similar tasks:

14

261

1K

0

0

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

“The cortex (GM + WM, everything outside the striatum) has ~15 B neurons and ~60 B glia; that’s ~80% of the brain’s mass and ~20% of the brain’s neurons. The cerebellum has ~70 B neurons and ~15 B glia; that’s ~10% of the mass and ~80% of the neurons.“

@MWCvitkovic

Milan Cvitkovic

2 years

8

17

108

2

2

11

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 month

@dwarkesh_sp Josh Tenenbaum

2

0

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 months

@AidanFitzzz @tszzl Haha thanks @AidanFitzzz !

0

0

2

@TrentonBricken

Trenton Bricken

@TrentonBricken

10 months

Awesome thread by @AdamSJermyn on his path into interpretability! I’m really glad he switched :)

@AdamSJermyn

Adam Jermyn

10 months

Astronomy relies on the humility to be guided by what we can see more than what we expect, and the arrogance to believe we can build theories and gain a deep understanding. I think interpretability is the same way.

1

2

42

0

0

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

Neurons on a dish learning to play Pong: . It is crazy you can plate neurons on electrodes, pre-define input and output regions, and get them to learn just using rate codes. They use two different stimuli to encode success/failure ...

Tweet media one

2

2

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

If you prefer videos I gave a talk @MIT_CBBM on the work here: Core idea: SDM's read operation uses intersections between high dimensional hyperspheres that approximate the exponential over sum of exponentials that is Attention's softmax function. 2/12

Tweet media one

1

1

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

3 years

This is fantastic

@aaronmring

Aaron Ring

3 years

The 94.5% efficacy of the $MRNA vaccine is getting the headlines, but its stability at -20˚C (normal freezer) long term and up to 30 days at 4˚C (normal fridge) are game changers.

24

433

3K

0

0

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

@kulesatony IMHO the middle ground between "Immune" and Janeway is: which gives you strong intuition/mental models and a high level overview of what's going on. (You didn't say how deep you want to go)

Tweet card media

How the Immune System Works, 6th Edition (The How it Works Series)

How the Immune System Works, 6th Edition (The How it Works Series)

www.amazon.co.uk

1

0

9

@TrentonBricken

Trenton Bricken

@TrentonBricken

2 years

I spent a day messing around with different Brain Atlases and summarized my impressions here: ! The amount and quality of the data being collected is awe inspiring.

Tweet media one

1

3

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

7 months

Going from left to right in the diagram below shows this coarse to fine splitting of similar features.

Tweet media one

1

0

10

@TrentonBricken

Trenton Bricken

@TrentonBricken

1 year

More evidence what the model truly knows != what you can get it to output. Fine-tuning doesn’t teach the model new things it improves prompt comprehension. Same with RLHF.

@patrickmineault

Patrick Mineault

@patrickmineault

1 year

Prompt engineering? How about learning prompts by gradient descent! Lester et al. add virtual tokens at the start of prompts and use supervised fine-tuning on their embeddings. It's almost as good as fine-tuning all the model weights. ~20 tokens suffice

Tweet media one

Tweet media two

3

23

138

2

0

9