Zico Kolter @zicokolter Twitter profile | Pikagi

Pikagi

Zico Kolter

@zicokolter

14,911

Followers

499

Following

34

Media

520

Statuses

Associate professor at Carnegie Mellon, VP and Chief Scientist at Bosch Center for AI. Researching (deep) machine learning, robustness, implicit layers.

Pittsburgh, PA

https://t.co/dmnxCg9Ptc

Joined March 2017

Don't wanna be here? Send us removal request.

Pinned Tweet

@zicokolter

Zico Kolter

6 years

Excited to announce a collaboration between CMU and the Bosch Center for AI (BCAI). I'm joining BCAI as chief scientist of AI research, and they are funding my CMU group (where I'm remaining full time). Press release: My comments:

BCAI Collaboration

I am delighted to annouce a collaboration between my [research group at CMU](/group) and the [Bosch Center for Artificial Intelligence (BCAI)](http://bosch-ai.com). There will be two aspects to this...

8

28

172

Last Seen Profiles

@tenshicos

@Kniraven

@chinnich_m

@VonNoPolv

@theisleofficial

@ISanSaturnino

@cfc_facts

@godsummer0828

@cryptoquariux

@__jxmes__

@CPFBOneStandard

@ThorstenPolleit

@MotownKurtis

@HeidiRe5150

@DeuceWilson32

@nico_taglia

@SymplyCarter

@EmoryBasketball

@palladiummag

@andrew_frailer

@my_swnh

@ambiebambii__

@witching191143

@artbytheclique

@marketinggmari

@PixologicZbrush

@martinaolvr_

@FBIAgent42

@mrs_stowe

@mattydiver

@_suburbanwill

@jimpamlyrics

@StacheAutonomy

@AcessoTravis

@blueeeskies

@TRA_OMAN

@zicokolter

Zico Kolter

1 year

Generative models and P vs. NP: A clickbaity thread 🧵 An important point that seems missing (as far as I've seen) in the debate over LLM "correctness" / "usefulness": generative models are most useful precisely in situations where creation is hard, but verification is easy. 1/N

21

161

935

@zicokolter

Zico Kolter

2 years

I realize this is seemingly an unpopular opinion, but I can't get onboard with these Twitter criticisms of some of the recent #ICML2022 best paper awardees. I've been thinking about this all day. A thread... 🧵 1/N

20

86

917

@zicokolter

Zico Kolter

3 years

Join @DavidDuvenaud , @SingularMattrix , and me at 1:30pm PT for our NeurIPS tutorial on Deep Implicit Layers. Neural ODEs, Deep Equilibrium Models, and differentiable optimization! Extensive notes on companion webpage: NeurIPS link:

Tweet media one

9

139

643

@zicokolter

Zico Kolter

2 years

Announcement: This Fall @tqchenml and I are releasing a free online version of our Deep Learning Systems course! Short description is you build a deep learning framework from the ground up. Sign up at course website: Video teaser:

13

123

630

@zicokolter

Zico Kolter

6 years

Got the "do I use an RNN or CNN for sequence modeling" blues? Use a TrellisNet, an architecture that connects these two worlds, and works better than either! New paper with @shaojieb and Vladlen Koltun. Paper: Code:

Tweet media one

2

152

517

@zicokolter

Zico Kolter

3 years

Why exactly are we still having virtual conferences/workshops on weekends? There is no physical presence that makes a set of contiguous dates more convenient ... why not split workshops/conferences over separate weeks? We'd never expect seminars to be on weekends, why these?

10

22

434

@zicokolter

Zico Kolter

9 months

@CadeMetz at the New York Times just published a piece on a new paper we are releasing today, on adversarial attacks against LLMs. You can read the piece here: And find more info and the paper at: [1/n]

Tweet card media

Researchers Poke Holes in Safety Controls of ChatGPT and Other Chatbots

A new report indicates that the guardrails for widely used chatbots can be thwarted, leading to an increasingly unpredictable environment for the technology.

www.nytimes.com

9

79

352

@zicokolter

Zico Kolter

2 years

But we need to be honest about who bears 99% of the brunt of this very public and immediate criticism: the PhD student first authors. I can't *fathom* how stressful this must be for them. *I* have been stressed just thinking about it, and I have zero connection to the papers. 4/N

1

17

342

@zicokolter

Zico Kolter

4 months

I feel like a lot of people leverage LLMs suboptimally, especially for long-form interactions that span a whole project. So I wrote a VSCode extension that supports what I think is a better use paradigm. 🧵 1/N Extension: Code:

GitHub - locuslab/chatllm-vscode

Contribute to locuslab/chatllm-vscode development by creating an account on GitHub.

5

46

334

@zicokolter

Zico Kolter

5 years

One (implicit) layer is all you need! Just posted "Deep Equilibrium Models", paper with @shaojieb , Vladlen Koltun, also a #NeurIPS2019 spotlight. Paper: Code: Talk video:

Tweet media one

0

76

279

@zicokolter

Zico Kolter

2 years

Thrilled to receive the #aistats2022 Test of Time Award with Tommi Jaakkola . I'll be speaking tomorrow at 1pm ET about our work, with a healthy dose of skepticism about the ultimate impact of energy disaggregation research.

14

12

277

@zicokolter

Zico Kolter

2 years

For all the talk about how industry and "big" academic institutions dominate AI (which I benefit from, to be clear), @pess_r @robrombach Björn Ommer at @UniHeidelberg then @LMU_Muenchen have had as large an impact over the past year as anyone: VQGAN, then Stable Diffusion work.

2

22

253

@zicokolter

Zico Kolter

3 months

That's a wrap on #ICML2024 submissions! 🎉 Initial count is 9,653 submissions, including 220 position paper submission ... (the count will certainly be adjusted in coming days, but for reference, ICML 2023 had 6,538 submissions 😮)

2

38

251

@zicokolter

Zico Kolter

4 years

Virtual @iclr_conf replicates the normal conference experience amazingly well! In that I manage to attend hardly any of the actual talks and instead spend all my time in my hotel/home office working on calls, emails, grant applications, reports, ...

3

11

249

@zicokolter

Zico Kolter

5 months

Now that at NeurIPS is upon us shortly ... it's time to start planning for ICML😀! Thrilled to serve with @kat_heller @adrian_weller @nuriaoliver as PCs, and @rsalakhu as general chair. Call for papers is here: Intro blog post:

Welcome to ICML 2024

We (the program committee: Zico Kolter, Katherine Heller, Adrian Weller, and Nuria Oliver) are thrilled to have recently put out the ICML…

3

24

243

@zicokolter

Zico Kolter

6 years

At the NIPS Deep RL symposium we released an implementation of a differentiable 2D physics engine, written in @pytorch . Available here:

Tweet card media

GitHub - locuslab/lcp-physics: A differentiable LCP physics engine in PyTorch.

A differentiable LCP physics engine in PyTorch. Contribute to locuslab/lcp-physics development by creating an account on GitHub.

0

79

238

@zicokolter

Zico Kolter

3 years

Yearly reminder: To get natbib \citep and \citet commands working properly with NeurIPS citation style (numbers rather than author lists), use the following commands. \usepackage[nonatbib]{neurips_2021} \usepackage[numbers]{natbib} ... \bibliographystyle{abbrvnat}

3

9

234

@zicokolter

Zico Kolter

9 months

The short summary is that we show how to create adversarial suffixes that you can append to LLM prompts, and which cause the LLMs to respond in a manner that circumvents their safety guards. [2/n]

Tweet media one

5

39

208

@zicokolter

Zico Kolter

2 years

It finally happened. That recurring dream that I have been enrolled in a course all semester and never attended lecture became a dream that I have been _teaching_ a course all semester and similarly forgot to ever show up. (🎉?)

12

3

209

@zicokolter

Zico Kolter

2 years

Oh look, it's I̶C̶L̶R̶ ̶N̶o̶t̶i̶f̶i̶c̶a̶t̶i̶o̶n̶ ̶D̶a̶y̶ ConvMixer De-anonymization Day. 🤷‍♂️ With @ashertrockman . On a completely unrelated topic, did everyone know that ICLR 2022 was my first post-tenure deadline... 1/

@ashertrockman

Asher Trockman

2 years

Is attention really behind the success of Vision Transformers? We think maybe... Patches Are All You Need? 🤷 Check out ConvMixer, the first model that achieves 82%+ ImageNet top-1 accuracy while also fitting into a tweet! With @zicokolter . 1/4

6

65

388

4

8

191

@zicokolter

Zico Kolter

3 years

Thank goodness SlidesLive is here to curate all our NeurIPS presentations. It's clearly impossible to build some kind of "video hosting platform" where we could just post videos ourselves, without 8 weeks of lead time.

4

5

174

@zicokolter

Zico Kolter

4 years

The @NeurIPSConf chairs have done a wonderful job managing many unforeseen circumstances, and have my deepest thanks. But in the name of inclusivity, and _because_ they have been so responsive, I want to ask: could we consider killing the desk-reject experiment this year? 1/n

6

29

171

@zicokolter

Zico Kolter

5 years

Our research group is starting a (technical) blog! First post, by @RICEric22 , covers provable adversarial defenses . Each post is downloadable as a Jupyter notebook, so you can recreate all the examples. More info about blog here .

Introducing the LocusLab group blog

locuslab.github.io

0

37

158

@zicokolter

Zico Kolter

2 years

I just posted our Deep Learning Systems Lecture 6 on Fully Connected Networks, Optimization, and Initialization: However, the real topic of interest here is that I used @OpenAI 's whisper to caption it entirely. A thread 🧵on my experience. 1/N

Tweet card media

Lecture 6 - Fully connected networks, optimization, initialization

Lecture 6 of the online course Deep Learning Systems: Algorithms and Implementation.This lecture covers the implementation of fully connected networks (now v...

www.youtube.com

5

23

157

@zicokolter

Zico Kolter

2 years

The "field" will be just fine, even in the unthinkable event that papers with some really interesting ideas also have a few honest mistakes, and win a few awards. 10/N, N=10.

5

3

152

@zicokolter

Zico Kolter

2 years

In what should be a shining moment of a new researcher's career, they feel publicly called out by (sometimes senior) researchers, then piled upon / all lumped together without nuance, because Twitter loves to criticize. 5/N

2

2

146

@zicokolter

Zico Kolter

2 years

Checkmate, "deep learning can't handle tabular data" crowd.

Tweet media one

5

4

131

@zicokolter

Zico Kolter

1 year

Then LLMs are net-positive in time if: T_create > p(correct) * T_verify + (1-p(correct))*(T_verify + T_create), or, simplifying, T_verify < T_create * p(correct) 12/N

7

15

131

@zicokolter

Zico Kolter

5 years

New work with @_powei , @priyald17 , and Bryan Wilder on (SDP-based, differentiable, approximate) SAT solving as a layer within deep networks. For example, learns to play 9x9 Sudoku just from examples, without any structure provided (past work like OptNet never scaled beyond 4x4).

@_powei

Po-Wei Wang

5 years

1/ Integrate logic and deep learning with #SATNet , a differentiable SAT solver! #icml2019 Paper: Code: Joint work with @priyald17 , Bryan Wilder, and @zicokolter .

4

112

367

2

27

130

@zicokolter

Zico Kolter

4 years

Congrats to @_vaishnavh for winning the Outstanding New Directions Paper Award at #NeurIPS2019 ! Come see his talk this Tuesday, 9/10, at 10:05am in West Exhibition Hall C + B3. Paper available here:

Uniform convergence may be unable to explain generalization in...

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on...

@_vaishnavh

Vaishnavh Nagarajan

4 years

Thrilled that our paper w/ @zicokolter on generalization in deep learning has been selected for the Outstanding New Directions Paper Award at #NeurIPS2019 . Extremely grateful to the selection committee, reviewers & many others who provided useful feedback to improve our paper.

19

43

409

6

6

125

@zicokolter

Zico Kolter

2 years

Reminder: @tqchenml and I are offering our Deep Learning Systems course publicly this fall. Build your own deep learning framework from scratch! Starts *September 13*, enroll at . First two lecture preview:

3

11

117

@zicokolter

Zico Kolter

9 months

I'm giving a talk today at the ICML New Frontiers in Adversarial Learning Workshop (). The talk is from 3:00-3:30 Hawaii time in Ballroom A. I'll talk about how we break aligned LLMs. Website: Paper: See 👇

3

25

117

@zicokolter

Zico Kolter

3 months

We have been receiving a few questions about the @icmlconf #ICML2024 Impact Statement in submissions, so made a short post about it here: Hopefully this is helpful to people trying to understand whether to include anything in this section.

Impact Statements in ICML Submissions

This year at ICML, we are instituting an ethics review process for papers. Both ICLR and NeurIPS have such processes, so they should be…

7

21

110

@zicokolter

Zico Kolter

1 year

On the other hand, having Bing tell me quarterly revenue from some financial report feels like a horrible use case: precisely because I'd need to look through a document (returned by ordinary search) to verify it anyway. Verification is essentially as hard as creation. 6/N

1

1

108

@zicokolter

Zico Kolter

3 months

I've made some substantial updates to my chatllm-vscode extension (long-form LLM chats as VSCode notebooks): 1. GPT-4Vision + Dall-E support 2. @ollama support to use local LLMs (including LLaVA for vision) 3. Azure API support (including via SSO) Link:

Tweet card media

chatllm-vscode - Visual Studio Marketplace

Extension for Visual Studio Code - Long-form interaction with LLMs via VSCode's notebook environment.

marketplace.visualstudio.com

1

18

100

@zicokolter

Zico Kolter

2 years

Let's celebrate the papers and authors right now, look for the aspects that warrant excitement (and again, in some cases I think there there are zero merits to the criticism anyway). 9/N

1

0

99

@zicokolter

Zico Kolter

2 years

Hi to everyone at #ICML2022 ! I’ll be at the @Bosch_AI booth 1019 today at 2pm ET (and around there before and after), giving an informal short talk on evaluating and adapting to distribution shift. Drop by to chat!

Tweet media one

2

4

96

@zicokolter

Zico Kolter

2 years

Our Deep Learning Systems lectures 4+5, given by @tqchenml on Automatic Differentiation and its implementation, are now both posted: Lecture 4: Lecture 5: Watch to understand all about the recent memes!

Tweet card media

Lecture 5 - Automatic Differentiation Implementation

Lecture 5 of the online course Deep Learning Systems: Algorithms and Implementation.This lecture provides a code review of needle, our framework for automati...

www.youtube.com

@andrew_n_carr

Andrew Carr (e/🤸)

2 years

Everything I thought I knew about deep learning frameworks is a lie!

Tweet media one

7

10

109

1

11

82

@zicokolter

Zico Kolter

4 years

Apropos of recent discussions on whether "Lp robustness is useful", @RICEric22 and I have a new paper on learning adversarial perturbation models from data. This learns to map an L2 latent space (over which we perform adversarial training) to "realistic" perturbations of inputs.

@RICEric22

Eric Wong @ NeurIPS 23

4 years

1/ New paper on learning perturbation sets for robust machine learning! We study how to characterize real world perturbations in a well-defined set. Paper: Blog post: Code: Joint work with @zicokolter

Tweet media one

3

60

243

3

9

76

@zicokolter

Zico Kolter

4 years

Excited to release our recent paper and library: easily optimize custom performance metrics (including constrained versions) using deep networks. Paper: Code:

GitHub - rizalzaf/ap_perf_py: A Python interface to the AdversarialPrediction.jl (https://github....

A Python interface to the AdversarialPrediction.jl (https://github.com/rizalzaf/AdversarialPrediction.jl). - rizalzaf/ap_perf_py

@rizalzaf

Rizal Fathony

4 years

Want to easily optimize deep nets for performance metrics like F1 score and beyond? Check out our latest paper! AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning with @zicokolter Link:

2

5

25

0

15

76

@zicokolter

Zico Kolter

9 months

Matt and I had a blast demonstrating our recent work on LLM attacks () to @donie . Sorry for both mispronouncing your name in the clip, _and_ making the chatbot insult you 😂. Work with co-authors @andyzou_jiaming , @_zifan_wang .

@donie

Donie O'Sullivan

9 months

Hacking ChatGPT. Our dispatch from @defcon Watch the full report (it’s worth it!):

0

10

90

0

12

75

@zicokolter

Zico Kolter

4 years

A trivial point in current circumstances, but took me way too long to figure out, so hopefully of use to others: To get \citep and \citet annotations working with NeurIPS style: \usepackage[nonatbib]{neurips_2020} \usepackage[numbers]{natbib} ... \bibliographystyle{abbrvnat}

1

5

73

@zicokolter

Zico Kolter

9 months

@mustafasuleymn @inflectionAI @CadeMetz @kevinroose To be clear, "not vulnerable to these suffixes" != "not vulnerable to this mode of attack". I'm pretty confident that with white-box access we (paper authors) could find similar breaks, though with query access only, let's see ... we just requested API access 😀

2

1

72

@zicokolter

Zico Kolter

4 years

Our upcoming ICML paper on integrating industrial-grade (differentiable) PDE solvers into graph neural networks!

@filipeabperes

filipe

4 years

We've just release our ICML paper on combining differentiable fluid simulations and graph neural networks to predict fluid flows. Joint work with Thomas Economon and @zicokolter . Paper: Poster (requires ICML registration):

3

15

52

0

4

71

@zicokolter

Zico Kolter

5 years

Slides and video for @aleks_madry and my #NeurIPS tutorial on Adversarial Robustness now up on the tutorial website:

Adversarial Robustness - Theory and Practice

This web page contains materials to accompany the NeurIPS 2018 tutorial, "Adversarial Robustness: Theory and Practice", by Zico Kolter and Aleksander Madry. The notes are in **very early draft...

adversarial-ml-tutorial.org

0

23

68

@zicokolter

Zico Kolter

3 months

To any mid-senior ML researchers who want to start coding again (see also: ), consider volunteering to serve as a program chair for ICML! Clearly it's a trend...

@kchonyc

Kyunghyun Cho

3 months

because i'm so busy and totally swamped, i decided to have an uncalled-for exercise on implementing gradient-based planning in a fully-observed 2d grid please don't ask me why i did it.

6

6

118

2

1

69

@zicokolter

Zico Kolter

1 year

I'll be speaking at NeurIPS #OptML workshop today at 2pm CT, discussing how you should finetune models using the same loss (or as close as possible) used to pretrain them. Work with @goyalsachin007 @MingjieSun369 @ananyaku Sankalp Garg @AdtRaghunathan

Tweet media one

1

8

66

@zicokolter

Zico Kolter

3 years

Really excited about this new work with @deepcohen on the surprising properties of (full batch) gradient descent training of deep networks. See thread (and you know ... the paper ... if you must) for details.

@deepcohen

Jeremy Cohen

3 years

Our new ICLR paper demonstrates that when neural networks are trained using *full-batch* gradient descent: (1) the training dynamics obey a surprisingly simple "story", and (2) this story contradicts a lot of conventional wisdom in optimization.

10

109

683

0

6

67

@zicokolter

Zico Kolter

15 days

How do you balance repeat training on high quality data versus adding more low quality data to the mix? And how much do you train on each type? @pratyushmaini and @goyalsachin007 provide scaling laws for such settings. Really excited about the work!

@pratyushmaini

Pratyush Maini

15 days

1/ 🥁Scaling Laws for Data Filtering 🥁 TLDR: Data Curation *cannot* be compute agnostic! In our #CVPR2024 paper, we develop the first scaling laws for heterogeneous & limited web data. w/ @goyalsachin007 @zacharylipton @AdtRaghunathan @zicokolter 📝:

Tweet media one

8

72

293

0

7

65

@zicokolter

Zico Kolter

3 years

New paper with @priyald17 , @roderickmelrose , and Mahyar Fazlyab. Really excited about these directions for integrating robust control into deep RL methods.

@priyald17

Priya L. Donti

3 years

Provable robustness is vital for safety-critical systems like planes/power grids, but deep RL has struggled to provide needed guarantees Our new #ICLR2021 paper tackles this by fusing robust control & deep RL w/ @roderickmelrose M Fazlyab @zicokolter 🧵

Tweet media one

5

33

157

1

3

63

@zicokolter

Zico Kolter

9 months

What does this say about the future of safety guards for LLMs? Well, if the history of adversarial ML is any prediction, this could be a hard problem to solve: we have been trying to fix adversarial attacks in computer vision for almost 10 years, with little real success. [6/n]

1

10

62

@zicokolter

Zico Kolter

2 years

_Of course_ best paper awards are going to be a bit arbitrary. But these papers, led again often _by junior PhD students_, garnered enough excitement from the right combination of reviewers, ACs, award committee members to want to highlight them to the community. 8/N

2

0

60

@zicokolter

Zico Kolter

2 years

I'll be talking about this paper at the #ICML2022 UpML workshop today, from *2:00-2:20* in Ballroom II East. Let me give a quick rundown of the paper/talk. Alternative talk title: "How I learned to stop worrying and love/generalize TENT" 1/N

@goyalsachin007

Sachin Goyal

@goyalsachin007

2 years

How do you find the "right" loss for test-time adaptation to distribution shifts? Turns out we can use convex conjugates! New paper📢 with @Eric_jie_thu , Aditi Raghunathan, @zicokolter . Test-Time Adaptation via Conjugate Pseudo-labels

4

24

113

1

7

56

@zicokolter

Zico Kolter

2 years

I will accept that the original critiques are probably in good faith, but I think heavily misguided. Claims that this is about larger process fall flat ... the reviewers, ACs, SACs, award committees, whoever, they are not the ones with their names on the paper. 6/N

1

1

53

@zicokolter

Zico Kolter

2 years

All nuance and discussion, all the important aspects of the paper, any mistakes in the critiques, this is all lost. And even if authors engage, Twitter defaults to trust "the independent experts" more. It is an impossible situation for them. 7/N

2

1

53

@zicokolter

Zico Kolter

4 years

Really excited to release our latest work on large-scale DEQs. We show that implicit layer DEQ models can be scaled up to large-scale vision tasks, by solving for equilibria simultaneously at multiple scales.

@shaojieb

Shaojie Bai

4 years

Have a look at MDEQ, which solves for synchronized equilibria of *multiple* feature streams and achieves SOTA-level results on megapixel vision tasks!🤩 New work w/ @zicokolter & Vladlen Koltun. Code: Talk:

Tweet media one

Tweet media two

0

32

99

0

6

52

@zicokolter

Zico Kolter

3 months

@shortstein I would strongly disagree with the notion that such statements are invitations to “discuss politics.” Lots of ML work touches on societal and ethical issues, at all stages of the research. If you don’t believe this applies to your work, we provide a response to indicate this.

2

4

52

@zicokolter

Zico Kolter

1 year

In more detail: the most compelling use cases for a lot of these models are situations where "creation" of some content is hard / time-consuming, but where I can easily verify that what a model created is what I want. 2/N

1

3

52

@zicokolter

Zico Kolter

5 years

Had a blast giving our #NeurIPS with @aleks_madry on Adversarial Robustness (one last plug for website ). But, secretly sad to not get more comments our amazing logo. :-) (someone probably has made this joke a long time ago...)

Tweet media one

3

4

51

@zicokolter

Zico Kolter

4 years

New paper with @roderickmelrose and @_vaishnavh on provably safe exploration in the PAC-MDP setting. Incidentally, I have never been happier about the first sentence in a paper's introduction.

@roderickmelrose

Melrose Roderick

@roderickmelrose

4 years

I’m really excited to release our work on a provably safe and optimal reinforcement learning method: Analogous Safe-state Exploration (with @_vaishnavh and @zicokolter ). Paper: Code:

1

22

62

2

1

51

@zicokolter

Zico Kolter

1 year

But there are _tons_ of situations where it's hard for me to produce a quick answer, but once given an answer I can easily tell if it's correct. As the silly thread title suggests, this is, after all, the essence of P vs NP. 8/N

3

3

47

@zicokolter

Zico Kolter

2 years

First off, my statements are independent of actual possible issues with the papers. I think one of the criticisms likely has some merit but the authors would claim it is missing the main point of the work. The other seems baseless. That's all I'll say on that. 2/N

2

0

46

@zicokolter

Zico Kolter

3 years

New work with @priyald17 and @david_rolnick on efficient methods for training networks that integrate hard constraints and differentiable optimization! To appear at #ICLR2021 .

@priyald17

Priya L. Donti

3 years

Deep learning methods often struggle to satisfy hard constraints, limiting their practical use in domains such as power systems. We tackle this challenge in our new #ICLR2021 paper on DL for approximate optimization: w/ @david_rolnick & @zicokolter 1/

Tweet media one

9

81

422

0

3

46

@zicokolter

Zico Kolter

2 years

I know that we as a field are concerned about rigor of papers, reviewing quality, and the way in which conference acceptance (let alone paper awards) seems largely a random lottery. 3/N

1

0

45

@zicokolter

Zico Kolter

6 years

New paper, with @RICEric22 , on provable defenses against adversarial examples. Paper: Code:

Tweet media one

0

15

45

@zicokolter

Zico Kolter

7 years

ReLUs can self-normalize too: srelu(x) = 1.18240091*(relu(x+0.89947156) - 1) (but @SELUAppendix may prove this wrong somewhere in there)

2

9

46

@zicokolter

Zico Kolter

2 years

Starting next week! We have posted a preview of the first two lectures: Lecture 1 - Introduction and Logistics: Lecture 2 - ML Refresher / Softmax Regression: Enroll at

Tweet card media

Deep Learning Systems Online Course Teaser

Introduction to our online Deep Learning Systems course, info at http://dlsyscourse.org.The online version of the course starts **September 13, 2022**. Enro...

www.youtube.com

@zicokolter

Zico Kolter

2 years

Announcement: This Fall @tqchenml and I are releasing a free online version of our Deep Learning Systems course! Short description is you build a deep learning framework from the ground up. Sign up at course website: Video teaser:

13

123

630

0

4

44

@zicokolter

Zico Kolter

1 year

Please don't take this to mean I think LLMs are solving NP-complete problems. That's obviously not the case. But there is a similar quality in what kind of use cases we _should_ be looking to LLMs for: problems where we can "quickly" tell if the answer is right. 9/N

1

1

43

@zicokolter

Zico Kolter

2 years

🎞️Lecture 3 🎞️ (in two parts) posted for Deep Learning Systems, with @tqchen . Learn how to "manually" compute backprop for neural networks ... and then never do it again! (Part 1) (Part 2)

Tweet card media

Lecture 3 (Part II) - "Manual" Neural Networks

Lecture 3 (Part 2) of the online course Deep Learning Systems: Algorithms and Implementation.This lecture discusses the nature of simple networks, such as tw...

www.youtube.com

0

4

42

@zicokolter

Zico Kolter

1 year

People arguing over whether these models are useful or not seem to have different implicit settings in mind about the value of these variables. 13/N, N=13

13

0

44

@zicokolter

Zico Kolter

9 months

Finally, the fact that our attacks transfer to closed models also implies to us that being closed-source is not necessarily a substantial impediment to malicious actors attacking your system. [8/n]

1

2

44

@zicokolter

Zico Kolter

9 months

Of course, many people have already demonstrated that "jailbreaks" of common chatbots are possible, but they are time-consuming and rely on human ingenuity. Adversarial attacks provide an entirely _automated_ way of creating essentially a limitless supply of jailbreaks. [5/n]

1

6

42

@zicokolter

Zico Kolter

2 years

Their relevant papers: also w/ co-authors @andi_blatt and Dominik Lorenz (on Twitter?) for methods behind Stable Diffusion. I am in awe.

Tweet card media

Taming Transformers for High-Resolution Image Synthesis

Designed to learn long-range interactions on sequential data, transformers continue to show state-of-the-art results on a wide variety of tasks. In contrast to CNNs, they contain no inductive bias...

1

9

43

@zicokolter

Zico Kolter

4 years

Another new DEQ-related paper this week! @ezra_winston develops a framework for equilibrium models with unique fixed points convergent solvers, based upon monotone operator theory. Paper: Code: Talk:

GitHub - locuslab/monotone_op_net: Monotone operator equilibrium networks

Monotone operator equilibrium networks. Contribute to locuslab/monotone_op_net development by creating an account on GitHub.

1

16

40

@zicokolter

Zico Kolter

2 years

I really hope your were still able to enjoy, and take great pride in, your accomplishments. Congratulations again on the award, and I look forward to seeing more of this work in the future!

@LotfiSanae

Sanae Lotfi

2 years

It has been extremely stressful not to be able to defend our work on Twitter, just to avoid an unproductive discourse. We wanted to enjoy ICML but were instead stressed out all week glued to Twitter as our integrity was called into question. 11/N

1

2

49

3

0

41

@zicokolter

Zico Kolter

9 months

The attack is pretty simple, and involves combining three elements that have appeared in other forms in the literature: 1) making the model answer affirmatively to questions, 2) combining gradient-based and greedy search, and 3) attacking multiple prompts simultaneously. [3/n]

2

2

41

@zicokolter

Zico Kolter

9 months

@maksym_andr Was literally preparing slides for my talk about it tomorrow, and this was a rough draft for slide 1...

Tweet media one

2

4

39

@zicokolter

Zico Kolter

4 years

Current backdoor poisoning attacks aren't just vulnerable to attackers with a "secret trigger". They can easily be broken (creating a new trigger that is just as effective) given access to the classifier. New paper with @Eric_jie_thu and @agsidd10 .

@_mingjiesun

Mingjie Sun

4 years

Is the backdoor secret? Checkout our new work on ''breaking'' poisoned classifiers, where we use neat ideas in adversarial robustness to analyze backdoored classifiers. Joint work with @agsidd10 & @zicokolter . Paper: Code:

Tweet media one

Tweet media two

1

12

49

0

9

38

@zicokolter

Zico Kolter

6 years

Research scientist and research engineer positions available at the Bosch Center for AI in Pittsburgh: Research Scientist: Research Engineer: I'm at #icml2018 and #ijcai2018 , if you also happen to be here and want to know more.

Bosch Group is looking for a Machine Learning Research Engineer in 2555 Smallman St, Pittsburgh, PA...

jobs.smartrecruiters.com

0

13

39

@zicokolter

Zico Kolter

4 years

I'm an AC for NeurIPS, and with ~18 papers, I'd need to desk reject around 3. I'd happily take the three additional weeks we save and offer to be a normal reviewer for 9 papers, if it means we can avoid desk rejects, providing feedback for those with the least experience. n/n

1

2

37

@zicokolter

Zico Kolter

9 months

At least, we hope that making these attacks evident increases awareness of the brittleness and risks of LLMs, especially in circumstances where they are deployed "in the wild" without a human in the loop. We have an expanded ethics and disclosure statement in the paper. [7/n]

2

3

35

@zicokolter

Zico Kolter

5 years

Congrats to @_powei , @priyald17 , and Bryan on our best paper honorable mention at #icml2019 ! Come see our talk at 4:40pm in Hall A. Paper: Code:

Tweet card media

GitHub - locuslab/SATNet: Bridging deep learning and logical reasoning using a differentiable...

Bridging deep learning and logical reasoning using a differentiable satisfiability solver. - locuslab/SATNet

@priyald17

Priya L. Donti

5 years

Excited to have received a best paper honorable mention for our paper "SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver" (with @_powei , Bryan Wilder, and @zicokolter ) at #ICML2019 !

Tweet media one

9

20

153

0

5

35

@zicokolter

Zico Kolter

9 months

If you'd like to hear more, I'll be talking about this at the New Frontiers in Adversarial Machine Learning @AdvMLICML23 , this Friday July 28th, from 3:00-3:30pm Hawaii time (in Ballroom A). [9/n]

2

3

36

@zicokolter

Zico Kolter

9 months

We began the work by attacking open source LLMs, which works very well (including on recent models, like Llama-2). But as a test we copied the attacks into public closed-source chatbots and found they (sometimes) worked there as well.🤯 [4/n]

1

4

35

@zicokolter

Zico Kolter

2 years

Update: Transcribing latest lecture, @OpenAI whisper correctly capitalized "Z i" versus "z i" after I verbally said "capital Z" nearby 🤯 After adding flag to fix the alignment bug, these captions are objectively _better_ that the paid ones...

@zicokolter

Zico Kolter

2 years

When my main annoyance is that it can't divine, somehow, from audio alone, that I obviously mean (captial) Z i versus (lowercase) z i, things are pretty good. 6/N

1

0

3

0

1

34

@zicokolter

Zico Kolter

5 years

New work with @_vaishnavh , showing that proving generalization in deep learning may be even trickier than thought. Many existing bounds can _increase_ with sample size, and we show a simple DL-like setting where any uniform convergence proof fails.

Uniform convergence may be unable to explain generalization in...

Aimed at explaining the surprisingly good generalization behavior of overparameterized deep networks, recent works have developed a variety of generalization bounds for deep learning, all based on...

@_vaishnavh

Vaishnavh Nagarajan

5 years

At the heart of most deep learning generalization bounds (VC, Rademacher, PAC-Bayes) is uniform convergence (u.c.). We argue why u. c. may be unable to provide a complete explanation of generalization, even if we take into account the implicit bias of SGD.

2

26

109

1

1

35

@zicokolter

Zico Kolter

1 year

Code generation is an interesting middle ground: I probably don't immediately know that a block of code will work, but if e.g., I ask ChatGPT to generate some random matplotlib options, I can quickly check in a notebook to see if it's right. 10/N

4

0

34

@zicokolter

Zico Kolter

4 years

Tomorrow, 12/11, I will be giving a talk at 3:15pm at the @Bosch_AI booth at #NeurIPS . I will be covering some of the ongoing research collaborations between CMU and BCAI Pittsburgh. Come and find out about our work!

0

5

30

@zicokolter

Zico Kolter

3 years

Might you say that … a post-modern approach to AI is needed? (the title of my last teaching statement, incidentally).

@srush_nlp

Sasha Rush

3 years

Ill-Advised Rant: Despite working in AI for a decade, I find the introduction chapter in the AI textbook to be nearly alien language. (I hope this is not seen as punching down as this is one of the most popular CS texts in the world) /1

Tweet media one

26

35

379

3

1

29

@zicokolter

Zico Kolter

5 months

In case anyone is wondering, Llama Guard is very easily broken via adversarial attacks...

@andyzou_jiaming

Andy Zou

@andyzou_jiaming

5 months

Meta: Here's a model we fine-tuned extensively to do exactly one thing (differentiating safe and unsafe content). GCG: Hold my beer...

Tweet media one

5

20

118

0

0

30

@zicokolter

Zico Kolter

1 year

I can't easily create a cool graphic for some complicated text description; but I can really quickly verify that a graphic is suitable for what I'm looking for. 4/N

1

1

30

@zicokolter

Zico Kolter

1 year

It's time-consuming to write out a few paragraphs of text elaborating on a couple of bullet points, but I can quickly verify that the text does indeed capture the content I'm looking to describe. 5/N

2

0

28

@zicokolter

Zico Kolter

7 months

New work with @andyzou_jiaming on analyzing internal representations of LLMs, to find phenomena that correlate quite well with notions like truthfulness. All sorts of cool applications.

@andyzou_jiaming

Andy Zou

@andyzou_jiaming

7 months

LLMs can hallucinate and lie. They can be jailbroken by weird suffixes. They memorize training data and exhibit biases. 🧠 We shed light on all of these phenomena with a new approach to AI transparency. 🧵 Website: Paper:

Tweet media one

27

254

1K

1

5

28

@zicokolter

Zico Kolter

4 years

#NeurIPS2019 registration line, 7am alternate version...

Tweet media one

2

0

27

@zicokolter

Zico Kolter

7 years

Our new paper (by @_vaishnavh ): Gradient descent GAN optimization is locally stable Paper:

Tweet media one

0

8

25

@zicokolter

Zico Kolter

2 years

Really excited about this work with @_christinabaek , @yidingjiang , and Aditi on using an empirical phenomenon we found (agreement on the line), to better estimate OOD performance of a collection of classifiers.

@_christinabaek

Christina Baek

@_christinabaek

2 years

Estimating out-of-distribution (OOD) performance is hard because labeled data is expensive. Can we predict OOD performance w/ only _unlabeled data_? In our work (), we show this can be done using models’ agreement. w/ @yidingjiang , Aditi R., @zicokolter

12

82

516

0

2

26

@zicokolter

Zico Kolter

5 years

New paper on using randomized smoothing to build a provable defense against L2 adversarial attacks. The method outperforms previous bounds _and_ scales to ImageNet (well beyond the scale of existing provable methods). Work with @deepcohen and Elan Rosenfeld.

@deepcohen

Jeremy Cohen

5 years

1/ I'm excited to share our work on randomized smoothing, a PROVABLE adversarial defense in L2 norm which works on ImageNet! We achieve a *provable* top-1 accuracy of 49% in the face of adversarial perturbations with L2 norm less than 0.5 (=127/255).

2

60

210

0

4

25

@zicokolter

Zico Kolter

5 years

We have a new post up on our blog, written by @_vaishnavh : . It highlights our work on the failures of uniform convergence in explaining generalization in deep learning. As with all our posts, you can download the post, with code, as a Jupyter notebook.

Uniform convergence may be unable to explain generalization in deep learning

Empirical and theoretical evidence demonstrating that uniform convergence based generalization bounds may be meaningless for overparameterized deep networks trained by stochastic gradient descent.

locuslab.github.io

@_vaishnavh

Vaishnavh Nagarajan

5 years

Excited to share our new blog post **w/ code** (downloadable as Jupyter notebook) highlighting why current approaches to deriving generalization bounds in deep learning may be severely limited.

Tweet media one

1

55

232

0

2

25

@zicokolter

Zico Kolter

2 years

(🔥🔥Viral paper reveal + tenure annoucement in a single tweet 🔥🔥 ... if this doesn't finally me to 1K likes I'm clearly not cut out for social media). 2/

2

0

25