Emtiyaz Khan @EmtiyazKhan Twitter profile | Pikagi

Pikagi

Emtiyaz Khan

@EmtiyazKhan

10,738

Followers

234

Following

382

Media

5,577

Statuses

Team leader at @RIKEN_AIP_EN . Opinions my own. Follow me at

Tokyo-to, Japan

https://t.co/Gvn7c5Dtg3

Joined November 2012

Don't wanna be here? Send us removal request.

Pinned Tweet

@EmtiyazKhan

Emtiyaz Khan

3 years

Our new paper on "The Bayesian Learning Rule" is now on arXiv, where we provide a common learning-principle behind a variety of learning algorithms (optimization, deep learning, and graphical models). Guess what, the principle is Bayesian. A very long🧵

Tweet media one

29

389

2K

Last Seen Profiles

@unity

@JalenHurts

@RyanSwitzer14

@CoachLimegrover

@_Nerdy76

@SalusCoop

@HabitatMichigan

@sanpadrepio

@niepowaznych

@ChrisHe26289330

@FamilyFirstMag1

@tfpridePH

@WjEstelle

@MayoClinicKids

@clarktournament

@LarryLegend86

@heanpa

@eibmeab

@brownboy_boy

@McMasterScience

@sefryamin54

@Fr1skyBumbleBee

@cecilia63179379

@sunnynicephones

@fvd_almelo

@HenkSchoonveld1

@lollitskenn

@freeszya

@nemunem68931519

@Kamilba92

@GreenStateFH

@BobHoldenNYC

@AHS_BlazerFB

@luuvvvable

@PhiladelphiaHo8

@CameronBlackTD

@EmtiyazKhan

Emtiyaz Khan

3 years

Tomorrow, I will give a tutorial on "ML from a Bayesian Perspective" at ACML 2021. I also wrote a summary for the slides, explaining why a Bayesian perspective is essential for machine learners Slides:

Tweet media one

9

231

1K

@EmtiyazKhan

Emtiyaz Khan

4 years

I have had an amazing reaction for my NeurIPS tutorial. I thank you all for your encouraging comments. Links to the video and the paper attached below.

Tweet card media

Mohammad Emtiyaz Khan · Deep Learning with Bayesian Principles

Deep learning and Bayesian learning are considered two entirely different fields often used in complementary settings. It is clear that combining ideas from the two fields would be beneficial, but how

7

177

832

@EmtiyazKhan

Emtiyaz Khan

3 years

I am really frustrated with ML papers omitting previous works. It happens so frequently, that it has started to take a toll on my mental health. Today a student pointed me to a paper that is published in ICML and is missing all references to the last 5 yrs of our work. 1/7

31

63

684

@EmtiyazKhan

Emtiyaz Khan

2 years

I am pleased to be promoted to a "permanent PI" position at RIKEN (equivalent to a "tenured" full professor). Thanks to all my collaborators and mentors for their help, and all the funding agencies that supported our work. My position starts next month! 1/

79

7

620

@EmtiyazKhan

Emtiyaz Khan

2 years

Our postdoc just submitted a paper that took 2 yrs of hard work. It solves a problem I wanted to solve for lthe ast 14 yrs or so, since my PhD! So happy. Now it time to prepare for a 3,5,5,6 review score and why this paper is not good enough for this prestigious conference. 😅

11

8

537

@EmtiyazKhan

Emtiyaz Khan

4 years

I have noticed that I lose followers whenever I tweet about diverse issues, so I thought to clearly state👇🏾 I care immensely about - diversity, inclusion, justice, equality - science (AI/ML) for a better society - fundamental research in AI/ML & I am here to talk about all of👆🏾

17

26

507

@EmtiyazKhan

Emtiyaz Khan

3 years

I am pleased to announce that our proposal on the Bayes-duality project has received (approx) $2.76 million through JST-ANR's CREST, to develop AI systems that learn adaptively, robustly, and continuously, like humans. Follow @BayesDuality 1/6

Tweet media one

34

47

489

@EmtiyazKhan

Emtiyaz Khan

3 years

In about 10hrs, I will face an interview for my largest grant proposal. Going to push for Bayes + lifelong learning. Wish me luck!

22

5

471

@EmtiyazKhan

Emtiyaz Khan

3 years

Generalization with one shot learning.

Tweet media one

Tweet media two

Tweet media three

5

31

412

@EmtiyazKhan

Emtiyaz Khan

5 years

Happy to announce that I will be giving a NeurIPS tutorial on “Deep learning with Bayesian principles”. I feel very fortunate to have this opportunity and thank the organizers for their efforts.

@DaniCMBelg

Danielle Belgrave

5 years

#NeurIPS2019 tutorials are now out! @aliceoh and myself wrote a blog on how we went about selecting this year's tutorials

1

46

181

7

31

308

@EmtiyazKhan

Emtiyaz Khan

3 years

If you are feeling down, may be due to rejections, may be burn out, or may be just gloomy weather, ....and someone to share it with, I am happy to chat. I might not be available right away but I will try my best to make time for you. My email is here

9

19

300

@EmtiyazKhan

Emtiyaz Khan

4 years

If you sort your dataset with according to prediction error *and predictive variance*, you surely find interesting things. We visualized high variance examples, called memorable examples, and they sure are very interesting to look at.

Tweet media one

@karpathy

Andrej Karpathy

4 years

When you sort your dataset descending by loss you are guaranteed to find something unexpected, strange and helpful.

30

227

2K

3

50

297

@EmtiyazKhan

Emtiyaz Khan

2 years

I am okay with theory. I am okay with empirical efforts. But I am not okay with magic. I would like to understand the magic, with the help of theory and empirical efforts. A lot of deep learning right now is like magic (to me).

7

15

271

@EmtiyazKhan

Emtiyaz Khan

3 years

Happy to introduce , a portal to easily connect mentors and mentees. Designed and created by @hen_str , @mpd37 , Adam White, Olga Isupova, (and me). We are testing it out at #NeurIPS2020 . Sign up. Help us spread the word! A 🧵below on how to use. 1/8

Tweet media one

4

76

270

@EmtiyazKhan

Emtiyaz Khan

5 years

New work on scaling up VI to ImageNet. With Kazuki Osawa and @rioyokota from TokyoTech, Siddharth Swaroop and Rich Turner @CambridgeMLG , and interns @sponde25 and Runa E. We show that VI can match Adam's performance in about the same number of epochs. 1/7

Tweet media one

Tweet media two

3

75

256

@EmtiyazKhan

Emtiyaz Khan

2 years

I made a poster summarizing our group’s research done in the last 2 years (2020-2021). A higher-resolution version is at Thanks to many of our past and current members , and also to the funding agencies.

Tweet media one

3

36

216

@EmtiyazKhan

Emtiyaz Khan

2 years

New work with @tmoellenhoff showing that SAM's max-loss is the *best* convex upper bound to Bayes' expected-loss. This is then used to derive an Adam-style extension of SAM to estimate uncertainty for free. The work took 2 yrs, and it's my favorite! 1/13

Tweet media one

3

33

216

@EmtiyazKhan

Emtiyaz Khan

2 years

I am trying to find good candidates for ACs and reviewers for NeurIPS 2022. Trying to reach deserving folks who are somehow never invited, e.g., those who have been NeurIPS reviewers for years but never an AC, or those reviewing for other ML conf. but never for NeurIPS, etc.etc.

15

30

210

@EmtiyazKhan

Emtiyaz Khan

2 years

I have a feeling this is soon going to be a shitty prior.

Tweet media one

12

10

209

@EmtiyazKhan

Emtiyaz Khan

3 years

These are some wonderful advice from @jbhuang0604 . I add 10 more tips below. Some are perhaps redundant, but it is worth repeating. They are also highly biased and may not work for everybody or in every situation, but I do hope it helps some. 1/11

@jbhuang0604

Jia-Bin Huang

3 years

How to make steady progress in my research? I worked so damn hard but "IT JUST DOESN'T WORK!"😤 How can I unblock myself quickly and make good progress toward the goals? Below I compiled a list of tips that I found useful. 👇

17

558

2K

2

48

205

@EmtiyazKhan

Emtiyaz Khan

2 years

Exciting news! ACML 2022 will be held in Hyderabad, India (mid Dec, most likely hybrid). So few ML related events happen in India, so we worked for the last 3 yrs to bring ACML to India, and just had the kickoff meeting! Looking for suggestions and most importantly sponsors.

Tweet media one

5

30

202

@EmtiyazKhan

Emtiyaz Khan

4 years

Our FROMP paper is finally accepted for an oral presentation (around 1% of all submissions) at #NeurIPS2020 (). Congratulations to @siddharthswar , Pingbo Pan, Alex Immer, Runa Eschenhagen, and Rich Turner!

Tweet card media

Continual Deep Learning by Functional Regularisation of Memorable Past

Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past. Recent works address this with weight...

@siddharthswar

Siddharth Swaroop

4 years

Recent work on functional regularisation for continual learning with @EmtiyazKhan and others. Since network outputs depend on weights in a complex way, function-regularisation may be better than previous weight-regularisation. 1/7

Tweet media one

7

55

246

4

28

198

@EmtiyazKhan

Emtiyaz Khan

3 years

On this day, 5 yrs ago, I left Switzerland and arrived in Tokyo. (This is me in front of my apartment building in Lausanne, ready to go to the airport)

Tweet media one

5

0

197

@EmtiyazKhan

Emtiyaz Khan

3 years

So a NeurIPS AC under me (as SAC) disappeared for the whole post rebuttal period, didn't start any discussions, didn't respond to any emails. Upon chairs' request, I spent my whole weekend to write the meta-review (instead of being with my family) and when I went to enter... 1/2

6

9

197

@EmtiyazKhan

Emtiyaz Khan

2 years

A bitter lesson I have learned about academia is that when you negotiate with the institutions, you *must* not take anything they offer for granted. "Get everything in writing", a wise academic told me a while ago, and I wish I had followed that practice.

9

12

195

@EmtiyazKhan

Emtiyaz Khan

3 years

Academia can be brutal in acknowledging contributions. What I absolutely dislike is that we shame those who seek attribution, much more than those who steal. Schmidhuber’s behaviour might seem strange but (some of) it is a by product of the toxicity of our community.

@pcastr

Pablo Samuel Castro

3 years

No one: Literally no one: Schmidhuber: all your ideas are belong to me

8

3

268

7

7

185

@EmtiyazKhan

Emtiyaz Khan

5 years

We are hiring in our approx-Bayes team at RIKEN-AIP. Post-docs (2 positions), RA (4 positions), interns (15 positions). Job posting: Team page: Email: jobs-abi-riken-aip @googlegroups .com Help me spread the word. Retweets appreciated

8

91

183

@EmtiyazKhan

Emtiyaz Khan

3 years

A delayed announcement: I have joined @OISTedu as an *external* professor. I will spend some time every year in Okinawa, and also teach a course there. I will also be able to accept PhD students (and rotation students), affiliated with OIST but will work at RIKEN in Tokyo. 1/3

18

24

182

@EmtiyazKhan

Emtiyaz Khan

5 years

New work with my 3 amazing interns @RIKEN_AIP_EN . We show that certain Gaussian posterior approximations for neural nets are equivalent to GP posteriors. This enables us to turn neural nets into GPs *during* training. Full text here: Details below 1/9

Tweet media one

1

41

181

@EmtiyazKhan

Emtiyaz Khan

6 years

I will talk about our new work on "Bayesian deep learning using weight-perturbation in Adam" at #icml2018 in "Deep Learning (Bayesian) 2" session at 4:50pm in room A4. Paper here Slides here Code here 1/6

Tweet media one

2

42

177

@EmtiyazKhan

Emtiyaz Khan

2 years

A new (pre-PhD) position (suitable for bachelor/masters students) If interested lifelong deep learning, Bayes, or optimization methods, please consider applying. A good position if you want to do a PhD position later (with me) in OIST. RTs appreciated.

5

76

174

@EmtiyazKhan

Emtiyaz Khan

2 years

After my masters I worked for 2 yrs to save money, and then After paying for grad applications + flight tickets + first month rent in Vancouver, I immediately went in debt. My first tuition fee went to a credit card, left no money to buy proper clothes/shoes for my first winter.

8

7

173

@EmtiyazKhan

Emtiyaz Khan

3 years

In any case, if you read the whole thread, I request you to please do proper literature search when writing papers. We are in so much hurry to write papers, and to put them out, that we end up hurting others in our own community. Please do not do this. 7/end

3

9

171

@EmtiyazKhan

Emtiyaz Khan

4 years

A concise and beautiful summary of my tutorial! Thanks @RobertTLange

@RobertTLange

Robert Lange

4 years

Great #NeurIPS2019 tutorial kick-off by @EmtiyazKhan ! Showing the unifying Bayesian Principle bridging Human & Deep Learning. Variational Online Gauss-Newton (VOGN; Osawa et al., 19‘) = A Bayesian Love Story ❤️

Tweet media one

Tweet media two

7

88

340

1

20

166

@EmtiyazKhan

Emtiyaz Khan

2 years

C: in college, I was lost and felt like dropping out. There is a 3rd type who didn’t get any help when they really needed it, but still survived. I am the 3rd type.

@anthonyocampo

anthony christian ocampo 🇵🇭🏳️‍🌈

2 years

the 2 types of professors: A: my parents are professors. B: in college, when i was lost and felt like dropping out, there was this one professor who, despite my lack of confidence, must've caught a glimpse of something i couldn't yet see, and went above and beyond to mentor me.

75

447

7K

3

1

168

@EmtiyazKhan

Emtiyaz Khan

3 years

Here is an "old is gold" tweet (copyright @gabrielpeyre ?). Csato and Opper 2002 show a "parameterization" (q and R) for GP inference with *general* likelihoods. This is perhaps the first reference making this important point. Why is this interesting? A thread below 1/11

Tweet media one

2

23

165

@EmtiyazKhan

Emtiyaz Khan

4 years

Excited to announce that we have a new postdoc/RA position with a focus on "uncertainty in deep learning and applications to life-long learning". Help me spread the word. RT appreciated. (RA position is for candidates without a PhD)

0

92

159

@EmtiyazKhan

Emtiyaz Khan

4 years

Explains the difference between Variational Inference and expectation propagation. Enjoy.

@ari_seff

Ari Seff

4 years

KL divergence asymmetry

19

532

3K

2

29

162

@EmtiyazKhan

Emtiyaz Khan

3 years

Since the NeurIPS review submission deadline is just around the corner, I thought to write a thread. Someone asked me for tips for "fast reviewing", and the thread below contains some of those. 1/12

3

23

160

@EmtiyazKhan

Emtiyaz Khan

10 months

At #ICML2023 we will have a workshop on Duality Principles. "Duality is a principle, it gives two different views of the same object" Duality is not a niche topic, it's for everybody. Hope to see you there! @tmoellenhoff @ZeldaMariet @mblondel_ml

Tweet media one

1

32

157

@EmtiyazKhan

Emtiyaz Khan

4 years

I will give a tutorial on "DL with Bayes" at SMILES 2020 at CET 9am (3pm JST). I will also discuss "robustness of Bayes vs optimization" and new work on "DNN2GP and life-long learning" Slides: Youtube:

@SMILESkoltech

SMILES - Summer School of Machine Learning at SK

4 years

. @EmtiyazKhan is a team leader at the @RIKEN_JP / @riken_en @RIKEN_AIP_EN in Tokyo where he leads the Approximate Bayesian Inference (ABI) Team and a visiting professor at the EE department in @TUAT_all Apply now:

Tweet media one

0

10

23

1

40

155

@EmtiyazKhan

Emtiyaz Khan

3 years

My sister in India tested +be for covid. She is doing ok but we are all worried for my mother, who lives with them and has a preexisting lung condition. I too have been sick (non covid related) for the past 4 months. Just surviving. This pandemic is not over yet for us.

17

0

152

@EmtiyazKhan

Emtiyaz Khan

2 years

My usual running track is very inviting now.

Tweet media one

Tweet media two

3

1

152

@EmtiyazKhan

Emtiyaz Khan

4 years

“those who achieve at lower ranked unis are the real stars because they've succeeded despite fewer resources.” Do you hear that? 🙃

2

21

151

@EmtiyazKhan

Emtiyaz Khan

3 years

As an SAC for NeurIPS this year, I hope that bad reviews will be flagged by the authors and ACs. I personally have decided to put in *as much time as required* to go through such comments by authors. I have >130 papers under me, but this must be done.

2

5

144

@EmtiyazKhan

Emtiyaz Khan

4 years

Excited for the tutorial tomorrow (Dec 9) at 9am at #NeurIPS2019 If you are at the conference and would to chat, please send me an email (also, if you are interested in a post-doc position in our group at Tokyo).

6

22

142

@EmtiyazKhan

Emtiyaz Khan

4 years

List of accepted workshops at @icmlconf 2020. An (unofficial) html version is at my webpage This will eventually be available in the ICML webpage but this is taking time because virtualization of ICML is a higher priority atm. 1/2

Tweet media one

1

43

141

@EmtiyazKhan

Emtiyaz Khan

3 years

I did my first (very short) hike today, in a very long time (2yrs). I feel fortunate to be finally healthy enough to go out in the nature again. This is Mount Fuji from Mount Takao right after the sunset.

Tweet media one

2

2

138

@EmtiyazKhan

Emtiyaz Khan

4 years

Please apply to Summer School of Machine Learning at Skoltech (SMILES) that hopes to bring together the ML community in CIS, Central Asia, and overseas. I will also be giving a talk on DL with Bayes there. Application open until 2nd of August 2020!

Tweet card media

SMILES: ONLINE School of Machine Learning. 2021, Moscow, Russia

smiles.skoltech.ru

0

44

133

@EmtiyazKhan

Emtiyaz Khan

2 years

Today was my first day as a tenured PI and marks exactly 6yrs since I started my position in Tokyo. I was happy because it didn’t feel any different and still feel as excited as I was 6 yrs ago.

@EmtiyazKhan

Emtiyaz Khan

3 years

On this day, 5 yrs ago, I left Switzerland and arrived in Tokyo. (This is me in front of my apartment building in Lausanne, ready to go to the airport)

Tweet media one

5

0

197

5

0

136

@EmtiyazKhan

Emtiyaz Khan

3 years

I had a very good experience as a mentor for @iclr_conf 's reviewer mentorship. Out of 10 reviewers assigned to me, I accepted 8. 3/10 wrote very good reviews, and the rest required just a little adjustment. We need more such initiatives (thanks to @mpd37 and colleagues)!

5

4

134

@EmtiyazKhan

Emtiyaz Khan

3 years

Nice slides on uncertainty in DL! See slide 62 onwards for a list of references. But be careful! This is a much more difficult topic than it appears at first (uncertainty is a vague term to define everything the model doesn’t know and the same is true for “out of distribution”).

@balajiln

Balaji Lakshminarayanan

3 years

Really enjoyed giving a talk on "Introduction to Uncertainty in Deep Learning" at @CIFAR_News #DLRL summer school today🙂Lots of great questions! Link to slides: All models are wrong, but ̶s̶o̶m̶e̶ ̶ *models that know when they are wrong*, are useful😋

Tweet media one

8

106

573

0

27

133

@EmtiyazKhan

Emtiyaz Khan

2 years

Not sure who needs to here this, but out of around 100 papers in my lot as SAC for #NeurIPS2022 only 8 have a score >= 6, around 45 have >=5. My own submission has a rating between 5 and 6.

7

8

130

@EmtiyazKhan

Emtiyaz Khan

3 years

K-priors is now accepted at NeurIPS! We have 1 more accepted paper and also a rejected one. Happy overall but have decided to submit less to conferences, from now on. And that decision, already makes me feel relaxed, not to have these back to back deadlines. Highly recommended!

@EmtiyazKhan

Emtiyaz Khan

3 years

In about 12 hrs, my talk on "K-priors: a general principle of adaptation" will be streamed at the workshop on "theory of continual learning". Video: Slides: Paper: , joint work with @siddharthswar A🧵👇🏿1/18

Tweet media one

3

24

117

6

7

129

@EmtiyazKhan

Emtiyaz Khan

2 years

A prior is always a (kind of) posterior.

11

6

124

@EmtiyazKhan

Emtiyaz Khan

4 years

LM and (related) Gauss-Newton are very useful concepts. We have used this previously to derive variants of Adam and also to convert neural nets to Gaussian Process.

Tweet card media

Scalable Training of Inference Networks for Gaussian-Process Models

Inference in Gaussian process (GP) models is computationally challenging for large data, and often difficult to approximate with a small number of inducing points. We explore an alternative...

@gabrielpeyre

Gabriel Peyré

4 years

The Levenberg-Marquardt is a standard method for non-linear least-squares, combining the best of gradient descent and Newton methods by replacing the Hessian by the Jacobian term only.

5

144

695

0

18

122

@EmtiyazKhan

Emtiyaz Khan

6 years

Today, I will give a talk on "uncertainty in deep learning" at "AI meets life science" workshop at Karolinska Institutet in Stockholm (). Slides at I am excited to chat with life science researchers about Bayesian deep learning.

2

39

121

@EmtiyazKhan

Emtiyaz Khan

2 years

Lagrange multipliers are (under appreciated)part of almost all ML methods. Eg, this thread shows “temperature parameter” in the Gibbs posteriors as a Lagrange parameter. Such parameters capture “sensitivity to perturbations”, making “duality” inherent to ML.

2

11

121

@EmtiyazKhan

Emtiyaz Khan

2 years

Listening to @vincefort who is telling us about the "importance of priors in Bayesian deep learning" at our reading group.

Tweet media one

2

8

119

@EmtiyazKhan

Emtiyaz Khan

7 months

We have two open positions in our group in Tokyo (start date is April 2024). Postdoc/research-scientist: Postdoc or tech-staff (pre-PhD): Help me spread the word!

0

57

120

@EmtiyazKhan

Emtiyaz Khan

2 years

Bayesian principles suggest to add the Shannon entropy to the expected loss. This allows you to sample around a minimizes more often than other places. A foundational principle.

@gabrielpeyre

Gabriel Peyré

2 years

Gibbs distributions are maximum entropy distributions. Used in conjunction with MCMC methods to generate samples close to minimisers of the functional.

5

133

704

1

10

119

@EmtiyazKhan

Emtiyaz Khan

6 years

Check out our new paper at ICLR on “variational message passing for deep structured models”. It’s a VAE-like amortized algorithm (we call it SAN) which also performs natural-gradient inference for the structured model. Poster Tue May 1, from 11-1.

Tweet media one

1

39

116

@EmtiyazKhan

Emtiyaz Khan

3 years

In about 12 hrs, my talk on "K-priors: a general principle of adaptation" will be streamed at the workshop on "theory of continual learning". Video: Slides: Paper: , joint work with @siddharthswar A🧵👇🏿1/18

Tweet media one

3

24

117

@EmtiyazKhan

Emtiyaz Khan

3 years

A few months ago I donated to @arxiv and I was very happy to receive this “hand written” postcard, with a very personal “correction” included which makes it very authentic. Please consider donating and wish them happy 30th birthday.

Tweet media one

2

2

115

@EmtiyazKhan

Emtiyaz Khan

5 years

An out of the box idea by @hardmaru and team! For a Bayesian, this is like choosing an architecture such the posterior distribution is uniform (contains no information at all)!!

@hardmaru

hardmaru

5 years

Weight Agnostic Neural Networks 🦎 Inspired by precocial species in biology, we set out to search for neural net architectures that can already (sort of) perform various tasks even when they use random weight values. Article: PDF:

55

671

2K

3

18

114

@EmtiyazKhan

Emtiyaz Khan

4 years

Our recent work on continual deep learning.

@siddharthswar

Siddharth Swaroop

4 years

Recent work on functional regularisation for continual learning with @EmtiyazKhan and others. Since network outputs depend on weights in a complex way, function-regularisation may be better than previous weight-regularisation. 1/7

Tweet media one

7

55

246

0

20

111

@EmtiyazKhan

Emtiyaz Khan

2 years

If you are looking for a position to work on deep-learning and related fields, including theory/practice of it, feel free to reach out. We are open to hiring 1-2 more people at postdoc or RA (pre-PhD) level.

Team ApproxBayes

ABI at RIKEN AIP

team-approx-bayes.github.io

2

32

110

@EmtiyazKhan

Emtiyaz Khan

4 years

Tutorial starts at 8:30 am (sorry not at 9am). Slides and a paper with some of the proofs are here. Slides might take a while to download (heavy!)

@EmtiyazKhan

Emtiyaz Khan

4 years

Excited for the tutorial tomorrow (Dec 9) at 9am at #NeurIPS2019 If you are at the conference and would to chat, please send me an email (also, if you are interested in a post-doc position in our group at Tokyo).

6

22

142

3

23

104

@EmtiyazKhan

Emtiyaz Khan

2 years

How does Bayes' favor flat regions of the loss? I made a small animation where the original loss (shown in gray) has three minima. Bayes uses expected loss (shown in red as a function of mean m with fixed variance), where the less flat minima disappear as variance is increased 1/

2

22

104

@EmtiyazKhan

Emtiyaz Khan

1 year

I haven't been very active in Twitter lately, but thought of giving you a few updates from my life in the last few months. Here is a long thread with news about - our new baby - achievements of our group - new papers, - open positions, etc. 1/11

10

2

102

@EmtiyazKhan

Emtiyaz Khan

5 years

This is really amazing. I always say that “missing values” in reality are not always a problem but they can also be features. Perhaps we learn to predict things better when we know observations will be mostly missing. Congratulations @hardmaru and coauthors!

@hardmaru

hardmaru

5 years

Learning to Predict Without Looking Ahead: World Models Without Forward Prediction Rather than hardcoding forward prediction, we try to get agents to *learn* that they need to predict the future. Check out our #NeurIPS2019 paper!

12

309

1K

2

22

105

@EmtiyazKhan

Emtiyaz Khan

2 years

Our 7-yr old daughter likes making short stories with illustrations, and she has made >200 over the past few years. I thought I will share the one she made today. It made my day and I hope you enjoy it too. So "Story time! Wake up!"

Tweet media one

3

5

103

@EmtiyazKhan

Emtiyaz Khan

2 years

DC duality is very closely related to message-passing algorithms, and very useful for Bayesian methods. The Bayesian learning rule (BLR) generalizes these kinds of procedure. I wrote an explainer in the thread below. 1/

@gabrielpeyre

Gabriel Peyré

2 years

Difference of convex (DC) programming are non-convex problem enjoying a nice duality theory and thus a simple optimization algorithm.

Tweet media one

5

101

521

1

12

100

@EmtiyazKhan

Emtiyaz Khan

2 years

Over a 100 papers in my lot as SAC at @NeurIPSConf , 34 were accepted. A few details of behind the scenes in the thread below. You will be happy to know that nobody (me, chairs, ACs) ever bothered about "acceptance rate". We all tried to push for acceptance. 1/

@liyzhen2

yingzhen

2 years

@ericjang11 My batch also has many >=5.5 papers and my SAC @EmtiyazKhan explicitly told me to accept as many as I can

2

0

13

5

8

101

@EmtiyazKhan

Emtiyaz Khan

5 years

I am in Europe and then UK giving several talks. If you are around and would like to meet, DM me. List of talks: Sep 9 TU Berlin Sep 12 EPFL Sep 16 Imperial College Sep 18 UEdinburgh Sep 19 Gatsby Unit Sep 20 DeepMind Sep 23 MSR Cambridge Sep 2? UCambridge

15

7

100

@EmtiyazKhan

Emtiyaz Khan

4 years

There is one by James Martens on natural gradients. Enjoy.

@MIT_CSAIL

MIT CSAIL

4 years

DeepMind just released a new set of free lecture videos w/ @ucl covering machine learning, AI, image recognition & more. Watch them all here: #ML #CV #CVPR #AI #DeepLearning #WednesdayWisdom

Tweet media one

5

242

591

0

16

96

@EmtiyazKhan

Emtiyaz Khan

3 years

This paper is really good Very simple trick (sec 2.2) leading to model selection through linear regression (sec 3.3). These kinds of papers originally motivated me to write Bayes through simple “conjugate computations”, the main ingredient of my work.

@gabrielpeyre

Gabriel Peyré

3 years

Oldies but goldies: O Banerjee, L El Ghaoui, A d’Aspremont, Model selection through sparse maximum likelihood estimation for multivariate Gaussian, 2008. Introduces the now called Graphical Lasso to estimate a sparse graph by l^1 penalizing the maximum likelihood estimator.

2

43

225

1

12

95

@EmtiyazKhan

Emtiyaz Khan

2 years

Maximum-likelihood estimate you found the other day.

@OdedRechavi

Oded Rechavi

2 years

Work-life balance

Tweet media one

11

660

3K

2

2

96

@EmtiyazKhan

Emtiyaz Khan

4 years

Learning about Bergman divergence changed my life. It is an extremely useful concept to learn after convex conjugates. Duality rocks!

@FrnkNlsn

Frank Nielsen

4 years

Common formula for statistical (dis)similarities between any two densities of a same exponential family (incl. Gaussian, Beta, Dirichlet): Implement those formula *easily* from legacy statistical library APIs. Slides: Report:

Tweet media one

1

20

101

2

7

96

@EmtiyazKhan

Emtiyaz Khan

5 years

I just hear that my intern from Iran but lives at Switzerland has his visa denied. Reason? They are not sure if he will leave Canada after. He is one of the sweetest person I know. He deserves better. I am angry. I am upset. @timnitGebru @ulrichpaquet @NandoDF @AlanMackworth

@timnitGebru

@[email protected] on Mastodon

5 years

This never ends. This year, so far, 15 out of 44 people to attend @black_in_ai workshop at @NeurIPSConf (which is still in Canada) have been denied visas. That's 33%. We had all this press last year, they were supposed to help us this year.

58

487

974

10

17

93

@EmtiyazKhan

Emtiyaz Khan

5 years

I will be giving a talk on Dec. 2 at 4pm in the Symposium on Advances in Approximate Bayesian Inference (AABI) 2018 in Montreal. Talk title "Fast yet simple natural gradient descent in variational inference". Slides []. Hope to see you there. 1/3

Tweet media one

3

14

97

@EmtiyazKhan

Emtiyaz Khan

3 years

Our team ApproxBayes has a new webpage. Check it out at

Tweet media one

2

19

95

@EmtiyazKhan

Emtiyaz Khan

4 years

Don’t know who needs this now but here it is.

1

5

95

@EmtiyazKhan

Emtiyaz Khan

2 years

I am starting to teach after a 3 yr break! My course at @OISTedu on Foundations of ML starts tomorrow, where I will teach: some main principles, a few basic methods, how they relate to each other, and why they work.

1

6

94

@EmtiyazKhan

Emtiyaz Khan

2 years

If I had known earlier, I could have done this before. They didn't direct me to the appointment page until I finished the documents. Emailed the embassy and got a usual response. F*** this s***. I don't miss physical conferences. They are for people with better passports.

8

7

92

@EmtiyazKhan

Emtiyaz Khan

2 years

Sunday Yokohama sunset. Mount Fuji at the right.

Tweet media one

4

1

90

@EmtiyazKhan

Emtiyaz Khan

4 years

I always write the title and abstract first. I do this write after we decide to start a project. I change it all the time and I keep it there so that the group is focused on the story. For me it has worked great so I recommend it.

@moinnadeem

Moin Nadeem

4 years

@zacharylipton Hi Professor Lipton! Undergrad here who really benefited from . Quick question: some advise writing the abstract last, since your story in the paper may change as you write it. Do you have any thoughts / advice? I noted you listed it first

3

11

47

4

7

93

@EmtiyazKhan

Emtiyaz Khan

3 years

Applicants, I hope your applications don’t go through such shitty process. You deserve much better.

@fadeladib

Fadel Adib

3 years

How to choose your letter writers? The best LoRs I've seen usually come from a faculty/research who publishes in highly selective venues If you are in CS, you can use to see what are considered the most selective CS venues (ignore rankings for now) 6/13

7

8

144

4

2

89

@EmtiyazKhan

Emtiyaz Khan

9 months

We had a blast yesterday in this duality workshop. I as unsure if a workshop full of math will still be interesting in the age of DL, but >50 people chose to attend the workshop on the last day of the conference instead of going to the beach! Gives me hope for the ML community.

Tweet media one

@EmtiyazKhan

Emtiyaz Khan

10 months

At #ICML2023 we will have a workshop on Duality Principles. "Duality is a principle, it gives two different views of the same object" Duality is not a niche topic, it's for everybody. Hope to see you there! @tmoellenhoff @ZeldaMariet @mblondel_ml

Tweet media one

1

32

157

0

2

92

@EmtiyazKhan

Emtiyaz Khan

4 years

Check out this new book by @NisheethVishnoi on convex optimization. I recommend reading chapters on duality and mirror descent. Mirror descent on Bayes covers plenty of existing algorithms. There are so many that I still haven’t finished my paper 😂 1/2

@NisheethVishnoi

Nisheeth Vishnoi

@NisheethVishnoi

4 years

1/6 Excited to announce that a draft of my book Algorithms for Convex Optimization (to be published by Cambridge) is available for download: Intended audience includes undergraduates, graduates and researches from TCS, optimization, and machine learning

Tweet media one

13

323

1K

3

23

89

@EmtiyazKhan

Emtiyaz Khan

3 months

Got an emergency surgery for Appendicitis over the weekend. I knew Appendices were useless, so I am happy to get rid of them. Recovering fast now; all the needles are out and getting discharged in a few hour (my ICML plans all vaporized, sure won't write any appendices there).

22

0

92

@EmtiyazKhan

Emtiyaz Khan

4 years

I will giving two talks this week on "Bayesian principles of learning machines", also highlighting new work on continual learning. [Nov 3] Waterlool AI [Nov 5] TU Darmstadt Thanks to @compthink @Jan_R_Peters for inviting me.

2

13

89

@EmtiyazKhan

Emtiyaz Khan

5 years

I have an unfortunate news. On Aug 25, 2019, my father passed away. I rarely share personal news here, but I will make an exception. Below is a thread about my father and me, and a few other things. 1/11

12

0

89

@EmtiyazKhan

Emtiyaz Khan

3 years

.... I find that the AC entered the meta reviews just a day before the deadline! WTF? At least, send an email! Also, no accountability for being absent throughout the rebuttal period. WTF? There goes my weekend on nothing. This person is famous. I guess they can do whatever.

4

0

87

@EmtiyazKhan

Emtiyaz Khan

2 years

So after avoiding Covid for 2.5yrs, I now have it. First 3 days were the hardest but now feeling much better. Isolating and recovering fast. Nobody in the family is showing symptoms yet! 🤞🏾 Going to take it easy on work for a few weeks and focus on recovering.

29

0

89

@EmtiyazKhan

Emtiyaz Khan

5 years

When people ask me how to choose a PhD program, I tell them to make the choice of supervisor to be their top priority. Somebody who cares about your career more than their own. The next question is usually “how do I know?” This 👇🏽answers that question.

@krwedemeyer

Dr. Katie Wedemeyer-Strombel, PhD

5 years

YES. Absolutely. Don’t join a lab until you talk in-person, or on the phone, with current and former students. Ask, specifically, “if you could work with them again, would you?”.

3

25

107

3

14

87

@EmtiyazKhan

Emtiyaz Khan

4 years

Natural grad is a 1st order approx of riemannian gradient. Just like NGD, it takes a form similar to RMSprop for Bayes with Gauss approx, but the steps stay within the “pos def” constraint. In this (rare) case exp map is easy. see this paper of ours

Tweet card media

Handling the Positive-Definite Constraint in the Bayesian Learning Rule

The Bayesian learning rule is a natural-gradient variational inference method, which not only contains many existing learning algorithms as special cases but also enables the design of new...

@FrnkNlsn

Frank Nielsen

4 years

Natural gradient uses *steepest descent* but may leave the manifold. Riemannian gradient always stay on the manifold (but exponential map difficult to calculate). Natural gradient= approximation of the Riemannian gradient using a simple rectraction.

Tweet media one

0

42

232

0

11

86

@EmtiyazKhan

Emtiyaz Khan

3 years

Excited to announce new positions in Tokyo for @BayesDuality , focusing on adaptive, robust, and lifelong deep learning. Get in touch if interested (URMs highly encouraged). Also available PhD positions @OISTedu (I think due soon). RTs appreciated. Help us spread the word

@BayesDuality

The Bayes Duality Project

3 years

Up to 4 new open positions in ApproxBayes group @RIKEN_AIP_EN , Tokyo with @EmtiyazKhan and @PierreAlquier Email us at jobs-bayes-duality (at) googlegroups (dot) com, if interested. Also, check our webpage RTs appreciated.

3

29

56

3

34

85

@EmtiyazKhan

Emtiyaz Khan

2 years

View for the next 6 day vacation. Hoping it will be a safe vacation.

Tweet media one

2

0

85