We have ≥$10k to support talented 14-18 year olds whose studies were interrupted by war in Ukraine.
We especially would like to hear from IMO, EGMO, MEMO, IOI, EGOI, IPhO, IChO contestants. If you're one or know one, here's the form to apply:
BREAKING: OpenAI reveals that ptrblck user who answers every single question on pytorch user forum has in fact been powered by superhuman ChatGPT since 2021
Judea Pearl claims all we do in ML is curve fitting. I wrote this post to explain that claim and introduce the basics of causal inference to ML folks.
Machine Learning beyond Curve Fitting: An Intro to Causal Inference and do-Calculus
2010: some people put papers on ArXiv
2012: we put papers on ArXiv after peer-review is done
2016: we put papers on ArXiv the day after deadline
2018: we just put stuff on on ArXiv
2020: you wake up with a headache and wonder if you drunk-posted something on ArXiv you will regret
I came to do my PhD in the UK (and stayed to eventually pay more taxes than 99% of Brits) only because my partner could move with me.
As a Cambridge academic, I am losing out on great students, top global talent, who choose Germany because they have a partner.
From today, the majority of foreign university students cannot bring family members to the UK.
In 2024, we’re already delivering for the British people.
2018: The GAN is are failing at AI. Look, it can't even generate a consistent bedroom.
2022: DALL-E2 fails at AI, look it can't even generate "A donkey is playing tug-of-war against an octopus. The donkey holds the rope in its mouth. A cat is jumping over the rope."
I'm designing an introductory AI short course, split into four sessions:
1. linear regression
2. convnets
3. transformers
4. consciousness
Did I leave anything out?
I’m happy to reveal that I will be joining the Cambridge CS Department (
@Cambridge_CL
) later this year, working with
@lawrennd
and
@carlhenrikek
to build a new ML group.
This should be an awesome place to do an ML PhD in the coming years 😉!
A follow-up to my introduction to causal inference and do-calculus. This is based on my lectures at MLSS Africa last week. I'm turning that material into a series of posts, stay tuned.
Causal Inference 2: Illustrating Interventions via a Toy Example
MIT research on Jenga-playing robots. If this was a
@DeepMindAI
project we would all be watching a 2.5 hour live stream right now between AlphaJenga vs the Jenga World Champion and some professional Jenga commentators.
Colab notebook from my Causal Inference practical at MLSS2019.
Illustrates generative processes, interventions and counterfactuals through structural equation models.
You can make a copy and play around with it.
New post in which I attempt to explain counterfactuals: they are powerful yet weird and difficult to grasp. Third post in a tutorial series on causal inference, following the material in my my MLSS lectures.
The Hype of Deep Learning:
1. Write a post with ML, AI or GAN in the title.
2. post appears at the top of hackernews (despite your best efforts)
3. HN drives tens of thousands of clicks
4. "what's with all the maths? show me pretty pics"
5. <=1% stay for longer than a minute
LOL, the hallmark 2005 paper that made mRNA therapies (incl. vaccines) possible wouldn't make it to the top 60 highest cited CVPR papers (1209 citations). In case you needed any more evidence that citations are a stupid measure of impact.
Now that everyone is fatigued by GPT-4 hot takes and blocked the keyword "LLM", here's the blog post with my current view on the topic, and how my views changed:
I am uncomfortable with C++ because I don’t know how my code maps precisely to machine code.
This is why I naturally prefer C to deploy my pile of linear algebra whose parameters are found by billion-dimensional stochastic optimisation to drive my car.
@jamesdouma
@RadarMoron
@JeffTutorials
@karpathy
Transformers are replacing C heuristics for post-processing of the vision NN’s “giant bag of points”.
[Side note: I hate the bloated mess that is modern C++, but love simple C, as you know what it will compile to in terms of actual CPU operations.]
New post on iMAML: Meta Learning with Implicit Gradients
some animations, discussing potential limitations and of course a Bayesian/variational interpretation
We’re announcing the Inverse Scaling Prize: a $100k grand prize + $150k in additional prizes for finding an important task where larger language models do *worse*.
Link to contest details:
🧵
The quantum physics community makes the ML community look like a bunch of beginners.
While we're arguing about the importance of reproducibility they *experimentally prove that there is no such thing as observer-independent objective truth*
Linear Transformers Are Secretly Fast Weight Memory Systems
Shows the formal equivalence of linearised self-attention mechanisms and fast weight memories from the early ’90s.
Any suggested material out there on how to skim-read research papers (especially in ML)? Eventually, students get this, but this feels like potentially something teachable.
In prior work (Doe et al, 2019) has considered the problem of parrot walking, however, the proposed method had severe limitations. The approach presented in this paper is novel and versatile. To our knowledge it is the first work considering multiple parrots simultaneously.
Deep reinforcement learning algorithms are impressive, but only when they work. In reality, they are largely unreliable and can yield very different results.
@larocheromain
proposes two ways to achieve reliability in RL:
#ICML2019
Happy to announce that I've rejoined
@Twitter
as an academic advisor/part-time researcher, working specifically with the META (ML Ethics, Transparency and Accountability) team under
@quicola
Every dish you eat came out of Italy 1840-1965. Nothing was made from 1965-2020. The culture was so broken. Pineapple, overcooked pasta, deep pan and entitlement.
But the culture is changing. Wild food will be cooked in the next 10 years. Are you in or out?
We’re excited to reveal a new partnership between Twitter and UC Berkeley:
a new lab, lead by
@mrtz
and
@beenwrekt
, dedicated to understanding and improving how ML systems work inside social systems.
Live streaming in China is so insane.
This woman is known for promoting the products she sells for less than 3 seconds each.
On average she sells ~$19 million USD of products per week.
Unpopular myth-busting opinion:
deep learning DOES NOT do away with feature engineering. In certain high-D dense domains such as images or sounds, convolution-like things do well on what we call “raw” data. Elsewhere, input representation matters.
Let the flame wars commence.
We used to design features. Deep Learning learns features instead. Now, we design learning-algorithms. The next step is to learn learning-algorithms instead.
Introducing a new approach for training
#ML
models using noisy data that works by dynamically assigning importance weights to both individual instances and class labels, thus reducing the impact of noisy examples. Learn more about it at
The Generalization Mystery: Sharp and Flat Minima, SGD and how it's all related.
A critical look at recent work plus some of my own ideas on how to predict generalization performance.
I’m explaining transformers to someone and I genuinely don’t know: why do we use self-attention and not attention there. I.e. why are keys and queries the same for each token?
🎂🎁🎈David Duvenaud Birthday Special:
Meta-Learning Millions of Hyper-Parameters Using the Implicit Function Theorem. New post on recent work by
@JonLorraine
,
@PaulVicol
and
@DavidDuvenaud
I expect about 30% of NeurIPS papers this year to be something like “towards solving finger-collapse in diffusion-based generative models using doubly conditioned augmented hypernetworks”
🤣This made my week.
Figure 1: Hungarian government propaganda poster advertising their new family welfare program (notice the stock photo choice)
Figure 2: distracted boyfriend meme
My note on Smith et al (2021): On the Origin of Implicit Regularization in Stochastic Gradient Descent - a cool paper about modeling the behaviour of SGD just accepted to ICLR
"deep learning uncertainty in real-world applications":
* will my model ever converge?
* Is it too late to switch to PyTorch?
* mitigating uncertainty of the review process
* registering to NIPS
* asymptotically minimal effort experiment design for conference submissions
Awesome line-up for this year's Bayesian deep learning workshop
@NipsConference
, with this year's theme "deep learning uncertainty in real-world applications"
This
@TuringTumble
thing is really awesome. It starts with simple enough challenges, then introduces memory and registers. My favourite new toy by far.
Few years went by, GANs produce beautiful stuff. What I find fascinating is that we still celebrate pretty pictures and inception scores and there is little we can say about the generalisation or usefulness of any of this (pardon me if that’s incorrect, hit me with citations)
Over the weekend reports of racial/gender bias in Twitter's AI-based image cropping have started blowing up. I wanted to add some context from my perspective as an ex-employee and as a contributor to the research the product is based on.
Few days ago I tweeted things I should not have. It was bad, I regret and apologize.
This sort of stuff undermines the effort of colleagues, and my own, to articulate the important role various disciplines play in taking ML forward and to create a welcoming and healthy community.
Invariant Risk Minimization: an Information Theoretic View.
My post on Arjovsky et al's latest paper, with a slightly different derivation of the IRM objective:
New massive recommender system dataset from Twitter! We hope this will stimulate more research on recommender systems which didn't really have datasets of this size.
We are releasing the biggest ever (160 million samples) recommender system public dataset today with
@recsyschallenge
2020. Please go check out the website:
Google develops AI to optimize the layout of its next generation AI chip. (a) human-designed layout (b) AI-designed layout is 30% more power-efficient, includes four legs, an on-chip battery and a syringe of 5G activated nanorobots.