Can't refuse a direct request from the Cho-sen one.
🧵 TIME FOR A YEAR END RETROSPECTIVE 🧵
wherein I get to brag about having the honour of having been included on my students' papers who, by now, are generally more clued in than me about what's what.
Let's go! [1/24]
I will be posting (probably next week) some job listings for a new team I’m hiring into at
@GoogleDeepMind
. I’ll be looking for some research scientists and engineers with a strong engineering background to help build increasingly autonomous language agents. Watch this space.
Sorry,
@DeepMind
, it looks like we got there first! Proud to announce that our Jointly Optimized Kernel Evaluator agent achieved AGI this morning around 20 mins past 4am GMT. The solution was delightfully simple and yet elegant. It won’t fit in this tweet so THREAD (1/70)
I'm happy to announce I've joined
@facebookai
(FAIR) as as research scientist, working out of the London office. Get in touch if you want to chat about internships, PhDs, working together, etc.
I don’t want to brag, but I wrote (basically almost published) some homework when I was 5 that used multiplication (basically almost linear algebra), which was used in DL methods like LSTM, GANs, etc so you can thank me later for basically inspiring the most cited papers in ML.
Pleased as punch (the drinky kind, not the hurty kind) to be returning to Google
@DeepMind
as Director of Research today. It's an exciting time to be helping develop general agents that can adapt to open-ended environments, communicate with us, and help us in novel ways!
@maosbot
Not defending the view you’re objecting to, but Japan had a good pre-pandemic attitude towards mask-wearing e.g. when ill, during other epidemics, etc. I was rather hoping we’d develop some of this attitude in the west after the pandemic, but here we are 😔
Want to do a PhD with me at
@facebookai
(FAIR) +
@uclcs
? I am recruiting up to one student this year for a special FAIR-UCL studentship. FAIR will pay your fees (inc. international fees) + an extremely generous stipend, plus access to FAIR compute. Interested? Read on... (1/5)
Today I’m delighted to announce that I have joined
@CohereAI
as Head of ML, to assist the further developing of ML R+D across the company, and help build up the new London office! 🎉 Excited to start a new journey with this brilliant team 🥰
Today we’re welcoming aboard
@egrefen
and
@pblunsom
to lead the grand opening of our new London office! Interested in tagging along? We’re currently hiring in London and across the 🌍! Check out open positions at , or drop us a line at talent
@cohere
.com
I think that the string "Pre-AGI" is the shortest number of characters that gives me an instant rage aneurism. I despair for the future of our once approximately scientific field, now become the world's most well-financed techno-cult.
🧵THREAD 🧵
Are you looking to do a 4 year Industry/Academia PhD? I am looking for 1 student to pioneer our new FAIR-Oxford PhD programme, spending 50% if their time at
@UniofOxford
, and 50% at
@facebookai
(FAIR) while completing a DPhil (Oxford PhD). Interested? Read on… 1/9
Probably one of the more important DL papers of the last 5 years: it shows that the DL community has been good at rushing to flag-plant by creating flashy new models, but terrible at evaluating them by training good baselines. Can you trust YOUR model’s results in <insert task>?
Strongly (and respectfully) disagree with
@NandoDF
here. ML has a deep reproducibility problem. "Successful" methods are often unstable and require tricks not described in paper. Rarely have I been able to repro results without looking at assumptions in code but not in paper.
Research is not about re-running code, adding many comparisons where we know who wins, using citations purely for credit assignment, writing papers that no one remembers. We don’t want Neurips2020 to reproduce Neurips2019. We want it to be different and full of new ideas.
Announcing TorchBeast, an IMPALA-inspired
@pytorch
platform for distributed RL research. Used in a growing number of projects here at
@facebookai
. Project lead by Heinrich Küttler, with major effort by
@nntsn
et al.
Paper:
Code:
Clearly
@DeepMindAI
is primarily about fundamental research so of course it's loss making. Why is anyone (triumphantly) surprised about this? Do we look at profit margins of academic groups? Would the world be a better place if groups like DM changed focus to generate income?
🚨 JOB ALERT 🚨
We're hiring research scientists/engineers to conduct research on next-generation assistant technologies to power increasingly autonomous agents which strive to support humans
Research Scientist:
Research Engineer:
Got a complicated RL exploration problem? Sparse/no reward? It's dangerous to go alone: bring an AMIGo! This thread introduces work done by Andres Campero, with
@robertarail
, Josh B. Tenenbaum,
@HeinrichKuttler
,
@_rockt
and me during Andres' internship at FAIR London. [1/5]
Happy to announce that I’ve been “promoted” to Honorary Professor of
@UCL
. I will continue to support research at
@ucl_dark
and within
@ai_ucl
in general.
I think there's something fundamentally wrong and unscientific with
@AndrewYNg
's "heroes of deep learning" series. It's a terrible culture to assume a field has a few superstars rather than building on the work of many. Encourages West Coast-style self-promotion above rigour.
I cannot believe my eyes, as
@NeurIPSConf
2020 AC, that these are suitable reasons for desk rejection. The first and third points are things that should be evaluated by several reviewers. The third point (esp re presentation) is not a good sole reason for rejection. But… (cont.)
The idea that an ML paper should be written with a plot twist really tickles me. Like “oh yeah this method sucks but let’s do it lip service and talk about some evals and WAIT WHAT’S THIS?! ITS SOTA ENTERING THE RING WITH A STEEL CHAIR!!”.
No no no no no no no no no.
Thankfully, this advise was ignored by the authors. But this wide spread but unspoken belief is why NeurIPS/ICML/ICLR reviewing for empirical papers is totally broken.
@EylonALevy
This is the sort of shit you get to say if you've had the *privilege* of not being affected by her racist, xenophobic, and classist policies as both homesec and PM. Not everyone has that luxury.
Yours,
A citizen of nowhere.
To wrap up 2018, I can now announce that after four interesting years there, I left
@DeepMindAI
back in November. I am thankful for the friendships and collaborations I formed along the way. I will miss many, many of my former colleagues, but am excited for my next adventure! 😀
In parallel with this paper,
@facebookai
has released higher, a library for bypassing limitations to taking higher-order gradients over an optimization process.
Library:
Docs:
Contributions very welcome.
Doing a PhD in a CS/ML related field? We're looking for several(!) interns to come work with likes of
@_rockt
,
@riedelcastro
, and me (+ others) on NLU, RL+Language, Program Induction/Synthesis, and Metalearning at FAIR London in 2020. Get in touch!
Many machine reading datasets only require extracting a short span/entity. To drive research on systems that can read and understand complex narratives, we introduce NarrativeQA: human questions & answers about entire books/plays/movies. Upcoming in TACL.
@archer_rs
The more I read this, the more I think it must be some sort of indirect poe, but ultimately I don’t care: If it’s all true, it’s a beautiful schadenfreude-inducing story, and if it’s not, this is some top class writing. Either way, I await the next tweet with bated breath.
Lots of interest in meta-learning/differentiable optimization at
#ICLR2020
. We're happy to announce v0.2 of higher, a
@PyTorch
library for writing meta-learning research code in near-native pytorch. This is a fairly big update addressing some key blockers.
Happy to share our new
@DeepMindAI
paper on AGILE, a method for training agents to follow language instructions by jointly learning a reward model from examples. No more template languages, or problems with hard/impossible to code reward functions!
For the next conf, I’m contemplating taking LSD and engaging in an hour of Dadaist automatic writing. I’ll throw in some figures with a few buzz words, some unparsable maths with Greek letters you haven’t even HEARD of, TeX it all up, and submit. 100% novelty. Strong accept.
This surprising result should serve as a moment of reckoning for RL research. Reward may be enough in theory (if only) but an astounding amount of domain knowledge can, and probably must, be exploited in order to tractably solve complex problems.
Proud to present this short report on the outcomes of and learnings from
#NetHackChallenge21
, held at
@NeurIPSConf
. Did DeepRL win the day, or did symbolic challengers surprise us all? What do the results tell us about next steps for AI? Read to find out!
At AMLD GenAI,
@armandjoulin
is telling us how the building of custom language models is increasingly going to be within the reach of smaller teams and orgs. Paired with Angela Fan’s and
@jefrankle
’s talks yesterday, this paints a picture of a future where LLMs proliferate.
I just don’t get this attitude of saying something won’t work until you’re red in the face. Conversational search is a cool idea. If the tech isn’t ready, or the idea was cool but not useful, people just won’t flock to it. Chill out and let some notion of utility be the judge.
Good (deep) RL work shows stddev and mean across many seeds go demonstrate reliability of the method, rather than top-k (out of ???) runs. Most papers I've read do not do anything nearly as sound. Maybe I'm reading the wrong papers...
I think what a lot of senior people posting meta about NeurIPS acceptance/rejection/excitement might be forgetting is that it was a lot easier (and dare I say more fun) to get speculative/exploratory (or just any) work published circa 2010-2014 than now. (1/4)
Final version of our ICLR paper is out.
Key points:
* ConvNets good for vision, not so good for tree structure.
* Explicit conditioning on syntax helps.
* Nets can learn model-theoretic solution to entailment.
* Attention is not (always) all you need.
Want to help push the boundaries of RL research? Need a rich, difficult, and procedurally-generated environment with loads of structure and intricacy? An astounding amount of human play data? Sophisticated strategies and documentation? We got you (and it's faster than ALE!) [1/6]
A brilliant
@PyTorch
implementation of continuous stacks, extending work we did on unbounded neural memory at
@DeepMindAI
, and also inspired by related work on algorithm induction by
@armandjoulin
and Mikolov of
@facebookai
.
Super cool
@PyTorch
reimplementation (+ new stuff) of our
@DeepMindAI
differentiable stacks/queues/etc (NIPS'15) by
@Yale
undergrad(!) Will Merrill. Check it out!
I should probably be deeply ashamed by admitting this, but I've only *just* realised that the gradient of MSE loss (with 1/2 coef) is equivalent to taking the gradient of the KLD of two gaussians with the prediction and target values as means (and variance 1), wrt the prediction.
THREAD (application process at bottom)
We are looking for research interns to work with me,
@_rockt
,
@HeinrichKuttler
et al. at
@facebookai
(FAIR) London. Applicants should be doing a PhD, and ideally be interested in a project aligned with the topics of our recent pubs. [1/4]
On this most auspicious day, I am happy to end the speculation and announce that I will be setting up the world’s first AI-first beet farm in rural Pennsylvania. Looking forward to producing some fresh beets using the best gradients money can buy.
Those who have left the company include Edward Grefenstette, a research scientist that led Meta’s efforts on a branch of AI known as reinforcement learning, who departed in February.
I say this lovingly, because I have nothing but respect for Andrej, however…
Silicon Valley is one hell of a drug if *human communication* now receives this sort of characterisation.
Reading a tweet is a bit like downloading an (attacker-controlled) executable that you instantly run on your brain. Each one elicits emotions, suggests knowledge, nudges world-view.
In the future it might feel surprising that we allowed direct, untrusted information to brain.
Pleased to have been awarded, with
@LittleBimble
, the 2021 IJCAI-JAIR Best paper prize. Through a series of unfortunate events and miscommunications, we were sadly unable to be at IJCAI to receive the prize or present the work, but thank the committee for recognising our work.
Man these
@OpenAI
DALL·E 2 samples are getting more and more impressive. Note how despite the absurdity of the prompt, the photorealism is almost lifelike. Amazing…
And now, for something a little different…
We show that robust, interpretable latent rules can be synthesised by backprop. Data efficient, good generalisation. Can be trained end-to-end within a larger neural network. Upcoming in JAIR.
w/
@LittleBimble
One thing to highlight is the FiLM² layer introduced in §4.1 of the paper, which is a particularly simple-yet-powerful way of cross-conditioning from 2+ modalities. We did text/vision but in principle this works for anything…
@pytorch
code for this layer:
The code for our RTFM task suite and text2π architecture (in
@PyTorch
) is now available at ! Great work by
@hllo_wrld
!
Read the paper:
and a blog post about the work:
"Going" to
@icmlconf
? Come hear about the future of language+RL at the
#LaReL2020
workshop on Language in Reinforcement Learning, held July 18. Here's a short thread introducing some of the highlights. [1/9]
Delighted that our paper on neural nets and logic was accepted to ICLR'18.
Summary: on highly and heterogeneously structured tasks (detecting logical entailment), models are ranked: semantically-aware > syntactically-aware > LSTM > ConvNet.
"AI is the new electricity"
"AI = gradients + pray + love"
Etc.
Is there something in the water in silicon valley (and occasionally elsewhere) which compels people to write what are, at best, empty ML platitudes, and at worst (to paraphrase Pauli) "not even wrong"?
Awesome new "learning to reason" (mathematically) dataset and task suite (which I helped out with a bit in my last months at
@DeepMindAI
), in a project envisioned, lead, and primarily executed by the brilliant
@dwsaxton
. Give it a look!
Today we're releasing a large-scale extendable dataset of mathematical questions, for training (and evaluating the abilities of) neural models that can reason algebraically.
Paper:
Code and data:
Predictably, the useless leeches at
@ElsevierConnect
show how little they understand or care about science and scholarship. I can't wait for this industry to die out.
Your taxes paid for the research, so why let these muppets profit from it? Support open access publication.
This is a weird take. By the same token, is CS a branch of logic or of statistics? Is physics a branch of maths, or chemistry a branch of physics? Just because a field evolves to use the methods of another doesn’t make it a branch of it, if the focus of the field is different.
Within 10-20 years, nearly every branch of science will be, for all intents and purposes, a branch of computer science.
Computational physics, comp chemistry, comp biology, comp medicine... Even comp archeology. Realistic simulations, big data analysis, and ML everywhere
Excellent article by
@andrey_kurenkov
. A great and healthy step back from recent successes of DeepRL to ask "can we do even better?" by questioning whether methods unbiased by prior experience and instruction can scale.
Happy to have played a part in the design of
@facebookai
's NetHack Learning Environment, a project lead by Heinrich Küttler and
@_rockt
, with significant contributions by
@nntsn
and many others. This env will help push the boundaries of RL research.
[1/7]
There's some weapons-grade auto-back-patting coming out of
@OpenAI
this week. Sure, you *often* need dedication/effort/rigour to make progress in any domain. But (a) it's toxic to suggest this is the only way to do so, (b) it's just wrong to equate effort with working a 90h week.
Many machine reading datasets only require extracting a short span/entity. To drive research on systems that can read and understand complex narratives, we introduce NarrativeQA: human questions & answers about entire books/plays/movies. Upcoming in TACL.
... and this joke is somewhat ruined by the fact that Twitter won’t let me suppress the video preview when I link to a video explaining our fabulous method.
Oh well, here you go anyway...
One of the greatest moments of disappointment in my life was when I discovered that the expression “balls to the wall” refers to the operation of levers in a train or plane, rather than to testicles. It immediately made saying it 80% less funny.
For the last year or so, I’ve been saying (to anyone willing to listen), that—modulo best eng practices and appropriate scale—most research and progress in AI is going to come from rethinking how we evaluate models and use data. A short 🧵
The second point *really* gets my goat. Sure, we don’t want to send some mad rambling to reviewers, but who will be most affected by desk rejections on this point? ESL folk. Non-traditional submitters. Small labs in countries with few native English speakers. (cont.)
Disappointing to see so little discussion happening on
@iclr_conf
2020 papers. What's the point of a rebuttal phase if reviewers dump their opinion and have no intention of changing it or discussing it?
There should be a term like “dunning-kruger by proxy” to reference people who somehow still think Elon Musk is smart despite everything that’s happened in the last few months.
Not expecting a Turing award off the back of this one, but here's a neat little study—done with
@DeepMindAI
colleagues—of the effect of naive ensembling *during* adversarial training on the robustness of neural networks to adversarial examples. Enjoy.
Love this. Take a look if you're doing a PhD in NLP and thinking about what to work on next. Take a look if you think "BeRt WiLl SoLvE lAnGuAgE lol!11" and need someone to point you to actual research topics.
so, here are a bunch of stuff i find interesting. no particular order. and definitely not comprehenssive.
- creative ways to apply massive LMs. Sure we can fine-tune them with extra supervision. What else can we do with them?
@idavidrein
@NathanpmYoung
True. Although the absolute pinnacle of mathematical beauty is the square packing singularity: the optimal way of packing a square in a square.
Hong Kong:
- Life expectancy: 84 years
- Meat consumption per capita: 153 kg (world's highest)
India:
- Life expectancy: 68 years
- Meat consumption per capita: 4 kg (world's lowest)
Awesome result:
@facebookai
(FAIR), in collaboration with
@ucl_nlp
, takes the top (and often second as well) place in *every* competition in EfficientQA. Great job, everyone!
Mind-boggling results on the final EfficientQA leaderboard: The best system beat the REALM baseline by almost 20 points, and a 30 megabyte model got > 25% accuracy! Looking forward to hearing more about these systems at NeurIPS.
After giving it a try for a few weeks, I’ve deleted clubhouse. One of the main perks of working in tech in London is I *don’t* have to deal with inane Silicon Valley navel-gazing on the daily, so I have no idea why I thought it was a good idea to invite it into my living room 🤮
Want to work with me at FAIR London? I'm looking to take up to 2 interns in 2022 (flexible start) for 4 months. You must be actively pursuing a PhD, and ideally in the last 2 years.
Email me your research interests with [Internship 2022] in the subject line, at egrefen
@fb
.com.
In general, experts—or people who are highly regarded in any field—are often hooked on their own self-importance.
But sometimes status or accomplishment in one realm has no relevance in another.
Toy Semantic Search
(1/4) We frequently want to search through documents to find the answer to a question. Examples include technical documentation, like docs for a programming language, or a company wiki to find out how to set up our laptops ↓
The datasets (and code to generate more data) for our ICLR paper “Can Neural Networks Understand Logical Entailment?” is now online. Enjoy (and test your sequence models on it).