Here’s what I’ve been working on recently:
@anthropicai
. I’ll be spending a lot of my time on measurement and assessment of our AI systems, as well as thinking of ways govs/others can assess AI tech. There’s a lot to do!
As someone who has spent easily half a decade staring at AI arXiv each week and trying to articulate rate of progress, I still don't think people understand how rapidly the field is advancing. Benchmarks are becoming saturated at ever increasing rates.
A mental model I have of AI is it was roughly ~linear progress from 1960s-2010, then exponential 2010-2020s, then has started to display 'compounding exponential' properties in 2021/22 onwards. In other words, next few years will yield progress that intuitively feels nuts.
In the last decade:
- figured out cut&paste for DNA (crispr)
- reusable rockets (SpaceX)
- crude but generally useful AI systems (llms, image/vid models, RL for inventory)
- promising fusion approaches (helion, etc)
This decade is going to be so wild. It's very exciting.
Stable Diffusion: $600k to train.
I'm impressed and somewhat surprised - I figured it'd have cost a bunch more.
Also, AI is going to proliferate and change the world quite quickly if you can train decent generative models with less than $1m.
I am delighted to announce that I've been appointed to the National AI Advisory Committee, which will advise the President and the National AI Initiative Office on matters relating to AI.
The next five years of AI will see systems diffuse into the world that act on culture which will feed back into human society, changing it irrevocably. Some thoughts done this morning:
Many of the problems in AI policy stem from the fact that economy-of-scale capitalism is, by nature, anti-democratic, and capex-intensive AI is therefore anti-democratic. No one really wants to admit this. It's awkward to bring it up at parties (I am not fun at parties).
Personal news announcement: I am now the Policy Director for
@OpenAI
. This reflects my focus (where I spend the majority of my time), and also several recent hires (eg,
@apilipis
who is going to be handling a growing chunk of our comms). I'm psyched!
Import AI is skipping this week because I am exploring a new universe of emotions with my expanded family (and changing many diapers... so many diapers.)
One gnawing worry I have about the rise of LLMs is that, for me, writing IS thinking. One reason I spend so much time writing my newsletter each week is I haven't figured out a better way to think about AI than to sit down and write about it regularly.
Google CEO writes letter re
@timnitGebru
Sundar: "learning from our experiences like the departure of Dr. Timnit Gebru"
Translated Sundar: "analyzing why I let us fire Timnit Gebru and am now desperately trying to position myself as a bystander"
Note that the reason The Register got this monster market-moving story was because it employs (and trains) extremely technical journalists. If you don't understand tech you get lied to. If you can read code it's way harder to get lied to. Other news orgs should follow!
Like 95% of the immediate problems of AI policy are just "who has power under capitalism", and you literally can't do anything about it. AI costs money. Companies have money. Therefore companies build AI. Most talk about democratization is PR-friendly bullshit that ignores this.
Today, I testified to the U.S. Senate Committee on Commerce, Science, & Transportation
@commercedems
. I used an
@AnthropicAI
language model to write the concluding part of my testimony. I believe this marks the first time a language model has 'testified' in the U.S. Senate.
The real danger in Western AI policy isn't that AI is doing bad stuff, it's that governments are so unfathomably behind the frontier that they have no notion of _how_ to regulate, and it's unclear if they _can_
Facebook is deploying multi-trillion parameter recommendation models into production, and these models are approaching computational intensity of powerful models like BERT. Wrote about research here in Import AI 245:
Paper here:
I typically stay out of stuff like this, but I'm absolutely shocked by this email. It uses the worst form of corporate writing to present
@timnitGebru
firing as something akin to a weather event - something that just happened. But real people did this, and they're hiding.
The most depressing things about conspiracy theories is they tend to rely on governments being incredibly competent, technically advanced, and astonishingly well run. This is rarely the case.
I've moved on from OpenAI to work on something new with some colleagues (). I'm also going to be continuing a lot of my work on technology assessment with
@indexingai
and the
@OECD
, and am very excited about stuff in the pipeline there!
AI really is going to change the world. Things are going to get 100-1000X cheaper and more efficient. This is mostly great. However, historically, when you make stuff 100X-1000X cheaper, you upend the geopolitical order. This time probably won't be different.
The default outcome of current AI policy trends in the West is we all get to live in Libertarian Snowcrash wonderland where a small number of companies rewire the world. Everyone can see this train coming along and can't work out how to stop it.
There's pretty good evidence for the extreme part of my claim - recently, language models got good enough we can build new datasets out of LM outputs and train LMs on them and get better performance rather than worse performance. E.g, this Google paper:
Things are getting... Extremely weird. Think about what this graph may look like in spring 2023 (was published April 2021). From the excellent Dynabench paper
AI is such a rapidly growing field I think we forget how juvenile it was within recent memory; back in 2014 received wisdom was basic computer vision was an impossible task. Now it's a commodity deployed to users on their phones. (Has issues, e.g bias, but still... wild)
Arrival of increasingly general AI systems means next few years will be defined by a massive expansion in the ways we measure the impacts and capabilities of AI systems, how humans use them, and how AI systems influence the world. Measurement is crucial to effective AI policy.
The Stock Photo industry is probably not ready for generative AI. Generative AI seems better for 80% of use-cases. In other words, NYT still gonna do illustrators, but a random website will probably find economics of gen models more attractive than a Shutterstock subscription.
It's covered a bit in the above podcast by people like
@katecrawford
- there's huge implications to industrialization, mostly centering around who gets control of the frontier, when the frontier becomes resource intensive. So far control is accruing to the private sector (uh oh!)
Stability AI (people behind Stable Diffusion and an upcoming Chinchilla -optimal code model) now have 5408 GPUs, up from 4000 earlier this year - per
@EMostaque
in a Reddit ama
People in AI like to complain about the standard of journalistic coverage of AI. It is therefore v confusing to me that
#NeurIPS2018
has banned journalists from attending workshops. That's where the debates and new stuff are. How do we get better coverage without sharing more?
In late May, I had back spasms for 24 hours, then couldn't walk for a week, then spent a month+ recovering. It was one of the worst experiences of my life and I'm glad I seem to now be mostly recovered. Here are some things that happened that seemed notable during that time:
Here's a thread about doing things for yourself vs doing things the world thinks you should do. As I've got older, I've noticed that the more time I spend on the things that make sense to me, the more stable and fulfilled I am.
Every senior politician or military official in any nuclear-armed nation should be forced to read Annie Jacobsen's "Nuclear War: A Scenario". Easily the most frightening thing I have ever read (fiction or otherwise). A brilliant, factual account of the infernal logic of MAD.
Many AI policy teams in industry are constructed as basic the second line of brand defense after the public relations team. A huge % of policy work is based around reacting to perceived optics problems, rather than real problems.
It's ironic to me that more and more of Google's papers reference JFT, a secret in-house image dataset. JFT is going to be the 'fuel' for a significant number of Google's AI advances (e.g, DM just pre-trained on it to set a new ImageNet SOTA.) Yet...
Many technologists (including myself) are genuinely nervous about the pace of progress. It's absolutely thrilling, but the fact it's progressing at like 1000X the rate of gov capacity building is genuine nightmare fuel.
Google's 'Talk to Books' AI experiment is... uncanny. Talk to a library like a person and have the library reply like a person. A good example of how AI can reframe interactions between us and data to make data more of an active protagonist. Spooky!
Microsoft trains a 530billion parameter GPT3-style language model. This is the largest LM in existence. (There's also the mysterious multi-modal 1.5trillion+ 'Wu Dao' MOE model but little known about it). Microsoft trains on 'The Pile' dataset.
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
abs:
introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level
Modern AI development highlights the tragedy of letting the private sector lead AI invention - the future is here but it's mostly inaccessible due to corporations afraid of PR&Policy risks. (This thought sparked by Google not releasing its music models, but trend is general).
Most people working on AI massively discount how big of a deal human culture is for the tech development story. They are aware the world is full of growing economic inequality, yet are very surprised when people don't welcome new inequality-increasing capabilities with joy.
GPT-4 should be analyzed as a political artifact just as much as a technological artifact. AI systems are likely going to have societal influences far greater than those of earlier tech 'platforms' (social media, smartphones, etc).
A surprisingly large fraction of AI policy work at large technology companies is about doing 'follow the birdie' with government - getting them to look in one direction, and away from another area of tech progress
Anyway, how I'm trying to be in 2023 is 'mask off' about what I think about all this stuff, because I think we have a very tiny sliver of time to do various things to set us all up for more success, and I think information asymmetries have a great record of messing things up.
I'd sum up 2023 for me with these two pictures: In one, I'm speaking to the UN Security Council about AI and its immense impact on the world. In the other, I'm passed out with my baby. The second photo was taken about 30 minutes after the first one.
Thrilled to announce I've become a Research Fellow @ the Center for Security and Emerging Technology in Washington, DC:
I'll be hanging my hat there sometimes when I'm in DC, and will be figuring out creative ways to publicly bridge SV&DC re AI policy
US AI researchers: Big models have loads of problems and it's mostly not appropriate for academia to develop them.
Chinese AI researchers: Here's a 200 page roadmap for why big models are really important and why we should develop them
Today I'm testifying in Congress about AI and public policy. I'm going to discuss the importance of developing shared ethical norms, the need to support AI development & education, and why we need government-led measurement and forecasting of AI.
One of the amazing and also frightening things about AI is how it magnifies and repeats the 'culture' that it is trained on, where culture is a bunch of implicitly ideological choices on the part of the people that create the underlying datasets.
Both of these results were published TODAY. These results happen at a delay, so this is probably old information on order of 3-9 months. There are easily 5 labs and probably 10 with enough compute to play at this level. Imagine what we don't know right now?
Language models have shown strong performance on a variety of
#NLU
tasks but are weaker at solving tasks that involve quantitative reasoning. Learn how
#Minerva
uses step-by-step reasoning to achieve a new state of the art on quantitative reasoning tasks→
I'm not sure that having a liberal arts degree has done fantastic things for my career, but it does bring me joy every day when I read AI research papers and think 'Baudrillard would love this!' or 'this is a Borges story!' or 'these LM outputs read like Amy Hempel'.
Will write something longer, but if best ideas for AI policy involve depriving people of the 'means of production' of AI (e.g H100s), then you don't have a hugely viable policy. (I 100% am not criticizing
@Simeon_Cps
here; his tweet highlights how difficult the situation is).
Wow guys, Falcon-40B (SOTA open-source, probably already dangerous from a misuse perspective) has been trained with ~400 A100s over 2 months😮.
It means that if we want to avoid that a random org trains in less than 1y a SOTA open source system and releases it or leak it bc…
I'll one day write about the experience of trying to explain LLMs before anyone gave a fuck and how strange and alienating it was, but not today! Just remember - 4 years between gpt2 and where we are right now. Prepare for the next four.
Working in AI right now feels like how I imagine it was to be a housing-debt nerd in the run-up to the global financial crisis. You can sense that weird stuff is happening in the large, complicated underbelly of the tech ecosystem.
Christmas can be an incredibly hard time of year for some people, so if anyone is out there who wants to talk (it doesn't even have to be about AI!) I'm available - just drop me a line in email in bio or DM. Happy Holidays to all, there will be a (festive) newsletter tomorrow.
Nice analysis of AlphaCode from Scott Aaronson. Yes, AIs make all kinds of mistakes and have various deficits, but. the field has massively advanced in recent years.
Universities need access to computational infrastructure at equivalent scale as private sector, or I worry about long-term democratic governance of technology. I gave a short talk @ Stanford today about why I think a National Research Cloud would help.
@LondonBreed
@OpenAI
Hiya, I worked at OpenAI for many years and I've got to say that some of my colleagues didn't feel safe walking home at night because of how dangerous chunks of the mission were. I think SF got extremely lucky with OpenAI but you really need to get a handle on crime and housing.
AI systems are solving math olympiad problems, becoming competitive with humans in programming competitions, speeding up science via improved protein structure predictions, writing poems and fiction that people enjoy, and revolutionizing our ability to monitor a changing climate
Many of the immediate problems of AI (e.g, bias) are so widely talked about because they're at least somewhat tractable (you can make measures, you can assess, you can audit). Many of the longterm problems aren't discussed because no one has a clue what to do about them.
I've been keeping some form of journal for 15/20 years or so - some years have been incredibly sparse and some years have involved writing stuff every day. Through this, I've discovered a meaningful link between journaling and mental health. Here's a thread of what I've learned:
Playing around with the latest ChatGPT replication (OpenChatApp) and it's a) quite good, and b) neatly illustrates how crazy-powerful instruction-tuned models are compared to stock LLMs. Compare OpenChatGPT (20B params) on left to OPT (GPT3-replication, 175B params) on right.
China has a much more well-developed AI policy approach than that in Europe and the United States. China is actually surprisingly good at regulating things around, say, synthetic media.
One thing I regularly obsess about is how 'today's AI systems are more powerful than they appear' - here's a nice example for Claude via
@AnthropicAI
of how by prompting your system more intelligently you can eke out significant performance boosts
We’ve published a quantitative case study on prompt engineering for one of our most popular features, Claude’s industry-leading 100K token context window.
If AI is a new industrial revolution shouldn't we be... incredibly concerned? The industrial revolution was a time of great chaos and misery for millions of people, and it occurred during a period with less extreme weather and a less connected world.
Me: OK brain time to read these AI research papers!
Brain: CLIMATE CHANGE MEANS YOUR CHILDREN WILL LIVE TO SEE HELL.
Me: OK brain that's enough of that let's think about neural architecture search!
Brain: THE DESTINY OF HUMAN IS DEATH BY ITS OWN TOOLS.
I'm hiring someone to work with me at
@OpenAI
on scientific communications. This will involve reading a lot of papers and helping to write and edit the OpenAI blog as well as creating educational materials for policymakers / VIPs and (soon) general public. Is this you?
A lot of the criticisms people use to talk about AI (influence on inequality, monopoly-burnishing capabilities, bias towards underrepresented people, mostly opaque to the public, etc) are also equally valid criticisms of neoliberalism. AI augments the system it is deployed in.
People wildly underestimate how much influence individuals can have in policy. I've had a decent amount of impact by just turning up and working on the same core issues (measurement and monitoring) for multiple years. This is fun, but also scares the shit out of me.
Pretty eery: AI models learn to reflect user views back at them (since I figure getting low loss rewards monitoring the _context_ of whatever emitted the input tokens). Pretty weird to see it in the wild. LLMs seek to reflect the views of people that talk to them.
pulling out my favorite chart here: large models are really, really keen to tell you what you want to hear
if you thought the last decade's echo chambers were bad, hooo boy
Thrilled about these new models - I've been playing around with Claude 3 Opus a lot and it's very capable and useful. Like with most frontier models, it has chewed through a bunch of evals so we need to now build more complicated evals to better understand its capabilities.
Today, we're announcing Claude 3, our next generation of AI models.
The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.