It's such an honor to work on Project Astra with such an amazing team from across Gemini and Google DeepMind! While the
#GoogleIO
keynote was happening we had a last minute idea of watching the keynote with Project Astra. Check it out!
Gemini and I also got a chance to watch the
@OpenAI
live announcement of gpt4o, using Project Astra! Congrats to the OpenAI team, super impressive work!
It's such an honor to work on Project Astra with such an amazing team from across Gemini and Google DeepMind! While the
#GoogleIO
keynote was happening we had a last minute idea of watching the keynote with Project Astra. Check it out!
Learning to represent objects is a major research direction towards representing the causal structure of the world.
In our oral at
#iclr2022
workshop on Objects Structure & Causality, we present a new way to conceptualize objects: as stable points of a fixed-point procedure: 👇
Does modularity improve transfer efficiency?
In our ICML’21 paper (long oral), we analyze the causal structure of credit assignment from the lens of algorithmic information theory to tackle this question
w/
@SKaushi16236143
,
@svlevine
,
@cocosci_lab
1/N
My dissertation talk can be viewed here:
Thank you to my advisors
@svlevine
and
@cocosci_lab
for a PhD journey that has been the most intense and fulfilling growth experiences of my life.
Thank you to all my friends and family for your love and support.
Fun weekend project - a visualization of the generation process of a Bayesian flow network modeling a single point (-0.8, 0.8) (blue star). Pink dots are samples from sender dist, white traj are Bayesian updates over parameters mu, green traj are network's predictions over time.
📣 BFNs: A new class of generative models that
- brings together the strengths of Bayesian inference and deep learning
- trains on continuous, discretized or discrete data with simple end-to-end loss
- places no restrictions on the network architecture
🤖🤖Multi-Agent Dialogue Simulations🤖🤖
Ever wondered how your favorite characters would interact in new contexts? How about putting Harry Potter and Argus Filch on the same team?
New ex. in
@LangChainAI
for simulating stories of *multiple* characters
♊️ Gemini is out! ♊️
What an honor it's been to work as part of the Gemini multimodal and eval teams with such amazingly talented and high velocity colleagues!
Seeing its multimodal coding capabilities makes me ever more excited for the future of human-computer interfaces! 🚀
The Gemini era is here. Thrilled to launch Gemini 1.0, our most capable & general AI model. Built to be natively multimodal, it can understand many types of info. Efficient & flexible, it comes in 3 sizes each best-in-class & optimized for different uses
🤖Data-Driven Character Chatbots🤖
Running out of creativity creating
@character_ai
character definitions?
Introducing data-driven-characters, a repo built on
@LangChainAI
for easily creating, debugging, and running your own character chatbots grounded in any story corpus 🧵
🔗 LangChain x Gymnasium 🤖
Chatbots have mostly been used as dialogue agents, but they can also be adapted for standard RL envs.
New ex. in
@LangChainAI
showing how to integrate chat models with Gymnasium (formerly
@OpenAI
gym) from
@FaramaFound
(it doesn't do RL yet though).
🗳️Decentralized speaker selection🗳️
How to implement multiagent dialogue without fixed schedule for who speaks when? Let the agents bid to speak!
New ex. in
@LangChainAI
of a fictitious presidential debate b/w Donald Trump, Kanye West , Elizabeth Warren:
Hard to contain my excitement about what we've been working on: it's exhilarating to see how quickly what would've blown my mind a short time ago -- like fine-grained analysis of hours of video -- become table stakes of what we expect from these models when things "just work."
Introducing Gemini 1.5: our next-generation model with dramatically enhanced performance. It also achieves a breakthrough in long-context understanding.
The first release is 1.5 Pro, capable of processing up to 1 million tokens of information. 🧵
1/ Check out our latest work on societal decision-making which we will present next week at ICML 2020. Very grateful to my collaborators and advisors
@SKaushi16236143
, Matt Weinberg, Tom Griffiths, and
@svlevine
.
Can we view RL as a series of economic transactions between primitives/actions? We present an RL method based on auctions, where primitives "buy" states and "sell" next states
w/
@mmmbchang
,
@SKaushi16236143
, Weinberg, Griffiths
"Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation" is now out on arXiv!
w/ my advisors
@cocosci_lab
,
@svlevine
arxiv:
web:
youtube:
Learning to represent objects is a major research direction towards representing the causal structure of the world.
In our oral at
#iclr2022
workshop on Objects Structure & Causality, we present a new way to conceptualize objects: as stable points of a fixed-point procedure: 👇
🤖Dialogue Agents x Tools🔨
Equipping dialogue agents w/ browsing tools enables more grounded discussions.
New ex. in
@LangChainAI
showing how to augment dialogue agents w/ tools in a fictitious debate between an AI accelerationlist & AI alarmist:
🔗 LangChain x PettingZoo 🤖🤖🤖
Not only can you integrate LLMs with standard RL envs, you can now do so with multi-agent RL envs too.
New ex. in
@LangChainAI
showing how to integrate chat models in PettingZoo (multi-agent Gymnasium) from
@FaramaFound
🔗 LangChain x Gymnasium 🤖
Chatbots have mostly been used as dialogue agents, but they can also be adapted for standard RL envs.
New ex. in
@LangChainAI
showing how to integrate chat models with Gymnasium (formerly
@OpenAI
gym) from
@FaramaFound
(it doesn't do RL yet though).
The code for Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions is now released: . Thank you to my collaborators and advisors
@SKaushi16236143
, Matthew Weinberg, Tom Griffiths (
@cocosci_lab
),
@svlevine
!
Can we view RL as a series of economic transactions between primitives/actions? We present an RL method based on auctions, where primitives "buy" states and "sell" next states
w/
@mmmbchang
,
@SKaushi16236143
, Weinberg, Griffiths
I'm now on the job market, looking for postdoc and industry positions. Interested in (1) developing AI for automatically modeling and manipulating systems (2) developing AI for powering next generation human-computer interfaces. If you know any opportunities, please let me know!
How to repurpose previous knowledge for new problems is a major question in developing agents that automatically model and manipulate systems.
In our oral at
#NeurIPS2022
Attention workshop we study this question in the context of object rearrangement: 👇
Looking forward later today to the
#Tribeca2024
premiere of The Thinking Game - a new documentary about the story of
@GoogleDeepMind
, AGI & AlphaFold, by Greg Kohs with music by Dan Deacon; it’s a sequel of sorts to the award-winning AlphaGo documentary
🤖Two Agent Simulators🤖
Ever wanted to explore hypothetical variants of your favorite stories in role-playing games?
Introducing a new example in
@LangChainAI
for simulating two agent dialogues, showing how to implement simple rpgs similar to D&D:
We will present our work with Abhishek Gupta,
@svlevine
, and Tom Griffiths
@iclr2019
on Wed 11am
#83
. Come see how composing representation transformations improves over learning flat input-output mappings when we want to extrapolate to harder compositionally structured problems.
Missed the agent simulations panel at the summit last Fri?
No worries, we got you.
Here's an agent simulation of the agent simulations panel itself, letting you re-simulate the discussion w/ the panelists about any topic (w/ 🔊!)
Model-based RL with models that factorize over entities; can discover object-like representations, and can be used to plan how to construct structures out of parts.
w/ R. Veerapaneni, JD Co-Reyes, M. Chang, M. Janner,
@chelseabfinn
, J. Wu, J. Tenenbaum
🖼️Solving visual analogies w/ in-context learning🖼️
In-context learning has mostly been shown in language; how can we transfer this capability to visual domain? Key challenge is learning tokens at appropriate abstraction.
Our new paper led by
@bhish_98
:
Tired of engineering language prompts for your favorite text2img model? What if you could generate images directly from image prompts?
Im-Promptu shows how you can learn to compose in-context from images; * no * language instructions required. Devil in the details 🧵 -->
Check out our
#ICML2021
long oral Thurs 7/22, where we apply causal analysis to the structure of RL algorithms to better understand transfer. w/
@SKaushi16236143
,
@svlevine
,
@cocosci_lab
Talk
Poster
Links (arxiv, youtube)👇
Does modularity improve transfer efficiency?
In our ICML’21 paper (long oral), we analyze the causal structure of credit assignment from the lens of algorithmic information theory to tackle this question
w/
@SKaushi16236143
,
@svlevine
,
@cocosci_lab
1/N
Come check out our virtual poster session/Q&A for "Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions" tomorrow (July 16) at either 9:00am or 8:00pm Pacific Time
Can we view RL as a series of economic transactions between primitives/actions? We present an RL method based on auctions, where primitives "buy" states and "sell" next states
w/
@mmmbchang
,
@SKaushi16236143
, Weinberg, Griffiths
👑Authoritarian speaker selection👑
Instead of having all agents bid to speak, we can also have a privileged agent direct who to speak when.
New example in
@LangChainAI
of how this can be done, with a fictitious Daily Show episode:
🗳️Decentralized speaker selection🗳️
How to implement multiagent dialogue without fixed schedule for who speaks when? Let the agents bid to speak!
New ex. in
@LangChainAI
of a fictitious presidential debate b/w Donald Trump, Kanye West , Elizabeth Warren:
Both the process of knowledge creation and of programming rely on the cycle of conjecture and criticism, aka debugging. If AGI is capacity to create new knowledge, then the artificial programmer is the drosophila of AGI. Very excited for
@cognition_labs
on taking this first step!
Today we're excited to introduce Devin, the first AI software engineer.
Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork.
Devin is
We will present our work next week at
#NeurIPS2022
on "Object Representations as Fixed Points" on Nov 29 4:30-6:00 Central Time, Hall J Poster 505:
w/
@svlevine
and
@cocosci_lab
Paper, website, and talk below:
"Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation" is now out on arXiv!
w/ my advisors
@cocosci_lab
,
@svlevine
arxiv:
web:
youtube:
Sayan Gul was a wonderful friend to me. I always enjoyed my conversations with him, during which he made me feel how beautiful research was. Please consider donating to a new
@cogsci_soc
travel fund for undergraduates in honor of Sayan, who passed away on his way to
#cogsci2018
.
If you attend
@cogsci_soc
, please consider donating to a new travel fund for undergraduates in honor of Sayan Gul, who was a student in my former lab and who tragically passed away this year on his way to
#cogsci2018
.
There's huge demand for AI companions, as shown by
@stuffyokodraws
AI companion kit
We recently hosted a webinar (w/
@Metropolize_AI
,
@AkashSamant4
,
@mmmbchang
) on creating AI characters with LangChain - now on YouTube for some fun Friday viewing!
Excited to give an invited talk
@GRASPlab
tomorrow about neural software abstractions! The talk will cover the following papers:
-
-
-
-
and will feature my advisors & collaborators:
Join us TOMORROW for a GRASP SFI presentation by Michael Chang from
@UCBerkeley
who will be presenting "Neural Software Abstractions: Learning Abstractions for Automatically Modeling and Manipulating Systems".
For more info, please visit:
Presented our poster with Sjoerd van Steenkiste on "Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions" at
#ICLR2018
today.
“Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions” (;
) at
#ICLR2018
this Mon 11am-1pm, East Meeting level; 1,2,3
#13
. With Sjoerd van Steenkiste, Klaus Greff, Jürgen Schmidhuber.
Congrats
@mmmbchang
! So cool to see all the progress of Project Astra in the last months! Amazing to build these next gen Multimodal Models with such an awesome team!
I am so lucky to have spent the past couple months with the amazing
@LangChainAI
team. I had opportunity to build on their incredible infrastructure to explore agent simulations () & data-driven characters. Thank you for the wonderful time, LangChain team!
🤖Data-Driven Character Chatbots🤖
Running out of creativity creating
@character_ai
character definitions?
Introducing data-driven-characters, a repo built on
@LangChainAI
for easily creating, debugging, and running your own character chatbots grounded in any story corpus 🧵
6/
Yes we can: slot attention can be trained as a deep equilibrium model ()!
We can use any root-finding solver to find the fixed point for the forward pass, and any method to directly compute the implicit gradient in the backward pass.
The code for reproducing this visualization, as well as for reproducing the other figures in Section 4 of the BFN paper (), can be found here: . Please let me know if you find any bugs!
Fun weekend project - a visualization of the generation process of a Bayesian flow network modeling a single point (-0.8, 0.8) (blue star). Pink dots are samples from sender dist, white traj are Bayesian updates over parameters mu, green traj are network's predictions over time.
New work from Kevin Ellis is a breakthrough for learning to synthesize programs. DreamCoder learns and extends a DSL and uses it to solve new problems faster. Rediscovers human-like languages: physical laws, vector algebra & functional programming:
Come check out our work on "Modularity in RL via Algorithmic Independence" in
#ICLR2021
ws!
Generalization: 1pm PT:
Learning to learn: 8:40am PT: .
Grateful to my collaborators & advisors
@SKaushi16236143
,
@svlevine
,
@cocosci_lab
Can causality and algorithmic independence help RL transfer better? Tmrw,
@mmmbchang
will present "Modularity in RL via algorithmic independence" in
#ICLR2021
ws:
Generalization beyond... 1 pm PT:
Learning to learn 8:40 am PT:
Thrilled to share
#Lyria
, the world's most sophisticated AI music generation system. From just a text prompt Lyria produces compelling music & vocals. Also: building new Music AI tools for artists to amplify creativity in partnership w/YT & music industry
Check out “Doing more with less: meta-reasoning and meta-learning in humans and machines” w/ Tom Griffiths, Frederick Callaway,
@ermgrant
, Paul Krueger,
@FalkLieder
where we argue that computation and data constraints are intrinsic to building intelligence
12/
There's more work to do to understand how implicit differentiation affects object centric models, but what is clear is that object-centric learning has potentially deep connections to other research areas (meta-learning, causality, fast-weights) that have yet to be explored.
Learning to represent objects is a major research direction towards representing the causal structure of the world.
In our oral at
#iclr2022
workshop on Objects Structure & Causality, we present a new way to conceptualize objects: as stable points of a fixed-point procedure: 👇
11/
One interesting finding is that the attention masks for implicit SLATE appear to be more smeared out.
At first we couldn’t understand why, but folks at
@genintelligent
suggested that it could be learning to not only capture objects, but also their shadows.
3/
Inferring from observation a set of representations that are a priori symmetric and independent requires a method for breaking symmetry.
For many object-centric models, this is done through an iterative refinement process that is structurally similar to the EM algorithm.
5/
We empirically notice that slots of slot attention tend to remain generally stable.
Can we treat these iterative algorithms as fixed-point procedures?
If so, we can leverage recent advances in implicit differentiation techniques to stabilize training and improve learning.
🦜LangChain 🤝Gemini♊️
Gemini API access is out! Access it through LangChain with our first standalone integration package:
`pip install langchain-google-genai`
We're also launching an integration guide showing how to:
🎏Stream results
🖼️Use it's multimodal capabilities
Modularity is the capacity for components of a system to be independently modified. The most rigorous formalization of this we know comes from causality: modularity = algorithmic independence of mechanisms. (janzing &
@bschoelkopf
, 2010).
3/N
We are excited to have a range of interesting speakers at the
#ICMLmodeling
workshop () this Friday 6/14! Please submit your questions for our fantastic panel at the here: .
4/
The problem is that these iterative models have been difficult to train because they learn by backpropagating through an unrolled optimization procedure.
The Jacobian norm of the slot attention cell (in red) blows up during training, leading to poor optimization.
8/
One method is particularly simple: just truncate the backprop.
This requires only one extra line of code and has O(1) space and time complexity in the backward pass.
See also , , ,
7/
It turns out that many different combinations of solvers and approximations for implementing implicit differentiation train much more stably than vanilla slot attention.
Our work (w/
@TomerUllman
, Torralba, and Tenenbaum) on the Neural Physics Engine is featured alongside Interaction Networks by
@PeterWBattaglia
,
@DeepMindAI
in
@sciencemagazine
as a step towards endowing intelligent agents with the same sense of intuitive physics as humans have:
For folks going to
#NeurIPS2023
,
@bhish_98
will present our latest work that shows how we can train models to make visual analogies through in-context visual prompting! Come check it out!
Excited to be presenting Im-Promptu at NeurIPS 2023! Catch us at Great Hall on Tue, 12/12, from 6:15 PM onwards. Will enjoy chatting about slot-centric methods and where this piece fits in the Generative AI puzzle 🧩
2/
Objects reflect two very general properties about the causal structure of the physical world.
The first is symmetry: the same physical laws apply to all objects.
The second is independence: objects can be locally acted upon without affecting other objects.
Multimodal error correction over code has been my dream for a while, and it's so wild to see
@mckaywrigley
demonstrate the beginnings of such capabilities with Gemini 1.5 Pro! Crazy to think that things like this will soon become the default that people expect from all models.
The future of fixing bugs?
Just record them.
I filmed 3 separate bugs in an app and gave the videos to Gemini 1.5 Pro with my entire codebase.
It correctly identified & fixed each one.
AI is improving insanely fast.
14/
Check out our posters at these
#iclr
workshops Apr 29:
Objects, Structure, Causality:
Gamification and Multiagent:
Deep Generative Models for Highly Structured Data:
Can we understand intrinsic motivation as an arms race between policies that minimize and maximize surprise?
Check out this work led by
@arnaudfickinger
,
@natashajaques
, and
@parajuli_samyak
at the Unsupervised Reinforcement Learning workshop at
#ICML2021
this Friday July 23!
Effective unsupervised reinforcement learning requires a balance between seeking novelty and familiarity. How can we build an algorithm that strikes this balance?
Paper:
Project Page:
Code:
9/
Using implicit differentiation for slot attention improves qualitative reconstructions, as can we be seen from this comparison with the state-of-the-art SLATE architecture ()
@stevesi
"When you are thinking of the great ideas of history, often what you really want to be thinking about is the great representations that enabled people to think those ideas." -- Bret Victor (
@worrydream
)
Key idea:
By representing the computation of learning algorithms as one giant algorithmic causal graph, we show that to get independently modifiable components, we need a credit assignment mechanism whose causal structure makes independent modification possible.
6/N
Announcing Genmo Video, a generative media platform with a new text-to-video model that can generate immersive live artwork from any prompt or any image.
What will you create? 🎨▶️
Free public access:
Discord:
👇1/n
We will present our work with Abhishek Gupta,
@svlevine
, and Tom Griffiths
@iclr2019
on Wed 11am
#83
. Come see how composing representation transformations improves over learning flat input-output mappings when we want to extrapolate to harder compositionally structured problems.
How to repurpose previous knowledge for new problems is a major question in developing agents that automatically model and manipulate systems.
In our oral at
#NeurIPS2022
Attention workshop we study this question in the context of object rearrangement: 👇
Excited to announce the Object Representations for Learning and Reasoning at NeurIPS 2020! We will host child developmentalists, roboticists, and machine learning researchers on what objects are, what object representations should do, and what the challenges in applying them are.
A popular hypothesis in machine learning is that modularity could enable efficient transfer. But it is an open question how to determine whether a learning algorithm is modular. Without a formal definition of modularity for learning systems, we can’t test this hypothesis.
2/N
We can finally empirically test our hypothesis, and we find it survives the experimental test. Below, a modular on-policy rl algorithm (red) has higher transfer efficiency than its non-modular counterpart (blue). This trend appears consistent across many transfer topologies.
11/N
@tobias_rees
"Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say."
Gamma-models are dynamics models without a fixed time step. Instead, gamma models predict discounted averages of future state visitations, allowing us to train "infinite horizon" models with TD.
w/
@michaeljanner
&
@IMordatch
->
8/ We show evidence that the local credit assignment mechanisms of our societal decision-making framework produce more efficient learning than the global credit assignment mechanisms of the monolithic framework.
See our paper w/
@SKaushi16236143
,
@svlevine
,
@cocosci_lab
, NeurIPS DeepRL workshop Dec 11: causal analysis on structure of an RL algorithm, towards formalizing modular transfer in RL
Poster:
Paper:
Video:
At deepRL WS,
@mmmbchang
will present “Modularity in Reinforcement Learning: An Algorithmic Causality Perspective on Credit Assignment” how causal models help us understand transfer in RL!
Poster:
Paper:
Vid:
10/
The quantitative metrics are also improved too. In terms of mean squared error, the implicit version has almost a 7x improvement over its vanilla counterpart.
Neural Bucket Brigade largely inspired our work "Decentralized RL: Global Decision Making via Local Economic Transactions", which showed using the Vickery auction mechanism connects optimal local behavior to optimal emergent global behavior. Thread here:
Re: more biologically plausible "forward-only” deep learning. 1/3 of a century ago, my "neural economy” was local in space and time (backprop isn't). Competing neurons pay "weight substance” to neurons that activate them (Neural Bucket Brigade, 1989)
It would be nice if we can generalize this customized character chat experience to allow the user to chat with any character based on any corpus, with full control over the design and data used to create the characters.
Introducing data-driven-characters: