Prithviraj (Raj) Ammanabrolu Profile Banner
Prithviraj (Raj) Ammanabrolu Profile
Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

4,704
Followers
520
Following
324
Media
2,404
Statuses

Interactive & grounded AI, RL, NLP. Assistant Prof @UCSanDiego . Research Scientist @DbrxMosaicAI . Prev: @allen_ai , @GeorgiaTech

San Diego, CA
Joined April 2019
Don't wanna be here? Send us removal request.
Pinned Tweet
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
6 months
The PEARLS Lab at @UCSD_CSE is now open for business! I'm recruiting Fall 24 PhD students in all things interactive and grounded AI, RL, and NLP!! Join us in the land of 🏖️ beach (🧋pearl tea included). Apply by Dec 20. Please help spread the word! More:
Tweet media one
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
11 months
Soon™, I'll be an Asst Prof @UCSanDiego @UCSD_CSE focusing on interactive & grounded AI, RL, NLP I will also be a research scientist @MosaicML helping lead efforts to make tech like RLHF more accessible Looking for PhD students & research eng/scientists to join me in ☀️SoCal🏖️
Tweet media one
Tweet media two
Tweet media three
75
41
548
8
65
242
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
11 months
Soon™, I'll be an Asst Prof @UCSanDiego @UCSD_CSE focusing on interactive & grounded AI, RL, NLP I will also be a research scientist @MosaicML helping lead efforts to make tech like RLHF more accessible Looking for PhD students & research eng/scientists to join me in ☀️SoCal🏖️
Tweet media one
Tweet media two
Tweet media three
75
41
548
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 years
Using GPT-3 instead of regex
@JamesFarmer87
James Farmer
4 years
Well this has made my day.
Tweet media one
173
6K
42K
2
44
456
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
I haven't been home in years. I stay up at night thinking of all the people I'll never see again. I'd like to have a home to go back to. All I can do is donate/RT so I'm boosting #CovidIndia posts that can help. If this bothers you, pls mute/unfollow. Don't send me DMs like this
Tweet media one
22
18
430
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing 💻RL4LMs: library to train any @huggingface LM w/ RL 👾GRUE: benchmark of 6 NLP tasks+rewards 📈NLPO: new RL alg 4 LMs 🌐
6
115
417
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
If it doesn't work with seed 42, it'll never work.
8
8
254
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 years
I have a language modeling joke, but it's too dangerous to be released.
@criticalneuro
Ida Momennejad
4 years
I have a reinforcement learning joke, but not sure it's rewarding.
31
75
852
3
16
244
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Why do ML academics have such knee jerk reactions to writing rules or engines to ground and control an ML system? "It won't work in the real world" is such an unsubstantiated argument. Have you ever actually put an ML system in production?? How do you think those work???
9
17
239
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 years
How to Avoid Being Eaten by a Grue: Structured Exploration Strategies for Textual Worlds This is Q*BERT, an agent that explores using an intrinsic motivation to learn a knowledge graph of world by asking questions. Paper: Code:
Tweet media one
3
51
230
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Do embodied agents dream of 🤖pixelated sheep🐏? Meet DECKARD, an agent that "dreams" a world model hypothesizing how to achieve tasks via a LLM. Efficiently training more generally capable RL agents by grounding LMs with actions in a world! In #ICML2023
5
62
225
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
9 months
I spent my highschool and half my undergrad like this (minus the sleep and serenity). The 16 hour workday grindset is a helluva drug. Took me years to recover. It's ineffective, inefficient, and will just plain leave you miserable
@airkatakana
Air Katakana
9 months
Meanwhile the first thing Berkeley CS undergrads are shown:
Tweet media one
68
174
2K
6
9
219
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Bit of a life update. Starting this fall (after I defend), I'll be at @allen_ai doing interactive #NLProc things with @YejinChoinka @HannaHajishirzi and the rest of the amazing @ai2_mosaic team!! Much excite!!!
28
5
207
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
18 days
NeurIPS first author paper to get into highschool soon. This is a questionable move from NeurIPS The number of emails I get from some of the highschools in Cali is insane. They're all from "top" highschools and most directly from parents who know the game.
@thegautamkamath
Gautam Kamath
18 days
NeurIPS 2024 will have a track for papers from high schoolers.
Tweet media one
79
91
593
6
12
203
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 years
Hearing all the sirens outside my window and trying to go back to writing papers just makes me think, "Man, what a bubble we live in."
2
9
195
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
ACL achievement unlocked: get both a "this short paper should've been long" and a "this long paper should've been short" review
5
5
182
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Chaos in the Metaverse
Tweet media one
5
22
182
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
The first paper of my PhD from three years ago with @mark_riedl has a 100 citations! Not much by today's ML/NLP standards perhaps but it means a lot to me especially because of how non "mainstream" the work is
Tweet media one
8
4
181
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 months
Finally! A very natural next step I'm glad someone tried Here's one more free paper idea: use NLPO instead of PPO to mask out next tokens during generation based on compiler syntax feedback to make exploration more efficient. V bullish on LLM+RL+CodeGen
Tweet media one
@_akhaliq
AK
3 months
StepCoder Improve Code Generation with Reinforcement Learning from Compiler Feedback paper page: The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning…
Tweet media one
0
56
250
2
26
145
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 month
Unsolicited advice for (academics) interested in Big Model capabilities/scaling research. Stay grounded in reality (industry) and write fewer papers. Honestly, very very few recent papers in the area actually matter
3
8
153
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
How can we get language based reinforcement learning agents to act in more altruistic and less harmful ways towards themselves and others? One way is to constrain their actions with social commonsense. New #NAACL2022 paper on social value alignment 🧵👇
Tweet media one
Tweet media two
5
28
151
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 month
In under 4 hours of release, the community has taken our model and made it run locally on an M2 laptop. Open source LFG!!!
@awnihannun
Awni Hannun
1 month
4-bit quantized DBRX runs nicely in MLX on an M2 Ultra. PR:
29
114
735
1
14
152
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 years
🚨New Paper Alert🚨 Having trouble keeping your (AI) dragon motivated? Same here. So we figured out how to teach it, interactively w/ RL & lang pretraining, to act consistently + talk naturally wrt its motivations when questing in a fantasy text game. 1/4
Tweet media one
3
29
146
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Now accepted to #ICLR2023 ! Look forward to our talk on open source, efficient natural language RLHF algorithms at Kigali, Rwanda!!!
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing 💻RL4LMs: library to train any @huggingface LM w/ RL 👾GRUE: benchmark of 6 NLP tasks+rewards 📈NLPO: new RL alg 4 LMs 🌐
6
115
417
4
21
142
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
A major use of conversational AI is looking for information but most unrealistically rely on the user to figure out how to ask exactly the right question.The new INSCIT benchmark evals how agents can take the initiative to guide users to the info they need
Tweet media one
5
25
137
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Announcement time! I'm on the academic job market this cycle! Please reach out if I'm a good fit! I make trustworthy and safe AI agents that communicate with language, build world models, and learn from human and environmental feedback. More:
Tweet media one
3
36
133
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
Two #NeurIPS2023 spotlights accepted!! 1. Our work on how to improve the Human Feedback portion of RLHF to be more effective, a direction which I believe is the clear future of feedback learning. And ...
@zeqiuwu1
Ellen Wu
11 months
F in RLHF is overall preference, which conveys limited info🙁 We introduce Fine-Grained RLHF🚀and train LMs with explicit feedback like "sentence 1 is not factual", "sentence 2 is toxic" More effective & enables LM customization
Tweet media one
10
133
574
2
11
132
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Announcing the 1st Workshop on 🎨Creative AI Across Modalities🎶 at AAAI 2023! Come chat and learn about the latest in creative AI for Art, Music, Narrative, Poetry, Sciences and so much more from the entire community! 4-8 pg submissions due: Nov 4 More:
Tweet media one
1
34
129
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 months
If you're a student trying to do Big Model AI right now, my one piece of advice is to take and pay attention in both a Systems (covering GPUs) course and something covering Human Participant Study Design. These are basic prereqs for Every Thing Else.
5
4
131
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Made it through my final 5 phase boss fight. Even got a title for my trouble. Hi, I'm Dr. Prithviraj, at your service.
@mark_riedl
Mark Riedl
3 years
. @rajammanabrolu kicks off his PhD dissertation defense
Tweet media one
2
0
51
24
0
126
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
My 280+ page magnum opus, my dissertation, has been submitted to the committee!! Only D(efense)-Day left 🤞
Tweet media one
2
3
124
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Check out our new easy to use, off policy reinforcement learning algorithm to selectively *un*learn *un*desirable behaviors in LMs without sacrificing other capabilities!
@_akhaliq
AK
2 years
Quark: Controllable Text Generation with Reinforced Unlearning abs: introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property, while not straying too far from the original model
Tweet media one
0
24
163
1
20
120
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Open letter to all game devs on here. In honor of mother's day, I want more video games where I can summon my amma to hit people being mean to me with a chappal. Thanks.
@OuterloopGames
Thirsty Suitors 💦🛹🔪 out now!
2 years
Call your amma!
32
353
2K
1
22
117
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
5 months
DPO is nice and easy to get running but I have yet to see it out perform an(y) online actor critic RL algo with large scale (noisyish) human feedback data. I've burned too many GPU hours. No exploration or reward means it's not well suited to initial RLHF training on real data
12
11
118
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Massive updates to RL4LMs, library + paper! Early Christmas! 🗣️New task: chitchat dialogue 🧑‍💻Human preference collection UIs released. 🔁Continual deployment expts: how to budget expert demos vs feedback, feedback formats, reward training, and more!
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing 💻RL4LMs: library to train any @huggingface LM w/ RL 👾GRUE: benchmark of 6 NLP tasks+rewards 📈NLPO: new RL alg 4 LMs 🌐
6
115
417
1
25
117
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 years
Announcing "Wordplay: When Language Meets Games" at #NeurIPS2020 , your one stop workshop for all things interactive (narrative + language learning + AI). Now with a amazing set of speakers and organizers spanning all these fields!
Tweet media one
2
29
116
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
From the GPT4 tech report: "This report contains no further details about the architecture (including model size), ... dataset ... training method, or similar." It's a product. Not science. That's fine. I better not see *any* ACL prompting papers on it.
3
12
114
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Was just told my approach to mentoring students "is not scalable". Why are we scaling again??? They're people not LLMs????
4
5
111
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
5 months
Do people see why we shouldn't allow all of AI development to be closed source in the hands of like three companies yet?
1
12
108
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 months
For a small fee, I will attend your enemy's job talk and ask "isn't this just fancy prompt engineering on GPT-4?"
@analisapackham
Analisa Packham
3 months
For a small fee, I will attend your enemy’s job talk and ask “how is this economics?”
37
152
2K
4
3
108
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 months
Only a year ago that our paper on when and how to use RL in NLP was accepted to ICLR-23. We're now at 100 citations and 2k GitHub stars! Less about numbers and just how excited I am that so many people are working on RL for NLP! Only a few years ago this was unimaginable!
Tweet media one
Tweet media two
4
9
109
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Have you met the entire nation of India? People who actually know how to make vegetarian food?
@ezraklein
Ezra Klein
3 years
There is no doubt that being veg is less delicious. People who argue otherwise are kidding themselves. But a lot of that is because there are fewer options on menus, so much less money driving creativity. The more plant-based eaters and chefs there are, the tastier it'll get.
594
62
2K
3
6
107
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
Starting to sink in that I'll never need a winter jacket again 🥹
Tweet media one
6
0
106
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
The power of powers of 2!! We noticed this while building encoder decoder models like T5 into our RL4LMs open source RLHF toolkit, just snapping vocab size to the nearest power of 2 significantly improves run times!!
@karpathy
Andrej Karpathy
1 year
The most dramatic optimization to nanoGPT so far (~25% speedup) is to simply increase vocab size from 50257 to 50304 (nearest multiple of 64). This calculates added useless dimensions but goes down a different kernel path with much higher occupancy. Careful with your Powers of 2.
86
360
5K
2
7
105
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 month
I'm writing this cause I'm a bit salty. We've implemented so many seemingly promising, published & popular papers only for them to utterly flop. At least I like to think that my personal bs Big Model paper classifier is now pretty good given my extensive training data.
4
1
101
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Basically every year but hits harder this time #EMNLP2021 😭
Tweet media one
1
8
100
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 month
The sacrifices to the twin gods of compute and crowdworking worked! In under 3 months we built the best commercially viable open weight LLM We're committed to opening up AI research again by giving *you* the result of our efforts! We're just getting started
@jefrankle
Jonathan Frankle
1 month
Meet DBRX, a new sota open llm from @databricks . It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.
Tweet media one
34
269
1K
7
9
96
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
4 line review I got today - Line 1: "Method lacks novelty." Line 4: "Actually if I think about it, this is very novel. I see no issues." Reject. How do you even respond to this? T_T
5
4
95
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
An AI dungeon master that delivers actually engaging experiences you say? We're already on it! Just hit up @peizNLP and @zhuexe
Tweet media one
Tweet media two
@willknight
Will Knight
7 months
Hmm a dungeon master that won’t talk about weapons is not ideal
Tweet media one
15
17
255
3
11
90
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
RL4LMs has 500 stars on GitHub! Thanks for the support for your one stop shop for all things RLHF! 3000+ expts over 7 NLP tasks, 4 RL algos, any Huggingface generative LM, 20+ metrics, human preference collection UIs, continual deployment, and more!
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing 💻RL4LMs: library to train any @huggingface LM w/ RL 👾GRUE: benchmark of 6 NLP tasks+rewards 📈NLPO: new RL alg 4 LMs 🌐
6
115
417
2
25
91
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
9 months
A common misconception is that I'm an AI researcher. Actually, my job... is just beach. I am an Assistant Professor of Beach. Soon in San Diego! 🤙
Tweet media one
6
0
91
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
I'm quite tired of industry papers that don't have any released data/models/code + not enough implementation details making them impossible to reproduce. The blanket reason being "Company IP". Just don't publish then.
2
6
90
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Introducing the JerichoWorld Dataset! Designed to measure textual world modeling agents' situated knowledge representation and commonsense reasoning skills. Thousands of autoannotated (text→knowledge graph+actions) pairs across dozens of text games.
Tweet media one
Tweet media two
4
29
88
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 months
This week for Christmas I got two grant proposals 🎉rejected🎉 cause they are "5+ year moonshots that are not worth wasting resources on" 🥰 New to the whole professing thing, can I eventually send them the paper with the caption "here's your 5+ year moonshot, took us 1 🚀" ?
5
0
88
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 months
PhD visit days at @ucsd_cse covering all the essentials!!
Tweet media one
2
0
84
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 months
Someone understands my pain. The root of suffering is tokenization A lot of the things people point out as "LLMs can't do X" are actually tokenizer issues. This becomes really obvious really fast once you spend time low level and see how messed up all forms of encodings are
@karpathy
Andrej Karpathy
2 months
We will see that a lot of weird behaviors and problems of LLMs actually trace back to tokenization. We'll go through a number of these issues, discuss why tokenization is at fault, and why someone out there ideally finds a way to delete this stage entirely.
Tweet media one
59
280
3K
1
7
83
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
In other news, I've finally moved to Seattle and today is my first day working at @allen_ai !!
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Bit of a life update. Starting this fall (after I defend), I'll be at @allen_ai doing interactive #NLProc things with @YejinChoinka @HannaHajishirzi and the rest of the amazing @ai2_mosaic team!! Much excite!!!
28
5
207
3
0
83
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Finally got around to updating my website. It's now a lot more ✨me✨
Tweet media one
5
2
83
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
5 months
I'm attending #NeurIPS2023 !! Presenting two spotlights and recruiting PhD students for my PEARLS lab at @ucsd_cse & research engineers/scientists @MosaicML / @databricks !! A heavy focus on LLMs+RL(HF) & Embodied NLP. Email me! 🧋 🌐
Tweet media one
Tweet media two
Tweet media three
1
19
82
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 months
We need to unify rules that jobs should only consider your top 3 papers. Otherwise, job hunting (PhD) students have too much pressure to publish lots (even if faculty don't)
@arankomatsuzaki
Aran Komatsuzaki
3 months
I hate when arXiv casually releases like 500 papers almost everyday. My energy runs out around when I scans 350 papers 😅
26
4
149
8
5
81
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
11 months
Our latest work answering the question: Do you really need the RL in RLHF? Yes! You really do. But it requires work on improving the HF portion to go from very sparse pairwise preferences to something more informative and fine grained. Let's build better rewards!!
@zeqiuwu1
Ellen Wu
11 months
F in RLHF is overall preference, which conveys limited info🙁 We introduce Fine-Grained RLHF🚀and train LMs with explicit feedback like "sentence 1 is not factual", "sentence 2 is toxic" More effective & enables LM customization
Tweet media one
10
133
574
6
14
81
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Our paper on Multimodal RL(AI)F is now accepted to #CVPR2023 especially thanks to @YoungjaeYu3 and @JiwanChung . Tune your language models to understand multimodal inputs with RL while keeping their zero shot language abilities intact!!
@_akhaliq
AK
2 years
Multimodal Knowledge Alignment with Reinforcement Learning abs:
Tweet media one
1
24
96
0
9
77
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
The Wordplay: When Language Meets Games workshop is back y'all!!! 3rd edition will be held at #NAACL2022 in Seattle (+hybrid virtual). Your one stop shop for all things interactive language learning, narrative, and more!! Much excite!!! More updates soon:
Tweet media one
1
23
75
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
5 months
If one more person asks me PPO vs DPO I swear I'm gonna blow a gasket. The answer (like everything else) is that it depends on your data
9
0
74
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
11 months
I missed multiple NeurIPS early in my PhD cause I couldn't get a Canadian visa. I know many who can't do CVPR or ACL this year. Statements on "fostering inclusivity" are just theatre unless conference locations are moved outside Canada/US
@val_iisc
Vision and AI Lab, IISc
11 months
Indian PhD students from @iiscbangalore , who have first-authored papers at prestigious conferences like @CVPR , are facing unjust denials of Canadian visas. With Shocking reasons "limited employment possibilities in India" and "purpose of visit not consistent with a temp. stay."
69
422
2K
2
8
72
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
5 months
Forget MMLU, AGI is only achieved when it can fully finish my visa applications
3
0
71
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Me trying to mute all the AI hypefluencers on my feed
3
8
69
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
New #SIGDIAL2021 paper on #DnD style storytelling through multi-user dialogue! Predicting relationship types between characters via sentiment while learning to talk helps #AI models be better DnD players!! Paper: 🧵👇1/3
Tweet media one
Tweet media two
2
18
67
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 months
PSA for PhD applicants: US School offers will start going out very soon. Exploding/short deadlines are NOT a thing, you have until April 15 to make a decision. Top schools will have visit days in March. Go to those, talk to ppl, make an informed choice
@natolambert
Nathan Lambert
3 months
WTF! I've heard multiple accounts that "exploding offers" on the 1-2 week timescale are now a regular occurence in AI Ph.D. application process. Not okay. If you're not in the application cycle and hear about this, speak up! Honestly, I'll help coach people on this negotiation.
15
2
71
3
6
67
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Two #NeurIPS2022 accepted papers!! Bless the ACs!! See y'all in New Orleans and let's chat interaction, language, grounding, and reinforcement learning!! 1/2 HEX-RL Explainable RL in Natural Language using Knowledge Graphs! Led by the amazing @beckypeng6 !
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
🚨Preprint Alert🚨 "Inherently Explainable RL in Natural Language" The Hierarchically Explainable (HEX) RL agent that thinks out loud to tell us why decisions are made by pointing to the facts in its internal state that most influence its actions. Paper:
Tweet media one
1
10
64
5
10
65
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
10 months
The Dungeon Meowsters are live in Toronto for #ACL2023 to talk all things: #DnD , theory of mind, multi agent grounded dialogue, reinforcement learning, table top games, and more!! Catch @peizNLP at 4 pm today at Session 8!!
Tweet media one
@peizNLP
Pei Zhou
1 year
📍Introducing an AI Dungeon Master’s Guide🧙‍♂️, or how to make a #DnD DM dialogue agent trained with intents and theory of mind-inspired💭reinforcement learning. Predicting how your players will react to you ahead of time makes for a better DM! 📃
Tweet media one
3
32
114
1
8
65
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 months
Advice from senior faculty in my dept: "Got a grant rejected eh, fck it. Go beach. Paper rejected? Same thing. Beach, then try again." 🥺😭
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 months
This week for Christmas I got two grant proposals 🎉rejected🎉 cause they are "5+ year moonshots that are not worth wasting resources on" 🥰 New to the whole professing thing, can I eventually send them the paper with the caption "here's your 5+ year moonshot, took us 1 🚀" ?
5
0
88
7
1
65
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
How is it that a relatively early stage startup has a 4000 A100 GPU cluster seemingly effortlessly when the best funded academic institutes struggle to pay for a small fraction of that?
9
1
64
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
🚨Preprint Alert🚨 "Inherently Explainable RL in Natural Language" The Hierarchically Explainable (HEX) RL agent that thinks out loud to tell us why decisions are made by pointing to the facts in its internal state that most influence its actions. Paper:
Tweet media one
1
10
64
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
Following time honored academic traditions: I'm happy to announce that I have taken the profile pic that will stay on my website until I ascend to full professor (at least).
@ucsd_cse
UCSD CSE
7 months
A warm welcome to an impressive cohort of new faculty members who will be joining CSE over the next two academic years 👏🎉. @creamyoki and @qip_liu started this Fall. In Fall 2024, we welcome @rajammanabrolu , @Lianhuiq , @AlexTamkin , and @_kumarde !
Tweet media one
0
9
65
1
1
63
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 months
The boba tea shops in SD do not mess around, "light of my life, my sin, my soul" is so extra 🤣
Tweet media one
8
2
63
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Two really cool AI classes I found out about recently: 1. Interactive fiction and AI run by @ccb , @LangTechLara at UPenn 2. World Models and Intelligence run by @Matsuo_Lab , @shaneguML , @shade_tree2112 (and many others!) at UTokyo
2
11
63
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
#AAAI2021 Come chat about C2PO: the causal, commonsense plot ordering storyteller and *how* ppl think about causality using commonsense expectations in stories. Sat. 2/6 8:45-10:30, 4:45-6:30 PST. Paper: Site: w/ EILab, @mark_riedl
Tweet media one
2
17
61
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
"Your English is pretty good, where are you 𝙧𝙚𝙖𝙡𝙡𝙮 from?"
@pinkkatydid
The Linguist Formerly Known As Yate
2 years
Tweet media one
10
311
1K
2
15
60
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 months
The correct answer to "what online RL algo should you use" has always been and will always be "whatever you know how to tune the hyper parameters for best"
Tweet media one
@aahmadian_
Arash Ahmadian
2 months
PPO has been cemented as the defacto RL algorithm for RLHF. But… is this reputation + complexity merited?🤔 Our new work revisits PPO from first principles🔎 📜 w @chriscremer_   @mgalle   @mziizm @KreutzerJulia Olivier Pietquin @ahmetustun89 @sarahookr
Tweet media one
13
95
474
2
3
60
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
“You're in an open field to the west of a white House. There's a mailbox here.” First scene of Zork1 materialized thanks to the new #dalle by @jmhessel . Automated text game -> visual novel pipeline when??
Tweet media one
4
10
60
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Can confirm, @YejinChoinka definitely favors exploration over exploitation and encourages others (me at least for sure) to also "be adventurous and live like a game character"!! A very well deserved award!
@uwcse
Allen School
2 years
#UWAllen @uwnlp 's @YejinChoinka aims to develop #AI with the ability to reason and communicate about the world in physical and abstract terms, like humans can do. As a 2022 #MacFellow , she looks forward to taking the “adventurous route” in her research:
Tweet media one
9
66
390
1
2
58
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
10 months
Embodied agents and pixelated sheep 🐑 To be presented at #ICML2023 next Thursday 27th in Hawaii by @kolbytn and me! Come chat with us about language grounding, embodied AI, world models, RL+NLP and more!!
Tweet media one
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Do embodied agents dream of 🤖pixelated sheep🐏? Meet DECKARD, an agent that "dreams" a world model hypothesizing how to achieve tasks via a LLM. Efficiently training more generally capable RL agents by grounding LMs with actions in a world! In #ICML2023
5
62
225
2
11
58
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Our new work on language agents that augment their action space with symbolic modules! Basically, don't teach your LM to be a calculator when it can just use an existing one instead. A step towards Neurosymbolic LM tool use for math, navigation, and more!
@peterjansen_ai
Peter Jansen
2 years
Transformers are robust reasoners, but frustratingly lack the ability for accurate math, navigation, & other easily coded tasks. In our new work "Behavior Cloned Transformers are Neurosymbolic Reasoners", we show you can have the best of both worlds. 1/3
Tweet media one
5
50
271
0
7
56
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 years
Join us! I'm looking for interns next summer @allen_ai to work in the areas of RL for NLP to learn from human feedback and also grounding language in envs like text games, Minecraft, NetHack, etc. for open ended RL agents
@ai2_mosaic
MOSAIC
2 years
📢📢 Looking for a Summer 2023 research internship? Apply to the Mosaic team @allen_ai !! 📢📢 topics include: commonsense, language generation, vision+language, RL, + more! Applications due Nov 13th!
1
43
165
3
8
55
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
6 months
So you burned a lot of money and trained a really good RLHF model for your existing users' preferences. Now, a new user comes along with very different preferences? How do you scale effectively to new RLHF use cases without wasting everything? New paper!
@jang_yoel
Joel Jang
6 months
🎯 Tired of one-size-fits-all AI chatter? ChatGPT tends to generate verbose & overly informative responses. This is because the current RLHF pipeline only allows aligning LLMs to the general preferences of the population. However, in the real world, people may have multiple,…
Tweet media one
2
67
299
0
6
53
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
5 months
Evaling LLMs is hard but the interesting thing is that AI/ML people seem weirdly determined to get rid of humans entirely in the eval and RLHF processes. Pls ground your metrics to something real pls thanks
@maxisawesome538
Max ⛅
5 months
pls i need respite
Tweet media one
5
15
218
2
3
54
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
18 days
I'm mostly just worried that they're the only type who will be able to submit to this. Esp cause "Highschool paper" will definitely get used as a metric Also, seriously, let the kids go touch sand, plenty of time to be in a lab later
0
1
53
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
Current industry landscape in the fight for open source AI (personal opinion)
Tweet media one
@martin_casado
martin_casado
7 months
I feel like we’ve all been pulled into a fucked up alternate timeline … I can’t believe we have to fight again for open source …
59
101
1K
2
7
53
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
7 months
I regret nothing.
Tweet media one
@Mahoukarp
Norin 🦈 🛒 Open
8 months
i have a lot of plushies
Tweet media one
22
2K
7K
3
0
53
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
The Worldformer will now appear at #NeurIPS2021 at the main track alongside the JerichoWorld benchmark in the benchmarks track. Get ready for a NeurIPS where I talk about world models not once but twice!!! 🎉🎉 Wformer: Benchmark:
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Parte the second thread, as promised: Here's the Worldformer: a sota text game world model that multi-task learns to generate all possible lang actions and the *difference* between world states as a knowledge graph, using it to learn env dynamics! Paper:
Tweet media one
Tweet media two
3
8
35
2
7
52
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 month
Pretty funny when you get a paper review saying "method won't be of practical value" when it's been deployed in production serving millions in industry for a couple months already 🤡
2
0
51
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
When I, a vegetarian, go out to eat with my friends
Tweet media one
1
3
52
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
5 months
(Most) Academic Labs are sleeping on selling lab merch to keep themselves funded. The PEARLS Lab is not! (Innovating funding schemes cause making merch is fun+easy!! And NSF is... not) 🌐
Tweet media one
Tweet media two
Tweet media three
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
6 months
The PEARLS Lab at @UCSD_CSE is now open for business! I'm recruiting Fall 24 PhD students in all things interactive and grounded AI, RL, and NLP!! Join us in the land of 🏖️ beach (🧋pearl tea included). Apply by Dec 20. Please help spread the word! More:
Tweet media one
8
65
242
2
2
52
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Now in #ACL2023 !! Look forward to @peizNLP 's presentation! See y'all in Toronto and let's chat #DnD dialogue, theory of mind, and all things interactive NLP!! Camera ready soon!
@peizNLP
Pei Zhou
1 year
📍Introducing an AI Dungeon Master’s Guide🧙‍♂️, or how to make a #DnD DM dialogue agent trained with intents and theory of mind-inspired💭reinforcement learning. Predicting how your players will react to you ahead of time makes for a better DM! 📃
Tweet media one
3
32
114
0
11
51
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
2 months
lol. lmao even.
Tweet media one
@lmsysorg
lmsys.org
2 months
@HeinrichKuttler Sorry for the mistake. We recognize the issue and are indeed pushing a v1.1 to fix them. Originally the errors in reference answer were left intentionally as we wanted to demonstrate the limitation of gpt-4 judge in the paper. However, since MT-Bench has become widely used, those…
1
2
11
2
3
49
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
Yes! Language as an interface!! Conversational information search works best when LMs are grounded in an underlying info source. See our recent TACL paper led by @zeqiuwu1 for more on this idea
@fchollet
François Chollet
1 year
I'm pretty optimistic that the LLM reliability / factualness issue can be fixed. The key is to use LLMs as a dialog interface and not as a store of knowledge. LLMs as the query layer between a human user an a knowledge graph with sources (which can be hybrid generated/curated).
57
131
1K
1
7
49
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
Now accepted to #NAACL2021 !! Fairly impressed by the thoroughness of the reviews, camera ready coming soon!!
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
4 years
🚨New Paper Alert🚨 Having trouble keeping your (AI) dragon motivated? Same here. So we figured out how to teach it, interactively w/ RL & lang pretraining, to act consistently + talk naturally wrt its motivations when questing in a fantasy text game. 1/4
Tweet media one
3
29
146
1
3
49
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 month
Academics will get fully priced out of Big Model Sota research v soon. Even fine tuning won't be possible for the best OSS models. The best funded unis have 1000 times less compute than Big Tech
Tweet media one
@jackclarkSF
Jack Clark
1 month
Facebook is targeting 350,000 H100s by end of this year.
10
23
237
7
5
48
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
1 year
@timo_schick @LukeZettlemoyer Very interesting work! You might be interested in our very related work on LMs that use tools in interactive settings, to be presented at EACL this year.
@peterjansen_ai
Peter Jansen
2 years
Transformers are robust reasoners, but frustratingly lack the ability for accurate math, navigation, & other easily coded tasks. In our new work "Behavior Cloned Transformers are Neurosymbolic Reasoners", we show you can have the best of both worlds. 1/3
Tweet media one
5
50
271
0
3
46
@rajammanabrolu
Prithviraj (Raj) Ammanabrolu
3 years
This is fascinating but also worrying for deep RL in general in some ways. If agents can be observation permutation invariant, what can we even really claim about how they learn env dynamics/semantics?
@hardmaru
hardmaru
3 years
The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning We explore RL agents that still work even when their observations get shuffled around a lot! A fun paper w/ @yujin_tang web pdf
28
240
1K
4
7
47