Medhini Narasimhan Profile Banner
Medhini Narasimhan Profile
Medhini Narasimhan

@medhini_n

1,138
Followers
404
Following
20
Media
142
Statuses

Researcher @googledeepmind working on Veo Prev: Ph.D. @berkeley_ai , MS @IllinoisCS , Intern @GoogleAI @MetaAI @Zillow

Berkeley, CA
Joined June 2009
Don't wanna be here? Send us removal request.
Pinned Tweet
@medhini_n
Medhini Narasimhan
4 months
Feel privileged to be a part of the amazing team behind Veo!!
@GoogleDeepMind
Google DeepMind
4 months
Introducing Veo: our most capable generative video model. 🎥 It can create high-quality, 1080p clips that can go beyond 60 seconds. From photorealism to surrealism and animation, it can tackle a range of cinematic styles. 🧵 #GoogleIO
147
942
4K
0
3
36
@medhini_n
Medhini Narasimhan
1 year
After 4 wonderful years at Berkeley, I'm Phinally Done :) Huge thanks to everyone who supported me, especially my advisor @trevordarrell This week, I'm starting as a Research Scientist at Google Labs with Steve Seitz! Excited to continue my research on video understanding :)
13
3
219
@medhini_n
Medhini Narasimhan
4 months
👩‍🎓🐻
Tweet media one
Tweet media two
Tweet media three
13
1
127
@medhini_n
Medhini Narasimhan
3 years
Want to create short summaries of long videos? Check out our work, “CLIP-It! Language-Guided Video Summarization”. Joint work with Anna Rohrbach and @trevordarrell Paper: Project Page: Results:
8
23
112
@medhini_n
Medhini Narasimhan
3 years
Excited to share our work, “Strumming to the Beat: Audio-Conditioned Contrastive Video Textures” with Shiry Ginosar, @andrewhowens , Alyosha Efros, and @trevordarrell Website: Talk: Paper:
5
12
69
@medhini_n
Medhini Narasimhan
2 years
Excited to share our ECCV 2022 work “TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency” Joint work w @NagraniArsha @jesu9 @miki_rubinstein A. Rohrbach @trevordarrell @CordeliaSchmid Paper: Website:
5
5
62
@medhini_n
Medhini Narasimhan
3 years
Happy to share that CLIP-It will be presented at #NeurIPS2021 !
@medhini_n
Medhini Narasimhan
3 years
Want to create short summaries of long videos? Check out our work, “CLIP-It! Language-Guided Video Summarization”. Joint work with Anna Rohrbach and @trevordarrell Paper: Project Page: Results:
8
23
112
0
2
56
@medhini_n
Medhini Narasimhan
9 months
Thrilled to see our work, “Modular Visual Question Answering via Code Generation” featured in Google’s year-in-review! w/ @sanjayssub @kushaltk1248 Kevin Yang @NagraniArsha @CordeliaSchmid @andyzeng_ @trevordarrell Dan Klein
@demishassabis
Demis Hassabis
9 months
It's always great to look back on the year in a year-in-review blogpost with @JeffDean & James Manyika. It's been an amazingly productive year for us, doing awesome research, shipping products and advancing science - 2024 is going to be incredible!
38
140
948
0
5
51
@medhini_n
Medhini Narasimhan
2 years
Never fly @lufthansa !! They canceled my flight and haven't issued a refund yet. It's been 6 months and I haven't received a response from their customer relations team. How can airlines get away with this?!
14
6
42
@medhini_n
Medhini Narasimhan
1 year
🚨I'll be @ICCVConference next week, presenting my work at the @cveu_workshop on Oct 2nd! Excited to share two new works on learning from instructional videos. Please stop by for the talk/poster or reach out if you'd like to connect! Oral: 1:45 - 2:30 PM Poster: 12 - 1:45 PM
2
1
38
@medhini_n
Medhini Narasimhan
10 months
There couldn’t be a better time to join @GoogleDeepMind !! Excited for Monday :)
@GoogleDeepMind
Google DeepMind
10 months
We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google ’s largest and most capable AI model. Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵
171
2K
6K
0
1
31
@medhini_n
Medhini Narasimhan
4 months
Come try what we've been building :)
@Google
Google
4 months
🎥Introducing Veo, our new generative video model from @GoogleDeepMind . With just a text, image or video prompt, you can create and edit HQ videos over 60 seconds in different visual styles. Join the waitlist in Labs to try it out in our new experimental tool, VideoFX #GoogleIO
39
346
1K
1
0
28
@medhini_n
Medhini Narasimhan
3 years
@colorado_reed and I are starting an AI + Climate Change reading group to meet every two weeks, Tuesday at 5 pm. Here’s a website with more info: You can join the meeting announcement list there if you’re interested in attending!
2
4
21
@medhini_n
Medhini Narasimhan
2 years
We’re excited to announce the Berkeley AI Research Climate Initiative! The BCI aims to foster the development of fundamental AI research through directly working on impactful problems related to the most pressing issue of our time: climate change.
0
1
15
@medhini_n
Medhini Narasimhan
2 years
Will be in person @eccvconf presenting at the poster session on Tuesday, Oct 25th, 3:30-5:30 PM, Hall B poster 12! Please do stop by if you’d like to learn more about our work or just chat!
@medhini_n
Medhini Narasimhan
2 years
Excited to share our ECCV 2022 work “TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency” Joint work w @NagraniArsha @jesu9 @miki_rubinstein A. Rohrbach @trevordarrell @CordeliaSchmid Paper: Website:
5
5
62
0
0
14
@medhini_n
Medhini Narasimhan
1 year
My team @GoogleAI is hiring research interns doing multimodal research #GenAI ! Please DM if you're a PhD student and interested in working with us!
4
0
12
@medhini_n
Medhini Narasimhan
1 year
Here are links to my talk () and thesis () if you're interested!
2
0
9
@medhini_n
Medhini Narasimhan
3 years
Join our #AI4Climate reading group today @5PM PST to hear from @sarameghanbeery ! “Computer Vision for Global-Scale Biodiversity Monitoring - Scaling Geospatial and Taxonomic Coverage Using Contextual Clues” Where:  More info:
0
2
7
@medhini_n
Medhini Narasimhan
3 years
A generic video summary is an abridged version of a video that conveys the whole story and features the most important scenes. Yet, the importance of scenes in a video is often subjective, and users should have the option of customizing the summary by using language.
Tweet media one
1
0
6
@medhini_n
Medhini Narasimhan
3 years
Nice illustrations!
@araffin2
Antonin Raffin
3 years
Although not new (Siamese net are from the 90s), Self-Supervised image representation Learning (SSL) is getting a lot of attention recently, as it gets closer to supervised learning performance. ⬇️ A thread trying to summarize recent advances and their challenges ⬇️
Tweet media one
5
170
604
0
1
6
@medhini_n
Medhini Narasimhan
3 years
We show that our model outperforms baselines on human perceptual scores, can handle diverse input videos, and can combine semantic and audiovisual cues in order to synthesize videos that synchronize well with an audio signal. Enjoy more examples here:
1
0
5
@medhini_n
Medhini Narasimhan
3 years
We draw on recent techniques from self-supervised learning to learn this distance metric, allowing us to compare frames in a manner that scales to more challenging dynamics, and to condition on other data, such as audio.
1
0
5
@medhini_n
Medhini Narasimhan
3 years
Our work is inspired by Video Textures, which showed that new videos could be generated from a single one by stitching its frames together in a novel yet consistent order. However, due to its use of hand-designed distance metrics, it was limited to simple, repetitive videos.
1
0
5
@medhini_n
Medhini Narasimhan
3 years
Existing models for generic summarization have not exploited available language models, which can serve as an effective prior for saliency. We propose a single framework for addressing both generic and query-focused video summarization, typically approached separately.
1
0
5
@medhini_n
Medhini Narasimhan
3 years
CLIP-It is a language-guided multimodal transformer that learns to score frames in a video based on their importance relative to one another and their correlation with a user-defined query (query-focused summ.) or an automatically generated dense video caption (generic summ.)
Tweet media one
1
0
4
@medhini_n
Medhini Narasimhan
3 years
We learn representations for video frames and frame-to-frame transition probabilities by fitting a video-specific model trained using contrastive learning. Our model also naturally extends to an audio-conditioned setting without requiring any finetuning.
1
0
4
@medhini_n
Medhini Narasimhan
1 year
@sarameghanbeery Love this! Excited to see some birds
0
0
3
@medhini_n
Medhini Narasimhan
3 years
The Transformer design enables effective contextualization across frames. We demonstrate the impact of language guidance on generic summarization. We establish the new state-of-the-art on both generic and query-focused datasets in supervised and unsupervised settings.
Tweet media one
0
0
4
@medhini_n
Medhini Narasimhan
3 years
We introduce a nonparametric approach for infinite video texture synthesis using a representation learned via contrastive learning.
Tweet media one
1
0
4
@medhini_n
Medhini Narasimhan
2 years
Tweet media one
0
0
4
@medhini_n
Medhini Narasimhan
3 years
@ICCV_2021
ICCV2021
3 years
PCs Update: Preparing for authors notification tomorrow. To avoid CMT traffic, will first publish list of accepted papers as customary. Expect this late afternoon PST. Will tweet! We will then publish status, meta-reviews and final reviews on CMT - platform warned about traffic!
2
10
115
0
0
3
@medhini_n
Medhini Narasimhan
4 years
@ilyasut
Ilya Sutskever
4 years
Synthetic capybaras in different styles
Tweet media one
12
91
667
0
1
3
@medhini_n
Medhini Narasimhan
5 years
@icra2020 Many countries/universities have imposed travel restrictions. Consulates are closed and we cannot apply for visas now even if these bans were to be lifted later. Please consider those who have health issues and don't want to travel. Could we get a refund on the registration?
0
0
2
@medhini_n
Medhini Narasimhan
2 years
We propose an instructional video summarization network that combines a context-aware temporal video encoder and a segment scoring transformer. Using pseudo summaries as weak supervision, our network constructs a visual summary for any instructional video using video and speech
Tweet media one
1
0
2
@medhini_n
Medhini Narasimhan
3 years
Hoping to see you at our first meeting on May 4th at 5PM!
0
0
2
@medhini_n
Medhini Narasimhan
2 years
Evaluation: We collect a high-quality test set, WikiHow Summaries, by scraping WikiHow articles that contain video demonstrations and visual depictions of steps.
Tweet media one
1
0
2
@medhini_n
Medhini Narasimhan
4 years
@pathak2206 @RonghangHu Great work and really cool visualizations!
0
0
2
@medhini_n
Medhini Narasimhan
3 years
We’re especially excited to have folks from the earth, atmospheric, and climate sciences join from Berkeley, so even if it’s not directly related to your research, it will be a nice chance to look for potential connections to your work. This is open to folks outside Berkeley too!
1
0
2
@medhini_n
Medhini Narasimhan
11 months
🫠
@CVPR
#CVPR2024
11 months
#CVPR2024 LLM policy Reviewer Guidelines:
Tweet media one
2
14
71
0
0
1
@medhini_n
Medhini Narasimhan
2 years
Anyone who has tried to follow a recipe from a YouTube video would agree that most videos contain irrelevant filler content! In this work, we introduce an approach for creating short visual summaries of instructional videos containing only the most important steps
Tweet media one
1
0
2
@medhini_n
Medhini Narasimhan
2 years
We outperform several baselines and a state-of-the-art video summarization model on this new benchmark!
Tweet media one
1
0
2
@medhini_n
Medhini Narasimhan
4 years
@xiaolonw Congratulations! :)
0
0
2
@medhini_n
Medhini Narasimhan
2 years
@colorado_reed I’m glad we got to work on BCI and thanks for all the fun discussions! Good luck on your journey :)
0
0
1
@medhini_n
Medhini Narasimhan
2 years
(i) relevant steps are likely to appear in multiple videos of the same task (Task Relevance), and (ii) they are more likely to be described by the demonstrator verbally (Cross-Modal Saliency)
Tweet media one
1
0
1
@medhini_n
Medhini Narasimhan
6 months
@Amey__Joshi Congratulations!!
0
0
1
@medhini_n
Medhini Narasimhan
4 years
Always a nice reminder after a deadline! 🙄
@DlCountdown
AI Conference DL Countdown
4 years
ECML-PKDD 2021: 15 days left! NeurIPS 2021 (abstract): 62 days left! NeurIPS 2021 (paper): 69 days left!
0
2
10
0
0
1
@medhini_n
Medhini Narasimhan
2 months
@_amirbar @TelAvivUni @berkeley_ai @AIatMeta @ylecun Congratulations Amir!! 🎉🎉 Will visit you in NYC :)
1
0
1
@medhini_n
Medhini Narasimhan
3 years
@ak92501 Thank you for sharing!
1
0
1
@medhini_n
Medhini Narasimhan
2 years
@lufthansa I've done both of these already... but I just messaged you again.
0
0
1
@medhini_n
Medhini Narasimhan
3 years
@trevordarrell Thanks so much for your support!
0
0
1
@medhini_n
Medhini Narasimhan
2 years
Data: Existing video summarization datasets rely on manual frame-level annotations, making them subjective and limited in size. To overcome this, we first automatically generate pseudo summaries for a corpus of instructional videos by exploiting two key assumptions
1
0
1
@medhini_n
Medhini Narasimhan
3 years
0
0
1