Medhini Narasimhan @medhini_n Twitter profile | Pikagi

Pikagi

Medhini Narasimhan

@medhini_n

1,138

Followers

404

Following

20

Media

142

Statuses

Researcher @googledeepmind working on Veo Prev: Ph.D. @berkeley_ai , MS @IllinoisCS , Intern @GoogleAI @MetaAI @Zillow

Berkeley, CA

https://t.co/Hq9HEiuHNu

Joined June 2009

Don't wanna be here? Send us removal request.

Pinned Tweet

@medhini_n

Medhini Narasimhan

4 months

Feel privileged to be a part of the amazing team behind Veo!!

Tweet card media

Veo is our most capable video generation model to date. It generates high-quality, 1080p resolution videos that can go beyond a minute, in a wide range of cinematic and visual styles.

deepmind.google

@GoogleDeepMind

Google DeepMind

@GoogleDeepMind

4 months

Introducing Veo: our most capable generative video model. 🎥 It can create high-quality, 1080p clips that can go beyond 60 seconds. From photorealism to surrealism and animation, it can tackle a range of cinematic styles. 🧵 #GoogleIO

147

942

4K

0

3

36

Last Seen Profiles

@corgionbase

@fielthebapet

@annaiarotska

@bacchanale_

@stwmaniax

@jordan_connery

@ShadyRecords

@lahpoppy

@MIYAOO1224

@tropfatiguesah

@JMMRKZ

@ElonWBasketball

@skattiecat

@iiconic_

@salary_up_kw

@mimislovelyx

@ManFlo59

@StephenUzzell2

@_martaay_

@CelineBardet

@ohiozei

@melodysogoat

@ValenCCSScort

@stw_pdg

@StalkingVi96051

@zecascalambrini

@EliudG829

@YHamiltonBlog

@Guirassy_19

@TheLowreyBang

@tsprada2624640

@bokeplokalmalam

@Dewi66334486

@dul_turkporno

@xiaoshou03

@EnriqueBas69

@medhini_n

Medhini Narasimhan

1 year

After 4 wonderful years at Berkeley, I'm Phinally Done :) Huge thanks to everyone who supported me, especially my advisor @trevordarrell This week, I'm starting as a Research Scientist at Google Labs with Steve Seitz! Excited to continue my research on video understanding :)

13

3

219

@medhini_n

Medhini Narasimhan

4 months

👩‍🎓🐻

Tweet media one

Tweet media two

Tweet media three

13

1

127

@medhini_n

Medhini Narasimhan

3 years

Want to create short summaries of long videos? Check out our work, “CLIP-It! Language-Guided Video Summarization”. Joint work with Anna Rohrbach and @trevordarrell Paper: Project Page: Results:

Tweet card media

CLIP-It Results Video

Website: https://medhini.github.io/clip_it/

www.youtube.com

8

23

112

@medhini_n

Medhini Narasimhan

3 years

Excited to share our work, “Strumming to the Beat: Audio-Conditioned Contrastive Video Textures” with Shiry Ginosar, @andrewhowens , Alyosha Efros, and @trevordarrell Website: Talk: Paper:

Tweet card media

Strumming to the Beat: Audio-Conditioned Contrastive Video Textures

Project website: https://medhini.github.io/audio_video_textures/

www.youtube.com

5

12

69

@medhini_n

Medhini Narasimhan

6 months

Happy to have played a small role in this big launch :) Text-to-live-images:

Tweet card media

Google launches a video clip generator | TechCrunch

Imagen 2 can now create short, four-second videos from text prompts, along the lines of AI-powered clip generation tools like Runway, Pika and Irreverent Labs.

2

1

67

@medhini_n

Medhini Narasimhan

2 years

Excited to share our ECCV 2022 work “TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency” Joint work w @NagraniArsha @jesu9 @miki_rubinstein A. Rohrbach @trevordarrell @CordeliaSchmid Paper: Website:

5

5

62

@medhini_n

Medhini Narasimhan

3 years

Happy to share that CLIP-It will be presented at #NeurIPS2021 !

@medhini_n

Medhini Narasimhan

3 years

Want to create short summaries of long videos? Check out our work, “CLIP-It! Language-Guided Video Summarization”. Joint work with Anna Rohrbach and @trevordarrell Paper: Project Page: Results:

8

23

112

0

2

56

@medhini_n

Medhini Narasimhan

9 months

Thrilled to see our work, “Modular Visual Question Answering via Code Generation” featured in Google’s year-in-review! w/ @sanjayssub @kushaltk1248 Kevin Yang @NagraniArsha @CordeliaSchmid @andyzeng_ @trevordarrell Dan Klein

Tweet card media

Modular Visual Question Answering via Code Generation

We present a framework that formulates visual question answering as modular code generation. In contrast to prior work on modular approaches to VQA, our approach requires no additional training...

@demishassabis

Demis Hassabis

9 months

It's always great to look back on the year in a year-in-review blogpost with @JeffDean & James Manyika. It's been an amazingly productive year for us, doing awesome research, shipping products and advancing science - 2024 is going to be incredible!

38

140

948

0

5

51

@medhini_n

Medhini Narasimhan

2 years

Never fly @lufthansa !! They canceled my flight and haven't issued a refund yet. It's been 6 months and I haven't received a response from their customer relations team. How can airlines get away with this?!

14

6

42

@medhini_n

Medhini Narasimhan

1 year

🚨I'll be @ICCVConference next week, presenting my work at the @cveu_workshop on Oct 2nd! Excited to share two new works on learning from instructional videos. Please stop by for the talk/poster or reach out if you'd like to connect! Oral: 1:45 - 2:30 PM Poster: 12 - 1:45 PM

2

1

38

@medhini_n

Medhini Narasimhan

10 months

There couldn’t be a better time to join @GoogleDeepMind !! Excited for Monday :)

@GoogleDeepMind

Google DeepMind

@GoogleDeepMind

10 months

We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google ’s largest and most capable AI model. Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵

171

2K

6K

0

1

31

@medhini_n

Medhini Narasimhan

4 months

Come try what we've been building :)

Tweet card media

Veo is our most capable video generation model to date. It generates high-quality, 1080p resolution videos that can go beyond a minute, in a wide range of cinematic and visual styles.

deepmind.google

@Google

Google

4 months

🎥Introducing Veo, our new generative video model from @GoogleDeepMind . With just a text, image or video prompt, you can create and edit HQ videos over 60 seconds in different visual styles. Join the waitlist in Labs to try it out in our new experimental tool, VideoFX #GoogleIO

39

346

1K

1

0

28

@medhini_n

Medhini Narasimhan

3 years

@colorado_reed and I are starting an AI + Climate Change reading group to meet every two weeks, Tuesday at 5 pm. Here’s a website with more info: You can join the meeting announcement list there if you’re interested in attending!

2

4

21

@medhini_n

Medhini Narasimhan

3 years

Website: Talk: Paper:

Tweet card media

Strumming to the Beat: Audio-Conditioned Contrastive Video Textures

Project website: https://medhini.github.io/audio_video_textures/

www.youtube.com

@_akhaliq

AK

4 years

Contrastive Video Textures pdf: abs:

Tweet media one

0

9

38

0

5

19

@medhini_n

Medhini Narasimhan

2 years

We’re excited to announce the Berkeley AI Research Climate Initiative! The BCI aims to foster the development of fundamental AI research through directly working on impactful problems related to the most pressing issue of our time: climate change.

Tweet card media

New UC Berkeley initiative uses AI research to solve climate problems

The new Berkeley AI Research Climate Initiative aims to build partnerships and conduct groundbreaking artificial intelligence research in service of solving one of society’s most intractable proble...

cdss.berkeley.edu

0

1

15

@medhini_n

Medhini Narasimhan

2 years

Will be in person @eccvconf presenting at the poster session on Tuesday, Oct 25th, 3:30-5:30 PM, Hall B poster 12! Please do stop by if you’d like to learn more about our work or just chat!

@medhini_n

Medhini Narasimhan

2 years

Excited to share our ECCV 2022 work “TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency” Joint work w @NagraniArsha @jesu9 @miki_rubinstein A. Rohrbach @trevordarrell @CordeliaSchmid Paper: Website:

5

5

62

0

0

14

@medhini_n

Medhini Narasimhan

1 year

My team @GoogleAI is hiring research interns doing multimodal research #GenAI ! Please DM if you're a PhD student and interested in working with us!

4

0

12

@medhini_n

Medhini Narasimhan

1 year

Here are links to my talk () and thesis () if you're interested!

Tweet card media

Doctoral Dissertation Talk

My dissertation talk recording (07/25/2023). It's Dr. Medhini now!

www.youtube.com

2

0

9

@medhini_n

Medhini Narasimhan

3 years

Join our #AI4Climate reading group today @5PM PST to hear from @sarameghanbeery ! “Computer Vision for Global-Scale Biodiversity Monitoring - Scaling Geospatial and Taxonomic Coverage Using Contextual Clues” Where: More info:

0

2

7

@medhini_n

Medhini Narasimhan

3 years

A generic video summary is an abridged version of a video that conveys the whole story and features the most important scenes. Yet, the importance of scenes in a video is often subjective, and users should have the option of customizing the summary by using language.

Tweet media one

1

0

6

@medhini_n

Medhini Narasimhan

3 years

Nice illustrations!

@araffin2

Antonin Raffin

3 years

Although not new (Siamese net are from the 90s), Self-Supervised image representation Learning (SSL) is getting a lot of attention recently, as it gets closer to supervised learning performance. ⬇️ A thread trying to summarize recent advances and their challenges ⬇️

Tweet media one

5

170

604

0

1

6

@medhini_n

Medhini Narasimhan

3 years

We show that our model outperforms baselines on human perceptual scores, can handle diverse input videos, and can combine semantic and audiovisual cues in order to synthesize videos that synchronize well with an audio signal. Enjoy more examples here:

Tweet card media

contrastive_vt_2021

Unconditional Contrastive Video Textures

sites.google.com

1

0

5

@medhini_n

Medhini Narasimhan

3 years

We draw on recent techniques from self-supervised learning to learn this distance metric, allowing us to compare frames in a manner that scales to more challenging dynamics, and to condition on other data, such as audio.

1

0

5

@medhini_n

Medhini Narasimhan

3 years

Our work is inspired by Video Textures, which showed that new videos could be generated from a single one by stitching its frames together in a novel yet consistent order. However, due to its use of hand-designed distance metrics, it was limited to simple, repetitive videos.

1

0

5

@medhini_n

Medhini Narasimhan

3 years

Existing models for generic summarization have not exploited available language models, which can serve as an effective prior for saliency. We propose a single framework for addressing both generic and query-focused video summarization, typically approached separately.

1

0

5

@medhini_n

Medhini Narasimhan

3 years

CLIP-It is a language-guided multimodal transformer that learns to score frames in a video based on their importance relative to one another and their correlation with a user-defined query (query-focused summ.) or an automatically generated dense video caption (generic summ.)

Tweet media one

1

0

4

@medhini_n

Medhini Narasimhan

3 years

We learn representations for video frames and frame-to-frame transition probabilities by fitting a video-specific model trained using contrastive learning. Our model also naturally extends to an audio-conditioned setting without requiring any finetuning.

1

0

4

@medhini_n

Medhini Narasimhan

1 year

@sarameghanbeery Love this! Excited to see some birds

0

0

3

@medhini_n

Medhini Narasimhan

3 years

The Transformer design enables effective contextualization across frames. We demonstrate the impact of language guidance on generic summarization. We establish the new state-of-the-art on both generic and query-focused datasets in supervised and unsupervised settings.

Tweet media one

0

0

4

@medhini_n

Medhini Narasimhan

3 years

We introduce a nonparametric approach for infinite video texture synthesis using a representation learned via contrastive learning.

Tweet media one

1

0

4

@medhini_n

Medhini Narasimhan

2 years

@NickEMoran Yup...

Tweet media one

0

0

4

@medhini_n

Medhini Narasimhan

3 years

@ICCV_2021

ICCV2021

3 years

PCs Update: Preparing for authors notification tomorrow. To avoid CMT traffic, will first publish list of accepted papers as customary. Expect this late afternoon PST. Will tweet! We will then publish status, meta-reviews and final reviews on CMT - platform warned about traffic!

2

10

115

0

0

3

@medhini_n

Medhini Narasimhan

4 years

@ilyasut

Ilya Sutskever

4 years

Synthetic capybaras in different styles

Tweet media one

12

91

667

0

1

3

@medhini_n

Medhini Narasimhan

5 years

@icra2020 Many countries/universities have imposed travel restrictions. Consulates are closed and we cannot apply for visas now even if these bans were to be lifted later. Please consider those who have health issues and don't want to travel. Could we get a refund on the registration?

0

0

2

@medhini_n

Medhini Narasimhan

2 years

We propose an instructional video summarization network that combines a context-aware temporal video encoder and a segment scoring transformer. Using pseudo summaries as weak supervision, our network constructs a visual summary for any instructional video using video and speech

Tweet media one

1

0

2

@medhini_n

Medhini Narasimhan

3 years

Hoping to see you at our first meeting on May 4th at 5PM!

0

0

2

@medhini_n

Medhini Narasimhan

2 years

Evaluation: We collect a high-quality test set, WikiHow Summaries, by scraping WikiHow articles that contain video demonstrations and visual depictions of steps.

Tweet media one

1

0

2

@medhini_n

Medhini Narasimhan

4 years

@pathak2206 @RonghangHu Great work and really cool visualizations!

0

0

2

@medhini_n

Medhini Narasimhan

3 years

We’re especially excited to have folks from the earth, atmospheric, and climate sciences join from Berkeley, so even if it’s not directly related to your research, it will be a nice chance to look for potential connections to your work. This is open to folks outside Berkeley too!

1

0

2

@medhini_n

Medhini Narasimhan

11 months

🫠

@CVPR

#CVPR2024

11 months

#CVPR2024 LLM policy Reviewer Guidelines:

Tweet media one

2

14

71

0

0

1

@medhini_n

Medhini Narasimhan

2 years

Anyone who has tried to follow a recipe from a YouTube video would agree that most videos contain irrelevant filler content! In this work, we introduce an approach for creating short visual summaries of instructional videos containing only the most important steps

Tweet media one

1

0

2

@medhini_n

Medhini Narasimhan

2 years

We outperform several baselines and a state-of-the-art video summarization model on this new benchmark!

Tweet media one

1

0

2

@medhini_n

Medhini Narasimhan

4 years

@xiaolonw Congratulations! :)

0

0

2

@medhini_n

Medhini Narasimhan

2 years

Code:

Tweet card media

GitHub - medhini/audio-video-textures: Audio-conditioned video texture generation

Audio-conditioned video texture generation. Contribute to medhini/audio-video-textures development by creating an account on GitHub.

0

0

1

@medhini_n

Medhini Narasimhan

2 years

@colorado_reed I’m glad we got to work on BCI and thanks for all the fun discussions! Good luck on your journey :)

0

0

1

@medhini_n

Medhini Narasimhan

2 years

(i) relevant steps are likely to appear in multiple videos of the same task (Task Relevance), and (ii) they are more likely to be described by the demonstrator verbally (Cross-Modal Saliency)

Tweet media one

1

0

1

@medhini_n

Medhini Narasimhan

6 months

@Amey__Joshi Congratulations!!

0

0

1

@medhini_n

Medhini Narasimhan

4 years

Always a nice reminder after a deadline! 🙄

@DlCountdown

AI Conference DL Countdown

4 years

ECML-PKDD 2021: 15 days left! NeurIPS 2021 (abstract): 62 days left! NeurIPS 2021 (paper): 69 days left!

0

2

10

0

0

1

@medhini_n

Medhini Narasimhan

2 months

@_amirbar @TelAvivUni @berkeley_ai @AIatMeta @ylecun Congratulations Amir!! 🎉🎉 Will visit you in NYC :)

1

0

1

@medhini_n

Medhini Narasimhan

3 years

@ak92501 Thank you for sharing!

1

0

1

@medhini_n

Medhini Narasimhan

2 years

@lufthansa I've done both of these already... but I just messaged you again.

0

0

1

@medhini_n

Medhini Narasimhan

3 years

@trevordarrell Thanks so much for your support!

0

0

1

@medhini_n

Medhini Narasimhan

2 years

Data: Existing video summarization datasets rely on manual frame-level annotations, making them subjective and limited in size. To overcome this, we first automatically generate pseudo summaries for a corpus of instructional videos by exploiting two key assumptions

1

0

1

@medhini_n

Medhini Narasimhan

3 years

@trevordarrell @xiaolonw

0

0

1