I’m thrilled to announce that I'll be joining
@iitmadras
as an Assistant Professor in April 2024!
I’m immensely grateful to my amazing mentors, family, and friends for their unwavering support. (1/4)
How can we measure the gap between machine text and human text?
We introduce MAUVE, a new comparison measure for open-ended text generation, in our upcoming oral presentation at NeurIPS 2021.
Paper:
Pip package:
1/n
Returning home to India and contributing to the nation's vibrant academic community fills me with immense gratitude and excitement. I look forward to collaborating with talented students and researchers to make a positive impact. (3/4)
I'll be building a new group focusing on the theory and practice of ML & AI research, exploring exciting areas like:
- Privacy-preserving ML & federated learning
- Robust & generalizable models and their optimization
- Generative AI, LLMs, and their applications
(2/4)
If you're interested in joining my group or collaborating, please feel free to contact me! I’ll also be at
#NeurIPS
in person and am happy to chat! (4/4)
#IITMadras
#Research
@iitmcse
Calling motivated students interested in pursuing MS/PhD in ML/AI, specifically privacy & generative AI!
The research group I'm starting at
@iitmadras
has openings!
Apply by *Mar 31* directly to
@DSAI_IITM
or
@iitmcse
at !
Excited to present our work "Federated Learning with Partial Model Personalization" at
@icmlconf
#ICML2022
!
Poster: Thu 7/21, 6-8 pm EDT in Hall E
#724
Spotlight: Thu 7/21, 4:40 pm in Hall G (Applications/Optim.)
1/n
Interested in how federated learning scales to foundation models?
Concerned about how we do not have suitably large federated / group-partitioned / user-stratified datasets?
We got you covered at
#NeurIPS
poster
#1217
at 5 pm today (Thursday).
1/ Our work is out! 🚨
Towards Federated Foundation Models:
Scalable Dataset Pipelines for Group-Structured Learning
We push federated learning research closer to LLM scales.
Paper:
Joint with
@nicki_mitch
&
@KrishnaPillutla
Thread below.
📢📢At the last minute, I decided to go on the job market this year!!!
Grateful for RTs & promotion at your univ.😇
CV & Statements:
Will be at
#NeurIPS2023
! presenting AdANNS, Priming, Objaverse & MADLAD. DM if you are around, would love to catch up👋
In a related NeurIPS 2021 paper led by
@UWStat
grad student Lang Liu, we also study the theory of MAUVE and divergence frontier methods in general. More details on this are coming up shortly.
Paper: 8/8
Super excited about our recent work on federated foundation models!
- New LLM-scale federated datasets (+ software to create your own from TFDS and
@huggingface
datasets)
- Federated training of a GPT-2-sized LM (from scratch) shows meta-learning!
1/ Our work is out! 🚨
Towards Federated Foundation Models:
Scalable Dataset Pipelines for Group-Structured Learning
We push federated learning research closer to LLM scales.
Paper:
Joint with
@nicki_mitch
&
@KrishnaPillutla
Thread below.
If you're a graduating / recent undergrad /MS and you'd like to spend two years learning from the best, experimenting and creating large-scale impact - come join our Predoctoral Researcher Program. Apply before Dec 18!
Thrilled to share that I will start a tenure-track assistant professor position in the CS department at Boston University! I am looking for PhD students in the area of programming languages with applications to distributed systems, cryptography, and machine learning.
Given neural text and human text, MAUVE yields a scalar measure of the gap between them.
It directly compares the learnt distribution from a text generation model to the distribution of human-written text using information divergence frontiers. 3/n
Finally, we show that MAUVE correlates better with human judgments compared to existing metrics for evaluating generations, e.g. generation perplexity, Self-BLEU, Jensen-Shannon divergence 6/n
I'll be at
#ICLR
at these posters:
Correlated Noise Provably Beats Independent Noise for Differentially Private Learning (
#217
@ 4:30 pm on Wednesday)
Distributionally Robust Optimization with Bias and Variance Reduction (Spotlight,
#154
@ 10:45 am on Thursday)
The geometric median, a multivariate generalization of the median, is similarly robust to outliers. It needs to be computed numerically via convex optimization.
Package features:
* GPU support (PyTorch only)
* Compatible w/ backprop (PyTorch only)
* Blazing fast algorithm
2/3
This captures two types of errors: (I) where the model assigns high probability to sequences which do not resemble human-written text, and (II) where the model distribution does not cover the human distribution, i.e., it fails to yield diverse samples. 4/n
We empirically show that MAUVE is able to quantify known properties of generated text with respect to text length, model size, and decoding more correctly and with fewer restrictions than existing distributional evaluation metrics. 5/n
Through extensive experiments, we find that the best layers to personalize are based on the diversity in the task.
* next word pred.: diverse outputs => personalize output layer
* speech: diverse inputs => personalize input layer
* label skew => architectural solutions
6/n
We applied the geometric median for federated learning in our paper .
At the time, we did not find any Python package for the geometric median. We open-sourced this package to fill the gap. Hope you find it useful!
We compare 2 local update algos from previous work. We establish 1/sqrt(t) rates for both in the smooth nonconvex case but alternating update of personal and shared params has better constants.
Its analysis is also technically challenging due to dependent random variables.
4/n
We study personalizing a few layers of the model. We have
(1) convergence theory of optimization algorithms for personalized federated learning,
(2) extensive empirical experiments in text, vision, and speech
3/n
Excited to present our work “Unleashing the Power of Randomization in Auditing Differentially Private ML” at
#NeurIPS
at poster
#1609
at 10:45 am today (Thursday).
Joint w/ Galen Andrew,
@KairouzPeter
, Brendan McMahan,
@AlinaMOprea
,
@sewoong79
(1/n)
There are several interesting technical details, including:
- a bias-variance trade-off for random canaries!
- our very own novel concentration inequalities!
- lots of cool math!
See you at the poster or drop me a line for more details!
(5/n)