So proud of
@mitchellgordon
and his brilliant PhD defense talk on “Human-AI Interaction Under Societal Disagreement”. Great lead advising by
@msbernst
! Excellent example of human-centered AI research for
@StanfordHAI
. On to his faculty position
@MITEECS
! Congrats!
I’m recruiting PhD students and there are still a few days left to apply! If you’re excited about working at the intersection of HCI and AI, come join my new group
@MITEECS
. Please submit at by 12/15!
1/n What should ML models do when a dataset’s annotators — the people that models are trying to emulate — disagree? In today’s typical supervised learning pipeline, we model an aggregate pseudo-human, predicting the majority vote label while ignoring annotators who disagree.
hi, I’ll be at
#CHI2022
! So excited to present jury learning, and please say hi if you’re interested in talking about human-centered AI, re-thinking today’s ML pipeline, evaluation metrics, or anything else!
I’m so excited and honored to be included among Apple’s first class of PhD fellows in AI/ML — and grateful for the opportunity to work with incredible mentors like
@foil
.
We’re presenting “HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models” as an oral at
#NeurIPS2019
at 4:50 today in West Exhibition Hall C + B3!
4/n We introduce jury learning, a new supervised learning architecture — a technical and normative approach — that models the individual voices in your training dataset...
3/n There’s not really a good answer, so we tried asking a different question: whose voices is our model emulating? Datasets are ultimately made up of individual people. So when annotators disagree, instead of modeling some aggregate pseudo-human, let’s model individual people.
2/n Or, maybe we do a bit better, and have our AI predict a distribution (e.g. 40% of annotators would think A, 60% would think B). But if you’re a practitioner who’s then got to make a single decision (e.g., do I remove this comment or not), what do you DO with that information?
Join us at the upcoming
#UIST2023
workshop, Architecting Novel Interactions with Generative AI Models. Featuring a keynote by Will Wright (creator of The Sims and Simcity) and Lauren Elliot (Where In The World Is Carmen Sandiego)!
5/n enabling us to then design a system for interactively exploring, tuning, and shifting the behavior of the classifier, by explicitly choosing which annotators our classifier will emulate, in what proportion.
Beyond excited to share our
#FAccT2022
paper "Sensible AI: Re-imagining Interpretability and Explainability using Sensemaking Theory"
Building on incredible recent work in this space, this paper is about *who* interpretability and XAI are intended for. 🧵
@ani_nenkova
@msbernst
@tatsu_hashimoto
For instance, while the Kaggle competition’s most popular model achieved a precision of .527 and recall of .827 over standard aggregated labels, our approach adjusted that down to a precision of .514, and recall of .499.
@ani_nenkova
@msbernst
@tatsu_hashimoto
Agreed that Jigsaw's class imbalance makes AUC a pretty bad metric for that task, even ignoring the issue of disagreement. Though I'd suggest that precision and recall still aren't doing enough to encode the task's subjectivity.