I've been wrestling with the gap between HCI curricula, which tend to focus on the design process, and the big theories and ideas that animate us in HCI. Here's a model that has starting working well at
@StanfordHCI
:
Despite the promise of AIs improving human decision making, a frustratingly resilient research result has found that people with AI decision-making aids are no better than people alone or AIs alone. Why? Overreliance: we take the AI's recommendations even when we shouldn't.
Congratulations to
@joon_s_pk
and team on the
#UIST2023
Best Paper Award for Generative Agents!
Happily, we already know that the agents can throw a party in celebration.
The
#UIST2023
best paper awards and honorable mentions have been announced:
Congrats to all the authors and thanks to Andy Wilson, who served as Awards Chair. Excited to see these papers and more next week - last chance to register at Standard Rate!
They kept saying that I looked like an “older, retired Tom Holland”. And now my PhD students have celebrated my birthday with the most on-brand custom bobblehead possible.
Fully 80% of papers at UIST from 1990-2010 have been cited by US patents. 20% overall at CHI+CSCW+ Ubicomp+UIST; 13% of papers at all
@sigchi
sponsored venues.
AAAI/IJCAI are 5%, ACL/EMNLP/NAACL are 11%, CVPR/ICCV/ECCV are 25%.
What’s the industrial impact of human-computer interaction (
#HCI
) research? Does HCI research contribute to technological inventions and products? Or are most of its insights ignored by the industry? Our
#CHI2023
paper provides new evidence for these long-standing questions.
What unifies, and what distinguishes, social media designs? Are all the Twitter spinoffs actually meaningfully different designs from each other? Form-From is a design space from
@amyxzh
, myself,
@karger
, and
@answergarden
that will appear at
#cscw2024
People using explainable AIs are empirically no better at decision making than people working alone.
@HelenasResearch
, an Honorable Mention at
#CSCW2023
, demonstrates that this failure is because AI explanations typically require lots of cognitive effort to verify.
I'm presenting tomorrow (Monday, October 15th) from 11am-11:20am at
#CSCW2023
during the XAI session in the Regency Room.
I will be presenting our paper "Explanations Can Reduce Overreliance on AI Systems During Decision-Making", which has won best paper honorable mention🤠.
And even when explainable AIs make clear when the AI is wrong! Helena Vasconcelos's new
#cscw2023
paper finally—finally!—demonstrates that explanations can reduce overreliance and increase decision-making performance even without forcing functions:
🥳👏👏Congratulations to the 2024 ACM
#SIGCHI
awardees!
We recognize 13 SIGCHI members receiving lifetime research and practice, societal impact, and outstanding dissertation awards, as well as 10 being inducted into the SIGCHI Academy!
1/6🧵
I agree with the challenges facing centralized online platforms, but jumping to decentralized as the solution strikes me as a misdirection around the core design question at stake: governance.
Join us at the upcoming
#UIST2023
workshop, Architecting Novel Interactions with Generative AI Models. Featuring a keynote by Will Wright (creator of The Sims and Simcity) and Lauren Elliot (Where In The World Is Carmen Sandiego)!
Curation modeling,
@WanrongHe
’s Best Paper winner at CSCW, demonstrates a method for social media to better maintain norms by naming curators and using community upvotes to estimate whether those curators would want each piece of content in the community.
🎉So excited that my first HCI paper "Cura: Curation at Social Media Scale" has won Best Paper Award at CSCW!
I want to give a huge thank you to my collaborators
@msbernst
@mitchellgordon
@lindsaypopowski
!
I'll be presenting this work at the conference. See you soon!
#cscw2023
Stanford Engineering just launched an open-rank faculty search for researchers studying ethics and society in engineering:
The search can appoint faculty in any department in engineering, including Computer Science.
Instead of centralized vs. decentralized platforms, we need to start with: what's the governance we want? Let the infrastructure fall out from that decision. Otherwise it feels like we're yelling about vim vs. emacs instead of what we ought to actually build.
We do so many social media analyses of single platforms like Twitter. But any single platform is only a small part of the online cultural ecosystem. What are we missing by focusing on single platforms? Upcoming
#cscw2022
paper: . Web-wide meme science!
So proud of
@mitchellgordon
and his brilliant PhD defense talk on “Human-AI Interaction Under Societal Disagreement”. Great lead advising by
@msbernst
! Excellent example of human-centered AI research for
@StanfordHAI
. On to his faculty position
@MITEECS
! Congrats!
Too often, we train only skills that are easy to practice and measure. By creating a conflict simulation environment with an agent for feedback and practice,
@oshaikh13
opened the door to new skills—observing that people double their use of effective conflict strategies!
#chi2024
Before taking it out on your roommate for leaving dirty dishes out, you probably want to practice your conflict resolution skills first. Expert conflict resolution trainers, however, are EXPENSIVE. What if we practiced with an angry LLM instead? 😈
#CHI24
CSCW AC, 2011, when I was in grad school: “I see little in this paper that I would call research. […] if it was any good it should be a commercial system by now.
Distressed by how social media algorithms can amplify antisocial behavior? Experts from computer science, communications, psychology, and law seek to develop an approach that encodes societal values into social media AI. (2/7)
Thrilled to announce the keynote speakers for our Michigan AI 2023 Symposium, Michael Bernstein
@msbernst
(Stanford) & Alexei Efros (UC Berkeley)!
Thanks to our sponsors: LG AI Research (silver) and Jane Street, KLA,
@Voxel51
(bronze).
Free registration:
Very cool work by
@tatsu_hashimoto
and colleagues: ask LLMs questions from Pew Surveys in order to measure whose opinions the model's outputs most closely reflects.
We know that language models (LMs) reflect opinions - from internet pre-training, to developers and crowdworkers, and even user feedback. But whose opinions actually appear in the outputs? We make LMs answer public opinion polls to find out:
When 5% of Reddit comments ought not to be there by the subreddits' own moderation strategies—e.g., misogyny, extreme vulgarity, bigotry, personal attacks— prosocial norms fall apart. This is why I see more hope for platform design shifts than for deus ex content removal.
Our new research estimates that *one in twenty* comments on Reddit are violations of its norms: anti-social behaviors that most subreddits try to moderate. But almost none are moderated.
🧵 on my upcoming
#cscw2022
paper w/
@josephseering
and
@msbernst
:
Social media designs fail when we mis-imagine how they'll behave at scale. But LLMs have learned social behaviors both good and bad—enabling us to create social simulacra that populate a prototype with thousands of accounts. High and low effort posts, trolls, rulebreaking, ...
I think that this leaves us with a challenge: we don't have many explainability techniques that are substantially less work to verify than completing the task manually.
It's been great working with the folks from
@AvidMldb
to launch a public version of IndieLabel, our prototype end-user auditing system (from our CSCW22 paper)! We hope this demo can seed further discussion and future work on user-driven AI audits ✨
Postdoc position: How should people and communities articulate how AIs should navigate difficult tradeoffs? Prof.
@sanmikoyejo
and I have a jointly mentored postdoctoral scholar position open at
@Stanford
CS starting in the fall. Information here:
Many objectives that have historically been out of reach for feed ranking algorithms may now be possible. As an example,
@JiaChenyan
and
@michelle123lam
demonstrated that social science constructs around democracy attitudes can be well modeled in ranking!
Can we design AI systems to consider democratic values as their objective functions? Our new
#CSCW24
paper w/
@michelle123lam
, Minh Chau Mai,
@jeffhancock
,
@msbernst
introduces a method for translating social science constructs into social media AIs (1/12)
Our hunch was that many of the null results fell into a valley where the cognitive costs of verifying the AI's explanation were roughly the same as the costs of doing the task yourself. When they're both nontrivially costly, people ignore the explanations and just rely on the AI.
Thrilled to share that I have accepted an offer to join the CS department
@Stanford
as an assistant professor, starting this Sept. We will continue to work on socially aware and positive
#NLP
. Very excited to explore new opportunities and collaborations at Stanford 😀
I’m recruiting PhD students and there are still a few days left to apply! If you’re excited about working at the intersection of HCI and AI, come join my new group
@MITEECS
. Please submit at by 12/15!
In a governance metaphor, centralized platforms ain't democracies: by
@margaretlevi
@henryfarrell
@timoreilly
But decentralized platforms are essentially anarchic, and there are good reasons governance doesn't work that way either:
super excited to share that I'll soon be starting a PhD in computer science at
@Stanford
, supported by the Stanford Graduate Fellowship and
@NSF
fellowship!!
Helena demonstrated through a series of five studies that if we manipulate the difficulty of the task and explanation, people do a cost-benefit tradeoff: they reduce overreliance and increase accuracy when explanations save significant effort.
Model sketching lets you rapidly iterate over the concepts that you want your model to capture, rather than the details of the prompt or the learning rate. Upcoming work at
#chi2023
by
@michelle123lam
!
When building ML models, we often get pulled into technical implementation details rather than deliberating over critical normative questions. What if we could directly articulate the high-level ideas that models ought to reason over?
#CHI2023
🧵
A policy brief conveying some of the lessons of our survey experiment comparing the perceived legitimacy of different content moderation strategies.
I'm not sure any of the student leads are on Twitter, so instead I'll just take this moment to remind you that
@amyxzh
is great.
NEW POLICY BRIEF: Policymakers play an important role in shaping the future of online speech and content moderation. Our latest brief seeks to understand people’s perceptions of content moderation legitimacy, providing a pathway for better online platforms.
2 (rather overdue but still very exciting) announcements: I’ll be joining
@UMichCSE
as an assistant professor in fall 2024! And this year, I’ll be a postdoc at
@StanfordHAI
working with
@sanmikoyejo
and
@msbernst
!
Beyond thrilled for these next steps 😊
Chain of Thought reasoning prompts—like "Let's think step by step"—make large language models more performant. Including, it turns out, at spewing out toxic and biased content. In our preprint, we evaluate zero-shot CoT on harmful questions & stereotypes:
This is how the internet gets a screenshot of me writing "2+2=5" on a whiteboard.
Check out the video about
@HelenasResearch
! Her project on XAI overreliance will appear at CSCW 2023.
Do you want to learn about how explanations can help reduce overreliance on AIs?
Watch this fantastic, out-of-this-world, one-of-a-kind, spectacular, etc. short video explaining our work! We put a lot of ❤️ into it and would appreciate the views.
“Can we get a new text analysis tool?”
“No—we have Topic Model at home”
Topic Model at home: outputs vague keywords; needs constant parameter fiddling🫠
Is there a better way? We introduce LLooM, a concept induction tool to explore text data in terms of interpretable concepts🧵
After 3+ years, today is the day that my book “The Worlds I See” gets to see the world itself. It is a science memoir of the intertwining histories of me becoming an
#AI
scientist, and the making of the modern AI itself. All versions are now on Amazon 1/
@JessicaHullman
While I’ll never be as cool as
@cfiesler
, once there were over 400k views on this TikTok it was clear that the correct response was for me to just own the Holland vibes
I am honored to be named a 2022 Microsoft Research PhD Fellow!
Thank you so much to my advisers,
@msbernst
and
@percyliang
, as well as
@merrierm
,
@kkarahal
, and everyone who shaped me as a researcher. I'm really excited to continue exploring the intersection of HCI and AI!
Stanford's
@stanfordsymsys
program is hiring its first lecturer! Symbolic Systems was my undergraduate major, focused on combining minds and machines: its foci include AI, cognitive science, NLP, neuroscience, and HCI.
Information and application here:
Mass rejections have been a thorn in Mechanical Turk workers' side for a decade. In the meantime, other marketplaces have developed more mature mechanisms for handling complaints, appeals, and bad actors, which
@amazon
could draw on and implement. Let's make this happen.
@IanArawjo
The methodological pluralism of HCI, which is one source of its strength, also makes it challenging to learn the individual methodological traditions well. It’s like we come in and try to play jazz, pop rock, classical, and podcasting all in a year or two.
We're looking for a broad set of scholars, ranging across qualitative and critical studies, quantitative data science or experimental research, policy, social computing platform design and deployment, and AI. Apply by February 20.
Methodologically, hooray on getting a paper with bootstrap estimation into
#cscw2022
! It's interesting to consider this approach as empowering audits beyond algorithms, e.g., of larger sociotechnical processes that require complicated sampling strategies.
Research takes unpredictable turns. Joon and I were inspired by work that
@unignorant
began 9 years ago (!) Here's a screenshot from our original rejected paper in 2014. That rejected research later evolved into Augur ().
Something something persistence?
@hcomp_conf
and
@ci_acm
are integrating more deeply this year, with a joint CFP and two program subcommittees. CI adds archival publication, and HCOMP adds a non-archival option. Abstracts due 6/2.
In blind comparisons, participants were rarely able to distinguish social simulacra from real content. In evaluation with social computing designers,
@joon_s_pk
and
@lindsaypopowski
saw designers use social simulacra to refine communities' goals, rules, and moderation responses.
I've been interested in the recession of simulation/modeling as a method in HCI over the last 2+ decades. Since it bears repeating, don't use models instead of talking to actual people: that's A Bad Idea. But I do think that LLMs offer a fascinating lens onto human behavior.
One of my favorite little investigations was coding the designs of the systems listed in a Wikipedia social media timeline () as of their launch date using Form-From. The pattern weaves back and forth between flat systems and threaded systems over time.
@munmun10
@IanArawjo
I’m a moderator on arxiv for the cs.HC category. There’s one set of (typically brief) holds where the system worries the paper may not compile correctly or be a real paper. Staff clear those pretty quickly I think.
So just what is this thing? It's been a long time since CSCW introduced Johansen's Time-Space matrix, and at this point a vast majority of social media would fall into the different time - different place quadrant, making it not very productive as a design or theory tool.
Emerging from maternity leave to learn this wonderful news — especially bc Viviana Zelizer has been a mentor and advocate since the first day I arrived in the US as an international student!
How can we enable a broader review of algorithms’ impacts? Stanford researchers created a tool to put algorithmic auditing in the hands of impacted communities.
If you don't like our 2x3 space, there's also a full 62-dimension treatment in the appendix from our inductive process, as well as a set of 11 categories those fall into. They're more useful for fine-grained distinctions.
@joon_s_pk
quietly put the paper on arXiv back in April after the paper deadline and then planned to take a long-overdue vacation. Some folks found the paper, and the attention it got was incredible. We are so thankful for the enthusiasm.
Joon did eventually take that vacation.
@munmun10
@IanArawjo
There’s another set of holds where arXiv’s category classifier disagrees with what the authors selected, and they have to wait for the primary category mod to come in, review it as complete and relevant, and mark it as good to go. That can be slower.
Form-From asks two questions: (1) What is the principal shape, or form, of the content: either threaded or flat? (2) From where or from whom one might receive content, ranging from spaces to networks to the commons?
@im__jane
Do they all need to be run simultaneously? I often suggest onboarding in waves, so that you can deal with issues without getting 100 simultaneous confused issues.
So who is originating all the image memes that get shared? It's not the periphery: internet meme production has become incredibly centralized. The more central the community to the overall web, the more diffusion events its memes originated. In our dataset, Reddit was
#1
.
@ishtiaqueSIA
Reaching consensus on an online governance design feels like a stretch to me. But I think there are several interesting and viable governance alternatives, as
@EthanZ
discusses, that at least to me feel like a great start.