We spent over a year scavenging for AI audit tools and interviewing audit practitioners about their process.
What we found: the audit process is more complicated than we think, and the tasks we need tooling for extends far beyond just evaluation.
See:
🚀New Paper from
@ryanbsteed
,
@brianavecchione
,
@Abebab
,
@rajiinio
and myself. 🚀
Through interviews with 35 practitioners and analysis of 390 AI audit tools, we examine the gaps between existing tooling and effective AI accountability.
🪡 1/n
ICYMI: Twitter conducted one of the largest platform audits ever (randomized trial on 58 million users) to assess "political bias" of their algorithm. In 6 of 7 countries, "the mainstream political right enjoys higher algorithmic amplification than the mainstream political left."
These are the four most popular misconceptions people have about race & gender bias in algorithms.
I'm wary of wading into this conversation again, but it's important to acknowledge the research that refutes each point, despite it feeling counter-intuitive.
Let me clarify.👇🏾
FOUR things to know about race and gender bias in algorithms:
1. The bias starts in the data
2. The algorithms don't create the bias but they do transmit it
3. There are a huge number of other biases. Race and gender bias are just the most obvious
4. It's fixable! 🧵👇
ICYMI
@jovialjoy
+
@AOC
= 🔥🔥🔥 My fave exchange from House Oversight hearing on facial recognition:
"We saw that these algorithms are effective to.. different degrees. Are they most effective on women?"
"No"
"Are they most effective on people of color?"
"Absolutely not."
We can't keep regulating AI as if it..works.
Most policy interventions start with the assumption that the technology lives up to its claims of performance but policymakers & critical scholars need to stop falling for the corporate hype and should scrutinize these claims more.
I understand the concern over Timnit’s resignation from Google. She’s done a great deal to move the field forward with her research. I wanted to share the email I sent to Google Research and some thoughts on our research process.
In our upcoming paper, we use a children's picture book to explain how bizarre it is that ML researchers claim to measure "general" model capabilities with *data* benchmarks - artifacts that are inherently specific, contextualized and finite.
Deets here:
The troubling thing about data is that, if you're Black, it's likely to contain lies about you.
This piece gets at the heart of something that so much alarmed me when I first got into the field, surrounded by people calling these lies "ground truth".
I'm starting a CS PhD
@Berkeley_EECS
this Fall, working with
@beenwrekt
&
@red_abebe
!
Doing a PhD is a casual decision for some - that wasn't the case for me.
I appreciate everyone that respected me enough to understand this & continued to work with me to figure things out. 💕
This is solid evidence to dismantle the "conservative bias" narrative being pushed in certain policy circles (ahem
@elonmusk
& crew). And it makes sense - right-wing creators disproportionately manipulate these rec engines to drive content engagement:
If ML researchers want to be taken seriously, they should start by taking themselves seriously. The lack of thought around the tasks the field chooses to rally around astounds me - we need to begin critically reflecting on what kind of problems this tech can meaningfully solve.
If you are a reporter, DON'T YOU DARE report on Amazon's decision today without acknowledging the literal hell advocates went through for even this small step to happen.
If you named us when Amazon was being defensive & dismissive then you need to name us now.
More than annoyed with the "AI is conscious" thing. The inappropriate anthropomorphism we see in AI causes serious problems - it clouds reasoning & obscures responsibilities.
AI systems are engineered artifacts, much like a toaster. There's consequences to misrepresenting this.
This investigation is wild - in the one city they could get the data for, it turns out PredPol was correct only ~1% (!!) of the time.
It literally enrages me how rampant this kind of dysfunctional "AI" technology is. What a dangerous waste of money.
1. Bias can start anywhere in the system - pre-processing, post-processing, with task design, with modeling choices, etc., in addition to issues with the data. The system arrives as the result of a lot of decisions, and any of those decisions can result in a biased outcome.
I'm one of
@techreview
's 35 Under 35 Innovators!
I've known since early May, but it feels esp. meaningful to be seen & have this work featured in this moment.
Thanks to
@jovialjoy
,
@timnitGebru
,
@mmitchell_ai
for guiding my growth from the beginning.💞
It’s seriously strange how we choose to dehumanize data but anthropomorphize AI systems.
Algorithms do not have “intent” or “moral responsibility”. They do not “think”, or “learn” or “perceive”. That’s us - we do that and then project this behaviour unto the systems we create.
I cannot count the number of times
@timnitGebru
has encouraged us, spoken out for us, defended us & stuck her neck out for us. She has made real sacrifices for the Black community.
Now it's time to stand with her!
Sign this letter to show your support.
People need to understand that what's happening to
@timnitGebru
&
@mmitchell_ai
at Google, can happen to any Black woman and their allies, anywhere. I'd especially appreciate if people chilled on framing this as a corporation specific thing - it's not. This happens everywhere.
Once we characterize AI as a person, we heap ethical expectations we would normally have of people - to be fair, to explain themselves, etc - unto the artifact, and then try to build those characteristics into the artifact, rather than holding the many humans involved accountable
Facial recognition was literally conceived with the task of matching mugshot photos in a book of "suspects" & was ignited through government efforts to deploy frictionless biometric surveillance they would not have to disclose. It's a carceral technology that defaults to abuse.
Ok, I can *finally* announce this:
Thrilled to be a full-time
@mozilla
fellow this year! 🥺
I'll be finally given the freedom to dedicate 100% of myself to a research agenda that has come to mean so much to me! It's kind of a scary time to be independent rn but boy am I ready.
Truly pisses me off how often ppl will try to erase
@jovialjoy
&
@timnitGebru
's role in waking up the entire facial recognition industry to their obvious bias problem.
They will literally cite everything & everyone before citing these two and it makes zero sense.
#CiteBlackWomen
This is actually unbelievable. In the UK, students couldn't take A-level exams due to the pandemic, so scores were automatically determined by an algorithm.
As a result, most of the As this year - way more than usual - were given to students at private/independent schools. 😩
Looks like sixth form and FE colleges have particularly lost out in the standardization process for this year's A Level results. Private schools reaping the benefits with a huge increase in grade A and above.
Algorithmic audits are *incredibly* difficult to execute successfully.
The toolkit for auditors is scattered - fragmented by closed corporate-captured development processes, & limited flexibility.
Excited to launch a new Mozilla project to address this!
To all the people scammed by
@sirajraval
that found themselves out of $199 and still without any idea of what ML is, I suggest
@fastdotai
. It’s free, it’s a solid start and
@jeremyphoward
and
@math_rachel
are amongst the loveliest people I’ve ever met.
I wish we talked more to the affected population. They're often not the customer or user, so it's easy to ignore them.
Right now, the ML community builds tools for teachers, doctors, landlords, police. These are the "domain experts". It's never about students, patients, tenants
We reviewed 100+ ML survey papers & discovered a pattern of evaluation failures distorting performance reporting across various subfields.
Often framed as a one-off casual consideration, ML eval is rarely presented as what it is - a chained *process*, rife w/ measurement hazards
Serious question: Why is
@nature
publishing phrenology?
I'm alarmed that someone wrote this but more alarmed that multiple reviewers somehow let this get accepted. The conclusions are not just offensive but highly questionable. So many biased and completely unfounded claims.
Anyways, this is my cue to log off for a while. I literally can't stomach seeing something like this, and I have no interest in engaging with whatever excuses him and his followers come up with. That first statement is *racist* - not to mention deeply hurtful and dehumanizing.
A reminder that all of machine learning, not just facial recognition, encourages the proliferation of surveillance infrastructure of all sorts. Our hunger to hoard & capture data of all forms, whether through web cookies or a camera, is tied to an increasing data requirement.
"Bombshell Stanford study finds ChatGPT & Google’s Bard answer medical questions with racist, debunked theories that harm Black patients"
We don't talk enough about how so much of the misinfo perpetuated by these models is specifically about minorities.
It's wild to see that paper about Wiki pages for women being regularly deleted for not being "notable" enough, while seeing tweets from women and people of color being denied Twitter verification status for not being "notable" enough...
Sadly, people use these misconceptions to escape responsibility (ie."I fixed the data bias, there's nothing I can do.") so it's important to be careful. Algorithmic discrimination requires socio-technical systems level thinking - that should affect how we think about all of this.
This is not just about offensive language - the underlying beliefs being expressed here are repugnant and untrue. It's genuinely terrifying to me that he cannot explicitly and unequivocally denounce every aspect of these beliefs, even today.
Lists like this can feel silly but I don't take for granted any opportunity to highlight the work I'm fortunate to do.
Feeling especially grateful for my
@mozilla
teammates
@Abebab
,
@brianavecchione
,
@ryanbsteed
&
@OjewaleV
🙏🏿 Excited for what's to come!
I said this because I believe it: “It was really easy for them to fire her because they didn’t value the work she was doing”.
Unfortunately for Google, much of the research community disagrees. Proud to continue standing by
@timnitGebru
&
@mmitchell_ai
.
My eldest brother passed away from cancer last week. He was our north star - in moments like this, he would go out of his way to talk about how proud he was of *us*, how we could graduate too. Words can't describe how much he was loved & how much he will be missed. He was 30. ❤️
The most tragic lesson I've learnt doing any kind of algorithmic auditing is that companies will not bother to make their product work, if they know the group it doesn't work for is powerless to stop them.
🗣️Finally, it's out! It's here!
In this paper, me,
@morganklauss
&
@amironesei
present work on a topic people remain hesitant to discuss - the **power dynamic between disciplines** & how that shapes poor AI ethics education and practice.
Check it out:
4. Framing interventions as "fixes" can be unhelpful. If we truly had a "fix", this wouldn't even be a problem anymore.
All data embeds a worldview & all models have some bias. Most interventions just try to make the model biased towards more inclusive (& non-illegal) outcomes.
Can't believe we live in a world where some would rather see an AI system as human before acknowledging the humanity of the marginalized actual people around them.
3. Race & gender can be the *least* obvious biases to detect. As legally protected attributes, they're the least likely to be included in meta-data & we don't succeed at identifying accurate proxies.
Here's a recent
@FAccTConference
paper about this:
My main takeaway from this workshop is how much the ML field is literally exporting it's chaos into other scientific disciplines. Well-meaning researchers see ML as a way to make sense of their data, but then fall into one of the many trap doors to invalid or non-sensical models
Dozens of scientific fields that have adopted machine learning face a reproducibility crisis. At a Princeton workshop today, ten experts discuss how to find and fix ML flaws. We had to cap Zoom signups at 1,600, but we’ve added a livestream. Starting NOW:
This is quite literally the thesis of our paper "Data and its (dis)contents: A survey of dataset development and use in machine learning research".
I hope ML researchers will take a look!
In general, there is very little research done on best practices for data curation / cleaning / annotation, even though these steps have more impact on applications than incremental architecture improvements. Preparing the data is an exercise left to the reader
So much harm is caused by AI systems that don't work. I get that the research is exciting but it really unsettles me how quick we are to deploy it everywhere. The technology is premature and probably should not be affecting so many people's actual lives.
PSA for those hoping to bandwagon into Ethical AI research : This isn’t like when GANs were the hot new thing - this is a field that means something to people. It’s personal and always has been. If you don't come ready to listen, you will have nothing meaningful to say.
My controversial opinion is that it's not actually that much harder to learn new things when you're older - it's just that, as we age, we often become more self concious about mistakes and thus hesitant to engage with the often humiliating process of learning something new.
I'll be giving a lecture on the risks of "AI for Social Good", talking about how even good intentions can lead to bad outcomes with AI.
This is a Stanford course turned workshop but anyone interested can register here:
There's something important that no one seems to be saying about this article so I'm putting my thoughts here to get it off my chest.
tl;dr I ended up reading this and thinking not so much about Facebook but much more about policy & incentives.
(THREAD)
Speaking out against censorship is now "inconsistent with the expectations of a Google manager".
She did that because she cares more and will risk everything to protect those she has hired to work under her - a team that happens to be more diverse than any other at Google.
However, we believe the end of your employment should happen faster than your email reflects because certain aspects of the email you sent last night to non-management employees in the brain group reflect behavior that is inconsistent with the expectations of a Google manager.
“It’s quite obvious that we should stop training radiologists,”
- Geoffrey Hinton, 2016.
“radiologists should be worried about their jobs”
- Andrew Ng, 2018.
2019: "... Clinically Meaningful Failures in Machine Learning for Medical Imaging"
lol
Technologists proclaiming that AI will make various professions obsolete is like if the inventor of the typewriter had proclaimed that it will make writers and journalists obsolete, failing to recognize that professional expertise is more than the externally visible activity.
Even those that don't own a phone & don't use the Internet can still experience algorithmic harm. Algorithms are always being imposed on people that didn't ask for it, that are often unaware of how that harmful tech impacts their life. You don't escape this by "just logging off".
For example, "automation bias" occurs when just the introduction of an algorithm results in increasing the bias of human discretion (ie. model predictions being perceived differently in one context vs. another, leading to biased outcomes for minorities).
They paid them and asked for consent. This is larger & more ethically sourced than any other evaluation dataset for testing demographic bias so far.
Functionality is just one of *many* issues, of course, but now there's literally no excuse for a tool not to work on minorities.
As part of our ongoing efforts to help surface fairness issues in
#AI
systems, we’ve open sourced a new data set of 45,186 videos to help evaluate fairness in
#computervision
and audio models across age, gender, apparent skin tone, & ambient lighting.
2. Much of the de-biasing work makes it clear that algorithmic design choices can lead to *more fair* outcomes (w/ a "fixed" dataset) so it shouldn't be surprising that algorithmic design can also lead to *less fair* outcomes.
@sarahookr
explains it here:
Yesterday, I ended up in a debate where the position was "algorithmic bias is a data problem".
I thought this had already been well refuted within our research community but clearly not.
So, to say it yet again -- it is not just the data. The model matters.
1/n
Not at all an exaggeration to say that I'm in research today (and will hopefully continue) becuase of the support of groups like
@black_in_ai
.
For anyone volunteering for any affinity group, know that your work is important & making a huge difference.
Representation matters!
11/n Congratulations to Deb Raji
@rajiinio
from the University of Toronoto.
Deb is joining
@Berkeley_EECS
@berkeley_ai
as a PhD student working on evaluation, audits, & accountability.
Deb shares "Thanks to BAI for all their hard work in supporting students in various ways...
A lot of practical modeling decisions have to do with data, so naturally we focus on those decisions when talking about bias as well. But this doesn't mean those are the *only* decisions being made nor the primary opportunities for human judgements to influence a model's outcome.
Still can't believe what happened. It's unreal.
@mmitchell_ai
fought for me. She would remind me to speak up & take credit for things I did, when others in a meeting would talk over or ignore me.
@timnitGebru
hypes me up every time. I wanted to quit, she's the reason I didn't.
As I process the abrupt firing of my manager
@mmitchell_ai
, in the wake of
@timnitGebru
’s firing, I keep coming back to the incredible feat they achieved in building such an incredibly diverse team, and one that truly thrived for a time,... 1/
It annoys me how much those advocating for existential risk expect us to believe them based on pure ethos (ie. authority of who says it)... do you know how many *years* of research it took to convince people machine learning models *might* be biased? And some are still in denial!
I wrote "The Discomfort of Death Counts" for Cell's data science journal
@Patterns_CP
, for their series on COVID-19 & data.
I thought I would be analyzing figures - but I didn’t. Instead, it's about seeing data as humans so we can properly mourn them. 💔
I catch glimpse of this perspective often - it genuinely surprises me. AI critics are not haters reluctant to give tech ppl their due credit. This in fact has nothing to do with tech ppl, is not at all personal. This is about those impacted & the harms they are now vulnerable to.
I think some are reticent to be impressed by AI progress partly because they associate that with views they don't like--e.g. that tech co's are great or tech ppl are brilliant.
But these are not nec related. It'd be better if views on AI were less correlated with other stuff.🧵
To put this in context:
Facebook pioneered the use of deep learning for facial recognition with the DeepFace model in 2014, because they had more face data than anyone else at the time ().
For them to now (finally) reject this technology is a big deal.
My jaw dropped to the floor when I got this news. Facebook is shutting down the facial recognition system that it introduced more than 10 years ago and deleting the faceprints of 1 billion people:
Unlike five years ago, when Gebru began raising questions, there’s now a well-established movement questioning what AI should be...This isn’t a coincidence. It’s very much a product of...the simple act of inviting more Black researchers into the field. 🔥
Why are we yet to enforce notice of use for AI systems? We should legally require criminal defendants subject to a risk assessment to be notified of its use in their case, require each job applicant to be informed when their application is being processed by an algorithm, etc.
is an interesting read on whether someone who expresses an intent to resign but whose employer stops paying them before a provided date has resigned or been fired under California law
The thing is -
@AllysonEttinger
wrote a whole paper years ago warning about this exact situation.
In those experiments, BERT failed *every single negation test*!
Did that deter deployment? Somehow, no. At some point, these institutions need to take basic responsibility.😕
No, we shouldn't let people publish whatever they want.
Ethical reflection is a basic part of the scientific process and part of what it means to do "good science". Pretty much every other applied research field has to think about this so what exactly makes ML exempt?
so, let people in academia publish on whatever they want. unless it is bad science, and then, sure, don't publish it.
and put the emphasis where it matters: regulating and controlling the *actual deployment* of systems.
Why should something need to be framed as an "existential risk" in order to matter? Is extinction status that necessary for something to be important?
The need for this "saving the world" narrative over a "let's just help the next person" kind of worldview truly puzzles me.
Why do people assume that "bias" in LLMs is just the model occasionally engaging in bigoted name calling? This is certainly bad, but I'm honestly much more concerned about the integration of these models (& their engrained biased stereotypes) in consequential decision-making.
Final word: the fact that pretty much everyone agrees that this is an incomplete, partial apology, but the divide is between "yes, that is unacceptable" and "let me try to convince you that your race is intellectually inferior" is really throwing me for a loop right now.
Something many often don't consider when discussing "Ethical AI" is the power differential - there is a multi-billion dollar apparatus marketing this technology as flawless and only recently has a critical mass of scholars & advocates come together to share an alternative view.
I feel like CS academics in particular are underestimating the amount of politicking & relentless advocacy it takes to get to something like this. Is this perfect? Of course not. Multiple different groups fought over every word!
But is this a rare & monumental win? Absolutely.
Today,
@POTUS
and I announced an Executive Order to ensure our nation leads the world in artificial intelligence.
This historic EO is the most significant action taken by any government in the world to ensure AI advances innovation and protects rights and safety.
"Raji didn’t see people like herself onstage or in the crowded lobby. Then an Afroed figure waved from across the room. It was
@timnitGebru
. She invited Raji to the inaugural
@black_in_ai
workshop..."
Will never forget this! Timnit impacted so many. ❤️
Amazing news!🎉🎉🎉
Rashida Richardson is one of my favorite scholars.
Her work has been transformational in the AI accountability space - she's revolutionized how we talk about data & race!
For those unfamiliar, here are some of her papers I learnt a lot from. (THREAD)⬇️
I am pleased to welcome Rashida Richardson to the Science and Society team
@WHOSTP
. Her combination of legal, policy, and civil rights insight into emerging technologies and automated systems is incisive and unparalleled.
In all the excitement of
#FAccT22
, I forgot to actually announce what I'll be presenting! Fortunate to be involved with 3 papers, centered on my interest in algorithmic accountability.
The first: We discuss how current AI policy ignores the fact that many AI systems don't work.
One culprit for these common misconceptions is the use of that term "bias". Some hear "algorithmic bias" and map this to "statistical bias" (as in the simple notion of bias vs. variance). Others hear it and think of algorithmically enabled discrimination and harm. Both are right.
After Schumer’s forum, so much of the focus from media & government officials was anchored to the speculations of the wealthy men in the room.
So I wrote in
@TheAtlantic
about another view -- the grounded complexity brought by unions & civil society:
The story of what happened to
@timnitGebru
goes far beyond Google. My brother David (who's only ever half-interested in anything I do) was so shook by the situation, he wrote a blog post about it!
Some good stuff there - if so inclined, check it out!
A reminder that it's not just academic researchers that operate as third party external auditors to AI products. Investigative journalists, civil society, regulators, law firms and so many others can also play this role - and *all* of us need to be legally protected & supported.
I'm getting tired of this pattern.
At this point,
@jovialjoy
has to spend almost as much time actively fighting erasure as she does just doing her work. It's a waste of everyone's energy & so frustrating to watch her, the spark this conversation, being consistently overlooked!
Adding this to clarify that my goal is not some unjust character assassination of Bostrom. It's upsetting that someone would write this at all but what is *most* upsetting is how he currently remains equivocal about beliefs that are harmful & prejudiced.
@flotsam70272377
@thebirdmaniac
@nsaphra
This is not about his past comments but his present ones. In his present day apology, he does not denounce the first statement of the original email and remains equivocal about something that is understood to be prejudiced and harmful.
Starting to notice a trend - the "Tech Savior Complex". This is not to be confused with Techno-solutionism.
It’s one thing to think tech is the answer, its another and perhaps more unsettling thing to think that the technologist holds all the answers (technical or not).
AI branded tools currently make consequential, unregulated decisions - in healthcare, criminal justice, education, etc. - causing harm to so many real people.
Yet for some reason, this speculative scenario is what some people think "AI as a policy issue" is about.
Frustrating.
I'm sorry, I just can't bring myself to care about A.I. as a policy issue. If the human race somehow ends up getting killed by a fancy logistic regression, then we were never gonna make it.
Me &
@Abebab
wrote about how, with the new large language models, everything old is new again - we're still talking about the harms observed since 2016 with products like BERT-enabled search & the chatbot Tay. And facing the same pushback to critique.
With all due respect, I can understand why they think that way.
You were one of the biggest obstacles to NeurIPS being able to develop & adopt a Code of Ethics -- your critique was along the lines of "this is over-reach" & "outside of what ML researchers should care about".
Some folks seem to think I ignore or don't care about AI ethics, safety, and alignment.
I need to remind them that a group of us co-founded the Partnership on AI in 2016 precisely to study, discuss, and address questions of AI ethics, safety, and alignment back in 2016.…
This always bothered me. Adding a sprinkle of ethics to CS education feels incomplete.
Why do we think it's that easy for tech folks to learn social science? While also believing that social scientists will "never understand the tech"?
We need to learn to collaborate instead.
@rajiinio
Like if we just give CS students enough ethics and society style CS classes, we can entirely replace historians, STS scholars, feminist scholars, etc. Tech will be magically fixed by the already valued.
We also don't do a good job categorizing race & gender (see: , , ) - that makes it even harder. In fact, people only focus on race & gender, despite the difficulties, because it's tied to anti-discrimination law
There's also a reality of various tradeoffs (ie. diversifying data may require privacy violations, measuring fairness could increase liability, etc.) & contradicting definitions (see:, ). In real life, this is far from a solved issue.
Increasingly uncomfortable with "AI safety" adherents positioning themselves as the more "technical" & "apolitical" camp when addressing AI harms. ML fairness research is clearly more theoretically mature & empirically grounded than the philosophizing in eg. LessWrong blog posts.
For those who seem to be confused:
“AI safety” (as a field) has nothing to do with the woke Gemini debacle. That is a result of “AI ethics” - a completely different thing:
AI ethics: focussed on stuff like algorithmic bias. Very woke & left-leaning. Dislike transhumanism & EA &…
@flotsam70272377
I am responding here only because I know you and your peers will do you best to defend this. Him apologizing for the use of a racial slurs does not excuse his lack of apology for the underlying prejudice of beliefs he does not apologize for.
This is not just about offensive language - the underlying beliefs being expressed here are repugnant and untrue. It's genuinely terrifying to me that he cannot explicitly and unequivocally denounce every aspect of these beliefs, even today.