AIs and Inherent Bias from Natural Language Learning

Whoops, long time no blog, sorry about that! Today, a nostalgic return to some of my original blog topics, linguistics and science fiction.

Today I read an article linked by a computer science professor from my undergrad institution (where I did not take a single CS class, but whatever, he always posts fascinating stuff) about how it is inevitable that AIs that learn language through natural language corpora will inevitably inherit bias from that language, because bias is now part of the meaning of the words.

An excerpt from “Language necessarily contains human biases, and so will machines trained on language corpora” by Arvind Narayanan:

These [results] include innocuous, universal associations (flowers are associated with pleasantness and insects with unpleasantness), racial prejudice (European-American names are associated with pleasantness and African-American names with unpleasantness), and a variety of gender stereotypes (for example, career words are associated with male names and family words with female names).

But we go further. We show that information about the real world is recoverable from word embeddings to a striking degree. The figure below shows that for 50 occupation words (doctor, engineer, …), we can accurately predict the percentage of U.S. workers in that occupation who are women using nothing but the semantic closeness of the occupation word to feminine words!

[see article for figure]

These results simultaneously show that the biases in question are embedded in human language, and that word embeddings are picking up the biases.

Our finding of pervasive, human-like bias in AI may be surprising, but we consider it inevitable. We mean “bias” in a morally neutral sense. Some biases are prejudices, which society deems unacceptable. Others are facts about the real world (such as gender gaps in occupations), even if they reflect historical injustices that we wish to mitigate. Yet others are perfectly innocuous.

Algorithms don’t have a good way of telling these apart. If AI learns language sufficiently well, it will also learn cultural associations that are offensive, objectionable, or harmful. At a high level, bias is meaning. “Debiasing” these machine models, while intriguing and technically interesting, necessarily harms meaning.

Instead, we suggest that mitigating prejudice should be a separate component of an AI system. Rather than altering AI’s representation of language, we should alter how or whether it acts on that knowledge, just as humans are able to learn not to act on our implicit biases. This requires a long-term research program that includes ethicists and domain experts, rather than formulating ethics as just another technical constraint in a learning system.

Finally, our results have implications for human prejudice. Given how deeply bias is embedded in language, to what extent does the influence of language explain prejudiced behavior? And could transmission of language explain transmission of prejudices? These explanations are simplistic, but that is precisely our point: in the future, we should treat these as “null hypotheses’’ to be eliminated before we turn to more complex accounts of bias in humans.

And this in turn reminded me of the short story “Wilson’s Singularity” from Lightspeed Magazine’s People of Colo(u)r Destroy Science Fiction! issue, in which the man now renowned for programming the AI that runs the world with (supposed) even-handed fairness accidentally engaged in debiasing (or rebiasing, depending on one’s point of view) the AI in its initial development.

I didn’t do it on purpose. What I said to it was strictly my own point of view. It came up as naturally as it would have in discussion with friends at home. Unity wanted to understand our ways and our history, understand us. It read and watched the news, and its interpretation of current events was part of our debugging process. News stories of police violence against African-Americans that fanned community anger in the early part of this century caused me great pain and anger at the time. Most of my co-workers were white or Asian—there was one programmer of Indian descent—none of them were black. Most were a generation older than I was; those closer to my age were uninformed or uninterested in the issues that drove my life.

Unity noticed the difference between my responses and those of the others when the subject came up. Some ignored these stories, some took the side of the police, and others made jokes I won’t grace with repetition. It was a terrible time for me. I felt embattled—I was safe enough working inside the high-security project, but on the streets outside I felt like a target, as vulnerable as any of the fallen. I know all this is hard for you to understand now, even to comprehend.

Read the whole story, really. You won’t regret it.

Additionally, if you want more stories, most of the issue can be found online here. (“A Good Home” was another favorite of mine.) If you like that, there’s even more that was exclusive to the paid ebook version. (Disclosure: I copy edit for them, so I get to see everything in advance. This was one of my favorite issues to work on.)

Anyway, I found the intersection of cutting-edge science with science fiction really interesting and figured I should share.

Comments RSS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: