AI Is No Magic Bullet for Policing Hateful Content

Share this:

Detailing the last 15 months at Facebook, Wired’s Nicholas Thompson and Fred Vogelstein penetrate the PR veil constructed by the company’s 300-person-strong communications team to illuminate the turmoil of that period. They lay out the logic behind decisions that seem baffling from afar, like top executives’ choice to wait five days before addressing the Cambridge Analytica scandal, and they toss out some true gems about the company’s less than savory culture, such as the name of Sheryl Sandberg’s conference room: Only Good News. (Even if that were intentionally ironic, which I doubt, it is the stuff of Aaron Sorkin’s dreams for his Social Network sequel screenplay.)

Wired’s story touches on a very public—and problematic—solution to one of Facebook and the contemporary Internet’s gravest issues. Beloved among Facebook execs and Mark Zuckerberg in particular, that solution is the use of AI to police harmful content, including hate speech. Describing the challenge facing Facebook as it attempts to use AI to identify hate speech, Thompson and Vogelstein write:

“Even a basic machine-learning system can pretty reliably identify and block pornography or images of graphic violence. Hate speech is much harder. A sentence can be hateful or prideful depending on who says it. ‘You not my bitch, then bitch you are done,’ could be a death threat, an inspiration, or a lyric from Cardi B. Imagine trying to decode a similarly complex line in Spanish, Mandarin, or Burmese. False news is equally tricky. Facebook doesn’t want lies or bull on the platform. But it knows that truth can be a kaleidoscope. Well-meaning people get things wrong on the internet; malevolent actors sometimes get things right.”

At the core of the struggle to identify and eliminate hateful content using machines—even ones capable of learning and improving—is a profound question. Can human verbal language be reduced to math or code? Is it possible to use machines to identify with very high accuracy and efficiency that users’ language constitutes hate speech—without unduly striking down, say, every instance of a profane word?

Zuckerberg is clearly hoping and betting that the answer is yes. He has touted AI’s potential to clean up the Internet, including Facebook’s fake news- and hate-contaminated ecosystem. Facebook’s coffers reflect his unsurprising faith in technical solutions. “Over the past several years, the core AI team at Facebook has doubled in size annually,” Wired reports.

Yet Facebook has already learned that not all inappropriate content is created equal. While the company’s mechanisms already flag spam and posts supporting terror before human users catch them 99% of the time, and nudity is caught by machines at a frequency of 96% before humans get around to it, the figure for hate speech is strikingly low by comparison at just 52%.

Wired’s exposé explains well enough why hate speech is so hard to identify in the lucid paragraph quoted above. To put it simply, language is slippery. Hate speech cannot consistently be boiled down to code. This is not only because the very words we generally take to be most profane can be meant and received as compliments or words of affection depending on the context in which they are used. It is also because the meaning of a given word can turn completely on the basis of the person reading or listening to it. In daily life, offline as well as online, moments of communication that we tend to regard as intuitive often hinge on interpretations informed by individual experience and highly variable values. This has perhaps never been more evident than in the polarized age of Donald Trump’s presidency. Words taken as expressions of higher truth by many are intolerable, even hate speech, to others.

“These are the kinds of problems that Facebook executives love to talk about,” Thompson and Vogelstein write, because, “They involve math and logic, and the people who work at the company are some of the most logical you’ll ever meet.”

All of that may be true. But if Facebook has failed egregiously in certain cases to moderate hateful content on its platform, failure that has contributed to genocide in Myanmar and other significant crimes around the world, it is perhaps precisely because the company’s executives view the problem of hate speech moderation through the lens of “math and logic.” If that were an appropriate primary lens, it would follow that Facebook could eliminate hate speech through the technical extension of those paradigms: more investment in programmers and machines.

On the contrary, the task Facebook must take up as it attempts to police hateful content is one inseparable from political values, human judgment, and the interpretation of statements that need to be parsed by well-trained eyes and bright minds with a stomach for horror to boot. While machines will play an indispensable role in content moderation on a platform of Facebook’s scale, they will be far from sufficient. That’s because monitoring hate speech touches on nothing less than some of humanistic inquiry’s age-old questions: the debatable violence, status of truth, and foundations of meaning in language.

If we treat monitoring hate speech as a highly complicated human problem and not simply as an obstacle to corporate growth, PR issue, or problem set, we realize that it is the kind of task that calls for a department of highly skilled workers unto itself. To avoid facilitating future disasters like the events in Myanmar will no doubt require hiring tens of thousands of moderators, as Facebook has already done and continues to do, but those moderators will not be doing a job fit for meager pay, few if any benefits, and contractual labor. They will be taking on one of the great quandaries of the Information Age, and they will require educational backgrounds and professional training that demand more than the $28,800 earned by contractors at one of the firms from which Facebook pulls content moderators. (That income, reported by The Verge, is just a smidge above 10% of the median $240,000 in compensation received by the company’s full-time employees.)

AI alone cannot fix Facebook and remove hate speech from the Internet. But acknowledging the more than technical scope of combating hate speech, establishing strong values in the context of divisive issues, and assembling teams of adequately paid moderators with complex understandings of culture, politics, and language would mark the start of a good-faith effort to do so. It would also set an example for smaller platforms and, in the long run, might just save lives.

Joe Zappa is the Managing Editor of Street Fight. He has spearheaded the newsroom's editorial operations since 2018. Joe is an ad/martech veteran who has covered the space since 2015. You can contact him at [email protected]