Breaking Down

How bias can be baked into machine learning
article cover

Francis Scialabba

· 3 min read

Stay up to date on emerging tech

Drones, automation, AI, and more. The technologies that will shape the future of business, all in one newsletter.

Our lives are increasingly managed by software, which makes it tempting to trust the algorithms and focus on other things, like: why haven’t I had coffee yet? But outputs are only as good as inputs. When bias creeps into training data for machine learning and AI, it can amplify existing inequalities.

Top of mind

Facial recognition has recently made headlines, so let’s start there. In a 2019 study, researchers at Brookings estimated that the subjects of common facial recognition training datasets are 75% male and 80% white. Also last year, the National Institute of Standards and Technology (NIST) found “demographic differentials” in performance across the majority of 189 facial recognition algorithms from 99 vendors.

  • For one-to-one tests, such as unlocking a phone or checking a passport, algorithms generally had higher false positive rates for Black and Asian faces. Translation: There’s a higher likelihood another individual could pass as you.
  • For one-to-many tests, such as identifying a face from a database of photos, algorithms tended to produce higher false positive rates for Black women. Translation: There’s a greater chance you could be identified as a person of interest for a crime you weren’t involved in.

As the NIST noted, facial recognition isn’t a monolith: “Different algorithms perform differently.” Though the technology continues to get more accurate, a growing chorus of voices are calling for strict legal guardrails.

A sampling of other biases in algorithms

  • Last October, a study published in Science found “significant racial bias” in a commercial algorithm that guides healthcare treatment decisions for millions of U.S. patients.
  • A 2016 ProPublica investigation discovered racial disparities in criminal risk assessment software used by U.S. courtrooms to predict recidivism rates and guide sentencing decisions.
  • Amazon scrapped an internal machine learning recruiting engine after realizing it favored men’s resumes.

Looking forward

There are reasons to be optimistic. “Fairness,” “bias,” “consent,” and “ethics” have rapidly entered the lexicon of AI researchers. Tech teams are challenged to avoid the path of least resistance by using diverse datasets and auditing algorithms before shipping. There are also calls to diversify technical teams that have remained stubbornly homogeneous.

Bottom line: AI is a tool with few creators but billions of users. There’s low tolerance for faulty tech with real-world impacts.

+ For more on the topic, many of you recommended Cathy O'Neil’s Weapons of Math Destruction. Or, for a shorter read, check out Vox’s excellent article on an algorithmic bill of rights.

Stay up to date on emerging tech

Drones, automation, AI, and more. The technologies that will shape the future of business, all in one newsletter.