Artificial intelligence is quickly taking over many functions of our day-to-day lives. It can already schedule appointments, look things up on the internet, and compose texts and emails for us. But what happens when AI takes over more serious parts of our lives (though, texting is pretty important)?
That’s the challenge researchers and developers are facing. They want to develop more robust AI solutions, but they are running into a serious roadblock – bias.
Our artificial intelligence solutions are currently being trained by humans using data. The trouble is that the data (and the humans sometimes) doesn’t give the AI an unbiased place to start its work. As a friend of mine says, “It’s garbage in, garbage out.”
Examples of Bias in Artificial Intelligence
The first example is COMPAS, and it’s been well-documented. COMPAS was a tool that would score criminal offenders on a scale between 1 and 10, determining whether the offender was likely to be re-arrested while awaiting trial. This is a common decision made during bond call in criminal cases. Judges have had to weigh factors to determine a criminal’s likelihood of reoffending for decades. COMPAS was designed to help them.
The trouble was that COMPAS gave higher risk scores to black offenders than white offenders. There is a wonderful breakdown of how the algorithm worked from MIT, here. It’s important to note that the program did not explicitly factor in race, but it had the effect of unfairly targeting blacks anyway. Without going into the math, the basic problem was that blacks were more likely to be arrested (due to current and previous racial discrimination), which meant that the program predicted a higher chance of re-arrest. The higher predicted chance of re-arrest translated into higher risk scores. In fact, the data (predictably) was wrong, meaning that it consistently led to a higher percentage of black offenders being held in jail unnecessarily.
For a program created to combat the prejudice that exists in our court system, it failed.
Our second example is the Allegheny Family Screening Tool, which was created to help humans determine whether a child should be removed from their family because of abusive circumstances. The designers knew that the data was biased from the start. The data was likely to show that children from black or biracial homes were more likely to need intervention from the state.
The engineers couldn’t get around the faulty data. Data is the primary way that we train artificial intelligence, and the developers could not bypass this necessary training or fudge the numbers. Because they didn’t feel like they could combat the bias in the numbers, they opted to educate those with the most influence over the numbers going forward, explaining the faulty data and implicit bias to the users of the system – mostly judges (article, here).
This is a good example of how bias in the data can be challenging to overcome.
My last example is from current facial recognition software. The top three facial recognition artificial intelligence from IBM, Microsoft, and Megvii (Chinese) can all correctly identify a person’s gender 99% of the time, if that person is white. If the person was a dark-skinned woman, the chance of correctly guessing that person’s gender is only 35% (article, here).
There is no doubt that facial recognition software has a long way to go. That’s why it is so disturbing to see it being used heavily by law enforcement. Perhaps we will also see its use in contact-tracing for COVID. I believe this technology is likely to start trampling on our privacy rights over the next few years.
Why does it matter?
Bias in artificial intelligence matters because the exact reason we want to use AI is to avoid biases that naturally exist in all humans. Computers represent the only true way to treat everyone fairly. We see how our courts, schools, and banks are biased on the basis of race and gender. AI could provide us with a way past these prejudices. Then as people who have been traditionally held down are lifted, we may see some of these implicit biases melt away.
But we cannot train an AI to avoid bias with biased data. That is the challenge for developers today.
Garbage in, garbage out.