Bias in Machine Learning
Learn more about the challenges that come with machine learning.
by Jesse Bracho
Machine learning algorithms can be characterized as pattern finders. The quality of those patterns are often evaluated by their predictive ability. These algorithms and the utility their predictions provide have been adopted with stunning speed across many industries. They are used to make life-changing decisions, affect hiring processes 1, 2, medical diagnoses3 and even things like deciding which prisoners are selected for parole4. Applied irresponsibly, it is entirely possible for machine learning algorithms to reproduce and reinforce systemic bias with detrimental effects on the lives of individuals.
The problem of bias boils down to a fundamental issue, machine learning algorithms base their predictions on associative (correlative) inferences. They group together things that appear to be related, and the differences in many of the algorithms boil down to how they group things. Repeated to the point of exhaustion, but still extremely important - “correlation does not imply causation.” It is impossible to determine a cause and effect between phenomena based on association alone. Machine learning algorithms, of all sophistications, do nothing to establish a causal link. This is because causal inference is the most difficult kind of inference to make, and requires structuring a problem in a specific way. For example, it’s been observed that machine learning algorithms have a particularly difficult time with medical diagnoses.
“... why do existing approaches struggle with differential diagnosis? All existing diagnostic algorithms ... rely on associative inference—they identify diseases based on how correlated they are with a patient's symptoms and medical history. This is in contrast to how doctors perform diagnosis, selecting the diseases which offer the best causal explanations for the patients symptoms… We argue that diagnosis is fundamentally a counterfactual inference task.”3
So then how are machine learning algorithms effective at all? This is because the machine learning algorithm user puts effort into assisting the algorithm, to the best of the user’s ability, to internalize the most predictive associations possible. They hope that the algorithm internalizes an “underlying” pattern to come up with associations that hold up or predict “in most cases.” An entire category of problems is dedicated to describing this challenge.
Here are five common machine learning challenges:
“Garbage in, garbage out.” These algorithms do not evaluate the plausibility of their associations, they just will associate. Erroneous data will lead to erroneous associations; outliers and extremes will skew them. This is the place where many data scientists will spend lots of their time trying to “clean up” the data.
Overfitting is when a machine learning model learns associations that are too specific to the data. For example, you as a person might believe it to rain in certain seasons based on the part of the world you live in, and your predictions are quite accurate for where you live. But your predictions for when it rains would be inaccurate when applied to other parts of the world. Your observations, or your “training data,” is too limited, despite your predictions performing well.
Underfitting occurs when the model is too simple to make accurate predictions. Any associations the model learns are bound to be inaccurate as the underlying reality of the world is complex. Going back to the weather example, if your rules for predicting when it rains is to try to best associate rain with a specific day of the week, it doesn’t matter how much data you look at - the way you’re thinking about the weather is fundamentally oversimplified.
So you might say, maybe there’s more to the weather than if it rained and the day of the week. You want to take more things into account. The items you decide to take into account are known as features. With better features, a model can potentially make better associations, but you might also include useless features. You might expand your weather feature set to include humidity levels, how cloudy it is, and whether or not your car was running that day. One of these is not like the others.
The common thread among all these problems is the data - and what data is selected is at the discretion of the data scientist. It’s important to keep in mind that these algorithms are not some Hitchhiker’s Guide style supercomputer with the ability to divine answers. Rather, they are finicky, very input-dependent tools that require human supervision along their entire lifecycle.
Current machine learning algorithms are really just a form of automated and sometimes productive bias. “Here again, we encounter bias as a generative and inevitable precondition for ML, as labelling (making predictions) both assumes and decides that the distribution of data is not random but matters deeply with respect to the patterns sought. There is nothing wrong with this, as long as one is aware of potential blind spots, skewed mappings and the ensuing invisibilities.”6
Ethically Problematic Bias
The bias mechanism can also be ethically problematic. In today’s society, nearly all sensitive and personal data about a person is collected and, in all likelihood, used in algorithms: features such as race, ethnicity, gender, sexual orientation or political opinion.6 It is tacitly accepted that machine learning algorithms can be allowed to work on and generalize across these features because of the aura of objectivity these algorithms provide.
“We have turned to machine learning, an ingenious way of disclaiming responsibility for anything. Machine learning is like money laundering for bias. It's a clean, mathematical apparatus that gives the status quo the aura of logical inevitability. The numbers don't lie.”5
There is also the problem of autonomy. These algorithms do not operate in a vacuum, and often produce outcomes which influence or “nudge” the choices of end-users, choices which are in-turn fed back into the algorithm as more data.
“This term coined by Thaler and Sunstein (2008) refers to the practice of influencing choice by “organizing the context in which people make decisions” (Thaler et al., 2013, p. 428; see also nudge). A frequently mentioned example is how food is displayed in cafeterias, where offering healthy food at the beginning of the line or at eye level can contribute to healthier choices. Choice architecture includes many other behavioral tools that affect decisions, such as defaults, framing, or decoy options.”6
The technical problem of bias
As we can see machine learning is not protected from bias - and it’s more than just a question of ethics. The domain of machine learning is incomplete and iterating. As engineers and scientists, it’s important for us to consider the impact of bias in our own work and in that of others. While it may be difficult to avoid bias, we can ensure that we work to correct it and build models that take it into account. Machine learning and algorithms aren’t foolproof because their creators are not foolproof. Adding the human element of review can help as can level setting with your team. At the end of the day, machine learning is a tool in our arsenal, not a magic solution to all data problems.