What is bias in machine learning

Artificial Intelligence

We all have biases. They might be unconscious, but they’re there nonetheless. And while they might not always be bad, they can sometimes lead to unintended consequences in our lives and work. In this blog post, we will explore what bias is and what it means for the field of machine learning. We’ll also look at ways to identify and overcome our own biases so that we can make better decisions in the future.

What is bias in machine learning?

Bias in machine learning is a term that you can use it to describe the ways that artificial intelligence (AI) systems can be unintentionally discriminatory. Geoffrey Hinton, widely considered one of the fathers of AI, first coined the term. Different types of bias exist in machine learning.

All of them arise because designers program AI systems to learn solely from data and not from humans. Consequently, these systems can show bias towards specific data and interpretations.

The three main types of bias in machine learning are:

1) Selection bias: This occurs when the AI system chooses which data to focus on and ignores other data that may be relevant. This can lead to inaccurate predictions based on the chosen data.

2) Appraisal bias: This occurs when the AI system evaluates data in a biased way, often favouring information that it has been explicitly taught to recognise as being important. Valuable data can lead to inaccurate predictions or decisions.

3) Generalisation bias: This happens when the AI system makes assumptions about how things will behave based on limited experience or knowledge. These assumptions can lead to incorrect predictions or decisions if they are used in future situations where those assumptions may not hold true.

Types of bias in machine learning

Machine learning algorithms are often accused of exhibiting bias. What does this mean and why is it a problem?

In the simplest terms, bias in machine learning refers to any systematic deviation from neutral predictions caused by the algorithm itself. This might manifest as over- or under-fitting. The algorithm ‘forgets’ how particular features relate to prediction success or failure, respectively. In practice, this means that certain classes of data (e.g. those belonging to a certain group or those with certain features) will tend to be more accurately predictors of future outcomes than others. Why is this problematic?

First and foremost, incorrect predictions can arise. For instance, if designers program an algorithm to predict cancer rates and it mistakenly gives low predictions for people with dark skin color, it would be erroneous and could potentially endanger them.

Secondly, biased predictions can have a negative impact on individual users and groups of users alike. For example, if an algorithm predicts that certain people have a high chance of developing diabetes, it might unfairly target those already at risk, worsening their disease symptoms. Machine learning circles refer to this type of discrimination as ‘heterosexism’ or ‘racism’, and numerous studies have illustrated it.

Lastly, biased algorithms might prioritize specific types of data to enhance performance, a practice sometimes termed ‘data scaling’. This can have unforeseen consequences because different types of data tend to behave


Bias in machine learning refers to the unintended effects that can occur when a computer is making decisions based on data. These effects can dramatically reduce the accuracy and usefulness of predictions that a machine learning algorithm makes. It can even cause it to draw incorrect conclusions. In order to avoid bias, you need to understand how it works and take steps to reduce its impact. By doing so, you will be able to harness the power of machine learning while minimizing its potential flaws.



Leave a Reply

Your email address will not be published. Required fields are marked *