What is bias in machine learning

Artificial Intelligence

We all have biases. They might be unconscious, but they’re there nonetheless. And while they might not always be bad, they can sometimes lead to unintended consequences in our lives and work. In this blog post, we will explore what bias is and what it means for the field of machine learning. We’ll also look at ways to identify and overcome our own biases so that we can make better decisions in the future.

What is bias in machine learning?

Bias in machine learning is a term used to describe the ways that artificial intelligence (AI) systems can be unintentionally discriminatory. The term was first coined by Geoffrey Hinton, who is widely considered to be one of the fathers of AI.

There are different types of bias in machine learning, but all of them stem from the fact that AI systems are designed to learn from data alone and not from humans. This means that they can be biased towards certain kinds of data and interpretations over others.

The three main types of bias in machine learning are:

1) Selection bias: This occurs when the AI system chooses which data to focus on and ignores other data that may be relevant. This can lead to inaccurate predictions based on the chosen data.

2) Appraisal bias: This occurs when the AI system evaluates data in a biased way, often favouring information that it has been explicitly taught to recognise as being important. This can result in inaccurate predictions or decisions based on what is seen as valuable data.

3) Generalisation bias: This happens when the AI system makes assumptions about how things will behave based on limited experience or knowledge. These assumptions can lead to incorrect predictions or decisions if they are used in future situations where those assumptions may not hold true.

Types of bias in machine learning

Machine learning algorithms are often accused of exhibiting bias. What does this mean and why is it a problem?

In the simplest terms, bias in machine learning refers to any systematic deviation from neutral predictions caused by the algorithm itself. This might manifest as over- or under-fitting, where the algorithm ‘forgets’ how particular features relate to prediction success or failure, respectively. In practice, this means that certain classes of data (e.g. those belonging to a certain group or those with certain features) will tend to be more accurately predictors of future outcomes than others. Why is this problematic?

First and foremost, it can lead to incorrect predictions – for example, if an algorithm is designed to predict cancer rates but mistakenly predicts low rates for groups of people with dark skin colour, that would be wrong and could potentially put them at risk. Secondly, biased predictions can have a negative impact on individual users and groups of users alike. For example, an algorithm that predicts people with a high chance of getting diabetes may unfairly target those who are already at risk and exacerbate their disease symptoms – this type of discrimination is known as ‘heterosexism’ or ‘ racism’ in machine learning circles (and has been illustrated in numerous studies). Finally, biased algorithms can favour specific types of data over others in order to improve performance (this is sometimes called ‘data scaling’). This can have unforeseen consequences because different types of data tend to behave


Bias in machine learning refers to the unintended effects that can occur when a computer is making decisions based on data. These effects can dramatically affect the accuracy and usefulness of the predictions made by a machine learning algorithm, and can even lead to incorrect conclusions being drawn. In order to avoid bias, you need to understand how it works and take steps to reduce its impact. By doing so, you will be able to harness the power of machine learning while minimizing its potential flaws.



Leave a Reply

Your email address will not be published. Required fields are marked *