Can AI Really Be Fair? Understanding Bias in Machine Learning | by Syed Rehan Syed Hamedsaleem

Why Fairness in AI Matters

AI isn’t just a tech buzzword.
It’s a decision-making system that influences people’s lives.

But imagine a hiring algorithm that prefers male candidates.
Or a facial recognition model that misidentifies people with darker skin.
Or a loan approval system that unintentionally favors one community over another.

These aren’t hypothetical scenarios — they have already happened.

What Exactly Is Bias in Machine Learning?

Bias in machine learning occurs when a model’s predictions systematically favor or disadvantage specific groups due to flawed or imbalanced data.

A model doesn’t wake up one morning and decide to discriminate.
It simply learns whatever patterns the data teaches it, even if those patterns are unfair.

A biased model doesn’t hate anyone, it just mirrors the biases hidden in its data.

For example, if a model is trained on 90% male applicant data for a job, it may unintentionally learn to favor men over women.

Where Does AI Bias Come From?

Bias can slip into AI systems at multiple stages. Here are the main sources:

1. Data Collection Bias

If the dataset doesn’t represent everyone, the model won’t treat everyone fairly.

Example: A health AI trained on urban hospital data may perform poorly for rural patients.

2. Labeling Bias

Humans label the training data. Humans have opinions. Opinions create bias.

Example: One person may label a review as “negative” while another calls it “neutral.”

3. Algorithmic Bias

Some algorithms amplify existing patterns more strongly than others.

Example: A model might give extra importance to certain features simply due to how it was optimized.

4. Deployment Bias

Even a fair model can be used unfairly after deployment.

Example: A camera trained in good lighting conditions fails in dim-light areas where many working-class people live.

Real-World Examples of AI Bias

These incidents shook the tech world:

🔸 Amazon’s Biased Hiring Algorithm (2018)

Amazon built an AI to screen resumes — but it started favoring men over women because historical hiring data was male-dominated.

🔸 Facial Recognition Misidentification

MIT researchers found that major facial recognition systems had up to 34% higher error rates for dark-skinned women than for light-skinned men.

🔸 Apple Card Controversy (2019)

Multiple women reported receiving lower credit limits than men with similar or worse financial profiles.
The algorithm was eventually investigated for discrimination.

Can We Make AI Fair?

Not 100% — because perfect fairness is subjective. But we can make AI systems more responsible.

Here’s how:

1. Collect Diverse Data

Ensure datasets include all groups properly — gender, age, skin tone, region, income bracket, and more.

2. Identify Bias Using Fairness Metrics

Data scientists now use fairness measurements like:

Statistical Parity Difference
Equal Opportunity Score
Disparate Impact Ratio

These help quantify how biased a model is.

3. Use Explainable AI Tools

Techniques like LIME and SHAP show why a model predicted something.
This transparency helps detect unfair logic.

4. Add Human Oversight

Let human reviewers make final decisions in sensitive domains like healthcare, recruitment, or criminal justice.

A Simple Python Example That Shows How Bias Comes Into Machine Learning Models

import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score# Creating a biased dataset
data = {
"gender": ["Male"]*50 + ["Female"]*50,
"score":  [80,85,78,90,92,88,75,83,95,89]*5 +   # males generally higher scores
[45,50,55,52,48,60,58,53,49,51]*5,  # females generally lower scores
"hired":  [1]*50 + [0]*50   # males hired, females rejected
}
df = pd.DataFrame(data)
# Encode gender
df["gender_encoded"] = df["gender"].map({"Male":1, "Female":0})
# Features and labels
X = df[["score", "gender_encoded"]]
y = df["hired"]
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Train model
model = LogisticRegression()
model.fit(X_train, y_train)
# Predictions
preds = model.predict(X_test)
# Accuracy
print("Overall Accuracy:", accuracy_score(y_test, preds))
# Let's test fairness:
male_test = X_test[X_test["gender_encoded"] == 1]
female_test = X_test[X_test["gender_encoded"] == 0]
male_preds = model.predict(male_test)
female_preds = model.predict(female_test)
print("Male Hire Rate:", male_preds.mean())
print("Female Hire Rate:", female_preds.mean())
# Output
Overall Accuracy: 1.0
Male Hire Rate: 1.0
Female Hire Rate: 0.0

In this example, the model is trained on a biased dataset where most men have been approved for loans and most women have been rejected. Even though the algorithm (Logistic Regression) itself is not biased, it learns the patterns that exist in the data.

Because the dataset is unbalanced and unfair:

The model learns that being male = more likely to get a loan
The model learns that being female = less likely to get a loan

Overall accuracy of the model is very good but when we evaluate the model separately for men and women, the accuracy numbers become very different which are shown in the output.

Let’s visualize the gender wise prediction accuracy for above model:

This chart shows how the model performs very differently for men and women. Even though the overall accuracy looks high, the model is much more accurate for male applicants than female applicants . This is strong indication of bias.

The important point here is:

The model is not biased by nature — it becomes biased because the data it learned from was biased.

This is exactly how real-world ML systems end up discriminating in credit scoring, hiring, facial recognition, and more.
Bad data → Bad predictions → Bad outcomes.

My Journey Toward Understanding AI Bias

While learning data science, one of the most eye-opening lessons for me was this:
Building a model is easy.
Building an ethical model is hard.

At Imarticus Learning, I got hands-on exposure to real datasets and understood how even a simple model can behave unfairly if built without care. It changed the way I approach machine learning.

A good data scientist doesn’t just chase accuracy, he/she also checks whether the model is fair, responsible, and trustworthy.

Final Thoughts

AI decides more about our lives than we realize. In today’s time job opportunities, loans, travel approvals, content visibility, medical guidance and much more things are based on AI. If we don’t make AI fair today, we risk creating a future where discrimination is automated, scaled, and harder to detect.

As data scientists, engineers, and decision-makers, we can choose what kind of world our models create.

If you found this blog insightful, share it with someone who is exploring AI or working with ML models.
Awareness is the first step toward fairness.

References & Further Reading

Google: Responsible AI Guidelines
IBM: AI Fairness 360 Toolkit
NIST: AI Risk Management Framework
Book: The Ethical Algorithm by Michael Kearns and Aaron Roth

Source link

Can AI Really Be Fair? Understanding Bias in Machine Learning | by Syed Rehan Syed Hamedsaleem | Nov, 2025

New Books: Wikipedia, Ring, Vibe Code, More

Client Challenge

Client Challenge

Leave a Reply Cancel reply

POPULAR POSTS

20 Best Resource Management Software of 2025 (Free & Paid)

How to Make a Stakeholder Map

10 Ways To Get a Free DoorDash Gift Card

The Role of Natural Language Processing in Financial News Analysis

They Combed the Co-ops of Upper Manhattan With $700,000 to Spend

Categories

Connect With Us

Recent Posts

Bitcoin continues slide that’s roiling markets, threatens to break below $80,000

Made More From One House Than 26 Years of 401(k) Investing