The Ethics of AI: Bias and Fairness
AI is making consequential decisions. Who gets a loan. Who gets an interview. Who gets parole. And increasingly, these systems are making biased decisions.
This isn’t a future problem. It’s happening now.
Real-World Examples
Amazon’s Hiring Algorithm
Amazon built a system to score resumes. It learned from 10 years of hiring data. The problem? Historical hiring was male-dominated in tech roles.
The model penalized resumes with “women’s” (e.g., “women’s chess club captain”). It demoted graduates of all-women’s colleges.
Amazon scrapped the system.
COMPAS Recidivism Algorithm
Courts use COMPAS to predict recidivism (likelihood of re-offending). ProPublica analysis found it was biased against Black defendants:
- Black defendants were falsely labeled high-risk at nearly twice the rate of white defendants
- White defendants were falsely labeled low-risk at higher rates
Same prediction error, different distribution of harm.
Google Photos
In 2015, Google Photos auto-tagged photos of Black people as “gorillas.” The algorithm learned from training data with insufficient diversity.
How Bias Enters AI Systems
Data Collection Bias
If your training data is biased, your model will be biased:
- Historical data reflects historical discrimination
- Underrepresented groups have less data
- Collection methods can exclude populations
Labeling Bias
Human annotators bring their biases:
- Different cultural interpretations
- Inconsistent labeling across demographics
- Subjective categories
Feature Selection Bias
Choosing what to measure matters:
- Using zip codes as proxies for race
- “Years of experience” favoring traditionally employed groups
- Seemingly neutral features with disparate impact
Algorithmic Bias
Even with perfect data, algorithms can:
- Optimize for majority groups
- Amplify small biases
- Create new discriminatory patterns
Defining Fairness
There’s no single definition. Common framings:
Demographic Parity
Positive outcomes should be equal across groups.
P(Ŷ=1 | A=0) = P(Ŷ=1 | A=1)
Problem: Ignores actual differences in base rates.
Equal Opportunity
True positive rates should be equal across groups.
P(Ŷ=1 | Y=1, A=0) = P(Ŷ=1 | Y=1, A=1)
Fair for qualified individuals.
Calibration
Predictions should mean the same thing across groups.
P(Y=1 | Ŷ=p, A=0) = P(Y=1 | Ŷ=p, A=1) = p
A 70% confidence should mean 70% for everyone.
The Impossibility Result
You can’t satisfy all fairness definitions simultaneously (unless the groups are identical or predictions are perfect).
Trade-offs are unavoidable.
Technical Mitigations
Pre-processing
Fix the data before training:
- Re-sample to balance demographics
- Transform features to remove bias
- Augment underrepresented groups
In-processing
Add fairness constraints during training:
# Penalize demographic disparity
loss = classification_loss + lambda * fairness_penalty
Post-processing
Adjust predictions after training:
- Calibrate thresholds per group
- Reject option classification
- Equal odds adjustment
Available Tools
- Fairlearn (Microsoft): Fairness assessment and mitigation
- AI Fairness 360 (IBM): Comprehensive bias detection
- What-If Tool (Google): Visualization for model fairness
- Aequitas: Bias and fairness audit toolkit
Beyond Technical Fixes
Technical solutions aren’t enough:
Problem Selection
Should we automate this decision at all? Some domains may be inappropriate for algorithmic decision-making.
Stakeholder Involvement
Affected communities should shape system design. Technical teams alone can’t define fairness.
Transparency
If an algorithm affects someone’s life, they deserve to understand how.
Accountability
Who’s responsible when an AI system causes harm? Clear ownership matters.
Ongoing Monitoring
Bias can emerge over time as populations and contexts change. Continuous auditing is essential.
Practical Steps
For ML practitioners:
- Audit your data: Check demographic distributions
- Test across groups: Disaggregate performance metrics
- Document assumptions: What does fairness mean for this use case?
- Involve diverse perspectives: Technical skills aren’t enough
- Consider not building: Some applications shouldn’t exist
The Bigger Picture
AI fairness isn’t just a technical challenge. It reflects society’s biases encoded in data and amplified by algorithms.
We’re not just building systems. We’re encoding values, setting defaults, and shaping futures.
The responsibility is significant. Take it seriously.
What we build reflects who we are.