2.4 ML models with bias
Models might end up biased, why is that?
[source: https://www.youtube.com/watch?time_continue=1&v=tlOIHko8ySg&feature=emb_logo]
With a unsuitable reward function an undesired result can occur
Framing the problem
- Goal is business reason, not fairness or avoidance of discrimination
- Goal might lead to unwanted side effects https://openai.com/blog/faulty-reward-functions/
Collecting data
- Unrepresentative of reality
- Collecting images of zebras only when sun shines => model might look for shadow for classifying a zebra
- Reflects existing prejudices
- Historical data might lead recruiting tools to dismiss female candidates
- Unrepresentative of reality
Preparing the data
- Selecting attributes to be considered might lead to bias
- Attribute gender might lead to bias
- Selecting attributes to be considered might lead to bias
2.4.0.1 How to avoid bias
Avoiding bias is harder than you might think
Unknown unknowns
- Gender might be deducted by recruiting tool from use of language
- Imperfect processes
- Test data has same bias as training data
- Bias not easy to discover
2.4.0.2 Human bias
Machine learning model can be biased for several reasons as shown above, how about humans?
Study in Germany
- Judges read description of shoplifter
- Rolled a pair of loaded dice
- Dice = 3 => Average 5 months prison
- Dice = 9 => Average 8 months prison