2.4 ML models with bias

Models might end up biased, why is that?

With a unsuitable reward function an undesired result can occur

Framing the problem
- Goal is business reason, not fairness or avoidance of discrimination
- Goal might lead to unwanted side effects https://openai.com/blog/faulty-reward-functions/
Collecting data
- Unrepresentative of reality
  - Collecting images of zebras only when sun shines => model might look for shadow for classifying a zebra
- Reflects existing prejudices
  - Historical data might lead recruiting tools to dismiss female candidates
Preparing the data
- Selecting attributes to be considered might lead to bias
  - Attribute gender might lead to bias

Avoiding bias is harder than you might think

Unknown unknowns
- Gender might be deducted by recruiting tool from use of language
Imperfect processes
- Test data has same bias as training data
- Bias not easy to discover

Machine learning model can be biased for several reasons as shown above, how about humans?