I am currently working on a project where the data concerns people and the dataset contain personal data with sensitive attributes. (typically: age, sex, handicap, race).
Now it seems there are mainly three options for modelling:
- Not take the protected attributes into the features. It is usually considered slightly bad as you can have hidden correlation.
- Taking the features into account in the model but doing nothing after. This is usually considered very bad as the model will explicitly learn biases.
- Taking the features into account in the model, then correcting the action taken, based on those protected attributes to ensure fairness.
I am curious if there is a general rules of thumbs for evaluating the impact of model performance on the bias. Basically an argument can be made that a better model (think gbdt over linear model) will be better overall. The counter argument could also be True, depending on the approach, mainly in the last case, because in the first two cases a better model will better learn biases, hidden or not.
Is there any rule of thumbs regarding the issue, and the possible need to implement a better model?