Question 13 of 24
How do we measure and mitigate algorithmic bias?
Standardized metrics for testing whether a model unfairly discriminates against protected groups, and processes for remediation when bias is found.
Defining bias in measurable terms
Algorithmic bias is not a single phenomenon. It can manifest as disparate treatment (the model uses a protected characteristic as an input), disparate impact (the model produces systematically different outcomes for protected groups even without using the characteristic directly), or intersectional bias (the model performs differently for individuals who belong to multiple protected groups simultaneously).
Before testing, define what fairness means for your specific use case. Several mathematically precise fairness definitions exist, and they are often mutually incompatible. Demographic parity requires equal selection rates across groups. Equalized odds requires equal true positive and false positive rates. Individual fairness requires that similar individuals be treated similarly. Choosing a definition involves value judgments about what kind of error is most harmful, and that choice should be made explicitly, not left to default.
Standardized testing metrics
The four-fifths rule (also called the 80% rule) from EEOC Uniform Guidelines provides a widely accepted starting point for employment contexts: if the selection rate for any group is less than 80% of the rate for the most selected group, adverse impact is indicated. This is a screening tool, not a legal standard, but it provides a defensible threshold for triggering further investigation.
For binary classification models, compute confusion matrices separately for each protected group and compare false positive rates, false negative rates, and overall accuracy. Significant differences in error rates across groups indicate bias that may produce discriminatory outcomes even if overall accuracy is high. For scoring models used in credit or risk assessment, compare score distributions and cutoff outcomes across groups.
Remediation and documentation
Bias remediation options depend on where the bias originates. Training data bias may be addressed by resampling, reweighting, or augmenting the dataset. In-processing techniques modify the learning algorithm to incorporate fairness constraints during training. Post-processing techniques adjust model outputs after prediction to achieve fairness criteria.
Document every bias finding and every remediation step. Record the metrics before and after remediation, the technique applied, and the rationale for choosing it. Bias testing and remediation records should be retained as part of the model's audit trail and reviewed whenever the model or its deployment context changes materially.
