In February 2024, Lamin Juwara joined us to discuss the results of his recently published study “An evaluation of synthetic data augmentation for mitigating covariate bias in health data” on bias mitigation techniques to correct biased datasets.
Bias in real-world datasets is common. For example, we often see biases based on gender, race, and socioeconomic status in health data. When these biased datasets are used in regression modeling, it can result in imprecise predictions and inconsistent estimates. There are also instances of algorithmic bias, where algorithms trained on biased datasets make potentially harmful or discriminatory decisions.
Lamin presents the results of an evaluation of common bias mitigation techniques as well as a synthetic data-augmentation method, synthetic minority augmentation (SMA). Through simulations and evaluations on four different datasets, the results demonstrate which methods are effective, the metrics they excel in, and the level of bias that can be mitigated using these techniques. Based on these findings, recommendations are provided on how to reduce the negative impact of such biases in real-world datasets.
Lamin Juwara is a Postdoctoral Fellow at the University of Ottawa where he works on the applications of machine learning methods to synthetic data generation. His research focus is on the development of statistical and computational methods for analyzing distributed biomedical data under privacy restrictions.