Fair AI-based decision support in healthcare

By Siri van der Meijden

Estimated Reading Time: 4 min

Responsible AI

Artificial Intelligence (AI) rapidly transforms healthcare, moving from research labs to practical clinical applications. Machine Learning, a subset of AI, plays a crucial role by creating prediction models that can assist healthcare professionals in their decision-making process, often referred to as Clinical Decision Support. At Healthplus.ai, we are committed to harnessing the power of AI responsibly. Our AI-based prediction model, PERISCOPE, exemplifies this commitment. PERISCOPE is designed to predict the risk of postoperative infection within 7- and 30-days of a surgical procedure. By providing a probability of infection, PERISCOPE aims to support clinical actions related to diagnosis and treatment, ultimately reducing the severity and impact of these infections.

Lack of a unified definition of fairness for AI-based decision support

However, the integration of AI into healthcare is not without its challenges, particularly concerning fairness. While AI models hold the potential to reduce health disparities, they can also inadvertently exacerbate them due to biases present in data sources, collection procedures, algorithm design, and decision-making processes. The risk of discrimination underscores the critical need for fair AI in healthcare.

Defining and achieving fairness in AI is a complex endeavor. Unlike bias, fairness explicitly considers the ethical implications of using AI in diverse populations. The challenge lies in the fact that there is no single, universally accepted definition of fairness. Numerous notions, metrics, and frameworks exist, often conflicting and requiring careful trade-offs. As highlighted in our recent preprint, “Navigating Fairness in AI-based Prediction Models: Theoretical Constructs and Practical Applications“, we identified 27 different definitions of fairness in recent literature, underscoring the lack of consensus in this field.

Furthermore, improving fairness can sometimes lead to a decrease in overall model accuracy, a phenomenon known as the fairness-accuracy trade-off. Conversely, optimizing solely for accuracy without considering fairness can result in unjustified inequalities in outcomes across certain patient groups. This inherent tension necessitates a careful and nuanced approach to evaluating and deploying AI in healthcare settings.

How we approach AI fairness at Healthplus.ai

At Healthplus.ai, we recognize the complexity of AI fairness and are dedicated to addressing it in the development and deployment of PERISCOPE. Our approach is guided by the principles outlined in our research. For PERISCOPE, which serves as an informative prediction tool providing a risk score (0-100%) rather than a direct classification, the intended use is to inform clinical decisions related to postoperative infection management. Given PERISCOPE’s intended use, we prioritize fairness evaluation methods that align with these characteristics and relevant ethical principles. Our framework emphasizes the importance of considering the model’s intended use, the type of decision influenced, and principles of distributive justice when selecting appropriate fairness metrics.

Monitoring subgroup performance for PERISCOPE

A cornerstone of our fairness approach for PERISCOPE is thorough subgroup monitoring. While we may not always have access to all legally or ethically protected attributes, such as socioeconomic status or race, that could potentially introduce bias, we diligently evaluate the model’s performance and calibration across available and relevant subgroups. For instance, as detailed in our study, we assessed fairness metrics for PERISCOPE in gender subgroups, as men are known to have higher infection rates. Our analysis revealed positive clinical utility overall and within gender subgroups, leading to the determination that no further bias mitigation was immediately necessary for these specific groups.

This example illustrates our commitment to continuously monitoring PERISCOPE’s performance across different patient groups to identify and address potential inequities. By focusing on metrics such as calibration (ensuring predicted probabilities match the outcome rate across groups) and clinical utility (assessing the benefit of the prediction model in patient outcomes), we strive for equal performance and equal benefit across the subgroups for which we have data.

Challenges

It is crucial to understand that achieving perfect fairness across all possible dimensions is often not feasible due to data limitations and the inherent trade-offs between different fairness notions and accuracy. However, our ongoing efforts in subgroup monitoring and our framework for selecting relevant fairness metrics provide a robust foundation for developing and maintaining a more equitable AI-based solution like PERISCOPE.

We invite you to learn more about the complexities of fairness in AI-based prediction models by reading our preprint: https://www.medrxiv.org/content/10.1101/2025.03.24.25324500v1.full.pdf+html.

At Healthplus.ai, we believe that by acknowledging the challenges and proactively implementing thorough evaluation and monitoring strategies, we can continue to advance the field of AI in healthcare in a way that promotes equitable performance and ultimately improves patient outcomes for everyone.

Recent posts

Predictive Analytics in de zorg

Door: Laurens Schinkelshoek Estimated Reading Time: 8 min Inleiding De [...]

Read more
Postoperative Infections: A Bigger Problem Than Just SSIs

By Bart Geerts Estimated Reading Time: 2 min Every hospital [...]

Read more
AI in healthcare: Why a business case matters before you begin

By Bart Geerts Estimated Reading Time: 7 min Artificial Intelligence [...]

Read more