Explainable AI for Medical Debt Forecasting: Integrating Healthcare and Fintech Data for Risk Prediction
Main Article Content
Abstract
Medical debt is increasingly recognized as one of the most pressing financial risks confronting households worldwide, particularly in the United States where it remains a leading driver of personal bankruptcy and long-term financial distress (Himmelstein et al., 2019). In emerging economies, similar dynamics are unfolding as healthcare expenditures rise faster than income growth, leaving households exposed to sudden clinical shocks and fragmented insurance coverage. Conventional credit scoring systems often fail to account for these medically induced shocks, as they rely primarily on historical repayment behavior and credit utilization. Conversely, healthcare analytics rarely incorporate indicators of financial resilience, liquidity management, or digital finance behaviors that directly influence a patient’s ability to manage debt obligations. This disjunction highlights a critical methodological gap: the absence of an integrated, explainable, and fair framework that links healthcare utilization with fintech-derived financial signals for predicting medical debt default. To address this gap, this study introduces a novel risk-prediction framework that integrates multi-source data from healthcare, insurance, and digital finance ecosystems. Specifically, we fuse clinical utilization records (e.g., emergency department admissions, inpatient stays), chronic illness burden scores (Charlson comorbidity index), and insurance continuity metrics with fintech-derived features such as transaction volatility, liquidity proxies from digital wallets, repayment friction measures, and alternative credit metadata. The framework operationalizes explainable artificial intelligence (XAI) models—comparing logistic regression with advanced ensemble methods such as Random Forests and XGBoost—and evaluates predictive performance across multiple dimensions: discrimination (AUC-ROC), calibration, interpretability, and fairness. Our results, derived from a de-identified, representative, multi-source panel dataset, demonstrate that gradient-boosted models (XGBoost) outperform traditional logistic regression by approximately 6–10 percentage points in AUC while simultaneously reducing false negatives, a critical feature in preventing undetected high-risk cases. Global feature importance measures (gain and impurity indices) and local explanations via SHAP values reveal that insurance discontinuities, high out-of-pocket expenditure ratios, recent acute encounters (emergency or inpatient), and fintech channel spending volatility are the most salient drivers of medical debt default. Importantly, SHAP-based local interpretations provide case-level transparency, enabling lenders, hospitals, and insurers to justify risk classifications to stakeholders and regulators. We extend the analysis by embedding fairness evaluation criteria, including equal opportunity difference and demographic parity. While XGBoost improves predictive performance, disparities emerge across income and racial subgroups, underscoring the need for algorithmic governance and bias auditing. Counterfactual simulations provide further insights: scenarios that close insurance coverage gaps and smooth liquidity through point-of-care microcredit mechanisms reduce modeled default probabilities by 12–20% among the highest-risk decile. These findings underscore the potential of combining AI-driven forecasting with targeted financial and policy interventions to proactively mitigate household financial distress. The study contributes to both academic and policy debates by positioning explainable AI as a practical and ethical tool for managing medical debt risk. For healthcare providers, the framework offers a basis for early identification of financially vulnerable patients and the design of tailored repayment or relief plans. For fintech lenders and insurers, the integration of clinical and financial data broadens underwriting horizons while maintaining accountability through transparent explanations. For regulators, the framework demonstrates how XAI can balance innovation with fairness, highlighting opportunities for algorithmic auditing, bias mitigation, and responsible financial inclusion. By bridging healthcare analytics with fintech data and deploying explainable AI methodologies, this paper provides a comprehensive and ethically aligned blueprint for forecasting medical debt risk. The results illustrate not only performance improvements but also actionable strategies that align with broader societal objectives of financial inclusion, healthcare access, and regulatory oversight. In doing so, the framework sets the stage for future research on hybrid models, real-time predictive deployment, and cross-market validation to ensure robustness across diverse healthcare and financial systems.