Predicting Missed Appointments in Primary Care: A Personalized Machine Learning Approach [Original Research]

INTRODUCTION

Continuity of care plays a vital role in primary care practice, to achieve better health outcomes and minimize overall medical costs.1-3 The longitudinal nature of continuity of care enables clinicians to optimize treatment strategies via a comprehensive view of the patient’s medical history and socioeconomic contexts that affect the patient’s ongoing illness and future health.4 Failure to attend primary care visits could disrupt essential care for patients and also interrupts clinical workflows and strains health care resources.2,5,6 Delayed or missed appointments for essential care can lead to serious health consequences and increased disparities among populations already experiencing health inequities.6-8 Despite major efforts to maintain the care continuum by building patient-clinician relationships and offering more access options, many health systems continue to struggle with patients missing their appointments.

The health services literature often identifies missed appointments when patients fail to show up or cancel within 24 hours before their appointments,5,9 though the nature of and contributing factors for no-shows and late cancellations could be different and require separate measurements.10,11 Whereas recent studies have provided insights into the demographic and clinical-related underpinnings of missed appointments,9,12 the effects of the care continuum and geosocial characteristics on missed appointments are not well understood.11 In addition, research on missed appointments has mainly relied on conventional statistical analyses of information available in electronic health records (EHRs) or patient surveys. There is limited knowledge regarding how to leverage machine learning (ML) to optimize previsit planning and appointment adherence in the primary care setting.

The objective of this study was to apply ML modeling using personal, health care utilization, geosocial, and climate data to assess the risk of no-shows and late cancellations among patients managed by primary care practices. This study could provide valuable insights into the barriers underlying missed appointments, enabling care teams to design personalized interventions that improve patient appointment adherence.

METHODSStudy Design and Setting

We applied a retrospective longitudinal design using integrated clinical, geosocial, and climate data. We retrieved patients’ clinical characteristics and appointment information from the EHR database of a regional academic medical center in southcentral Pennsylvania, which were then were geocoded and linked to corresponding US Census Bureau statistics and national weather reporting databases. Our analysis included all primary care visits and consultations scheduled from January 2019 to June 2023 at 15 family medicine clinics of the medical center. The outcome of each appointment was extracted from the scheduling table of the EHR database and categorized as no-show, late cancellation (cancelled within 24 hours before appointment), or completed visit. Persons deceased or living outside of Pennsylvania were excluded from the study. This research was approved by the academic medical center’s institutional review board.

Risk Factors for Missed AppointmentsDemographic and Clinical Characteristics

Patient characteristics included sex, age at visit, race/ethnicity, whether English was the preferred speaking language, insurance type, and comorbid conditions (more details are listed in Supplemental Table 1). Race/ethnicity was classified into the following 5 subgroups: Asian, Hispanic, non-Hispanic Black, non-Hispanic White, and other. Insurance types consisted of commercial, Medicare, Medicaid, and uninsured. The complexity of each patient’s health condition was measured by the number of health conditions present according to the Elixhauser comorbidity measure (29 medical, psychiatric, and lifestyle-related indicators),13,14 which is considered a reliable predictor of health care needs.15

Health Care Utilization

Health care utilization characteristics included scheduled lead time (number of days from a patient’s appointment request to the appointment date), prior appointment history, mode of visit (in-person, telemedicine), clinician type (physician, physician assistant/nurse practitioner), clinician’s years of practice, primary care physician (PCP) relationship, and continuity of care indices. Prior appointment history was measured by prior completed visit, no-show, and late cancellation rates, representing the ratio of the number of specific outcome events to the number of all prior appointments within 3 years. In addition, 3 continuity indices, including the usual provider of care index, the continuity of care index, and the sequential continuity of care index, were used to assess the density, dispersion, and sequential nature of care continuity within a 3-year period (more details are listed in Supplemental Table 1).16

Geosocial and Environmental Contexts

Each patient’s home address was geocoded and matched to a corresponding census block group. The distance from each patient’s home to their primary care clinic was estimated in miles using the Euclidean method. Each patient’s rural status was determined by linking their address to the Census Bureau’s urban and rural classification mapping file.17 Area deprivation index was extracted at the census block–group level to represent the composite socioeconomic status (eg, poverty, education, and housing quality) of the patient in the neighborhood context.18 Two additional Census statistics, the percentage of people without a vehicle and people graduated from high school at the block-group level, were obtained as proxies for transportation access and education. We extracted local climate information for each clinic, including daily average temperature, rain precipitation (inches), and snowfall (inches), from the national weather database.19

Machine Learning Modeling and EvaluationModel Selection

We used multiclass ML modeling approaches, including gradient boost, random forest, neural network, and least absolute shrinkage and selection operator logistic regression, to predict appointment outcomes (no-show, late cancellation, and completion). We randomly categorized the study sample at the patient level into a training data set (80% of the total samples) and a testing data set (20%). Prediction performance on the testing set was evaluated using the area under the receiver operating characteristic curve (AUROC) and F1 score for each class (vs other classes). The macro-average AUROC and macro-average F1 score were also calculated as the average of the AUROC and F1 score for all classes. We further calculated the misclassification rate of no-shows and late cancellations, respectively, to denote the percentage of incorrect predictions. The 95% CI of each metric was calculated to provide the accuracy of the performance measure. Details on model development are provided in Supplemental Appendix 1.

Feature Importance

The Shapley Additive Explanations (SHAP)20 value was calculated on the basis of game theory to assess each factor’s contribution to the predicted output at the patient level.20,21 Factors positively influencing the predicted outcome generally have positive SHAP values for that outcome and vice versa. The global importance of each factor was also calculated by averaging absolute SHAP values across all individuals to indicate the overall importance of each factor for predicted appointment outcomes. In addition, SHAP values captured the heterogeneous effects of each factor, enabling a personalized interpretation of predictive factors and outcomes.

Fairness Check

To assess for potential bias of the ML models associated with sensitive features, we conducted stratified analysis by evaluating the prediction performance for each subpopulation (sex, race, ethnicity) to ensure the fairness of the final model. Feature importance analysis was also reassessed by each subgroup.

RESULTSStudy Cohort Characteristics

The study included 109,328 patients and 1,118,236 appointments, including 77,322 (6.9%) no-shows and 75,545 (6.8%) late cancellations (Table 1). Overall, patients who missed appointments tended to be female, younger, under/uninsured, less fluent in English, and in ethnic minority groups. They also experienced longer lead times, greater prior missed appointment rates, and more socioeconomic challenges. Clinicians of missed appointments tended to have fewer years of practice. The percentages of no-shows, late cancellations, and completed visits also varied by visit mode, clinician type, and whether the visit was with the patient’s PCP.

Table 1.

Characteristics of Study Cohort by Appointment Outcome

Performance of Prediction Model

The gradient boost model showed the best performance compared with the other ML models (Table 2). The model achieved a macro-averaged AUROC of 0.881 (95% CI, 0.881-0.882) for overall model performance, along with an AUROC of 0.852 (95% CI, 0.851-0.852) for predicting no-shows, 0.921 (95% CI, 0.920-0.921) for late cancellations, and 0.872 (95% CI, 0.871-0.872) for completed appointments (Figure 1). The gradient boost model also showed better balance between precision and recall, with the highest F1 scores, as well as low misclassification rates for no-shows and late cancellations (Table 2).

Table 2.

Comparison of Performance in Predicting Missed Appointments by Different Machine Learning Models

Figure 1.Figure 1.Figure 1.

ROC Curve for Each Outcome vs Others and Macro-Averaged ROC Curve of 3 ROC Curves as the Overall Model Performance

AUROC = area under the ROC curve; ROC = receiver-operating characteristic curve.

Feature Importance

The global importance of the key factor, expressed as average SHAP values, in predicting appointment outcomes is shown in Figure 2. The most influential factors included scheduling lead time, providers’ years of practice, patient age at visit, and whether the visit was with the patient’s PCP, followed by the history of prior appointment outcomes (completed, no-show rates), geosocial contexts (area deprivation index, distance from home to clinic), and environmental factors (temperature). The degree of importance of each factor also varied by target outcome. For example, patient age at visit and prior no-show rate were found to play a strong role in predicting no-shows but were less important for predicting late cancellations.

Figure 2.Figure 2.Figure 2.

Feature Importance of Top 15 Most Impactful Predictors in Gradient Boost Prediction Model

PCP = primary care physician; SECOC = sequential continuity of care index; UPC = usual provider of care index.

Figure 3 shows the distribution of the SHAP values for the top contributing factors on the no-show risk at the individual patient level. The wide divergence of SHAP values for several factors indicates large heterogeneity in the study population. For example, although lead time played an important role in predicting no-shows, lead time effect on the predicted outcome could vary by person and other characteristics of the person. Feature importance analysis allowed clinicians to evaluate the individualized effect and how the predicted outcomes were driven by each person’s unique features, for which we provide more examples in Supplemental Figure 1 and Supplemental Figure 2.

Figure 3.Figure 3.Figure 3.

Heterogeneity of SHAP Values on Predicted No-Show Risk for 15 Most Influential Predictors Across Individuals

ADI = area deprivation index; COCI = continuity of care index; SHAP = Shapley Additive Explanations.

Note: Each point corresponds to an individual visit in the data set, with the color denoting the numeric value of the corresponding predictor feature. The horizontal position of each point represents the individual’s SHAP value of the corresponding predictor feature, which illustrates the size and direction of how it affects the no-show risk.

Fairness Check

The fairness of the gradient boost model among sex and racial/ethnic subgroups was validated by stratified analyses. Figure 4 shows that the macro-averaged AUROC was consistently high across sex (0.881-0.882) and race/ethnicity subgroups (0.846-0.882). The AUROC for each target appointment outcome remained >0.8, with modest variation within ±0.25 across all subgroups. All subgroup analyses also showed similar results in other performance measures (Supplemental Table 2) and shared important features among the 10 most influential factors for each subpopulation (Supplemental Figure 3).

Figure 4.Figure 4.Figure 4.

Stratified Analysis of Prediction Performance Measured by Area Under the ROC Curve Across Subpopulations

ROC = receiver-operating characteristic.

DISCUSSION

Missed appointments have long posed challenges to optimizing the care continuum, clinical workflow, and resource allocation for primary care practices. Our analysis showed a 13.7% missed appointment rate, within the 8% to 24% range reported in other primary care settings.22,23 The differences could be attributed to variations in demographics, practice settings, or health care access. This study developed multiclass ML models using integrated clinical, demographic, geosocial, and climate data to predict appointment outcomes at primary care clinics. Whereas all of the proposed ML models achieved good performance, with an overall AUROC >0.8 in predicting appointment outcomes,24 the gradient boost model outperformed random forest, neural network, and logistic regression models. In addition, a fairness check on the model showed similar model performance after stratifying by sex and ethnic/racial subgroups, indicating that predicted results were not biased against specific patient characteristics.

Feature importance analysis revealed that lead time was the most important predictor of appointment outcome, consistent with prior research.9,25 Our study found large variability in how lead time influences predicted risk. Essentially, longer lead times (>60 days) are associated with a greater risk of missed appointments, whereas the effect of shorter lead times (<30 days) can be complex, owing to potential interactions with factors such as age, race/ethnicity, socioeconomic status, and prior no-show rates. Given the strong effect of lead time, clinics could prioritize shorter wait times for high-risk patients. Given that short lead times had variable effects, ML-based analytics could help clinicians anticipate patient-specific needs, personalize outreach efforts, and proactively facilitate appointment scheduling. For example, patients with no-show history or from disadvantaged backgrounds might benefit from additional outreach such as text reminders or transportation assistance.

Several demographic characteristics, including female sex, younger age, non-English speaking, number of chronic conditions, and distance to the clinic, were also identified as key contributors to appointment outcomes. Essentially, patients’ sex, age, and language fluency played an important role in predicting no-shows, whereas home-to-clinic distance had a strong influence on late cancellations. The findings suggest that health systems should prioritize strategies for decreasing lead time, increasing interpreter availability, and offering transportation assistance for better appointment adherence. Whereas telemedicine was not identified among the most influential predictors of appointment adherence in our analysis, health systems should continue offering it as a convenient and accessible option26,27 to decrease missed appointments and improve access to care.

Among geosocial factors, the socioeconomic deprivation level of the neighborhood where patients lived was found to be the most important predictor of missed appointments. Studies have linked missed appointments to low socioeconomic status, which is often associated with a lack of health care literacy, financial stability, and social support.5,28 Among environmental factors, temperature was shown to have the greatest effect, especially on late cancellations. The influence of these factors on missed appointments has been reported9,29; however, it might not be viable for health systems to tackle these risks directly within the scope of health care operations. Nonetheless, incorporating geosocial and weather data into predictive modeling is beneficial for accounting for seasonality effect, offering valuable insights into patient behavior that can inform clinic operations and patient navigation efforts.

Several care utilization factors were also shown to be important predictors of appointment outcome. The clinician’s years of practice, prior completed visit rates, continuity of care indices, and whether an appointment was the first visit at a clinic were strong indicators of whether future visits would be completed. Previous no-show and late cancellation rates were considered strong predictors of no-shows and late cancellations, respectively. Primary care practices generally view these factors as core measures of continuity of care, closely associated with the quality, cost, and outcome of health care. Our analysis revealed that the history of care continuum could further serve as a powerful predictor of missed appointments, which is intuitive and practical given that this metric is already monitored in day-to-day clinical operations. Health systems could enhance continuity of care via better clinician-patient communication, customized team-based care,30 patient engagement, and shared decision making.31

Unlike earlier studies that relied on a single EHR or administrative database to predict patient no-shows,32-36 the present study augmented clinical information with geosocial/environmental contexts and leveraged ML modeling, particularly its ability to effectively capture complex nonlinear relations within high-dimensional data, accommodating the substantial heterogeneity of the study population. In contrast, traditional regression models emphasize global effects across the entire population, potentially overlooking key insights for individual patients. Integrating personalized predictive models with system-wide initiatives, such as automated reminders, patient navigations, and late cancellation fees, offers opportunities to help care teams create more targeted, effective, and personalized interventions for better appointment adherence.

Study Limitations

Our study has several limitations. First, despite a large sample size, the study population was based on a single academic health care center, potentially limiting the findings’ generalizability. Second, our study period encompassed the COVID-19 pandemic, which might have affected patients’ appointment adherence. Our supplementary analysis (Supplemental Appendix 1) showed that despite changes in appointment adherence before and after the pandemic, the prediction model performed robustly across periods, supporting the framework’s feasibility for future predictions. Third, ML models provide a limited explanation of how the models make predictions. Thus, we applied SHAP analysis at both the population and individual levels to describe the factors’ contributions to the prediction results. Fourth, this analysis was based on a single gradient boost model, which had the best performance internally validated with our data. Future research should explore ensemble learning to optimize prediction accuracy and robustness. Lastly, our analysis predicted the outcomes of each patient’s appointments separately instead of treating them as time-series data. To capture the temporal nature of appointment data, researchers can consider more sophisticated deep learning approaches, such as recurrent neural networks, to assess individual’s dynamic risk of missed appointments, helping health systems to make more informative decisions.

Comments (0)

No login
gif