This was a single-center retrospective study using high-fidelity data from the Sickbay (Medical Informatics Corporation, Houston, TX) platform. The primary aim of the study was to characterize the change in vasoactive score after a bolus dose of methylene blue. Secondary analyses included characterize the change in heart rate, respiratory rate, arterial saturation, end tidal carbon dioxide, cerebral near infrared spectroscopy, and renal near infrared spectroscopy.
This study protocol was approved by the institutional review board and in concordance with the Helsinki declaration.
As machine learning/artificial intelligence tools were utilized in this study, this study was done in concordance with the TRIPOD-AI checklist.
Patient InclusionAll pediatric patients (under 18 years of age) who were cared for in the pediatric cardiac intensive care unit from January 1, 2021 to October 1, 2025 were eligible for inclusion if they received a dose of methylene blue. Those receiving mechanical circulatory support or without high fidelity physiologic data from Sickbay were excluded.
VariablesThe following physiologic variables were collected: heart rate, respiratory rate, arterial saturation, end tidal carbon dioxide, cerebral near infrared spectroscopy, renal near infrared spectroscopy. These were collected from the Sickbay platform at 1-second intervals. Heart rate was sourced from telemetry in beats per minute, respiratory rate was sourced from the pulse oximeter in respirations per minute, the arterial saturation was sourced from the pulse oximeter in %, end tidal carbon dioxide was sourced from the end tidal device, cerebral and renal near infrared spectroscopy were sourced from the INVOS 5100 (Medtronic, Minneapolis, MN).
Infusion doses of epinephrine, norepinephrine, vasopressin, and calcium chloride were collected from the electronic health record. They were all collected in mcg/kg/min except for calcium chloride which was collected in mg/kg/hr and vasopressin which was collected in milliunits/kg/min.
Infusion doses of morphine, hydromorphone, fentanyl, midazolam, and dexmedetomidine were also collected from the electronic health record.
Fluid balance was collected at 30-minute intervals.
Age at infusion, weight at infusion, and dose of methylene blue were all collected. The circulatory physiology was also collected as one of the following: fully septated biventricular circulation, left to right shunt, limitation to pulmonary blood flow, limitation to systemic blood flow, functionally univentricular multidistributive, functionally univentricular Glenn, functionally univentricular Fontan, transposition, or ventricular dysfunction.
Statistical AnalysesRelative time was calculated for each datapoint in relation to the time of methylene blue administration for that patient. Thus, a relative time of 0 was the time at which methylene blue was administered.
To demonstrate the change in physiologic variables over time, locally estimated scatterplot smoothing regression was utilized. The y-axis of these plots was percent change from baseline while the x-axis was relative time. Data across all administrations were aggregated by mean percent change at each time point. The 95% confidence intervals were calculated from the standard error of the mean and plotted as well. This was done to allow for visualization of the nonlinear, nonparametric tends in the high-fidelity data with ease.
Next, to examine potential causal relationships between a bolus dose of methylene blue and physiologic variables, a counterfactual intermittent time-series modeling framework was used. This is conceptually similar to an average treatment effect approach. Percent change from baseline of each physiologic variable was modeled as a function of relative time, methylene blue dose, patient weight, epinephrine dose, norepinephrine dose, vasopressin dose, calcium chloride dose, sedation doses, fluid balance, and circulatory physiology. Models were fitted using bootstrap resampling (1,000 iterations) to generate empirical 95% confidence intervals around the estimated coefficients. Counterfactual modeling was selected to allow more than just descriptive association, enabling quasi-causal inference by estimating what the physiologic outcome might have been in the absence of the methylene blue exposure while controlling for measured covariates. The findings of such analyses represent estimated causal associations conditional on the included variables, not definitive mechanistic causality.
Figure 1 illustrates the time-varying physiologic effect of the intervention using a counterfactual time-series framework. For each patient, time was re-centered so that the moment of intervention corresponds to relative time 0. Thus, the period before 0 represents the pre-intervention physiologic state, and the period after 0 represents the post-intervention course. The solid orange line (“Observed,” before intervention) shows the natural, noisy fluctuations in the physiologic variable prior to the intervention. This same trajectory is then projected forward into the post-intervention period as a thin, semi-transparent, dashed continuation labeled “Counterfactual (no intervention).” This projected dashed line represents the model-based estimate of how the physiology would have continued over time in the absence of the intervention. The solid blue line (“Observed,” after intervention) represents the actual post-intervention trajectory. In this schematic, no abrupt step change at time 0 is assumed; instead, the observed post-intervention course shows a progressive downward slope over time. The arrow labeled “Trend change” quantifies the divergence between the observed post-intervention trajectory (blue line) and the counterfactual no-intervention trajectory (dashed orange line) at later time points. This difference reflects not only an instantaneous shift in level but a sustained change in slope over time, consistent with an ongoing treatment effect rather than a transient perturbation. Analytically, this framework goes beyond descriptive averaging: by modeling the physiologic response as a function of relative time, intervention status, and relevant clinical covariates (e.g., concomitant therapies, circulatory physiology, fluid balance), it estimates what the physiologic variable would have been without the intervention and compares that estimate to what was actually observed. The resulting contrast can be interpreted as an intervention effect under the assumptions of the model and conditional on the included covariates, but it should not be taken as definitive mechanistic causality.
Fig. 1
A graphical depiction of counterfactual time series analysis
Next, an unsupervised hierarchical cluster analysis was performed using Ward’s minimum variance linkage and Euclidean distance. This machine learning approach was utilized to determine if there were specific phenotypes of physiologic response. The input feature matrix consisted of the mean percent change from baseline to 4 h after administration of methylene blue for each physiologic variable. Clinical data was not included in the cluster, but clinical characteristics of each cohort were explored after cluster creation. The optimal number of clusters was determined by maximizing the silhouette coefficient, supported by additional internal validity metrics including the Calinski-Harabasz index, Davies-Bouldin index, and Dunn index. Cluster robustness was assessed using bootstrapped Jaccard similarity with 100 resamples. Cluster analysis was chosen to identify latent physiologic response phenotypes, mathematically, without human bias. Because clustering was unsupervised, the findings represent structure based purely on inter-variable relationships, independent of any predefined clinical labels. Interpretation of clusters is associative and not causal as these clusters describe co-varying patterns rather than direct mechanistic relationships.
Finally, random forest analyses were conducted to determine the association of vasoactive medication doses prior to methylene blue administration and decrease in vasoactive score, increase in mean arterial blood pressure, and increase in near infrared spectroscopy. Each model used 600 decision trees with unrestricted dept and a minimum leaf size of two samples. Predictors were standardized using a column transformer pipeline. Models were trained and tested using an 80/20 hold-out split when more than 10 evaluable administrations were available. Model performance was assessed using the coefficient of determination and root mean square error on the hold-out set. Internal validity was estimated using k-fold cross validation. Feature importance was computed as the mean decrease in impurity across all trees. Partial dependence plots were generated for each predictor to visualize its marginal relationship with the dependent variable while holding other predictors constantly.
All analyses were done using Python in a local Jupyter environment. A p-value of less than 0.05 was considered statistically significant. This study adhered to the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis + Artificial Intelligence (TRIPOD + AI) guidelines, ensuring comprehensive and transparent reporting of the clinical prediction model development and evaluation processes (PMID: 38626948).
Comments (0)