High-fidelity measurement of pulse arrival time in critically ill children using standard bedside monitoring equipment

Blood pressure (BP) is a critical vital sign for monitoring patient health status, reflecting the adequacy of blood flow to vital organs and serving as an essential indicator of cardiovascular function. Abnormalities in BP as a result of various conditions (e.g. septic or hemorrhagic shock) can lead to increased risk of morbidity and mortality due to impacts on vital organ perfusion (Guarracino and Bertini 2022, Schuurmans et al 2024). In high-acuity settings such as the operating room or an intensive care unit (ICU), patients can experience significant physiologic derangement that can lead to rapid changes in BP, making real-time and continuous BP measurement essential for detecting and managing instability. This data is typically obtained using peripherally inserted catheters in the arteries, which is not always feasible or risk-free (Lucet et al 2010). Meanwhile, non-invasive blood pressure measurements using a manual or automatic sphygmomanometer provide intermittent information and may not always be accurate. As such, there has been growing interest in the development of noninvasive approaches to continuous BP monitoring (Henry et al 2024).

One such method relies on the measurement of pulse arrival time (PAT) or pulse transit time (PTT), both of which describe the time it takes for the pulse to travel from one measurement site to another within the body. In this work, we define PAT as the time between the electrical pulse starting a heartbeat and the arrival of the pulse at a peripheral site, while PTT is the time it takes for the pulse wave to travel between two peripheral sites. Both PAT and PTT are inversely proportional to pulse wave velocity (PWV), which itself is highly correlated with blood pressure (Fung et al 2004, Foo and Wilson 2009, Schaanning and Skjaervold 2020, Finnegan et al 2021). Hospitalized patients are routinely monitored using standard bedside monitoring equipment that collects signals like electrocardiography (ECG) and photoplethysmography (PPG). These signals enable noninvasive measurement of the pulse wave at different sites, such as the heart for ECG and a peripheral site like a fingertip or toe for PPG (Smith et al 1999).

Large datasets of physiological waveforms have become more common in recent years (Lee et al 2011, 2022, Goodwin et al 2020, Johnson et al 2023, Miller et al 2023, Samad et al 2023), leading many researchers to investigate ways of estimating BP using PAT as a feature in data-driven models (Cattivelli and Garudadri 2009, Wang et al 2014, 2023, Xu et al 2017, Yoon et al 2017, Zhang et al 2017, Escobar-Restrepo et al 2018, Liang et al 2018, Dash et al 2019, Lee et al 2019, Bote et al 2020, Schaanning and Skjaervold 2020, Yang et al 2020, Finnegan et al 2021, Sanches et al 2024). However, the temporal integrity of the ECG and PPG waveforms in these databases has often been taken for granted. Signals collected from bedside monitoring equipment and clinical data aggregation systems can contain timing errors resulting from clock drift, digital filtering, and signal buffering (Goodwin et al 2021). For example, Lee et al observed a delay of approximately 350 ms between the ECG and PPG signals in the VitalDB database (Lee et al 2019). This issue is so salient that documentation for the often-used MIMIC dataset states that it is not designed for inter-waveform analysis (Johnson et al 2023), with Liang et al stating ‘one of the next steps is to investigate the synchronicity over the asynchronous MIMIC database’ (Liang et al 2019). Pulse oximetry from different models or manufacturers may preprocess PPG signals in different ways (Foo et al 2005), and medical devices may also alter the signals by applying proprietary pre-processing algorithms (Feldman et al 2021), potentially rendering the data unsuitable for some applications (Feldman 2006).

Ignoring the potential for offsets between the ECG and PPG signals may result in PAT values that are outside a physiologically plausible range. Some researchers have simply filtered out unusual PAT values to use values within what they believe to be a plausible range (He et al 2013, Finnegan et al 2021), but this practice has been inconsistently applied across analyses and has likely led to biased results. Therefore, standardized approaches to measurement must be developed to provide a foundation for research and analysis involving PAT.

To the best of our knowledge, nobody has deeply investigated and addressed the multitude of system-related timing errors that affect noninvasive PAT measurement. In this paper, we present a methodology for quantifying and correcting the discrepancies between ECG and PPG signals collected from standard bedside monitoring equipment. We apply these techniques to a large, retrospective dataset of physiological waveforms collected from over 10 000 critically ill children admitted to an ICU. We first utilize this wealth of data to support our assertion that these errors are derived from the system’s hardware. We then compare the distribution of PAT measurements before and after applying our methodology to demonstrate that addressing system-related timing errors is necessary to extract clinically meaningful data.

2.1. Naïve PAT measurement

PAT measurement requires the identification of the timestamps of fiducial points in both the ECG and the PPG signals. The R-peak is typically used as the timestamp of beat origination according to ECG because of its prominence relative to other components in the waveform, while in PPG, a wide range of possible choices for fiducial point are available (Mejía-Mejía et al 2022, Goda et al 2024, Poliński et al 2024).

If the ECG and PPG signals are perfectly synchronized, it is reasonable to assume that every beat in the ECG signal can be paired with the following beat in the PPG signal as PAT is being measured. This pairing, denoted as PAT0 in figure 1, is the default way of measuring PAT. However, the existence of system-related timing errors between the ECG and PPG signals can break this assumption, meaning that ECG beats should not necessarily be paired with the following beat in the PPG signal. Instead, they should possibly be paired with one further away in time; such possible pairings are denoted PAT1, $\dots$, PATn in figure 1. This issue is particularly salient when the delay in the PPG timestamps is large and/or when the heart rate is high. The remainder of this section outlines several factors that can result in errors in the PPG timestamps.

Figure 1. This figure shows the potentially incorrect measurement of PAT from ECG and PPG signals. The R-peak from the first ECG heartbeat (labeled Peak0) is selected as the first of two fiducial points that are required for the PAT calculation. Four possible PAT values are displayed, PAT0, PAT1, PAT2, and PAT3 measured to the 1st, 2nd, 3rd, and 4th heartbeat onsets in the PPG respectively. The typical measurement method for calculating PAT seen in prior work naïvely selects PAT0 (i.e. the ‘next beat’ in the PPG). However, characterization and correction of the system-related delays in the PPG timestamps will generally shift the PPG signal to the left. This results in a situation where, prior to synchronizing these signals, R-peaks may not be physiologically associated with the ‘next’ beat in the PPG signal.

Standard image High-resolution image 2.2. Sources of PAT measurement error

PAT measurement is subject to many sources of error. Some of these errors can be attributed to algorithmic deficiencies like poor fiducial point detection, yet there are others that depend on the software and hardware used to record and collate signals prior to that step. Our dataset comes from an ICU setting where ECG and PPG signals are collected using hardware from two different device manufacturers connected to a bedside monitor (see section 3.1 for more details). We describe the contributors to the timing errors in this scenario below:

2.2.1. Sample rate imprecision and re-synchronization $\mathbf)}$

After being timestamped by their respective hardware, third-party sensor devices pass their signals through a module connected to the bedside monitor, which then reassigns timestamps according to the monitor’s clock. Since the sample rates supplied by both device manufacturers are not precisely specified, there is a non-zero relative drift rate between the timestamps that are generated when using the nominal sample frequencies (Artoni et al 2018, Vollmer et al 2019). This relative drift rate is periodically corrected by synchronizing the two signals, but the continuous drift-and-correct cycle leads to a sawtooth-shaped series of timing errors observed by other researchers (Zhu et al 2015, Bennis et al 2019, Lin et al 2019); an example of this artifact is shown later in Panel B of figure 2. Upon observing this artifact, Bennis et al suggest, ‘This sawtooth, together with a large module-dependent absolute difference in PTT, renders the thus-derived PTT insufficient for clinical purposes’ (Bennis et al 2019). If the signal is passed through multiple devices (e.g. an extension module before going to the monitor), multiple sawtooth artifacts could be present within the same signal.

Figure 2. The left side of the figure provides an overview of the methodology we employed for precise PAT measurement, while the right side shows the results of this procedure on a real-world PAT timeseries. (A) Naïve PAT measurements are taken between the ECG and the PPG assuming that ECG R-peaks should be matched with the ‘next beat’ in the PPG. (B) PAT$ _\mathrm$ is calculated using our beat-matching algorithm. (C) A Kalman filter is used to identify the sawtooth artifact that appears due to clock drift and synchronization. (D) After removing the first sawtooth artifact, a second one appears in our dataset due to multi-stage hardware. (E) PAT$ _\mathrm-\mathrm_\mathrm$ accounts for the cumulative effects of the sawtooth artifacts. (F) PAT$ _\mathrm$ excludes Delay$ _\mathrm$ and Delay$ _\mathrm$, which are both derived from hardware profiling. (G) PAT$ _\mathrm$ is the latent PAT signal that addresses measurement errors resulting from the finite temporal resolution of the fiducial points.

Standard image High-resolution image 2.2.2. Filter-based delays $\mathbf)}$

Clinical monitors typically apply digital filters to signals in an attempt to enhance signal quality for downstream visualization and analysis (Tompkins 1993, Mejía-Mejía et al 2022). PPG signals particularly require aggressive averaging and smoothing due to the amount of noise in the raw signals (Wolling and Van Laerhoven 2020). A byproduct of this process is delays and other timing errors that can impact the relative timing of fiducial points within the PPG signal itself and with respect to other signals (Liu et al 2021, Liao et al 2022, Wang et al 2022, Lapitan et al 2024). The characteristics of these filters are often considered proprietary and are likely to vary across different medical devices. This means that delays of different magnitudes may be present in signals collected from different medical device manufacturers.

2.2.3. Digital buffering $\mathbf)}$

The main focus of a monitoring device is to provide a steady output visualization to facilitate clinical care. To this end, many sensors, clinical monitors, and data aggregation systems may buffer measurements to avoid pauses and discontinuities in their continuous output (Bracco and Backman 2012). The magnitude of the delays introduced by this buffering process may vary across devices and can be further compounded by the presence of intermediary data aggregation systems (Foo et al 2005).

2.2.4. Temporal resolution $\mathbf)}$

Digital signal quantization and filtering algorithms can limit the temporal resolution of fiducial points that are used to measure PAT. The ECG and PPG signals in our setting are nominally quantized to 2 ms and 8 ms given their respective sampling rates of 500 Hz and 125 Hz. However, we will later explain that the fiducial points in the PPG signals in our dataset actually have a resolution of 16 ms. This means that while the actual fiducial point may lie anywhere within a 16 ms quantization window, the timestamp at which it is recorded is shifted to the end.

2.3. Summary

Timestamps in PPG signals can be altered by a range of different effects including clock drift, digital filters, and signal buffering. We will later demonstrate that the magnitude of system-related delays in the PPG signal can exceed 1300 ms. In pediatric populations, interbeat intervals (IBIs) can range from 350 ms to 750 ms (Fleming et al 2011, Eytan et al 2017), while PAT may vary from 100 to 370 ms depending on the size and age of the patient, sensor locations, and other factors (Deshmukh et al 2022). The delays introduced into the PPG signal’s timestamps result in a situation where heartbeats in the ECG and PPG signals can be out of phase by one or more beats, particularly in children with higher heart rates.

In order to correctly measure PAT, we require a method that is capable of matching an R-peak in the ECG with its corresponding fiducial point in the PPG signal. Once this match has been found, we can measure the time between the R-peak and the fiducial point in the corresponding beat in the PPG, which we call PAT$ _\mathrm$. The beat-matched PAT also includes contributions from the data collection system, which we have categorized as follows:

Equation (1)

Once these systematic delays have been quantified, they can be subtracted from PAT$ _\mathrm$ to leave our corrected PAT as follows:

Equation (2)

Furthermore, the corrected PAT values are subject to measurement noise as per the following equation:

Equation (3)

The left side of figure 2 illustrates the various steps in our approach for generating corrected PAT measurements derived from ECG and PPG collected using standard bedside monitoring equipment. The right side of figure 2 illustrates the results of this process on a real-world PAT timeseries. Once the timing properties of the system have been characterized, this information can be used to correct inter-signal timing errors between the ECG and PPG, thereby synchronizing the two signals. We use these techniques to synchronize signals in our database, thereby allowing us to utilize traditional approaches for PAT measurement. We describe our approach both at a high level and in the context of the specific dataset we analyze in this paper.

3.1. Dataset

Physiological waveforms were collected from a 42-bed ICU at the Hospital for Sick Children (SickKids) in Toronto, Canada between 2017 and 2024. This data collection effort was approved by the hospital’s Research Ethics Board (REB#10 000 68 499) and complies with the Declaration of Helsinki. Explicit participant consent was not required for our protocol due to the nature of our study. The ECG and PPG signals were recorded at nominal frequencies of 500 Hz and 125 Hz respectively. These signals were stored on a secure server in an AtriumDB database (Goodwin et al 2020). Prior to 2022, signals were collected from Philips IntelliVue MP70 patient monitors coupled with Masimo PPG sensors that were connected through an X2 IntelliBridge device interface. In mid-2022, the monitoring hardware was upgraded to Philips MX750 monitors with an X3 IntelliBridge device interface.

3.2. Fiducial point detection

We used the findpeaks() function from Neurokit7 toolbox to identify R-peaks in the ECG waveform. Meanwhile, we used a method developed by Kavsaoğlu et al (2016) and integrated in the BioSPPy8 package to detect the onset of beats in the PPG waveform. While the peak of the pulse wave is also often used as the fiducial point in PPG waveforms, we elected to adopt the onset of each pulse wave since that has been shown to be more reliable (Finnegan et al 2021, Mejía-Mejía et al 2022).

Critically ill children generally have substantially higher heart rates than adults, which required changing the default parameters in these libraries to extend the range of allowable detected heart rates. We adopted an adaptive approach based on the 1 Hz pulse rate signal provided by the Philips patient monitor. If the maximum heart rate from this signal was less than 250 bpm, we used that value as the setting for the max_bpm parameter in BioSPPy, and we used its reciprocal for the mindelay parameter in Neurokit. Otherwise, we set max_bpm to 250 bpm and mindelay to 240 ms. To avoid performing calculations on windows with missed or spurious heartbeats, we removed windows for which the standard deviation in either the ECG or PPG IBIs was greater than 300 ms. We also removed windows for which the difference in mean IBI between the ECG and PPG was greater than 200 ms.

3.3. Temporal quantization of fiducial points and IBIs

The temporal resolution of PAT measurements is limited by the sampling rate of the signals (Williams 2016, Herndon 2017). There is also a causal relationship between the phenomena being recorded by the signals and the conversion of the digital measurement itself (Goodwin 2023), so we posit that measurements describe events that happened sometime prior to their quantized timestamps. As illustrated in figure 3, each sample in the 500 Hz ECG signal summarizes the heart’s electrical activity over the preceding 2 ms. This means that any measured R-peaks may have actually occurred up to 2 ms prior. Similar logic would lead one to believe that each sample in the 125 Hz PPG summarizes blood volume changes over the preceding 8 ms. However, a close examination of the PPG IBIs contradicted this expectation.

Figure 3. The impact of quantization (Noise$ _\mathrm$) on the discrepancy between measured PAT and PAT$ _\mathrm$. While the temporal resolution of fiducial points in the 500 Hz ECG is 2 ms, the effective resolution of the 125 Hz PPG is 16 ms when PAT is being measured.

Standard image High-resolution image

Figure 4 shows the distribution of differences between consecutive IBIs in the ECG and PPG signals. While the separation between ECG IBIs was always a multiple of 2 ms, the separation between PPG IBIs was highly likely ($  95$%) to be an even number of samples. In other words, the resolution of the beat onsets extracted from the PPG signal was 16 ms, or two consecutive 8 ms samples. This observation persisted when different techniques were used for fiducial point detection, confirming that this was a property of the signal and not a limitation of algorithms. Therefore, the effective resolution of the pulse onsets in the PPG in our dataset is actually 16 ms when we are measuring PAT, which we consider to be Noise$ _\mathrm$ in our methodology. We address this source of error in section 3.7.

Figure 4. The distribution of differences between consecutive IBIs derived from (A) ECG and (B) PPG.

Standard image High-resolution image 3.4. Beat matching

We extend the algorithm proposed by Goodwin et al (2023) to match beats across the ECG and PPG signals. This technique relies on identifying segments with similar sequences of IBIs between the two signals. For a given beat i, we generated a unique signature Sig$ _i = \_1, \mathrm_2, \dots, \mathrm_\}$ corresponding to the IBIs of the 20 beats after it. We discarded any signatures containing an IBI that would have resulted in an instantaneous heart rate measurement below 50 bpm or above 250 bpm since they were likely to be compromised by missed or spurious heartbeats.

Signatures with more drastic changes in IBI were assumed to be more unique, so we used the following as a score of the signature’s uniqueness:

Equation (4)

Beyond calculating this score, we discarded signatures for which the 75th-percentile of IBI differences falls below the corresponding signals’ quantization (2 ms for ECG, 16 ms for PPG). Signatures that failed this criterion were not necessarily low-quality, but they were challenging to match across signals due to a lack of uniqueness.

We compared the remaining signatures across the two signals according to their Euclidean distance. In our specific implementation, we bounded this search procedure to only compare signatures within 6 beats of one another given empirical observations that inter-signal timing delays typically did not exceed that duration. Matches between signatures with low uniqueness scores were less reliable, so we prioritized matches between vectors with higher IBI variability. Therefore, applying our beat-matching algorithm for a given beat i in the ECG entailed minimizing the following expression to identify the corresponding beat j in the PPG:

Equation (5)

Once all the beats were matched, we calculated the PAT for all the matched beats within a 60-minute frame. The distribution of these beat-matched PATs contained multiple peaks roughly spaced according to the average IBI. We assumed that the most prominent peak corresponds to correctly matched beats, while the remaining peaks contain mismatches. The mismatches could hypothetically be corrected instead of discarded, but for this work, we elected to discard these cases in order to provide the most robust signal for subsequent steps in the analysis. We algorithmically performed this final filtering step by discarding PATs beyond two standard deviations from the frame’s mean.

3.5. Characterization of clock drift artifacts

As mentioned in section 2.2.1, clock drift and re-synchronization can produce artifacts in the PAT time series; in fact, there can be multiple sawtooth artifacts superimposed on one another when signals are passed through multiple interfaces. The periods and amplitudes of these artifacts can be identified via manual inspection of a PAT$ _\mathrm$ timeseries. While we used manual inspection to confirm the presence of two sawtooth artifacts in our dataset, we propose an automated approach that estimates the characteristics of (amplitude, period, and phase) of a sawtooth artifact in the presence of physiological variation and other measurement noise.

As outlined in algorithm 1, the technique centers around a Kalman filter that tracks the linear rise of beat-matched PAT measurements over time. We chose a Kalman filter because it provides a principled way to estimate the underlying state of a system in the presence of noise, making it well-suited for tracking gradual, noisy trends like clock-induced drift. Its adaptive nature allows real-time processing and dynamic adjustment in response to gradual physiological changes or abrupt synchronization resets. When the difference between the predicted state and a new measurement (i.e. the measurement innovation) exceeds 16 ms, the Kalman filter resets under the assumption that a clock synchronization event has occurred. This threshold is based on the resolution of PATs in our dataset. To make this algorithm robust to outliers, the Kalman filter only resets after two samples exceed the threshold. Once a cycle of the sawtooth artifact has been identified, the estimated measurements are used to fit a line. The line is subtracted from the PAT measurements that were fed into the Kalman filter to suppress the drift during that cycle.

Algorithm 1. An abridged version of the linear Kalman filter (KF) technique used to track a sawtooth artifact in a PAT time series.Initialize KF parameters$x_0 \gets PAT_0$         $\triangleright$Initialize KF state with first PAT measurement$PeriodPoints \gets []$         $\triangleright$Initialize array to store all points in sawtooth period$ResetCount \gets 0$         $\triangleright$Initialize check for synchronization resetfor all PATkdo  $ \tilde_k = PAT_k - x_$         $\triangleright$Calculate measurement innovation  if $|\tilde_k| \lt $ threshold then    Calculate $x_$ using linear KF update    $PeriodPoints \gets PeriodPoints + [x_]$         $\triangleright$Add KF state to list of points in sawtooth period    $ResetCount \gets 0$         $\triangleright$Reset counter to continue tracking  else    $ResetCount  1$    if ResetCount > 1 then      Calculate the line of best fit from PeriodPoints      Correct PATk in PeriodPoints using line of best fit      Reset KF parameters    end if  end ifend for3.6. Hardware profiling

The total Delay$ _\mathrm+\mathrm_\mathrm$ can be inferred if $\mathrm_\mathrm$ and the expected distribution of PATs are known. Since we did not have access to the latter, we conducted explicit experiments with the hardware at our institution in order to empirically estimate Delay$ _\mathrm$ and Delay$ _\mathrm$. We first synthetically generated ECG and PPG waveforms representative of cardiac function. This entailed using a PS420 multi-parameter simulator for synthetic ECG and a Masimo SET Tester for synthetic PPG. We then injected simultaneous voltage step changes into both signals before they arrived at the transducers. The voltage changes were introduced every 20 s, alternating between stepped increases and decreases in voltage, and this was done out of phase with the heartbeats to avoid coincidental alignment that would affect our observations.

As shown in figure 5, the voltage step changes appeared at different times in the two signals collected by the monitor. We visually annotated and measured two intervals associated with each artifact. The first was the relative difference between the time that the artifact arrived in each signal ($\Delta t_\mathrm$), while the second was the relative time difference between the introduction of the artifact into the ECG signal and the moment the PPG waveform morphology began to deviate from its baseline pattern ($\Delta t_\mathrm$). These measurements incorporate different combinations of system-related delays:

Figure 5. Example output from the hardware profiling experiment. Artificially generated artifacts were simultaneously introduced into the ECG and PPG signals. $\Delta t_\mathrm$ represents the relative time difference between the expression of the artifact in the ECG and PPG signals. $\Delta t_\mathrm$ describes the moment the PPG signal begins to deviate from the repeating pattern in the synthetically generated signal. The ECG and PPG signals were generated using different heart rates, but this did not affect the experiment since the artifact detection process is independent of physiology.

Standard image High-resolution image Equation (6)

Comments (0)

No login
gif