Objective. Heart failure (HF) is considered a global pandemic because of increasing prevalence, high mortality rate, frequent hospitalization, and associated economic burden. This study explores a noninvasive method that may help in managing HF patients by predicting HF readmission. Methods. Seismocardiogram (SCG) signal is the low-frequency chest vibration produced by the mechanical activity of the heart. SCG signal was acquired from 101 patients with HF, including those readmitted to the hospital during the study period. SCG signals were segmented into heartbeats and clustered based on respiration phases. Features were extracted from each cluster. Several conventional machine learning (ML) models were developed using selected SCG and heart rate variability features. Furthermore, SCG signals were transformed into images using a time–frequency distribution method. Images were used to train a deep learning model. The models were able to predict the readmission status of HF patients. Results. ML algorithms achieved higher accuracy than the deep learning model in classifying the readmitted and non-readmitted HF patients. K-nearest neighbor achieved the highest classification accuracy (89.4% accuracy, 87.8% sensitivity, 90.1% specificity, 78.2% precision, and 82.7% F1-score). A detailed discussion of the extracted features was provided, correlating them with HF conditions. Conclusions. The study results suggest that SCG signals may be useful for readmission prediction of HF patients.
Export citation and abstractBibTeXRIS
Heart failure (HF) is a chronic progressive medical condition marked by the diminished capacity of the heart to effectively pump blood. HF is a major global health concern with an estimated 64 million cases worldwide (Savarese and Lund 2017) and 6 million in the United States (Virani et al 2020). This is projected to rise to 8.5 million in 2030 in the US (Bozkurt et al 2023). This increasing prevalence mainly accounts for the aging populations who are at greater risk of developing HF. Advances in medical diagnosis and treatment have improved survival rates, prolonging life in individuals with HF (Savarese and Lund 2017). Nevertheless, the mortality rate related to HF is still very high. A meta-analysis by Jones et al in 2018 showed that the 1- and 5 year survival rates of HF are 86.5% and 56.7%, respectively (Jones et al 2019). According to a more recent study by Bozkurt et al, 28% of 263 525 patients died during the first year of first HF hospitalization (2023). Apart from this, the healthcare costs related to HF is also substantial (Lesyuk et al 2018). The total cost for HF was estimated at $43.6 billion in the US, which is projected to increase to $70 billion by 2030 (Urbich et al 2020, Heidenreich et al 2022). The main driver of HF healthcare cost is hospitalization (Shafie et al 2018), as HF is associated with a very high number of hospital readmission rates. After discharge, about 25% and 50% of HF patients are readmitted within the 30 d and 6 month periods, respectively (Virani et al 2020, Khan et al 2021). With the increase in HF prevalence, the readmission rate and associated costs are likely to be increased in the coming years. Therefore, early readmission prediction may allow interventions that may reverse patient deterioration and avoid readmission.
HF can be classified based on left ventricular ejection fraction (LVEF). LVEF is the fraction of blood pumped out of the heart’s left ventricle (LV) during systole. It provides a measurement of LV systolic function, which is responsible for ejecting oxygenated blood from the heart to the rest of the body. Normal range of LVEF is 50%–70% (Lang et al 2015). Classification of HF regarding LVEF is illustrated in table 1.
Table 1. Classification of HF according to LVEF. Here, HFrEF is HF with reduced ejection fraction, HFmEF is HF with mildly reduced ejection fraction, and HFpEF is HF with preserved ejection fraction (Heidenreich et al 2022). LVEF stands for left ventricular ejection fraction.
HF classLVEFHFrEF⩽40%HFmrEF41%–49%HFpEF⩾50%HFrEF comprises approximately 50% of total HF cases (Murphy et al 2020). Patients with HFrEF have a higher mortality rate than those with HFpEF (Somaratne et al 2009, Burkhoff 2012). Although all-cause readmission is higher in HFpEF, HF readmission is higher in HFrEF (Cui et al 2020). In addition, the cost of readmission is higher in HFrEF patients (Sheikh et al 2021). Regardless of the HF class, the high readmission rate is avoidable with preventive measures (Desai and Stevenson 2012). In -(Stauffer 2011), it was demonstrated that a post-discharge transitional care program can greatly reduce the HF readmission rate and the associated cost. Taking this into account, continuous efforts have been made to build an early and reliable HF readmission prediction model that may help the clinicians to make timely targeted interventions to prevent readmissions.
Electronic health records (EHRs) and wearable sensors are the main data sources that have been used to predict HF readmission. EHR includes patient demographics, medications, vital signs, medical history, laboratory data, etc. Intrathoracic impedance, electrocardiogram (ECG), and seismocardiogram (SCG) can be acquired with wearable devices and used as predictors of HF readmission. The predictive accuracy values of these studies are widely varied. In Shameer et al (2016), authors used EHR data and achieved 83.19% accuracy in 1068 patients. In another study, sensitivity and specificity of 48% and 70% are achieved, respectively, using medical data of 10 757 HF patients (Awan et al 2019). A review article by Liu et al showed that B-type natriuretic peptide (BNP) and N-terminal pro-brain natriuretic peptide (NT-proBNP) are the most used predictors from the EHR data (Liu et al 2022).
Other authors used sensor data to predict HF readmission. Intrathoracic impedance-based models obtained variable predictive accuracy ranging from 21%–76%, suggesting the uncertainty in predicting HF readmission (Yu et al 2005, Cleland and Antony 2011, Heist et al 2014, Stehlik et al 2020). In (Stehlik et al 2020), ECG, skin impedance, temperature, etc were acquired from 100 patients at home with a multisensory patch for 3 months. High prediction accuracy was achieved (sensitivity = 86%, specificity = 87.5%) using the sensor data, although the study required baseline data for analysis. Boehmer et al used defibrillators implanted in patients to acquire data to predict hospitalization (Boehmer et al 2017). Invasive accelerometer-acquired heart sounds (similar to SCG), heart rate, intrathoracic impedance, respiration rate, and tidal volume data were collected from the implanted device, which were able to alert clinicians before HF hospitalization (sensitivity = 70%). In another SCG-based study, Lin et al identified HF patients by calculating LVEF from SCG and ECG signals (Lin et al 2018). In the study, 40 subjects were enrolled (25 HF and 15 healthy). The ratio of pre-ejection period and left ventricular ejection time was calculated from SCG and ECG signals, which was found to be inversely proportional to LVEF (correlation coefficient 0.73). A threshold ratio of 0.33 distinguished HF from healthy participants with 96% accuracy (sensitivity 98% and specificity 94%). Inan et al used SCG signals to distinguish between compensated and decompensated HF patients (2018). The patients needed to perform the 6 min walk test (6MWT) in this study. Similarity between SCG signals before and after the test was used as a metric to differentiate the two groups. Higher similarity was found in decompensated patients, suggesting their reduced cardiovascular reserve. Although the above studies had several limitations, such as requiring baseline data, demanding patients to perform 6MWT, or using invasive measurements, these studies demonstrated the merit of SCG signal in predicting HF readmission. The current study investigates the feasibility of using SCG and machine learning (ML) algorithms for HF readmission prediction when baseline measurements are not available.
2.1. Data acquisitionThe dataset used in this study was collected at Advent Health Orlando after IRB approval by the University of Central Florida (protocol number: BIO-16-12783; the date of approval: March 6, 2023). The study was carried out according to the principles outlined in the Declaration of Helsinki. HF patients were recruited after their discharge from the hospital. Overall, 101 patients were included in this study. Data was acquired in single or multiple sessions per patient, following their provision of written informed consent. After an observer manually checked the data, 24 recording sessions were excluded due to poor quality of the acquired signals (zero voltage or noisy signal). This resulted in the exclusion of 20 patients from the study. Data analysis was performed in the remaining 81 patients who had a total of 142 sessions. The demographic information of the subjects is shown in table 2.
Table 2. Available demographics. Age information was not available.
CategoryDetailsGenderMale: 62, female: 19Height (m)1.74 ± 0.11Weight (kg)101.7 ± 30.7BMI (kg m−2)33.3 ± 9.4HF statusHFrEF: 75, HFpEF: 6NYHA classificationaI: 2, II: 14, III: 28, IV: 19aNYHA classification was reported in 63 subjects.
After the initial discharge, 22 patients (who attended 41 recording sessions) were readmitted to the hospital during the window of data acquisition (six months). The protocol included 3 min of data acquisition in each session when patients were sitting on a 45° inclined exam table with their legs extended. The following three signals were acquired from the patients:
i.
SCG: Acquired using a tri-axial accelerometer (Model: 356A32, PCB Piezotronics, Depew, NY) placed on the chest surface at the 4th intercostal space near the left lower sternal border. Signal was amplified using a signal conditioner (Model: 482C, PCB Piezotronics, Depew, NY) with a gain of 100. The x, y, and z components of the accelerometer are pointed toward lateral (left to right), caudocranial (head to toe), and dorsal–ventral (normal to chest surface) directions, respectively. This study includes the analysis of the z-axis of the accelerometer.
ii.
ECG): Acquired by IX-B3G bio-potential recorder (iWorx Systems, Inc., Dover, NH).
iii.
Galvanic skin response (GSR): Provides an estimate of lung volume (Azad et al 2018). Acquired by IX-B3G bio-potential recorder.All the signals were acquired at a sampling rate of 10 kHz. A schematic representation of data acquisition is shown in figure 1.
Figure 1. Schematic of experiment setup.
Download figure:
Standard image High-resolution image 2.2. Data analysisOverview: The workflow diagram of data analysis is shown in figure 2. The process started with filtering raw signals (band pass = 0.5–100 Hz), followed by the segmentation of SCG and ECG signals (Azad et al 2023). After that, SCG beats were clustered using an unsupervised clustering method (k-medoids clustering) (Gamage et al 2020). The clustering was correlated to the respiration phases, which were obtained from GSR signal. This clustering provides a medoid SCG beat for each cluster. Clustering features (described below) were extracted using the relationship between the medoid SCG beats and the rest of the SCG beats. Other time- and frequency-domain features were extracted from the cluster ‘representative’ beats (described below). Conventional ML models were trained and tested using selected SCG features along with a few heart rate variability (HRV) features. This concludes the first approach of analysis that utilizes conventional ML.
Figure 2. Flow diagram of data analysis.
Download figure:
Standard image High-resolution imageIn the second approach, a few SCG beats (3–5) that were closest (in terms of waveform shape) to the medoid beats were transformed into images using a time–frequency distribution method (polynomial chirplet transform or PCT). The images were fed to a CNN model for training and testing.
2.2.1. PreprocessingAfter visually checking the signal quality, noisy portions of the data were discarded. This noise mainly came from patient movements. The rest of the data (usually 100–140 s) was considered for analysis. The raw ECG, SCG, and GSR signals were downsampled to 1 kHz. After that, ECG and SCG signals were forward–backward filtered using a 4th order Chebyshev type 2 bandpass filter with cutoff frequencies of 0.5 and 100 Hz. The GSR signal was detrended, and a flow rate signal was calculated by differentiating the GSR signal.
2.2.2. Segmentation and normalizationThe R-peaks of the ECG signal were detected using the Pan–Tompkins algorithm (Tompkins 1985). SCG and ECG beats were chosen to start 0.1 s before the ECG R-wave and end 0.1 s before the next R-wave. After segmentation, each SCG beat was normalized by its peak-to-peak amplitude.
2.2.3. Unsupervised clustering (k-medoid clustering)Studies on SCG signals reported that SCG signals have morphological variability (Azad et al 2019, Sandler et al 2019, Gamage et al 2020). The clusters of similar SCG beats were found to correlate with the respiration phases. It was suggested that clustering SCG beats into two clusters optimally lowers the variability and makes the feature extraction more accurate (Gamage et al 2020). To group the SCG beats with close morphological features, the k-medoids clustering method was used. The unsupervised clustering method requires two initial beats. Efficient clustering depends on good initialization. In the current study, the SCG beats are initially divided into two groups based on either lung volume (high and low) or flow rate (high and low). SCG beats are considered to be more similar when the distance between them is smaller. Dynamic time warping (DTW) and cross-correlation methods are the two methods chosen to measure the distance (i.e. morphological dissimilarity) between the SCG beats. After dividing the beats into two groups based on lung volume and flow rate, center beats were chosen from each group that had the minimum sum of distances with their neighboring beats in the same group. These two center beats are chosen as the initial beats for the k-medoids method, which is named as initial medoids. After obtaining the initial medoids, the clustering process began. The algorithm continued to update the cluster medoids by calculating the sum of distances and then update the clusters by grouping the beats that have morphological similarities measured by DTW distance. The algorithm stopped when there was no change in the assignment of the SCG beats to the clusters in two consecutive iterations. As there were two bases of grouping (lung volume and flow rate) and two distance measuring methods (DTW and cross-correlation), all four combinations of getting the initial medoids were performed. The combination that produced the most optimum clustering of SCG beats was selected. Clustering quality was also checked by plotting the clustered beats in a lung volume-flow rate space (figure 3). A decision boundary was drawn to visualize the separation of the beats into two clusters.
Figure 3. K-medoid clustering of SCG beats of a representative recording session (Subject 25, 3rd session) in lung volume-flow rate space. Blue circles and red triangles are the beats of the two clusters. A decision boundary (dashed line) is plotted to show the clear separation between the two clusters.
Download figure:
Standard image High-resolution imageAfter getting the cluster medoids, 15% of SCG beats that are closest (measured by DTW distance) to the medoid signal in a cluster were averaged to create a SCG beat that is a representative of that cluster. Features were extracted from both cluster medoids and cluster representatives.
2.2.4. Feature extraction and selectionIn total, 63 SCG features were extracted. These include clustering, time- and frequency-domain features. In addition, 8 HRV features were added to complete the feature set. The random forest (RF) algorithm was employed for feature selection. RF is a popular and powerful algorithm that falls under the embedded feature selection method. This embedded method combines the benefits of the other two feature selection methods (filter and wrapper) by allowing interaction with the classifier (like the wrapper method) and being computationally lighter while at the same time producing better classification results (Guo et al 2019, Pudjihartono et al 2022). 11 features were selected (7 SCG and 4 HRV features). A list of selected features is given in table 3, and the feature importance scores are provided in figure 4.
Figure 4. Feature importance scores of input variables computed using random forest based on Gini importance.
Download figure:
Standard image High-resolution imageTable 3. Selected SCG (1–7) and HRV (8–11) features with short descriptions. The features (4–7) are obtained by averaging the features from the two cluster representative waveforms (for each recording session).
Feature indexFeature nameDescription 1Intra-session waveform variability before clustering (WVbc).The dissimilarity among the SCG beats within a session. Dissimilarity was calculated using dynamic time warping (dtw) distance.
C: medoid beat before clustering, Xi: ith SCG beat, li: warping path length, n = number of SCG events in a session.SCG Features2Inter-cluster waveform variability (WVinter)Average dissimilarity between the medoid of a cluster and SCG beats of the other cluster.
WVinter =
n1, n2: number of events in Cluster 1 and 2,
C1,C2: SCG medoid of cluster 1 and 2,
Xi1, Xi2: ith SCG event of cluster 1 and 23Intra-cluster waveform variability (WVintra)Average dissimilarity between the medoid and SCG beats of the same cluster
WVintra =
4Average RMS amplitude of instantaneous frequency (Fins)Instantaneous frequency (Fins) was calculated as the frequency first moment of the time–frequency distribution (PCT), normalized by the integral of PCT at that time instant
Fins =
Then, the RMS of Fins was calculated over the duration of the beats under consideration.5Average turning point ratio (TPR)
Quantification of the randomness in a time-series signal.6Average sample entropy (SmEn)SmEn =
; here denominator and numerator are the number of matched template pairs of length m and m + 1 in the waveform, respectively (Richman and Moorman 2000).7Average Higuchi dimension (DH)Measures the irregularity in a time-series signal (Higuchi 1988).8Low frequency power (LFP)Spectral power of heart rate (HR) in.04–.15 Hz frequency band.HRV Features9High frequency power (HFP)Spectral power of HR in.15–.4 Hz frequency band.10Total power (TP)Total spectral power of HR in 0–0.4 Hz frequency band.11pNN50Proportion of successive RR intervals that differ by more than 50 ms.2.2.5. Image construction using time–frequency conversionFor the deep learning approach (approach 2 in figure 2), PCT (a time frequency distribution method) of the SCG signals was calculated and resulted in images. Depending on the length of session data, 3–5 SCG beats closest (as measured by DTW) to the medoid signals were processed by PCT. This resulted in 2D images with time and frequency information in horizontal and vertical axes, respectively (figure 5(b)). The PCT coefficient values were presented using the ‘Parula’ colormap. PCT is found to be more suited than other TFD methods for SCG and heart sound-related studies (Taebi and Mansy 2017, Bao et al 2023).
Figure 5. (a) SCG medoid beat of a representative subject (subject 3, session 3), and (b) the corresponding time–frequency distribution coefficient heatmap as calculated by PCT.
Download figure:
Standard image High-resolution image 2.2.6. Conventional ML algorithmsThree different ML algorithms were employed to evaluate the efficacy of the feature set in predicting HF readmission. These methods are k-nearest neighbor (KNN), multilayer perceptron neural network (MLP-NN), and extreme gradient boosting (XGBoost). Since there was an imbalance in the number of observations between the two classes, the decision threshold governing the conversion of the prediction probability to a class label was shifted from the default value of 0.5 and tuned to 0.7 to maximize sensitivity. The leave-one-subject-out cross-validation (LOOCV) approach was used for testing to avoid subject bias.
2.2.7. Convolutional neural networkFor image classification, the Residual Networks (ResNet-34) model was used. ResNets are being widely used in image classification after being introduced by He et al (2015). Several ResNet-based time–frequency image classification tasks have been studied previously (Diker et al 2019, Zhang et al 2021, Liu et al 2022). In this study, a 34-layer CNN network, ResNet-34, was used. Images were resized to 224 by 224 pixels with nearest neighbor interpolation to match the input requirement of ResNet-34. Image augmentation was performed by transformations such as random flips (horizontal and vertical) and rotation. The Adam optimizer with a learning rate of 0.000 008 was chosen. Cross-entropy loss metric was used for performance measurement. The number of epochs was 30 with a batch size of 8.
A balanced dataset, including all the readmitted patients and a subset of non-readmitted patients, was created to address the class imbalance issue for CNN. The number of observations for both the classes was balanced by random undersampling the majority class (non-readmitted patients). This dataset had 38 patients with 90 sessions (22 readmitted with 41 sessions) who were trained and tested by LOOCV. The remaining 43 non-readmitted patients with 52 sessions were not included in the training and only used for out-of-sample testing. These patients were tested using a model trained by data from all the sessions of the 38 patients. This also mimics a real-life application of the developed deep learning model, where the model is trained using the available HF patient data, and the trained model predicts the readmission of the future HF patients.
Five metrics were used to show the results (equations (1)–(5)),





The results obtained are presented in tables 4 and 5, and the ROC curves are shown in figure 6.
Figure 6. ROC curves for (a) the machine learning models (b) CNN models. The optimum thresholds are indicated by the yellow points (0.7 for ML and 0.5 for CNN).
Download figure:
Standard image High-resolution imageTable 4. Performance of the conventional machine learning models, boldface marks the best value in each column.
ModelSensitivitySpecificityPrecisionF1-scoreAUCAccuracyKNN0.880.900.780.830.880.89MLP-NN0.880.810.650.750.840.83XGBoost0.850.800.640.730.830.82Table 5. Performance metrics for the CNN model. 1st row shows the leave-one-subject-out cross-validation (LOOCV) metrics for the balanced dataset (38 patients). The 2nd row shows combined results after adding out-of-sample test set results.
ResNet-34SensitivitySpecificityPrecisionF1-scoreAUCAccuracyLOOCV0.800.820.780.790.860.81Combined0.800.810.630.700.870.81These results suggest that conventional ML algorithms performed better than the deep neural network (DNN) model with higher sensitivity. Specifically, KNN outperformed all other models with close to 90% accuracy.
The quantitative comparisons of different HF readmission prediction models are presented in table 6.
Table 6. Performance comparison of various methods for predicting patient readmission, boldface marks the best value in each column.
MethodsSubjects (readmitted)Data typeAccuracy (%)Sensitivity (%)Specificity (%)AUCShameer et al (2016)1068 (178)Electronic medical record83.19——0.78Awan et al (2019)10 757 (2546)Electronic health record64.948.4270.010.63Cleland and Antony (2011)501 (58)Thoracic impedance—42.1 Stehlik et al (2020)100 (49)ECG, accelerometry, skin impedance, temperature, activity, posture—87.586.00.89Yu et al (2005)33 (10)Thoracic impedance—76.9 —Boehmer et al (2017)900 (146)Heart sounds, thoracic impedance, heart rate, activity, respiration rate—7085.7—This study81 (22)SCG, ECG, GSR88.987.890.10.88A non-invasive approach of predicting HF readmission was proposed and tested in this study. The linear acceleration in dorsal–ventral direction was analyzed and used to classify HF patients (admitted vs non-readmitted). Data analysis was performed in two different approaches: (a) conventional ML and (b) deep learning. In the first approach, features were first extracted from SCG beats and HRV. Feature selection was performed, followed by using three different ML algorithms. For the second approach, time–frequency distribution (PCT) was applied to convert the time-domain signal into a 2D image with time and frequency information. The images were resized and fed into a CNN network (ResNet-34) for classification.
Results showed that handcrafted features provided better accuracy than the CNN method. One reason for this can be the inclusion of HRV features in the feature set, which was not provided to the CNN model. Given the higher performance of conventional ML models (with the SCG and HRV features), a discussion of these features that correlate those with HF conditions may be useful. The focus here will be given to SCG clustering features and HRV features.
The first three features in table 3 are the SCG clustering features. Th
Comments (0)