Comparative study of dermatologists and deep learning model on diagnosing childhood vitiligo

Objective

To explore the performance of a deep learning (DL) model based on dermoscopy images in diagnosing childhood vitiligo.

Methods

A total of 474 pediatric patients (223 with vitiligo and 251 without vitiligo) were enrolled. Three types of imaging data were collected: dermoscopic images, Wood’s lamp images, and standard clinical photographs. Two diagnostic evaluation approaches were established. Clinician-based assessment: Eight dermatologists performed a double-blind evaluation using dermoscopic images. DL-based assessment: ResNet152 and DenseNet121 models were trained on 3896 dermoscopic images (with an 8:2 split between the development set and validation set). The evaluation metrics included the AUC of ROC curve, sensitivity, specificity, F1-score, and accuracy. Additionally, the correlation between clinicians’ diagnostic performance and their years of experience was analyzed.

Results

ROC curve analysis revealed that using the training questionnaire as a control group, the diagnostic performance of dermatologists for vitiligo based solely on dermoscopy images yielded an AUC of 0.77 (95 % CI: 0.51–1.00), sensitivity of 0.88 (95 % CI: 0.53–0.99), and specificity of 0.75 (95 % CI: 0.41–0.96). The confusion matrix for the ResNet152 model indicated an accuracy of 83.08 %, a recall rate of 86.84 %, a precision of 81.08 %, a specificity of 79.22 %, an F1 score of 0.8386, and an AUC of 0.91. The confusion matrix for the DenseNet121 model indicated an accuracy of 81.41 % and a recall rate of 83.41 % (precision: 82.03 %, specificity: 79.12 %, F1 score: 0.8271, and AUC: 0.89).

Conclusion

Both DL models based on dermoscopy images exhibit high overall classification performance in the diagnosis of childhood vitiligo.

Comments (0)

No login
gif