ViT-WoundNet: An Explainable Deep Learning Framework for Wound Type and Severity Classification
DOI:
https://doi.org/10.65091/icicset.v2i1.16Abstract
Deep learning advances have revolutionized automated wound assessment, yet convolutional neural networks (CNNs) struggle with inter-class similarities, illumination variations, and generalization across diverse datasets. This work introduces ViT-WoundNet, a Vision Transformer (ViT)-based framework leveraging self-attention to capture long-range spatial dependencies and global context, addressing these limitations effectively. ViT-WoundNet outperforms CNN baselines—ConvNeXt (85.4%), EfficientNet (86.7%), and ResNet50 (74.1%)—achieving 91.3% accuracy for wound-type classification and 88.4% for severity across 2,372 multi-class images from five public datasets. Confusion matrices and Grad-CAM visualizations confirm superior inter-class separability and clinically interpretable feature localization, positioning ViT-WoundNet as a robust, explainable solution for telemedicine applications.