ViT-WoundNet: An Explainable Deep Learning Framework for Wound Type and Severity Classification

Authors

  • Rupesh Kumar Sah Indian Institute of Technology, Patna
  • Anil Verma Indian Institute of Technology, Patna
  • Kumar Prasun Indian Institute of Technology, Patna
  • Rajiv Misra Indian Institute of Technology, Patna

DOI:

https://doi.org/10.65091/icicset.v2i1.16

Abstract

Deep learning advances have revolutionized automated wound assessment, yet convolutional neural networks (CNNs) struggle with inter-class similarities, illumination variations, and generalization across diverse datasets. This work introduces ViT-WoundNet, a Vision Transformer (ViT)-based framework leveraging self-attention to capture long-range spatial dependencies and global context,  addressing these limitations effectively. ViT-WoundNet outperforms CNN baselines—ConvNeXt (85.4%), EfficientNet (86.7%), and ResNet50 (74.1%)—achieving 91.3% accuracy for wound-type classification and 88.4% for severity across 2,372 multi-class images from five public datasets. Confusion matrices and Grad-CAM visualizations confirm superior inter-class separability and clinically interpretable feature localization, positioning ViT-WoundNet as a robust, explainable solution for telemedicine applications.

Downloads

Published

2025-12-24

How to Cite

[1]
R. K. Sah, A. Verma, K. Prasun, and R. Misra, “ViT-WoundNet: An Explainable Deep Learning Framework for Wound Type and Severity Classification”, ICICSET2025, vol. 2, no. 1, Dec. 2025.