Dual-Scale Transformer-Guided Attention Network for Efficient Multi-OAR Segmentation in Head and Neck Radiotherapy
DOI:
https://doi.org/10.58190/imiens.2025.126Keywords:
Deep Learning, segmentation, Organ-at-risk, CT images, convolutional neural networks, dual-scale transformer-guided attention networkAbstract
Accurate segmentation of organ-at-risk (OARs) in head and neck CT images is crucial for radiotherapy planning, but it remains a challenging task due to anatomical complexity, low soft-tissue contrast, and the presence of small, variable structures. We propose DSTANet, a novel dual-scale transformer-guided attention network that integrates multi-resolution encoding, transformer-based global context fusion, and anatomically guided attention refinement to deliver precise multi-OAR segmentation. Unlike traditional CNN-based methods, DSTANet effectively models long-range spatial dependencies while preserving high-resolution boundary detail. On the HNSCC-3DCT-RT dataset, DSTANet achieved a mean Dice Score of 97.5% and a mean 95th percentile Hausdorff Distance (HD95) of 2.32 mm, while on the MICCAI 2015 benchmark dataset, it achieved 90.0% Dice, which surpasses several state-of-the-art approaches both in terms of overlap and geometric accuracy. These results, combined with a sub-20-second inference time, establish DSTANet as a robust and clinically viable solution for automated head and neck OAR segmentation.
Downloads
References
[1] Jin, D., Guo, D., Ge, J., Ye, X., & Lu, L. (2022). Towards automated organs at risk and target volumes contouring: Defining precision radiation therapy in the modern era. Journal of the National Cancer Center, 2(4), 306-313.
[2] Bose, P., Brockton, N. T., & Dort, J. C. (2013). Head and neck cancer: from anatomy to biology. International journal of cancer, 133(9), 2013-2023.
[3] Jaffray, D. A., Lindsay, P. E., Brock, K. K., Deasy, J. O., & Tomé, W. A. (2010). Accurate accumulation of dose for improved understanding of radiation effects in normal tissue. International Journal of Radiation Oncology* Biology* Physics, 76(3), S135-S139.
[4] Savenije, M. H., Maspero, M., Sikkes, G. G., van der Voort van Zyp, J. R., TJ Kotte, A. N., Bol, G. H., & T. van den Berg, C. A. (2020). Clinical implementation of MRI-based organs-at-risk auto-segmentation with convolutional networks for prostate radiotherapy. Radiation oncology, 15, 1-12.
[5] Han, X., Hoogeman, M. S., Levendag, P. C., Hibbard, L. S., Teguh, D. N., Voet, P., ... & Wolf, T. K. (2008, September). Atlas-based auto-segmentation of head and neck CT images. In International Conference on Medical Image Computing and Computer-assisted Intervention (pp. 434-441). Berlin, Heidelberg: Springer Berlin Heidelberg.
[6] Jeffery, G. (2001). Architecture of the optic chiasm and the mechanisms that sculpt its development. Physiological Reviews, 81(4), 1393-1414.
[7] King, A. D. (2017). Imaging Society (ICIS) 17th Annual Teaching Course. Cancer Imaging, 17(1), O1.
[8] Nawaz, U., Saeed, Z., & Atif, K. (2025). A novel framework for efficient dominance-based rough set approximations using K-dimensional (KD) tree partitioning and adaptive recalculations techniques. Engineering Applications of Artificial Intelligence, 154, 110993.
[9] Nawaz, U., Anees-ur-Rahaman, M., & Saeed, Z. (2025). A review of neuro-symbolic AI integrating reasoning and learning for advanced cognitive systems. Intelligent Systems with Applications, 200541.
[10] Nawaz, U., Anees-ur-Rahaman, M., & Saeed, Z. (2025). A Survey of Deep Learning Approaches for the Monitoring and Classification of Seagrass. Ocean Science Journal, 60(2), 19.
[11] Nawaz, U., Saeed, Z., & Atif, K. (2025). A Novel Transformer-based approach for adult’s facial emotion recognition. IEEE Access.
[12] Mirza, F., & Zhao, H. (2024, August). Hybrid Attention Mechanisms and Bio-Inspired Optimization for Enhanced Breast Cancer Diagnosis from Ultrasound Images. In 2024 7th International Conference on Pattern Recognition and Artificial Intelligence (PRAI) (pp. 786-792). IEEE.
[13] Shelhamer, E., Long, J., & Darrell, T. (2016). Fully convolutional networks for semantic segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(4), 640-651.
[14] Nemoto, T., Futakami, N., Yagi, M., Kumabe, A., Takeda, A., Kunieda, E., & Shigematsu, N. (2020). Efficacy evaluation of 2D, 3D U-Net semantic segmentation and atlas-based segmentation of normal lungs excluding the trachea and main bronchi. Journal of radiation research, 61(2), 257-264.
[15] Ma, Y. (2021, November). PANet: parallel attention network for remote sensing image semantic segmentation. In ISCTT 2021; 6th International Conference on Information Science, Computer Technology and Transportation (pp. 1-4). VDE.
[16] Smelyanskiy, M., Holmes, D., Chhugani, J., Larson, A., Carmean, D. M., Hanson, D., ... & Robb, R. (2009). Mapping high-fidelity volume rendering for medical imaging to CPU, GPU and many-core architectures. IEEE transactions on visualization and computer graphics, 15(6), 1563-1570.
[17] Liu, C., Zhang, X., Si, W., & Ni, X. (2021). Multiview Self‐Supervised Segmentation for OARs Delineation in Radiotherapy. Evidence‐Based Complementary and Alternative Medicine, 2021(1), 8894222.
[18] The Cancer Imaging Archive. (n.d.). HNSCC-3DCT-RT. https://www.cancerimagingarchive.net/collection/hnscc-3dct-rt/
[19] Papers With Code. (n.d.). MICCAI 2015 Head and Neck Auto Segmentation Challenge. https://paperswithcode.com/dataset/miccai-2015-head-and-neck-challenge
[20] Zhu, W., Huang, Y., Zeng, L., Chen, X., Liu, Y., Qian, Z., ... & Xie, X. (2019). AnatomyNet: deep learning for fast and fully automated whole‐volume segmentation of head and neck anatomy. Medical physics, 46(2), 576-589.
[21] Gao, Y., Huang, R., Chen, M., Wang, Z., Deng, J., Chen, Y., ... & Li, H. (2019). FocusNet: imbalanced large and small organ segmentation with an end-to-end deep neural network for head and neck CT images. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part III 22 (pp. 829-838). Springer International Publishing.
[22] Liang, S., Thung, K. H., Nie, D., Zhang, Y., & Shen, D. (2020). Multi-view spatial aggregation framework for joint localization and segmentation of organs at risk in head and neck CT images. IEEE Transactions on Medical Imaging, 39(9), 2794-2805.
[23] Chen, H., Huang, D., Lin, L., Qi, Z., Xie, P., Wei, J., ... & Lu, Y. (2020). Prior attention enhanced convolutional neural network based automatic segmentation of organs at risk for head and neck cancer radiotherapy. IEEE Access, 8, 179018-179027.
[24] Wang, T., Lei, Y., Roper, J., Ghavidel, B., Beitler, J. J., McDonald, M., ... & Yang, X. (2021). Head and neck multi-organ segmentation on dual-energy CT using dual pyramid convolutional neural networks. Physics in Medicine & Biology, 66(11), 115008.
[25] Dai, X., Lei, Y., Wang, T., Zhou, J., Rudra, S., McDonald, M., ... & Yang, X. (2022). Multi-organ auto-delineation in head-and-neck MRI for radiation therapy using regional convolutional neural network. Physics in Medicine & Biology, 67(2), 025006.
[26] Dai, X., Lei, Y., Wang, T., Tian, Z., Zhou, J., McDonald, M., ... & Yang, X. (2022, April). Automated CT segmentation for rapid assessment of anatomical variations in head-and-neck radiation therapy. In Medical Imaging 2022: Image-Guided Procedures, Robotic Interventions, and Modeling (Vol. 12034, pp. 306-311). SPIE.
[27] Gaikwad, U., & Shah, K. (2024). Hidden Markov Random Field Model Based VGG-16 for Segmentation and Classification of Head and Neck Cancer. International Journal of Intelligent Engineering & Systems, 17(1).
[28] Müller, D., Voran, J. C., Macedo, M., Hartmann, D., Lind, C., Frank, D., ... & Ulrich, H. (2024). Assessing Patient Health Dynamics by Comparative CT Analysis: An Automatic Approach to Organ and Body Feature Evaluation. Diagnostics, 14(23), 2760.
[29] Chen, Q., Bernard, M. E., Duan, J., & Feng, X. (2021). A transfer learning approach for improving OAR segmentation in the adaptive therapy or retreatment of head and neck cancer. International Journal of Radiation Oncology, Biology, Physics, 111(3), e125-e126.
[30] Raza, A., Khan, M. U., Saeed, Z., Samer, S., Mobeen, A., & Samer, A. (2021, December). Classification of eye diseases and detection of cataract using digital fundus imaging (DFI) and inception-V4 deep learning model. In 2021 International Conference on Frontiers of Information Technology (FIT) (pp. 137-142). IEEE.
[31] Saeed, Z., Khan, M. U., Raza, A., Khan, H., Javed, J., & Arshad, A. (2021, October). Classification of pulmonary viruses X-ray and detection of COVID-19 based on invariant of inception-V 3 deep learning model. In 2021 International Conference on Computing, Electronic and Electrical Engineering (ICE Cube) (pp. 1-6). IEEE.
[32] Khan, M. U., Abbasi, M. A., Saeed, Z., Asif, M., Raza, A., & Urooj, U. (2021, December). Deep learning based intelligent emotion recognition and classification system. In 2021 International Conference on Frontiers of Information Technology (FIT) (pp. 25-30). IEEE.
[33] Saeed, Z., Bouhali, O., Ji, J. X., Hammoud, R., Al-Hammadi, N., Aouadi, S., & Torfeh, T. (2024). Cancerous and non-cancerous MRI classification using dual DCNN approach. Bioengineering, 11(5), 410
[34] Khan, M. U., Saeed, Z., Raza, A., Abbasi, Z., Ali, S. Z. E. Z., & Khan, H. (2022). Deep Learning-based Decision Support System for classification of COVID-19 and Pneumonia patients. JAREE (Journal on Advanced Research in Electrical Engineering), 6(1).
[35] Naqvi, S. Z. H., Khan, M. U., Raza, A., Saeed, Z., Abbasi, Z., & Ali, S. Z. E. Z. (2021, November). Deep Learning Based Intelligent Classification of COVID-19 & Pneumonia Using Cough Auscultations. In 2021 6th International Multi-Topic ICT Conference (IMTIC) (pp. 1-6). IEEE.
[36] Saeed, Z., Torfeh, T., Aouadi, S., Ji, X., & Bouhali, O. (2024). An efficient ensemble approach for brain tumors classification using magnetic resonance imaging. Information, 15(10), 641
[37] Saeed, Z., Torfeh, T., Aouadi, S., Ji, X., & Bouhali, O. (2024). An efficient ensemble approach for brain tumors classification using magnetic resonance imaging. Information, 15(10), 641.
[38] Nawaz, U., Anees-ur-Rahaman, M., & Saeed, Z. (2025). A review of neuro-symbolic AI integrating reasoning and learning for advanced cognitive systems. Intelligent Systems with Applications, 200541.
[39] Saeed, Z., Raza, A., Qureshi, A. H., & Yousaf, M. H. (2021, October). A multi-crop disease detection and classification approach using cnn. In 2021 International Conference on Robotics and Automation in Industry (ICRAI) (pp. 1-6). IEEE.
[40] Saeed, Z., Khan, M. U., Raza, A., Sajjad, N., Naz, S., & Salal, A. (2021, December). Identification of leaf diseases in potato crop using Deep Convolutional Neural Networks (DCNNs). In 2021 16th International conference on emerging technologies (icet) (pp. 1-6). IEEE.
[41] Saeed, Z., Yousaf, M. H., Ahmed, R., Velastin, S. A., & Viriri, S. (2023). On-board small-scale object detection for unmanned aerial vehicles (UAVs). Drones, 7(5), 310.
[42] Ishtiaq, A., Saeed, Z., Khan, M. U., Samer, A., Shabbir, M., & Ahmad, W. (2022). Fall detection, wearable sensors & artificial intelligence: A short review. JAREE (Journal on Advanced Research in Electrical Engineering), 6(2).
[43] Raza, A., Saeed, Z., Aslam, A., Nizami, S. M., Habib, K., & Malik, A. N. (2024, February). Advances, application and challenges of lithography techniques. In 2024 5th International Conference on Advancements in Computational Sciences (ICACS) (pp. 1-6). IEEE.
[44] Saeed, Z., Awan, M. N. M., & Yousaf, M. H. (2022, November). A Robust Approach for Small-Scale Object Detection From Aerial-View. In 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) (pp. 1-7). IEEE.
[45] Nawaz, U., Saeed, Z., & Atif, K. (2025). A novel framework for efficient dominance-based rough set approximations using K-dimensional (KD) tree partitioning and adaptive recalculations techniques. Engineering Applications of Artificial Intelligence, 154, 110993.
[46] Nawaz, U., Saeed, Z., & Atif, K. (2025). A Novel Transformer-based approach for adult’s facial emotion recognition. IEEE Access.

Downloads
Published
Issue
Section
License
Copyright (c) 2025 Intelligent Methods In Engineering Sciences

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.