Dual-Scale Transformer-Guided Attention Network for Efficient Multi-OAR Segmentation in Head and Neck Radiotherapy

Authors

DOI:

https://doi.org/10.58190/imiens.2025.126

Keywords:

Deep Learning, segmentation, Organ-at-risk, CT images, convolutional neural networks, dual-scale transformer-guided attention network

Abstract

Accurate segmentation of organ-at-risk (OARs) in head and neck CT images is crucial for radiotherapy planning, but it remains a challenging task due to anatomical complexity, low soft-tissue contrast, and the presence of small, variable structures. We propose DSTANet, a novel dual-scale transformer-guided attention network that integrates multi-resolution encoding, transformer-based global context fusion, and anatomically guided attention refinement to deliver precise multi-OAR segmentation. Unlike traditional CNN-based methods, DSTANet effectively models long-range spatial dependencies while preserving high-resolution boundary detail. On the HNSCC-3DCT-RT dataset, DSTANet achieved a mean Dice Score of 97.5% and a mean 95th percentile Hausdorff Distance (HD95) of 2.32 mm, while on the MICCAI 2015 benchmark dataset, it achieved 90.0% Dice, which surpasses several state-of-the-art approaches both in terms of overlap and geometric accuracy. These results, combined with a sub-20-second inference time, establish DSTANet as a robust and clinically viable solution for automated head and neck OAR segmentation.

Downloads

Download data is not yet available.

References

[1] Jin, D., Guo, D., Ge, J., Ye, X., & Lu, L. (2022). Towards automated organs at risk and target volumes contouring: Defining precision radiation therapy in the modern era. Journal of the National Cancer Center, 2(4), 306-313.

[2] Bose, P., Brockton, N. T., & Dort, J. C. (2013). Head and neck cancer: from anatomy to biology. International journal of cancer, 133(9), 2013-2023.

[3] Jaffray, D. A., Lindsay, P. E., Brock, K. K., Deasy, J. O., & Tomé, W. A. (2010). Accurate accumulation of dose for improved understanding of radiation effects in normal tissue. International Journal of Radiation Oncology* Biology* Physics, 76(3), S135-S139.

[4] Savenije, M. H., Maspero, M., Sikkes, G. G., van der Voort van Zyp, J. R., TJ Kotte, A. N., Bol, G. H., & T. van den Berg, C. A. (2020). Clinical implementation of MRI-based organs-at-risk auto-segmentation with convolutional networks for prostate radiotherapy. Radiation oncology, 15, 1-12.

[5] Han, X., Hoogeman, M. S., Levendag, P. C., Hibbard, L. S., Teguh, D. N., Voet, P., ... & Wolf, T. K. (2008, September). Atlas-based auto-segmentation of head and neck CT images. In International Conference on Medical Image Computing and Computer-assisted Intervention (pp. 434-441). Berlin, Heidelberg: Springer Berlin Heidelberg.

[6] Jeffery, G. (2001). Architecture of the optic chiasm and the mechanisms that sculpt its development. Physiological Reviews, 81(4), 1393-1414.

[7] King, A. D. (2017). Imaging Society (ICIS) 17th Annual Teaching Course. Cancer Imaging, 17(1), O1.

[8] Nawaz, U., Saeed, Z., & Atif, K. (2025). A novel framework for efficient dominance-based rough set approximations using K-dimensional (KD) tree partitioning and adaptive recalculations techniques. Engineering Applications of Artificial Intelligence, 154, 110993.

[9] Nawaz, U., Anees-ur-Rahaman, M., & Saeed, Z. (2025). A review of neuro-symbolic AI integrating reasoning and learning for advanced cognitive systems. Intelligent Systems with Applications, 200541.

[10] Nawaz, U., Anees-ur-Rahaman, M., & Saeed, Z. (2025). A Survey of Deep Learning Approaches for the Monitoring and Classification of Seagrass. Ocean Science Journal, 60(2), 19.

[11] Nawaz, U., Saeed, Z., & Atif, K. (2025). A Novel Transformer-based approach for adult’s facial emotion recognition. IEEE Access.

[12] Mirza, F., & Zhao, H. (2024, August). Hybrid Attention Mechanisms and Bio-Inspired Optimization for Enhanced Breast Cancer Diagnosis from Ultrasound Images. In 2024 7th International Conference on Pattern Recognition and Artificial Intelligence (PRAI) (pp. 786-792). IEEE.

[13] Shelhamer, E., Long, J., & Darrell, T. (2016). Fully convolutional networks for semantic segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(4), 640-651.

[14] Nemoto, T., Futakami, N., Yagi, M., Kumabe, A., Takeda, A., Kunieda, E., & Shigematsu, N. (2020). Efficacy evaluation of 2D, 3D U-Net semantic segmentation and atlas-based segmentation of normal lungs excluding the trachea and main bronchi. Journal of radiation research, 61(2), 257-264.

[15] Ma, Y. (2021, November). PANet: parallel attention network for remote sensing image semantic segmentation. In ISCTT 2021; 6th International Conference on Information Science, Computer Technology and Transportation (pp. 1-4). VDE.

[16] Smelyanskiy, M., Holmes, D., Chhugani, J., Larson, A., Carmean, D. M., Hanson, D., ... & Robb, R. (2009). Mapping high-fidelity volume rendering for medical imaging to CPU, GPU and many-core architectures. IEEE transactions on visualization and computer graphics, 15(6), 1563-1570.

[17] Liu, C., Zhang, X., Si, W., & Ni, X. (2021). Multiview Self‐Supervised Segmentation for OARs Delineation in Radiotherapy. Evidence‐Based Complementary and Alternative Medicine, 2021(1), 8894222.

[18] The Cancer Imaging Archive. (n.d.). HNSCC-3DCT-RT. https://www.cancerimagingarchive.net/collection/hnscc-3dct-rt/

[19] Papers With Code. (n.d.). MICCAI 2015 Head and Neck Auto Segmentation Challenge. https://paperswithcode.com/dataset/miccai-2015-head-and-neck-challenge

[20] Zhu, W., Huang, Y., Zeng, L., Chen, X., Liu, Y., Qian, Z., ... & Xie, X. (2019). AnatomyNet: deep learning for fast and fully automated whole‐volume segmentation of head and neck anatomy. Medical physics, 46(2), 576-589.

[21] Gao, Y., Huang, R., Chen, M., Wang, Z., Deng, J., Chen, Y., ... & Li, H. (2019). FocusNet: imbalanced large and small organ segmentation with an end-to-end deep neural network for head and neck CT images. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part III 22 (pp. 829-838). Springer International Publishing.

[22] Liang, S., Thung, K. H., Nie, D., Zhang, Y., & Shen, D. (2020). Multi-view spatial aggregation framework for joint localization and segmentation of organs at risk in head and neck CT images. IEEE Transactions on Medical Imaging, 39(9), 2794-2805.

[23] Chen, H., Huang, D., Lin, L., Qi, Z., Xie, P., Wei, J., ... & Lu, Y. (2020). Prior attention enhanced convolutional neural network based automatic segmentation of organs at risk for head and neck cancer radiotherapy. IEEE Access, 8, 179018-179027.

[24] Wang, T., Lei, Y., Roper, J., Ghavidel, B., Beitler, J. J., McDonald, M., ... & Yang, X. (2021). Head and neck multi-organ segmentation on dual-energy CT using dual pyramid convolutional neural networks. Physics in Medicine & Biology, 66(11), 115008.

[25] Dai, X., Lei, Y., Wang, T., Zhou, J., Rudra, S., McDonald, M., ... & Yang, X. (2022). Multi-organ auto-delineation in head-and-neck MRI for radiation therapy using regional convolutional neural network. Physics in Medicine & Biology, 67(2), 025006.

[26] Dai, X., Lei, Y., Wang, T., Tian, Z., Zhou, J., McDonald, M., ... & Yang, X. (2022, April). Automated CT segmentation for rapid assessment of anatomical variations in head-and-neck radiation therapy. In Medical Imaging 2022: Image-Guided Procedures, Robotic Interventions, and Modeling (Vol. 12034, pp. 306-311). SPIE.

[27] Gaikwad, U., & Shah, K. (2024). Hidden Markov Random Field Model Based VGG-16 for Segmentation and Classification of Head and Neck Cancer. International Journal of Intelligent Engineering & Systems, 17(1).

[28] Müller, D., Voran, J. C., Macedo, M., Hartmann, D., Lind, C., Frank, D., ... & Ulrich, H. (2024). Assessing Patient Health Dynamics by Comparative CT Analysis: An Automatic Approach to Organ and Body Feature Evaluation. Diagnostics, 14(23), 2760.

[29] Chen, Q., Bernard, M. E., Duan, J., & Feng, X. (2021). A transfer learning approach for improving OAR segmentation in the adaptive therapy or retreatment of head and neck cancer. International Journal of Radiation Oncology, Biology, Physics, 111(3), e125-e126.

[30] Raza, A., Khan, M. U., Saeed, Z., Samer, S., Mobeen, A., & Samer, A. (2021, December). Classification of eye diseases and detection of cataract using digital fundus imaging (DFI) and inception-V4 deep learning model. In 2021 International Conference on Frontiers of Information Technology (FIT) (pp. 137-142). IEEE.

[31] Saeed, Z., Khan, M. U., Raza, A., Khan, H., Javed, J., & Arshad, A. (2021, October). Classification of pulmonary viruses X-ray and detection of COVID-19 based on invariant of inception-V 3 deep learning model. In 2021 International Conference on Computing, Electronic and Electrical Engineering (ICE Cube) (pp. 1-6). IEEE.

[32] Khan, M. U., Abbasi, M. A., Saeed, Z., Asif, M., Raza, A., & Urooj, U. (2021, December). Deep learning based intelligent emotion recognition and classification system. In 2021 International Conference on Frontiers of Information Technology (FIT) (pp. 25-30). IEEE.

[33] Saeed, Z., Bouhali, O., Ji, J. X., Hammoud, R., Al-Hammadi, N., Aouadi, S., & Torfeh, T. (2024). Cancerous and non-cancerous MRI classification using dual DCNN approach. Bioengineering, 11(5), 410

[34] Khan, M. U., Saeed, Z., Raza, A., Abbasi, Z., Ali, S. Z. E. Z., & Khan, H. (2022). Deep Learning-based Decision Support System for classification of COVID-19 and Pneumonia patients. JAREE (Journal on Advanced Research in Electrical Engineering), 6(1).

[35] Naqvi, S. Z. H., Khan, M. U., Raza, A., Saeed, Z., Abbasi, Z., & Ali, S. Z. E. Z. (2021, November). Deep Learning Based Intelligent Classification of COVID-19 & Pneumonia Using Cough Auscultations. In 2021 6th International Multi-Topic ICT Conference (IMTIC) (pp. 1-6). IEEE.

[36] Saeed, Z., Torfeh, T., Aouadi, S., Ji, X., & Bouhali, O. (2024). An efficient ensemble approach for brain tumors classification using magnetic resonance imaging. Information, 15(10), 641

[37] Saeed, Z., Torfeh, T., Aouadi, S., Ji, X., & Bouhali, O. (2024). An efficient ensemble approach for brain tumors classification using magnetic resonance imaging. Information, 15(10), 641.

[38] Nawaz, U., Anees-ur-Rahaman, M., & Saeed, Z. (2025). A review of neuro-symbolic AI integrating reasoning and learning for advanced cognitive systems. Intelligent Systems with Applications, 200541.

[39] Saeed, Z., Raza, A., Qureshi, A. H., & Yousaf, M. H. (2021, October). A multi-crop disease detection and classification approach using cnn. In 2021 International Conference on Robotics and Automation in Industry (ICRAI) (pp. 1-6). IEEE.

[40] Saeed, Z., Khan, M. U., Raza, A., Sajjad, N., Naz, S., & Salal, A. (2021, December). Identification of leaf diseases in potato crop using Deep Convolutional Neural Networks (DCNNs). In 2021 16th International conference on emerging technologies (icet) (pp. 1-6). IEEE.

[41] Saeed, Z., Yousaf, M. H., Ahmed, R., Velastin, S. A., & Viriri, S. (2023). On-board small-scale object detection for unmanned aerial vehicles (UAVs). Drones, 7(5), 310.

[42] Ishtiaq, A., Saeed, Z., Khan, M. U., Samer, A., Shabbir, M., & Ahmad, W. (2022). Fall detection, wearable sensors & artificial intelligence: A short review. JAREE (Journal on Advanced Research in Electrical Engineering), 6(2).

[43] Raza, A., Saeed, Z., Aslam, A., Nizami, S. M., Habib, K., & Malik, A. N. (2024, February). Advances, application and challenges of lithography techniques. In 2024 5th International Conference on Advancements in Computational Sciences (ICACS) (pp. 1-6). IEEE.

[44] Saeed, Z., Awan, M. N. M., & Yousaf, M. H. (2022, November). A Robust Approach for Small-Scale Object Detection From Aerial-View. In 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA) (pp. 1-7). IEEE.

[45] Nawaz, U., Saeed, Z., & Atif, K. (2025). A novel framework for efficient dominance-based rough set approximations using K-dimensional (KD) tree partitioning and adaptive recalculations techniques. Engineering Applications of Artificial Intelligence, 154, 110993.

[46] Nawaz, U., Saeed, Z., & Atif, K. (2025). A Novel Transformer-based approach for adult’s facial emotion recognition. IEEE Access.

Downloads

Published

2025-08-31

Issue

Section

Research Articles

How to Cite

[1]
U. . Nawaz, H. M. Ubaidullah, Z. Saeed, and C. M. Ali Nawaz, “Dual-Scale Transformer-Guided Attention Network for Efficient Multi-OAR Segmentation in Head and Neck Radiotherapy”, Intell Methods Eng Sci, vol. 4, no. 2, pp. 38–53, Aug. 2025, doi: 10.58190/imiens.2025.126.

Similar Articles

11-20 of 37

You may also start an advanced similarity search for this article.