Clinical validation of AI‐assisted contouring in prostate radiation therapy treatment planning: Highlighting automation bias and the need for standardized quality assurance

Najmeh Arjmandi(Gerash University of Medical Sciences), Ahmadreza Sebzari(Birjand University of Medical Sciences), Fatemeh Molaei(Birjand University of Medical Sciences), Saeid Rezaei(Tehran University of Medical Sciences), Maryam Rezaie-Yazdi(Birjand University of Medical Sciences), Malihe Rezaie-Yazdi(Birjand University of Medical Sciences)
Journal of Applied Clinical Medical Physics
December 19, 2025
Cited by 1Open Access
Full Text

Abstract

Abstract Purpose This study evaluated the impact of a commercial AI‐assisted contouring tool on intra‐ and inter‐observer variability in prostate radiation therapy and assessed the dosimetric consequences of geometric contour differences. Methods Two experienced radiation oncologists independently delineated clinical target volume (CTV) and organs at risk (OARs) for prostate cancer patients. Manual contours (C man ) and AI‐generated contours (C AI ) were compared with adjusted AI contours (C AI,adj ). A consensus reference (C ref ) served as the benchmark. To evaluate clinical impact, treatment plans were recalculated and replanned on each contour set under identical beam geometries to assess dose–volume histogram (DVH) parameters. Results AI‐assisted contouring significantly improved both intra‐ and inter‐observer agreement. Inter‐observer analysis revealed that the Dice similarity coefficient (DSCs) for CTV increased from 0.78 (± 0.11) for C man to 0.89 (± 0.09) for C AI, adj . Similarly, intra‐observer analysis revealed that both oncologists showed significantly higher DSCs for C AI, adj compared to C man . A thorough geometric comparison to the C ref revealed that while adjustments to C AI improved accuracy, they generally did not surpass C man for CTV and rectum. Dosimetric analyses demonstrated that, under fixed plan geometry, both C man and C AI,adj contours yielded lower planning target volume (PTV) D95% values compared with C ref , whereas after replanning, all plans met institutional criteria with no clinically significant differences among contour sets. Conclusion AI‐assisted contouring in prostate radiotherapy reduced intra‐ and inter‐observer variability and improved contouring consistency. However, C AI, adj did not consistently surpass C man , especially for the CTV and rectum, where automation bias or selective clinical acceptance may have influenced edits. Fixed‐plan recalculations revealed dose differences from minor geometric deviations. These findings underscore the importance of structured quality assurance (QA) and human oversight to mitigate automation bias while leveraging AI's efficiency. The single‐institution design with two oncologists and one AI software limits generalizability, underscoring the need for multi‐observer validation.


Related Papers

No related papers found

Powered by citation graph analysis