GPU-Accelerated Monte Carlo Methods for Proton Therapy: Integrating Heterogeneous Runtime Systems and Unified Memory Strategies for Clinical-Grade Dose Calculation

Dr. Marcellus V. Carter

Authors

Dr. Marcellus V. Carter Department of Computational Medicine, University of Edinburgh Author

Keywords:

Proton therapy, Monte Carlo simulation, GPU acceleration, unified memory

Abstract

This article presents a comprehensive, publication-ready investigation into the integration of GPU-accelerated Monte Carlo (MC) dose calculation methods for proton therapy with modern heterogeneous runtime systems and unified memory strategies. The work synthesizes principles from GPU programming, runtime compilation, managed runtime systems, and medical physics Monte Carlo simulations to propose a cohesive framework for clinical-grade, high-throughput proton dose calculation. The abstract outlines the motivation: proton therapy demands highly accurate dose calculations that account for complex particle transport physics while delivering results within clinical time constraints. Traditional CPU-based MC codes offer high fidelity but are limited by throughput; GPU implementations have demonstrated orders-of-magnitude speedups, yet bring challenges in memory management, precision, platform heterogeneity, and security. This paper reviews the computational and physical foundations of MC simulation in proton therapy, examines GPU programming models and thread hierarchies, surveys existing GPU-based MC systems and verification/validation efforts, analyzes unified memory and its performance implications, and proposes a methodology that couples just-in-time (JIT) GPU compilation, managed runtimes, and careful algorithmic restructuring to reconcile precision, performance, and maintainability. The results section provides a descriptive analysis of expected performance gains, trade-offs in memory strategies, and implications for clinical deployment. In the discussion, limitations, potential failure modes, and avenues for future work—including validation workflows, regulatory considerations, and hybrid CPU–GPU orchestration—are explored in depth. The conclusion synthesizes these threads into actionable recommendations for researchers and system designers seeking to produce clinically viable, reproducible, and secure GPU-accelerated Monte Carlo proton therapy dose engines. Keywords: proton therapy, Monte Carlo simulation, GPU acceleration, unified memory, just-in-time compilation, heterogeneous runtime, clinical dose calculation.

References

1. Sanders, J., & Kandrot, E. (2010). CUDA by example: an introduction to general-purpose GPU programming. Addison-Wesley Professional.

2. Fumero, J., Steuwer, M., Stadler, L., & Dubach, C. (2017). Just-in-time GPU compilation for interpreted languages with partial evaluation. Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, 60–73.

3. Kotselidis, C., Clarkson, J., Rodchenko, A., Nisbet, A., Mawer, J., & Lujan, M. (2017). Heterogeneous managed runtime systems: A computer vision case study. SIGPLAN Notices, 74–82.

4. Nvidia. (2020). CUDA programming guide, thread hierarchy. docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#thread-hierarchy.

5. Hayashi, A., Grossman, M., Zhao, J., Shirako, J., & Sarkar, V. (2013). Speculative execution of parallel programs with precise exception semantics on GPUs. International Workshop on Languages and Compilers for Parallel Computing, 342–356.

6. Klockner, A., Pinto, N., Lee, Y., Catanzaro, B., Ivanov, P., Fasih, A., Sarma, Å., Nanongkai, D., Pandurangan, G., Tetali, P., et al. (2009). PyCUDA: GPU run-time code generation for high-performance computing. arXiv preprint arXiv:911.

7. Celik, A., Nie, P., Rossbach, C. J., & Gligoric, M. (2019). Design, implementation, and application of GPU-based Java bytecode interpreters. Proceedings of the ACM on Programming Languages, 3, OOPSLA.

8. Duboscq, G., Stadler, L., Wurthinger, T., Simon, D., Wimmer, C., & Mossenböck, H. (2013). Graal IR: An extensible declarative intermediate representation. Proceedings of the Asia-Pacific Programming Languages and Compilers Workshop.

9. Negrut, D., Serban, R., Li, A., & Seidl, A. (2014). Unified memory in CUDA 6.0: a brief overview of related data access and transfer issues. Tech. Rep. TR-2014–09, University of Wisconsin-Madison.

10. Landaverde, R., Zhang, T., Coskun, A. K., & Herbordt, M. (2014). An investigation of unified memory access performance in CUDA. 2014 IEEE High Performance Extreme Computing Conference (HPEC), 1–6.

11. Stone, J. E., Gohara, D., & Shi, G. (2010). OpenCL: a parallel programming standard for heterogeneous computing systems. Computing in Science & Engineering, 12(3), 66.

12. Di, B., Sun, J., & Chen, H. (2016). A study of overflow vulnerabilities on GPUs. IFIP International Conference on Network and Parallel Computing, 103–115.

13. Verbeek, N., Wulff, J., Bäumer, C., Smyczek, S., Timmermann, B., & Brualla, L. (2021). Single pencil beam benchmark of a module for Monte Carlo simulation of proton transport in the PENELOPE code. Medical Physics, 48(1), 456–476. doi:10.1002/mp.14598

14. Kozłowska, W. S., Böhlen, T. T., Cuccagna, C., et al. (2019). FLUKA particle therapy tool for Monte Carlo independent calculation of scanned proton and carbon ion beam therapy. Physics in Medicine & Biology, 64(7), 075012. doi:10.1088/1361-6560/ab02cb

15. Shan, J., Feng, H., Morales, D. H., et al. (2022). Virtual particle Monte Carlo: a new concept to avoid simulating secondary particles in proton therapy dose calculation. Medical Physics, 49(10), 6666–6683. doi:10.1002/mp.15913

16. Souris, K., Lee, J. A., & Sterpin, E. (2016). Fast multipurpose Monte Carlo simulation for proton therapy using multi- and many-core CPU architectures. Medical Physics, 43(4), 1700–1712. doi:10.1118/1.4943377

17. Qin, N., Botas, P., Giantsoudi, D., et al. (2016). Recent developments and comprehensive evaluations of a GPU-based Monte Carlo package for proton therapy. Physics in Medicine & Biology, 61(20), 7347–7362. doi:10.1088/0031-9155/61/20/7347

18. Jia, X., Schümann, J., Paganetti, H., & Jiang, S. B. (2012). GPU-based fast Monte Carlo dose calculation for proton therapy. Physics in Medicine & Biology, 57(23), 7783–7797. doi:10.1088/0031-9155/57/23/7783

19. Guan, F., Peeler, C., Bronk, L., et al. (2015). Analysis of the track- and dose-averaged LET and LET spectra in proton therapy using the GEANT4 Monte Carlo code. Medical Physics, 42(11), 6234–6247. doi:10.1118/1.4932217

20. Jarlskog, C. Z., & Paganetti, H. (2008). Physics settings for using the Geant4 toolkit in proton therapy. IEEE Transactions on Nuclear Science, 55, 1018–1025. doi:10.1109/TNS.2008.922816

21. Prusator, M., Ahmad, S., & Chen, Y. (2017). TOPAS simulation of the Mevion S250 compact proton therapy unit. Journal of Applied Clinical Medical Physics, 18(3), 88–88. doi:10.1002/acm2.12077

22. Chen, Z., Liu, H., Zhao, J., & Kaess, S. (2022). TOPAS Monte Carlo simulation for a scanning proton therapy system in SPHIC. Journal of Radiation Research and Applied Sciences, 15(1), 122–129. doi:10.1016/j.jrras.2022.01.016

23. Liu, H., Li, Z., Slopsema, R., Hong, L., Pei, X., & Xu, X. G. (2019). TOPAS Monte Carlo simulation for double scattering proton therapy and dosimetric evaluation. Physica Medica, 62, 53–62. doi:10.1016/j.ejmp.2019.05.001

24. Testa, M., Schümann, J., Lu, H. M., et al. (2013). Experimental validation of the TOPAS Monte Carlo system for passive scattering proton therapy. Medical Physics, 40(12), 121719. doi:10.1118/1.4828781

25. Lulla, K. (2025). Python-based GPU testing pipelines: Enabling zero-failure production lines. Journal of Information Systems Engineering and Management, 10, 978–994.

GPU-Accelerated Monte Carlo Methods for Proton Therapy: Integrating Heterogeneous Runtime Systems and Unified Memory Strategies for Clinical-Grade Dose Calculation

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles