This paper presents a limited-area atmospheric simulation of a tropical cyclone accelerated using GPUs. The OpenACC directive-based programming model is used to port the atmospheric model to the GPU. The GPU implementation of the main functions and kernels is discussed. The GPU-accelerated code produces high-fidelity simulations of a realistic tropical cyclone forced by observational latent heating. Performance tests show that the GPU-accelerated code yields energy-efficient simulations and scales well in both the strong and weak limit.
AbdiDSGiraldoFXConstantinescuEM, et al. (2019a) Acceleration of the IMplicit–EXplicit nonhydrostatic unified model of the atmosphere on manycore processors. The International Journal of High Performance Computing Applications33(2): 242–267. Available at: https://doi.org/10.1177/1094342017732395
2.
AbdiDSWilcoxLCWarburtonTC, et al. (2019b) A GPU-Accelerated continuous and discontinuous Galerkin non-hydrostatic atmospheric model. The International Journal of High Performance Computing Applications33(1): 81–109. Available at: https://doi.org/10.1177/1094342017694427
3.
AndrejJAtallahNBäckerJ-P, et al. (2024) High-performance finite elements with MFEM. The International Journal of High Performance Computing Applications: 10943420241261981. https://doi.org/10.1177/10943420241261981
4.
FischerMSTangBHCorbosieroKL (2019) A climatological analysis of tropical cyclone rapid intensification in environments of upper-tropospheric troughs. Monthly Weather Review147(10): 3693–3719. https://doi.org/10.1175/mwr-d-19-0013.1
5.
FischerPKerkemeierSMinM, et al. (2022) NekRS, a GPU-Accelerated spectral element Navier–Stokes solver. Parallel Computing114: 102982. Available at: https://doi.org/10.1016/j.parco.2022.102982
6.
GiraldoFX (1998) The Lagrange–Galerkin spectral element method on unstructured quadrilateral grids. Journal of Computational Physics147(1): 114–146. Available at: https://doi.org/10.1006/jcph.1998.6078
7.
GiraldoFX (2020) An Introduction to Element-based Galerkin Methods on tensor-product Bases: Analysis, Algorithms, and Applications. Springer Nature.
8.
GiraldoFXRestelliMLäuterM (2010) Semi-implicit formulations of the Navier–Stokes equations: application to nonhydrostatic atmospheric modeling. SIAM Journal on Scientific Computing32(6): 3394–3425. Available at: https://doi.org/10.1137/090775889
9.
GiraldoFXKellyJFConstantinescuEM (2013) Implicit-explicit formulations of a three-dimensional nonhydrostatic unified model of the atmosphere (NUMA). SIAM Journal on Scientific Computing35(5): B1162–B1194. https://doi.org/10.1137/120876034
10.
GiraldoFXde Bragança AlvesFAVKellyJF, et al. (2024) A performance study of horizontally explicit vertically implicit (HEVI) time-integrators for non-hydrostatic atmospheric models. Journal of Computational Physics515: 113275. https://doi.org/10.1016/j.jcp.2024.113275
11.
GubaOTaylorMAUllrichPA, et al. (2014) The spectral element method (SEM) on variable-resolution grids: evaluating grid sensitivity and resolution-aware numerical viscosity. Geoscientific Model Development7(6): 2803–2816. https://doi.org/10.5194/gmd-7-2803-2014
12.
GuimondSRReisnerJMMarrasS, et al. (2016) The impacts of dry dynamic cores on asymmetric hurricane intensification. Journal of the Atmospheric Sciences73(12): 4661–4684. https://doi.org/10.1175/jas-d-16-0055.1
13.
HasanMBGuimondSRYuML, et al. (2022) The effects of numerical dissipation on hurricane rapid intensification with observational heating. Journal of Advances in Modeling Earth Systems14(8): e2021MS002897. https://doi.org/10.1029/2021ms002897
KangSKellyJFAustinAP, et al. (2025) Multiscale modeling framework using element-based Galerkin methods for moist atmospheric limited-area simulations. Journal of Advances in Modeling Earth Systems17(7): e2024MS004453. https://doi.org/10.1029/2024ms004453
16.
KellyJFGiraldoFX (2012) Continuous and discontinuous Galerkin methods for a scalable three-dimensional nonhydrostatic atmospheric model: Limited-area mode. Journal of Computational Physics231(24): 7988–8008. https://doi.org/10.1016/j.jcp.2012.04.042
17.
KlempJBDudhiaJHassiotisAD (2008) An upper gravity-wave absorbing layer for NWP applications. Monthly Weather Review136(10): 3987–4004. https://doi.org/10.1175/2008mwr2596.1
18.
KolevTFischerPMinM, et al. (2021) Efficient exascale discretizations: High-order finite element methods. The International Journal of High Performance Computing Applications35(6): 527–552. https://doi.org/10.1177/10943420211020803
19.
MedinaDSSt-CyrAWarburtonT (2014) OCCA: A Unified Approach to multi-threading LanguagesarXiv preprint arXiv:1403.0968.
20.
NolanDSGrassoLD (2003) Nonhydrostatic, three-dimensional perturbations to balanced, hurricane-like vortices. Part II: symmetric response and nonlinear simulations. Journal of the Atmospheric Sciences60(22): 2717–2745. https://doi.org/10.1175/1520-0469(2003)060<2717:ntptbh>2.0.co;2
21.
NolanDSMoonYSternDP (2007) Tropical cyclone intensification from asymmetric convection: energetics and efficiency. Journal of the Atmospheric Sciences64(10): 3377–3405. https://doi.org/10.1175/jas3988.1
OpenACC Organization (2020) The OpenACC application programming interface. Available at. https://www.openacc.orgVersion 3.1.
24.
OteroEGongJMinM, et al. (2019) OpenACC acceleration for the PN–PN-2 algorithm in Nek5000. Journal of Parallel and Distributed Computing132: 69–78.
25.
SaadYSchultzMH (1986) GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing7(3): 856–869. https://doi.org/10.1137/0907058
26.
TissaouiYGuimondSRGiraldoFX, et al. (2024) Accelerating simulations of tropical cyclones using adaptive mesh refinement. arXiv preprint arXiv:2410.21607. top500.org, TOP500 List – November 2025. Available at. https://top500.org/lists/top500/list/2025/11/
27.
VargasAStittTMWeissK, et al. (2022) Matrix-free approaches for GPU acceleration of a high-order finite element hydrodynamics application using MFEM, Umpire, and RAJA. The International Journal of High Performance Computing Applications36(4): 492–509. https://doi.org/10.1177/10943420221100262
28.
WilliamsSWatermanAPattersonD (2009) Roofline: an insightful visual performance model for floating-point programs and multicore. Communications of the ACM16.