April 10, 2018
Over the past year, Prof. Chandrasekaran and Robert Searles have teamed up with researchers Wayne Joubert and Oscar Hernandez of Oak Ridge National Lab in an effort to accelerate a miniapp called Minisweep. Minisweep is representative of the main computational kernel of a production Sn radiation transport application called Denovo, which is used for nuclear reactor neutronics modeling. This code is instrumental for nuclear scientists because modeling neutron flow helps them avoid a nuclear meltdown, and it provides them with information that is essential in designing an effective shielding system around the reactor to protect workers from excessive radiation exposure.
Denovo was one of 6 six applications selected for early application readiness on ORNL’s Titan system under the Center for Accelerated Application Readiness (CAAR) project and is part of the Exnihilo code suite which received an R&D 100 award for modeling the Westinghouse AP1000 reactor. It is currently used by a DOE Innovative and Novel Computational Impact on Theory and Experiment (INCITE) project to model the International Thermonuclear Experimental Reactor (ITER) fusion reactor.
Studying and accelerating Minisweep is extremely important because Minisweep’s sweep kernel is responsible for 80-99% of Denovo’s overall runtime. Accelerating this kernel allows domain scientists to run more configurations of Denovo in a fixed walltime. The more configurations scientists are able to run, the better their reactor modeling will be. Minisweep’s computation is also representative of a code structure called wavefront. Wavefront codes are very common in computational science applications, so the study of Minisweep has broader impact outside of the domain of nuclear reactor modeling. An animation (Image Credit: Evan Krape, UDEL) of this wavefront computational pattern is shown below.
Robbie was able to successfully accelerate Minisweep using OpenACC, a directive-based programming model designed to allow programmers to insert abstract annotations in their code that can translated appropriately by a compiler into code that will run on a specified target hardware architecture. This means that it will no longer be necessary to rewrite Minisweep every time a new type of hardware comes about. Robbie’s results below compare his OpenACC implementation to Wayne’s OpenMP and CUDA implementations of the code. The fact that the OpenACC performance achieved was in the same ballpark as CUDA’s performance has created excitement around this work.
For more information, check out the UDaily article about this work:
Alternatively, check out the Oak Ridge Leadership Computing Facility
(OLCF) article about this work: