Highly Efficient FFT for Exascale: HeFFTe v2.3
|
At the minimum, heFFTe requires a C++11 capable compiler, an implementation of the Message Passing Library (MPI), and at least one backend FFT library. The heFFTe library can be build with either CMake 3.10 or newer, or a simple GNU Make build engine. CMake is the recommended way to use heFFTe since dependencies and options are much easier to export to user projects and not all options could be cleanly implemented in the rigid Makefile.
Compiler | Tested versions |
---|---|
gcc | 7 - 11 |
clang | 5 - 14 |
icc | 18 |
dpcpp | 2021.2 |
OpenMPI | 4.0.3 |
Tested backend libraries:
Backend | Tested versions |
---|---|
stock | all |
fftw3 | 3.3.7 - 3.3.10 |
mkl | 2016 |
oneapi/onemkl | 2021.4 |
cuda/cufft | 10 - 11 |
rocm/rocfft | 4.0 - 5.6 |
The listed tested versions are part of the continuous integration and nightly build systems, but heFFTe may yet work with other compilers and backend versions.
Typical CMake build follows the steps:
Typical CMake build command:
The standard CMake options are also accepted:
Additional heFFTe options:
Additional language interfaces and helper methods:
See the Fortran and Python sections for details.
Heffte_ENABLE_AVX512
enables all the vectorization flags. Using vectorization is optional and not required for using the stock backend.MKL_ROOT
default to the environment variable MKLROOT
(chosen by Intel). The additional variable Heffte_MKL_THREAD_LIBS
allows to choose the MKL threaded backend, tested with mkl_gnu_thread
and mkl_intel_thread
, the default is to use GNU-threads on GCC compiler and Intel otherwise. Note that mkl_intel_thread
also requires libiomp5.so
and heFFTe will search for it in the system paths and LD_LIBRARY_PATH
, unless the variable Heffte_MKL_IOMP5
is defined and pointing to libiomp5.so
. GNU-threads do not use libiomp5.so
but the GNU libgomp.so
which CMake finds automatically.Heffte_ONEMKL_ROOT
defaults to the environment variable ONEAPI_ROOT
. The Intel oneAPI framework provides support for both CPU and GPU devices from the same code-base, thus the oneMKL libraries require the CPU MKL libraries and the MKL options will be enabled automatically if oneMKL is selected.Note: Only one of the GPU backends can be enabled (CUDA, ROCM, or ONEAPI) since the three backends operate with arrays allocated in GPU device memory (or alternatively shared/managed memory). By default when using either GPU backend, heFFTe assumes that the MPI implementation is GPU-Aware, see the next section.
Different implementations of MPI can provide GPU-Aware capabilities, where data can be send/received directly in GPU memory. OpenMPI provided CUDA aware capabilities if compiled with the corresponding options, e.g., see CUDA-Aware OpenMPI. Both CUDA and ROCm support such API; however, the specific implementation available to the user may not be available for various reasons, e.g., insufficient hardware support. HeFFTe can be compiled without GPU-Aware capabilities with the CMake option:
Note: The GPU-Aware capabilities can also be disabled at runtime by setting the corresponding option in heffte::plan_options
. On some platforms, the GPU-Aware MPI calls have significantly larger latency and moving the buffers to the CPU before ini
HeFFTe installs a CMake package-config file in
Typical project linking to HeFFTe will look like this:
An example is installed in <install-prefix>/share/heffte/examples/
.
The package-config also provides a set of components corresponding to the different compile options, specifically:
HeFFTe supports a GNU Make build engine, where dependencies and compilers are set manually in the included Makefile. Selecting the backends is done with:
The backends
should be separated by commas and must have correctly selected compilers, includes, and libraries. Additional options are available, see
and see also the comments inside the Makefile.
Testing is invoked with:
The library will be build in ./lib/
The GNU Make build engine does not support all options, e.g., MAGMA or disabling the GPU-Aware calls, but is provided for testing and debugging purposes.