Highly Efficient FFT for Exascale: HeFFTe v2.3
Fast Fourier Transform
Collaboration diagram for Fast Fourier Transform:


 Backend common wrappers
 Backend cufft
 Backend fftw3
 Backend mkl
 Backend oneAPI
 Backend rocfft
 Backend stock
 Box-geometry operations
 Complex types
 Event tracing
 Helper Methods from MAGMA
 Helper wrappers around MPI methods
 Miscellaneous helpers
 Packing/Unpacking operations
 Plan transformation logic
 Reshape operations
 Sine and Cosine Transforms


class  heffte::fft3d< backend_tag, index >
 Defines the plan for a 3-dimensional discrete Fourier transform performed on a MPI distributed data. More...
class  heffte::fft3d_r2c< backend_tag, index >
 Similar to heffte::fft3d, but computed fewer redundant coefficients when the input is real. More...
struct  heffte::box3d< index >
 A generic container that describes a 3d box of indexes. More...
struct  heffte::plan_options
 Defines a set of tweaks and options to use in the plan generation. More...
struct  heffte::is_ccomplex< scalar_type >
 Struct to specialize to allow HeFFTe to recognize custom single precision complex types. More...
struct  heffte::is_zcomplex< scalar_type >
 Struct to specialize to allow HeFFTe to recognize custom double precision complex types. More...


template<typename backend_tag , typename index = int>
using heffte::fft2d = fft3d< backend_tag, index >
 Alias of heffte::fft3d to be used for a two dimensional problem. More...
template<typename backend_tag , typename index = int>
using heffte::rtransform = fft3d< backend_tag, index >
 Alias of heffte::fft3d to be more expressive when using Sine and Cosine transforms. More...
template<typename backend_tag , typename index = int>
using heffte::fft2d_r2c = fft3d_r2c< backend_tag, index >
 Alias of heffte::fft2d to be used for a two dimensional problem.
template<typename index = int>
using heffte::box2d = box3d< index >
 Alias for expressive calls to heffte::fft2d and heffte::fft2d_r2c.


enum class  heffte::scale { heffte::none , heffte::full , heffte::symmetric }
 Indicates the scaling factor to apply on the result of an FFT operation. More...
enum class  heffte::reshape_algorithm { heffte::alltoallv = 0 , heffte::alltoall = 3 , heffte::p2p_plined = 1 , heffte::p2p = 2 }
 Defines list of potential communication algorithms. More...


template<typename backend_tag , typename index >
fft3d< backend_tag, index > heffte::make_fft3d (box3d< index > const inbox, box3d< index > const outbox, MPI_Comm const comm, plan_options const options=default_options< backend_tag >())
 Factory method that auto-detects the index type based on the box.
template<typename backend_tag , typename index >
fft3d_r2c< backend_tag, index > heffte::make_fft3d_r2c (box3d< index > const inbox, box3d< index > const outbox, int r2c_direction, MPI_Comm const comm, plan_options const options=default_options< backend_tag >())
 Factory method that auto-detects the index type based on the box.
std::ostream & heffte::operator<< (std::ostream &os, plan_options const options)
 Simple I/O for the plan options struct.
template<typename backend_tag , bool use_r2c = false>
plan_options heffte::set_options (plan_options opts)
 Adjusts the user provided options to what can be handled by the backend. More...
template<typename backend_tag >
plan_options heffte::default_options ()
 Returns the default backend options associated with the given backend.

Detailed Description

HeFFTe C++11 API
Encapsulates all classes and method for the C++11 API, most notably:

Typedef Documentation

◆ fft2d

template<typename backend_tag , typename index = int>
using heffte::fft2d = typedef fft3d<backend_tag, index>

Alias of heffte::fft3d to be used for a two dimensional problem.

The internal logic of heFFTe is capable of recognizing directions with only a single indexes and ignoring redundant communication. Thus, a two dimensional transform is just an alias for the three dimensional one with heffte::box2d as input (which is also an alias).

◆ rtransform

template<typename backend_tag , typename index = int>
using heffte::rtransform = typedef fft3d<backend_tag, index>

Alias of heffte::fft3d to be more expressive when using Sine and Cosine transforms.

In addition to the standard Discrete Fourier Transform, heFFTe also supports the discrete Sine and Cosine transforms. The input/output arrays/vectors and follow the same logic as in the heffte::fft3d class, in fact the heffte::rtransform is just an alias to that template. The difference lies in the way the name of the backend is selected and the accepted types.
The type-tags associated with the Sine and Cosine transforms are names starting with a regular FFT tag and appending either _sin or _cos to the name, e.g.,
Backend Sine Transform Cosine Transform
Stock heffte::backend::stock_sin heffte::backend::stock_cos
FFTW heffte::backend::fftw_sin heffte::backend::fftw_cos
MKL heffte::backend::mkl_sin heffte::backend::mkl_cos
oneMKL heffte::backend::onemkl_sin heffte::backend::onemkl_cos
cuFFT heffte::backend::cufft_sin heffte::backend::cufft_cos
rocFFT heffte::backend::rocfft_sin heffte::backend::rocfft_cos
The tags can be enabled for either the heffte::fft3d template or the heffte::rtransform alias.
The Sine and Cosine transforms operate with real types, float and double for the two supported precisions. Similarly, the size of the workspace vector is measured in the corresponding real units.
Memory Requirements
In the current implementation, the real transforms require more additional workspace memory, which can be counter-intuitive but it is the expected behavior.
Relationship to FFTW
The FFTW is probably the most widely used library for FFT algorithms including the Sine and Cosine transforms. The algorithms implemented in heFFTe correspond to:
heFFTe Transform FFTW Transform Type
Sine - forward FFTW_RODFT10
Sine - backward FFTW_RODFT01
Cosine - forward FFTW_REDFT10
Cosine - backward FFTW_REDFT01

Enumeration Type Documentation

◆ scale

enum heffte::scale

Indicates the scaling factor to apply on the result of an FFT operation.

See the description of heffte::fft3d for details.


No scale, leave the result unperturbed similar to the FFTW API.


Apply the full scale, divide by the number of elements in the world box.


Symmetric scaling, apply the square-root of the full scaling.

◆ reshape_algorithm

Defines list of potential communication algorithms.

Depending on the size of the data and the number of MPI ranks used in the FFT transform, the problems can be classified as either bandwidth-bound or latency-bound. The bandwidth-bound case hits pretty close to the maximum throughput of the MPI interconnect while the latency-bound case is more affected by the latency of the large number of small communications. As a short-hand we can call these small-problems (latency-bound) or large-problems (bandwidth-bound), although the specific cutoff point is dependent on the backend (and the version of the backend), the version of MPI, the machine interconnect, and the specific optimizations that have been implemented in MPI.

There is a plan of adding an auto-tuning framework in heFFTe to help users select the best possible set of options; however, currently the users have to manually find the best option for their hardware. The expected "best" algorithm is:

reshape_algorithm::alltoallv : for larger FFT, many MPI ranks
reshape_algorithm::alltoall : for smaller FFT, many MPI ranks
reshape_algorithm::p2p_plined : for larger FFT, fewer MPI ranks
reshape_algorithm::p2p : for smaller FFT, fewer MPI ranks
@ alltoall
Using the MPI_Alltoall options, with padding on the data.
@ alltoallv
Using the MPI_Alltoallv options, no padding on the data (default option).
@ p2p_plined
Using MPI_Isend and MPI_Irecv, all sending receiving packing and unpacking are pipelined.
@ p2p
Using MPI_Send and MPI_Irecv, receive is pipelined with packing and sending.

Note that in the GPU case, the above algorithms are also affected by the GPU latency if MPI calls are made directly from the GPU. This can be controlled with the use_gpu_aware variable of the heffte::plan_options.


Using the MPI_Alltoallv options, no padding on the data (default option).


Using the MPI_Alltoall options, with padding on the data.


Using MPI_Isend and MPI_Irecv, all sending receiving packing and unpacking are pipelined.


Using MPI_Send and MPI_Irecv, receive is pipelined with packing and sending.

Function Documentation

◆ set_options()

template<typename backend_tag , bool use_r2c = false>
plan_options heffte::set_options ( plan_options  opts)

Adjusts the user provided options to what can be handled by the backend.

Some backends do not support all available options, e.g., they require the use_reorder option to be set on. This template makes the necessary adjustments so that the correct answer is always computed even if the user provides unsupported options.