|
| fft3d_r2c (box3d< index > const inbox, box3d< index > const outbox, int r2c_direction, MPI_Comm const comm, plan_options const options=default_options< backend_tag >()) |
| Constructor creating a plan for FFT transform across the given communicator and using the box geometry. More...
|
|
| fft3d_r2c (typename backend::device_instance< location_tag >::stream_type gpu_stream, box3d< index > const inbox, box3d< index > const outbox, int r2c_direction, MPI_Comm const comm, plan_options const options=default_options< backend_tag >()) |
| See the documentation for fft3d::fft3d()
|
|
| fft3d_r2c (int il0, int il1, int il2, int ih0, int ih1, int ih2, int io0, int io1, int io2, int ol0, int ol1, int ol2, int oh0, int oh1, int oh2, int oo0, int oo1, int oo2, int r2c_direction, MPI_Comm const comm, bool use_reorder, int algorithm, bool use_pencils) |
| Internal use only, used by the Fortran interface.
|
|
| fft3d_r2c (int il0, int il1, int il2, int ih0, int ih1, int ih2, int io0, int io1, int io2, int ol0, int ol1, int ol2, int oh0, int oh1, int oh2, int oo0, int oo1, int oo2, int r2c_direction, MPI_Comm const comm) |
| Internal use only, used by the Fortran interface.
|
|
| fft3d_r2c (int il0, int il1, int il2, int ih0, int ih1, int ih2, int ol0, int ol1, int ol2, int oh0, int oh1, int oh2, int r2c_direction, MPI_Comm const comm) |
| Internal use only, used by the Fortran interface.
|
|
long long | size_inbox () const |
| Returns the size of the inbox defined in the constructor.
|
|
long long | size_outbox () const |
| Returns the size of the outbox defined in the constructor.
|
|
box3d< index > | inbox () const |
| Returns the inbox.
|
|
box3d< index > | outbox () const |
| Returns the outbox.
|
|
size_t | size_workspace () const |
| Returns the workspace size that will be used, size is measured in complex numbers.
|
|
size_t | size_comm_buffers () const |
| Returns the size used by the communication workspace buffers (internal use).
|
|
template<typename input_type , typename output_type > |
void | forward (input_type const input[], output_type output[], scale scaling=scale::none) const |
| Performs a forward Fourier transform using two arrays. More...
|
|
template<typename input_type , typename output_type > |
void | forward (input_type const input[], output_type output[], output_type workspace[], scale scaling=scale::none) const |
| Overload utilizing a user provided buffer.
|
|
template<typename input_type , typename output_type > |
void | forward (int batch_size, input_type const input[], output_type output[], output_type workspace[], scale scaling=scale::none) const |
| Overload utilizing a batch transform.
|
|
template<typename input_type , typename output_type > |
void | forward (int batch_size, input_type const input[], output_type output[], scale scaling=scale::none) const |
| Overload utilizing a batch transform using internally allocated workspace.
|
|
template<typename input_type > |
output_buffer_container< input_type > | forward (buffer_container< input_type > const &input, scale scaling=scale::none) |
| Vector variant of forward() using input and output buffer_container classes. More...
|
|
template<typename input_type , typename output_type > |
void | backward (input_type const input[], output_type output[], scale scaling=scale::none) const |
| Performs a backward Fourier transform using two arrays. More...
|
|
template<typename input_type , typename output_type > |
void | backward (input_type const input[], output_type output[], input_type workspace[], scale scaling=scale::none) const |
| Overload utilizing a user provided buffer.
|
|
template<typename input_type , typename output_type > |
void | backward (int batch_size, input_type const input[], output_type output[], input_type workspace[], scale scaling=scale::none) const |
| Overload that performs a batch transform.
|
|
template<typename input_type , typename output_type > |
void | backward (int batch_size, input_type const input[], output_type output[], scale scaling=scale::none) const |
| Overload that performs a batch transform using internally allocated workspace.
|
|
template<typename scalar_type > |
real_buffer_container< scalar_type > | backward (buffer_container< scalar_type > const &input, scale scaling=scale::none) |
| Variant of backward() that uses buffer_container for RAII style of resource management.
|
|
double | get_scale_factor (scale scaling) const |
| Returns the scale factor for the given scaling.
|
|
| device_instance (void *=nullptr) |
| Empty constructor.
|
|
virtual | ~device_instance ()=default |
| Default destructor.
|
|
void * | stream () |
| Returns the nullptr.
|
|
void * | stream () const |
| Returns the nullptr (const case).
|
|
void | synchronize_device () const |
| Syncs the execution with the queue, no-op in the CPU case.
|
|
template<typename backend_tag, typename index = int>
class heffte::fft3d_r2c< backend_tag, index >
Similar to heffte::fft3d, but computed fewer redundant coefficients when the input is real.
- Overview
- Given real input data, there is no unambiguous way to distinguish between the positive and negative direction in the complex plane; therefore, by an argument of symmetry, all complex output must come in conjugate pairs. The heffte::fft3d computes both numbers for each conjugate pair, this class aims at computing fewer redundant coefficients and thus reducing both flops and data movement. This is achieved by selecting one of the three dimensions and the data is shortened in that dimensions to contain only the unique (non-conjugate) coefficients.
- Boxes and Data Distribution
- Similar to heffte::fft3d the data is organized in boxes using the heffte::box3d structs; however, in the real-to-complex case the global input and output domains do not match. If the original data sits in a box {0, 0, 0}, {x, y, z}, then depending on the dimensions chosen for the shortening, the output data will form the box:
{{0, 0, 0}, {x/2 + 1, y, z}}
{{0, 0, 0}, {x, y/2 + 1, z}}
{{0, 0, 0}, {x, y, z/2 + 1}}
Thus, the union of the inboxes across all MPI ranks must add up to the global input box, and the union of the outboxes must add up to the shortened global box.
- Compatible Types
- The real-to-complex variant does not support the cases when the input is complex, the supported types are the ones with real input in the table of compatible types.