Highly Efficient FFT for Exascale: HeFFTe v2.3
|
Reshape algorithm based on the MPI_Alltoallv() method. More...
#include <heffte_reshape3d.h>
Public Member Functions | |
~reshape3d_alltoallv () | |
Destructor, frees the comm generated by the constructor. | |
void | apply (int batch_size, float const source[], float destination[], float workspace[]) const override final |
Apply the reshape operations, single precision overload. | |
void | apply (int batch_size, double const source[], double destination[], double workspace[]) const override final |
Apply the reshape operations, double precision overload. | |
void | apply (int batch_size, std::complex< float > const source[], std::complex< float > destination[], std::complex< float > workspace[]) const override final |
Apply the reshape operations, single precision complex overload. | |
void | apply (int batch_size, std::complex< double > const source[], std::complex< double > destination[], std::complex< double > workspace[]) const override final |
Apply the reshape operations, double precision complex overload. | |
template<typename scalar_type > | |
void | apply_base (int batch_size, scalar_type const source[], scalar_type destination[], scalar_type workspace[]) const |
Templated reshape3d_alltoallv::apply() algorithm for all scalar types. | |
Public Member Functions inherited from heffte::reshape3d_base< index > | |
reshape3d_base (index cinput_size, index coutput_size) | |
Constructor that sets the input and output sizes. | |
virtual | ~reshape3d_base ()=default |
Default virtual destructor. | |
index | size_intput () const |
Returns the input size. | |
index | size_output () const |
Returns the output size. | |
virtual size_t | size_workspace () const |
Returns the workspace size. | |
Public Member Functions inherited from heffte::backend::device_instance< location_tag > | |
device_instance (void *=nullptr) | |
Empty constructor. | |
virtual | ~device_instance ()=default |
Default destructor. | |
void * | stream () |
Returns the nullptr. | |
void * | stream () const |
Returns the nullptr (const case). | |
void | synchronize_device () const |
Syncs the execution with the queue, no-op in the CPU case. | |
Friends | |
template<typename b , template< typename d > class p, typename i > | |
std::unique_ptr< reshape3d_alltoallv< b, p, i > > | make_reshape3d_alltoallv (typename backend::device_instance< b >::stream_type, std::vector< box3d< i >> const &, std::vector< box3d< i >> const &, bool, MPI_Comm const) |
Factory method, use to construct instances of the class. | |
Additional Inherited Members | |
Public Types inherited from heffte::backend::device_instance< location_tag > | |
using | stream_type = void * |
The type for the internal stream, the cpu uses just a void pointer. | |
Protected Member Functions inherited from heffte::reshape3d_base< index > | |
template<typename scalar_type > | |
scalar_type * | cpu_send_buffer (size_t num_entries) const |
Allocates and returns a CPU buffer when GPU-Aware communication has been disabled. | |
template<typename scalar_type > | |
scalar_type * | cpu_recv_buffer (size_t num_entries) const |
Allocates and returns a CPU buffer when GPU-Aware communication has been disabled. | |
Protected Attributes inherited from heffte::reshape3d_base< index > | |
index const | input_size |
Stores the size of the input. | |
index const | output_size |
Stores the size of the output. | |
std::vector< float > | send_unaware |
Temp buffers for the gpu-unaware algorithms. | |
std::vector< float > | recv_unaware |
Temp buffers for the gpu-unaware algorithms. | |
Reshape algorithm based on the MPI_Alltoallv() method.
The communication plan for the reshape requires complex initialization, which is put outside of the class into a factory method. An instance of the class can be created only via the factory method heffte::make_reshape3d_alltoallv() which allows for stronger const correctness and reduces memory footprint.
location_tag | is the location of the input/output buffers, tag::cpu or tag::gpu |
packer | the packer algorithms to use in arranging the sub-boxes into the global send/recv buffer, will work with either heffte::direct_packer or heffte::transpose_packer |