Highly Efficient FFT for Exascale: HeFFTe v2.3
heffte::reshape3d_alltoallv< location_tag, packer, index > Class Template Reference

Reshape algorithm based on the MPI_Alltoallv() method. More...

#include <heffte_reshape3d.h>

Public Member Functions

 ~reshape3d_alltoallv ()
 Destructor, frees the comm generated by the constructor.
 
void apply (int batch_size, float const source[], float destination[], float workspace[]) const override final
 Apply the reshape operations, single precision overload.
 
void apply (int batch_size, double const source[], double destination[], double workspace[]) const override final
 Apply the reshape operations, double precision overload.
 
void apply (int batch_size, std::complex< float > const source[], std::complex< float > destination[], std::complex< float > workspace[]) const override final
 Apply the reshape operations, single precision complex overload.
 
void apply (int batch_size, std::complex< double > const source[], std::complex< double > destination[], std::complex< double > workspace[]) const override final
 Apply the reshape operations, double precision complex overload.
 
template<typename scalar_type >
void apply_base (int batch_size, scalar_type const source[], scalar_type destination[], scalar_type workspace[]) const
 Templated reshape3d_alltoallv::apply() algorithm for all scalar types.
 
- Public Member Functions inherited from heffte::reshape3d_base< index >
 reshape3d_base (index cinput_size, index coutput_size)
 Constructor that sets the input and output sizes.
 
virtual ~reshape3d_base ()=default
 Default virtual destructor.
 
index size_intput () const
 Returns the input size.
 
index size_output () const
 Returns the output size.
 
virtual size_t size_workspace () const
 Returns the workspace size.
 
- Public Member Functions inherited from heffte::backend::device_instance< location_tag >
 device_instance (void *=nullptr)
 Empty constructor.
 
virtual ~device_instance ()=default
 Default destructor.
 
void * stream ()
 Returns the nullptr.
 
void * stream () const
 Returns the nullptr (const case).
 
void synchronize_device () const
 Syncs the execution with the queue, no-op in the CPU case.
 

Friends

template<typename b , template< typename d > class p, typename i >
std::unique_ptr< reshape3d_alltoallv< b, p, i > > make_reshape3d_alltoallv (typename backend::device_instance< b >::stream_type, std::vector< box3d< i >> const &, std::vector< box3d< i >> const &, bool, MPI_Comm const)
 Factory method, use to construct instances of the class.
 

Additional Inherited Members

- Public Types inherited from heffte::backend::device_instance< location_tag >
using stream_type = void *
 The type for the internal stream, the cpu uses just a void pointer.
 
- Protected Member Functions inherited from heffte::reshape3d_base< index >
template<typename scalar_type >
scalar_type * cpu_send_buffer (size_t num_entries) const
 Allocates and returns a CPU buffer when GPU-Aware communication has been disabled.
 
template<typename scalar_type >
scalar_type * cpu_recv_buffer (size_t num_entries) const
 Allocates and returns a CPU buffer when GPU-Aware communication has been disabled.
 
- Protected Attributes inherited from heffte::reshape3d_base< index >
index const input_size
 Stores the size of the input.
 
index const output_size
 Stores the size of the output.
 
std::vector< float > send_unaware
 Temp buffers for the gpu-unaware algorithms.
 
std::vector< float > recv_unaware
 Temp buffers for the gpu-unaware algorithms.
 

Detailed Description

template<typename location_tag, template< typename device > class packer, typename index>
class heffte::reshape3d_alltoallv< location_tag, packer, index >

Reshape algorithm based on the MPI_Alltoallv() method.

The communication plan for the reshape requires complex initialization, which is put outside of the class into a factory method. An instance of the class can be created only via the factory method heffte::make_reshape3d_alltoallv() which allows for stronger const correctness and reduces memory footprint.

Template Parameters
location_tagis the location of the input/output buffers, tag::cpu or tag::gpu
packerthe packer algorithms to use in arranging the sub-boxes into the global send/recv buffer, will work with either heffte::direct_packer or heffte::transpose_packer

The documentation for this class was generated from the following file: