Highly Efficient FFT for Exascale: HeFFTe v2.3
Reshape operations
Collaboration diagram for Reshape operations:

Classes

class  heffte::reshape3d_base< index >
 Base reshape interface. More...
 
class  heffte::reshape3d_alltoall< location_tag, packer, index >
 Reshape algorithm based on the MPI_Alltoall() method. More...
 
class  heffte::reshape3d_alltoallv< location_tag, packer, index >
 Reshape algorithm based on the MPI_Alltoallv() method. More...
 
class  heffte::reshape3d_pointtopoint< location_tag, packer, index >
 Reshape algorithm based on the MPI_Send() and MPI_Irecv() methods. More...
 
class  heffte::reshape3d_transpose< location_tag, index >
 Special case of the reshape that does not involve MPI communication but applies a transpose instead. More...
 

Functions

template<typename index >
void heffte::compute_overlap_map_transpose_pack (int me, int nprocs, box3d< index > const destination, std::vector< box3d< index >> const &boxes, std::vector< int > &proc, std::vector< int > &offset, std::vector< int > &sizes, std::vector< pack_plan_3d< index >> &plans)
 Generates an unpack plan where the boxes and the destination do not have the same order. More...
 
template<typename index >
size_t heffte::get_workspace_size (std::array< std::unique_ptr< reshape3d_base< index >>, 4 > const &shapers)
 Returns the maximum workspace size used by the shapers.
 
template<typename location_tag , template< typename device > class packer = direct_packer, typename index >
std::unique_ptr< reshape3d_alltoall< location_tag, packer, index > > heffte::make_reshape3d_alltoall (typename backend::device_instance< location_tag >::stream_type q, std::vector< box3d< index >> const &input_boxes, std::vector< box3d< index >> const &output_boxes, bool uses_gpu_aware, MPI_Comm const comm)
 Factory method that all the necessary work to establish the communication patterns. More...
 
template<typename location_tag , template< typename device > class packer = direct_packer, typename index >
std::unique_ptr< reshape3d_alltoallv< location_tag, packer, index > > heffte::make_reshape3d_alltoallv (typename backend::device_instance< location_tag >::stream_type q, std::vector< box3d< index >> const &input_boxes, std::vector< box3d< index >> const &output_boxes, bool use_gpu_aware, MPI_Comm const comm)
 Factory method that all the necessary work to establish the communication patterns. More...
 
template<typename location_tag , template< typename device > class packer = direct_packer, typename index >
std::unique_ptr< reshape3d_pointtopoint< location_tag, packer, index > > heffte::make_reshape3d_pointtopoint (typename backend::device_instance< location_tag >::stream_type q, std::vector< box3d< index >> const &input_boxes, std::vector< box3d< index >> const &output_boxes, reshape_algorithm algorithm, bool use_gpu_aware, MPI_Comm const comm)
 Factory method that all the necessary work to establish the communication patterns. More...
 
template<typename backend_tag , typename index >
std::unique_ptr< reshape3d_base< index > > heffte::make_reshape3d (typename backend::device_instance< typename backend::buffer_traits< backend_tag >::location >::stream_type stream, std::vector< box3d< index >> const &input_boxes, std::vector< box3d< index >> const &output_boxes, MPI_Comm const comm, plan_options const options)
 Factory method to create a reshape3d instance. More...
 

Detailed Description

A reshape operation is one that modifies the distribution of the indexes across an MPI communicator. In a special case, the reshape can correspond to a simple in-node data transpose (i.e., no communication).

The reshape operations inherit from a common heffte::reshape3d_base class that defines the apply method for different data-types and the sizes of the input, output, and scratch workspace. Reshape objects are usually wrapped in std::unique_ptr containers, which handles the polymorphic calls at runtime and also indicates the special case of no-reshape when the container is empty.

Function Documentation

◆ compute_overlap_map_transpose_pack()

template<typename index >
void heffte::compute_overlap_map_transpose_pack ( int  me,
int  nprocs,
box3d< index > const  destination,
std::vector< box3d< index >> const &  boxes,
std::vector< int > &  proc,
std::vector< int > &  offset,
std::vector< int > &  sizes,
std::vector< pack_plan_3d< index >> &  plans 
)

Generates an unpack plan where the boxes and the destination do not have the same order.

This method does not make any MPI calls, but it uses the set of boxes the define the current distribution of the indexes and computes the overlap and the proc, offset, and sizes vectors for the receive stage of an all-to-all-v communication patterns. In addition, a set of unpack plans is created where the order of the boxes and the destination are different, which will transpose the data. The plan has to be used in conjunction with the transpose packer.

◆ make_reshape3d_alltoall()

template<typename location_tag , template< typename device > class packer = direct_packer, typename index >
std::unique_ptr<reshape3d_alltoall<location_tag, packer, index> > heffte::make_reshape3d_alltoall ( typename backend::device_instance< location_tag >::stream_type  q,
std::vector< box3d< index >> const &  input_boxes,
std::vector< box3d< index >> const &  output_boxes,
bool  uses_gpu_aware,
MPI_Comm const  comm 
)

Factory method that all the necessary work to establish the communication patterns.

The purpose of the factory method is to isolate the initialization code and ensure that the internal state of the class is minimal and const-correct, i.e., objects do not hold onto data that will not be used in a reshape apply and the data is labeled const to prevent accidental corruption.

Template Parameters
location_tagthe location for the input/output buffers for the reshape operation, tag::cpu or tag::gpu
packeris the packer to use to parts of boxes into global send/recv buffer
Parameters
qdevice stream
input_boxeslist of all input boxes across all ranks in the comm
output_boxeslist of all output boxes across all ranks in the comm
uses_gpu_awareuse MPI calls directly from the GPU (GPU backends only)
commthe communicator associated with all the boxes
Returns
unique_ptr containing an instance of the heffte::reshape3d_alltoall

Note: the input and output boxes associated with this rank are located at position mpi::comm_rank() in the respective lists.

◆ make_reshape3d_alltoallv()

template<typename location_tag , template< typename device > class packer = direct_packer, typename index >
std::unique_ptr<reshape3d_alltoallv<location_tag, packer, index> > heffte::make_reshape3d_alltoallv ( typename backend::device_instance< location_tag >::stream_type  q,
std::vector< box3d< index >> const &  input_boxes,
std::vector< box3d< index >> const &  output_boxes,
bool  use_gpu_aware,
MPI_Comm const  comm 
)

Factory method that all the necessary work to establish the communication patterns.

The purpose of the factory method is to isolate the initialization code and ensure that the internal state of the class is minimal and const-correct, i.e., objects do not hold onto data that will not be used in a reshape apply and the data is labeled const to prevent accidental corruption.

Template Parameters
location_tagthe location of the input/output buffers, tag::cpu or tag::gpu
packeris the packer to use to parts of boxes into global send/recv buffer
Parameters
qdevice stream
input_boxeslist of all input boxes across all ranks in the comm
output_boxeslist of all output boxes across all ranks in the comm
use_gpu_awareuse MPI calls directly from the GPU (GPU backends only)
commthe communicator associated with all the boxes
Returns
unique_ptr containing an instance of the heffte::reshape3d_alltoallv

Note: the input and output boxes associated with this rank are located at position mpi::comm_rank() in the respective lists.

◆ make_reshape3d_pointtopoint()

template<typename location_tag , template< typename device > class packer = direct_packer, typename index >
std::unique_ptr<reshape3d_pointtopoint<location_tag, packer, index> > heffte::make_reshape3d_pointtopoint ( typename backend::device_instance< location_tag >::stream_type  q,
std::vector< box3d< index >> const &  input_boxes,
std::vector< box3d< index >> const &  output_boxes,
reshape_algorithm  algorithm,
bool  use_gpu_aware,
MPI_Comm const  comm 
)

Factory method that all the necessary work to establish the communication patterns.

The purpose of the factory method is to isolate the initialization code and ensure that the internal state of the class is minimal and const-correct, i.e., objects do not hold onto data that will not be used in a reshape apply and the data is labeled const to prevent accidental corruption.

Template Parameters
location_tagthe tag for the input/output buffers, tag::cpu or tag::gpu
packeris the packer to use to parts of boxes into global send/recv buffer
Parameters
qdevice stream
input_boxeslist of all input boxes across all ranks in the comm
output_boxeslist of all output boxes across all ranks in the comm
algorithmmust be either reshape_algorithm::p2p or reshape_algorithm::p2p_plined
use_gpu_awareuse MPI calls directly from the GPU (GPU backends only)
commthe communicator associated with all the boxes
Returns
unique_ptr containing an instance of the heffte::reshape3d_pointtopoint

Note: the input and output boxes associated with this rank are located at position mpi::comm_rank() in the respective lists.

◆ make_reshape3d()

template<typename backend_tag , typename index >
std::unique_ptr<reshape3d_base<index> > heffte::make_reshape3d ( typename backend::device_instance< typename backend::buffer_traits< backend_tag >::location >::stream_type  stream,
std::vector< box3d< index >> const &  input_boxes,
std::vector< box3d< index >> const &  output_boxes,
MPI_Comm const  comm,
plan_options const  options 
)

Factory method to create a reshape3d instance.

Creates a reshape operation from the geometry defined by the input boxes to the geometry defined but the output boxes. The boxes are spread across the given MPI communicator where the boxes associated with the current MPI rank is located at input_boxes[mpi::comm_rank(comm)] and output_boxes[mpi::comm_rank(comm)].

Assumes that the order of the input and output geometries are consistent, i.e., input_boxes[i].order == input_boxes[j].order for all i, j.