Derived datatypes: MPI_Datatype

Questions

  • How can you use your own derived datatypes as content of messages?

Objectives

  • Understand how MPI handles datatypes.

  • Learn to send and receive messages using composite datatypes.

  • Learn how to represent homogeneous collections as MPI datatypes.

  • Learn how to represent your own derived datatypes as MPI datatypes.

The ability to define custom datatypes is one of the hallmarks of a modern programming language, since it allows programmers to structure their code in a way that enhances readability and maintainability. How can this be done in MPI? Recall that MPI is a standard describing a library to enable parallel programming in the message passing model.

In the C language, types are primitive constructs: they are defined by the standard and enforced by the compiler. The MPI types are instead variants in the MPI_Datatype enumeration: they appear as the same type to the compiler. This is a fundamental difference which influences the way custom datatypes are handled.

In the C language, you would declare a struct such as the following:

struct Pair {
 int first;
 char second;
};

Pair is a new type. From the compiler’s point of view, it has status on par with the fundamental datatypes introduced above. The C standard makes requirements on how to represent this in memory and the compiler will generate machine code to comply with it.

MPI does not know how to represent user-defined datatypes in memory by itself:

  • How much memory does it need? Recall that MPI deals with groups of processes. For portability, you can never assume that two processes share the same architecture!

  • How are the components of Pair laid out in memory? Are they always contiguous? Or are they padded?

The programmer needs to provide this low-level information, such that the MPI runtime can send and receive custom datatypes as messages over a heterogeneous network of processes.

Representation of datatypes in MPI

The representation of datatypes in MPI uses few low-level concepts. The type signature of a custom datatypes is the list of its basic datatypes:

(1)\[\textrm{Type signature}[\texttt{T}] = [ \texttt{Datatype}_{0}, \ldots, \texttt{Datatype}_{n-1} ]\]

The typemap is the associative array (map) with datatypes, as understood by MPI, as keys and displacements, in bytes, as values.

(2)\[\textrm{Typemap}[\texttt{T}] = \{ \texttt{Datatype}_{0}: \textrm{Displacement}_{0}, \ldots, \texttt{Datatype}_{n-1}: \textrm{Displacement}_{n-1} \}\]

The displacements are relative to the buffer the datatype describes.

Assuming that an int takes 4 bytes of memory, the typemap for our Pair datatype would be: \(\textrm{Typemap}[\texttt{Pair}] = \{ \texttt{int}: 0, \texttt{char}: 4\}\). Note again that the displacements are relative.

../_images/E01-displacements.svg

Depiction of the typemap for the Pair custom type. The displacements are always relative.

Knowledge of typemap and type signature is not enough for a full description of the type to the MPI runtime: the underlying programming language might mandate architecture-specific alignment of the basic datatypes. The data structure would then be laid out in memory incoherently with the displacements in its typemap. We need a few more concepts. Given a typemap \(m\) we can define:

Lower bound

The first byte occupied by the datatype.

(3)\[\textrm{LB}[m] = \min_{j}[\textrm{Displacement}_{j}]\]
Upper bound

The last byte occupied by the datatype.

(4)\[\textrm{UB}[m] = \max_{j}[\textrm{Displacement}_{j} + \texttt{sizeof}(\textrm{Datatype}_{j})] + \textrm{Padding}\]
Extent

The amount of memory needed to represent the datatype, taking into account architecture-specific alignment.

(5)\[\textrm{Extent}[m] = \textrm{UB}[m] - \textrm{LB}[m]\]

The C language (and Fortran) require that the data occurs in memory at well-defined addresses: the data needs to be aligned. The address, in bytes, of any item must be a multiple of the size of that item in bytes. This is so-called natural alignment. For our Pair data structure the first element is an int and occupies 4 bytes. An int will align to 4 bytes boundaries: when allocating a new int in memory, the compiler will insert padding to reach the alignment boundary. Indeed, second is a char and requires just 1 byte. This gives:

\[\begin{split}\begin{aligned} \texttt{Pair.first} &\rightarrow \textrm{Displacement}_{0} = 0, \quad \texttt{sizeof}(\texttt{int}) = 4 \\ \texttt{Pair.second} &\rightarrow \textrm{Displacement}_{1} = 4, \quad \texttt{sizeof}(\texttt{char}) = 1 \end{aligned}\end{split}\]

To insert yet another Pair item, we first need to reach the alignment boundary with a padding of 3 bytes. Thus:

\[\begin{split}\begin{aligned} \textrm{LB}[\texttt{Pair}] &= \min[0, 4] = 0 \\ \textrm{UB}[\texttt{Pair}] &= \max[0+4, 4+1] + 3 = 8 \\ \textrm{Extent}[\texttt{Pair}] &= \textrm{UB}[\texttt{Pair}] - \textrm{LB}[\texttt{Pair}] = 8 \end{aligned}\end{split}\]
../_images/E01-extent_and_size.svg

The relation between size and extent of a derived datatype in the case of the Pair. We show the address alignment boundaries with vertical red lines. The lowerbound of the custom datatype is 4: first can be found with an offset of 4 bytes after the starting address. Notice the 3 bytes of padding, necessary to achieve natural alignment of Pair. The upperbound is 8: the next item of type Pair can be found with an offset of 8 bytes after the previous element. The total size is 5 bytes, but the extent, which takes the padding into account, is 8 bytes.

Which of the following statements about the size and extent of an MPI datatype is true?

  1. The size is always greater than the extent

  2. The size and extent can be equal

  3. The extent is always greater than the size

  4. None of the above

MPI offers functions to query extent and size of its types: they all take a variant of the MPI_Datatype enumeration as argument.

Extents and sizes

We will now play around a bit with the compiler and MPI to gain further understanding of padding, alignment, extents, and sizes.

  1. What are extents and sizes for the basis datatypes char, int, float, and double on your architecture? Do the numbers conform to your expectations? What is the result of sizeof for these types?

    // char
    printf("sizeof(char) = %ld\n", sizeof(char));
    MPI_Type_get_extent(MPI_CHAR, &.., &..);
    MPI_Type_size(MPI_CHAR, &..);
    printf("For MPI_CHAR:\n  lowerbound = %ld; extent = %ld; size = %d\n", ..,
            .., ..);
    

    You can find the file with the complete source code in the content/code/day-1/03_basic-extent-size/solution folder.

  2. Let’s now look at the Pair data structure. We first need declare the data structure to MPI. The following code, which we will study in much detail later on, achieves the purpose:

    // build up the typemap for Pair
    // the type signature for Pair
    MPI_Datatype typesig[2] = {MPI_INT, MPI_CHAR};
    // how many of each type in a "block" of Pair
    int block_lengths[2] = {1, 1};
    // displacements of data members in Pair
    MPI_Aint displacements[2];
    // why not use pointer arithmetic directly?
    MPI_Get_address(&my_pair.first, &displacements[0]);
    MPI_Get_address(&my_pair.second, &displacements[1]);
    
    // create and commit the new type
    MPI_Datatype mpi_pair;
    MPI_Type_create_struct(2, block_lengths, displacements, typesig, &mpi_pair);
    MPI_Type_commit(&mpi_pair);
    

    What are the size and the extent? Do they match up with our pen-and-paper calculation? Try different combinations of datatypes and adding other fields to the struct.

    You can find the file with the complete source code in the content/code/day-1/04_struct-extent-size/solution folder.

Any type you like: datatype constructors in MPI

The typemap concept allows us to provide a low-level description of any compound datatype. The class of functions MPI_Type_* offers facilities for portable type manipulations in the MPI standard. At a glance, each custom datatype goes through a well-defined lifecycle in an MPI application:

  • We construct our new datatype with a type constructor. The new type will be a variable with MPI_Datatype type.

  • We publish our new type to the runtime with MPI_Type_commit.

  • We use the new type in any of the MPI communication routines, as needed.

  • We free the new type from memory with MPI_Type_free.

../_images/E01-type-life-cycle.svg

The lifecycle of user-defined datatypes in MPI. Calling any of the type constructors will create an object of type MPI_Datatype with the user-defined typemap. Before using this custom datatype in message passing, it needs to be published with MPI_Type_commit: the typemap is made known to the runtime, allowing it to handle messages of the new custom type. The programmer must take care to free the custom datatype object.

It is not always necessary to go all the way down to a typemap to construct new datatypes in MPI. The following types can be created with convenience functions, side-stepping the explicit computation of a typemap. In MPI nomenclature, these types are:

Contiguous

A homogeneous collection of a given datatype. The returned new type will describe a collection of count times the old type. Elements are contiguous: \(n\) and \(n-1\) are separated by the extent of the old type.

Vector

A slight generalization of the contiguous type: count elements in the new type can be separated by a stride that is an arbitrary multiple of the extent of the old type.

Hvector

Yet another generalization of the contiguous datatype. The separation between elements in a hvector is expressed in bytes, rather than as a multiple of the extent.

Indexed

This type allows to have non-homogeneous separations between the elements. Each displacement is intended as a multiple of the extent of the old type.

Hindexed

This is a generalization of the indexed type analogous to the hvector. The non-homogeneous separations between the elements are expressed in bytes, rather than as multiples of the extent.

Before using the output parameter newtype, it needs to be “published” to the runtime with MPI_Type_commit:

newtype is a variable of type MPI_Datatype. The programmer must ensure proper release of the memory used at the end of the program by calling MPI_Type_free:

In practice, none of the previous convenience constructors might be suitable for your application. As we glimpsed in a previous challenge, the general type constructor MPI_Type_create_struct will suit your needs:

More message passing Pokémons

We will revisit the Pokémon example from above using custom datatypes.

You can find a scaffold for the code in the content/code/day-1/07_pokemon-type-create-struct folder. You will have to complete the source code to compile and run correctly: follow the hints in the source file. A working solution is in the solution subfolder.

  1. Define the C struct for a pokémon. This has to contain:

    • The attacking pokémon’s name: a char array.

    • How many life points it has: a double.

    • The damage its attack will inflict: an int.

    • A damage multiplier: a double.

  2. Create its corresponding MPI datatype.

  3. Print it out on the receiving process.

Compile with:

mpicc -g -Wall -std=c11 pokemon-type-create-struct.c -o pokemon-type-create-struct

What happens if you don’t commit the type?

See also

  • The lecture covering MPI datatypes from EPCC is available on GitHub

  • Chapter 5 of the Using MPI book by William Gropp et al. [GLS14]

  • Chapter 6 of the Parallel Programming with MPI book by Peter Pacheco. [Pac97]

Keypoints

  • Typemaps are essential to enable MPI communication of complex datatypes.

  • MPI offers many type constructors to portably use your own datatypes in message passing.

  • Usage of the type constructors can be quite involved, but you strictly ensure your programs will be portable.