Derived datatypes: pack and unpack

Questions

  • How can you reduce the number of messages sent and received?

Objectives

  • Learn how to pack heterogeneous data into a single message.

MPI supports many of the basic datatypes recognized by the C and Fortran standards. However, often one needs to represent data that requires more complex programming structures than just the fundamental datatypes.

Basic datatypes in MPI and in the C standard. For a comprehensive explanation of the types defined in the C language, you can consult this reference.

MPI

C

MPI_CHAR

signed char

MPI_FLOAT

float

MPI_DOUBLE

double

MPI_LONG_DOUBLE

long double

MPI_WCHAR

wchar_t

MPI_SHORT

short

MPI_INT

int

MPI_LONG

long

MPI_LONG_LONG_INT

long long

MPI_SIGNED_CHAR

signed char

MPI_UNSIGNED_CHAR

unsigned char

MPI_UNSIGNED_SHORT

unsigned short

MPI_UNSIGNED

unsigned int

MPI_UNSIGNED_LONG

unsigned long

MPI_UNSIGNED_LONG_LONG

unsigned long long

MPI_C_COMPLEX

float _Complex

MPI_C_DOUBLE_COMPLEX

double _Complex

MPI_C_LONG_DOUBLE_COMPLEX

long double _Complex

MPI_PACKED

MPI_BYTE

The MPI standard defines functions to extend the datatypes that can be used in MPI messages. In this episode, we will discuss how to collect heterogeneous data into a single MPI message, leaving the definition of your own types to next episode Derived datatypes: MPI_Datatype.

Packing and unpacking

MPI offers the possibility to pack and unpack data of known datatype into a single contiguous memory buffer, without first having to define a corresponding datatype. This can be an extremely useful technique to reduce messaging traffic and could help with the readability and portability of the code. The resulting packed buffer will be of type MPI_PACKED and can contain any sort of heterogeneous collection of basic datatypes recognized by MPI.

../_images/E01-pack_unpack.svg

MPI allows the programmer to communicate heterogeneous collections into a single message, without defining a full-fledged custom datatype. The data is packed into a buffer of type MPI_PACKED. On the receiving end, the buffer will be unpacked into its constituent components.

../_images/E01-pack.svg

The relation of inbuf, outbuf, and position when calling MPI_Pack. In this figure, outbuf already holds some data (the red shaded area). The data in inbuf is copied to outbuf starting at the address outbuf+*position. When the function returns, the position parameter will have been updated to refer to the first position in outbuf following the data copied by this call.

../_images/E01-unpack.svg

The relation of inbuf, outbuf, and position when calling MPI_Unpack. In this figure, inbuf holds some data. The data in inbuf is copied to outbuf starting at the address given with position. When the function returns, the position parameter will have been updated to the first position in inbuf following the just copied data.

Message passing Pokémons

In the Pokémon trading card game, opponents face each in duels using their pokémons. The game is played in turns and at each turn a player can attack. We have to send:

  • The attacking pokémon’s name: a char array.

  • How many life points it has: a double.

  • The damage its attack will inflict: an int.

  • A damage multiplier: a double.

You can find a scaffold for the code in the content/code/day-1/05_pokemon-pack-unpack folder. You will have to complete the source code to compile and run correctly: follow the hints in the source file. A working solution is in the solution subfolder.

  1. Pack the data in the message buffer.

  2. Unpack the message buffer into its component data.

Compile with:

mpicc -g -Wall -std=c11 pokemon-pack-unpack.c -o pokemon-pack-unpack
  • Why are we hardcoding the length of the pokémon’s name?

  • What is the purpose of the position variable? Print its value after each packing and unpacking. Do these values conform with your intuition?

  • Should packing and unpacking happen in the same order? What happens if not?

  • What happens when there is a mismatch of types between packing and unpacking?

  • We could have packed our data as char, int, double, and double. Is there a way to pack (unpack) the life points and the damage multiplier with one call to MPI_Pack (MPI_Unpack)?

See also

  • The lecture covering MPI datatypes from EPCC is available on GitHub

  • Chapter 5 of the Using MPI book by William Gropp et al. [GLS14]

  • Chapter 6 of the Parallel Programming with MPI book by Peter Pacheco. [Pac97]

Keypoints

  • You can reduce message traffic by packing (unpacking) heterogeneous data together.

  • Packing/unpacking are straightforward to use, but might lead to less readable programs.