BDMPI - Big Data Message Passing Interface
Release 0.1
|
BDMPI is still at early development stages and as such its implementation does not contain robust parameter checking and/or error reporting. For this reason, while developing a BDMPI program it may be easier to start with focusing on the MPI aspect of the program and relying on MPICH's robust error checking and reporting capabilities. Once you have an MPI program that runs correctly, it can then be converted to a BDMPI program by simply compiling it using bdmpicc
or bdmpic++
and potentially optimized using any of the additional API's provided by BDMPI.
We are aware of two cases in which sbmalloc's memory subsystem will lead to incorrect program execution.
The first has to do with MPI/BDMPI programs that are multi-threaded (e.g., they rely on OpenMP or Pthreads to parallelize the single node computations). In such programs, the memory that was allocated by sbmalloc and is accessed concurrently by multiple threads (e.g., within an OpenMP parallel region) needs to be pre-loaded prior to entering the parallel region. This is something that needs to be done by the application. See the API in Storage-backed memory allocations on how to do that and specifically the BDMPI_load()
and BDMPI_loadall()
functions.
The second has to do with the functions from the standard library that block signals. Examples of such functions are the file I/O functions, such as read()/write
() and fread()/fwrite
(). If these functions are used to read/write data to/from memory that has been allocated by sbmalloc, the memory needs to have the appropriate access permissions (read or write). BDMPI provides wrappers for the above four functions that perform such permission changes automatically. However, there may be other functions in the standard library that block signals for which BDMPI does not provide wrappers. If you encounter such functions do the following:
BDMPI_load()
and BDMPI_loadall()
to obtain read permissions.memset()
to zero-fill the associated memory to obtain write permissions.When a BDMPI program exits unsuccessfully (either due to a program error or an issue with BDMPI itself), there may be a number of files that needs to be removed manually. These files include the following:
-wdir
option of bdmprun
(Options of bdmprun)./dev/mqueue/
./dev/shm/
.Accessing the message queues will require to create/mount the directory. The commands for that are:
sudo mkdir /dev/mqueue sudo mount -t mqueue none /dev/mqueue
Information related to that can be obtained by looking at the manpage of mq_overview (i.e., "man mq_overview"
).