This is OxMPI version 3.1. OxMPI is an Ox package enabling the development
of distributed Ox programs. The resulting Ox program can run on all sixteen processors
of a sixteen-core workstation (say), or on a cluster of machines.
Enabling the use of multiple processes requires a message-passing library,
MPI
in this case, and rewriting of the Ox code to make use of this. To make development
much easier, OxMPI provides a plug-in replacement of the Ox Simulator class.
At the core of the MPI enabled Simulator class is a Loop class which provides
a solution for distributed loops and parallel use of random number generation.
This is intended to give the same result as in the multithreaded version
(using the parallel for loop), see the Ox book. It is somewhat different
from the approach adopted in OxMPI version 2, which followed
Doornik, Hendry, Shephard (2003, 2006).
Requirements:
One of the objectives of OxMPI is to maintain replicability of simulation results
when using the plug-in Simulator class:
Here are some timings (in seconds) using artest.ox with oxmpi/artest.ox using 10^6 replications. The hardware
consists of two old quad core computers (the first with an Intel i5 750 at 2.66Ghz,
the second with Intel Q6600 at 2.4Ghz), both running Windows 7. Also listed are the results from a 12 core AMD Ryzen 5900X running Windows 10.
The software is MPICH2 and Ox 7.0 for the quad core computers, MS-MPI and Ox 9 for the twelve core:
Create an ox/packages/oxmpi folder and unzip the
package file there. The default location of Ox 9 is C:\Program Files\OxMetrics9\ox
under Windows, /Applications/OxMetrics9/ox under OSX,
and /usr/share/OxMetrics9/ox under linux.
Note that:
OxMPI for Windows has been tested with Microsoft MPI 10.1.2.
By default, oxmpi.bat runs the same number of processes
as the number of processors/cores. Although in simulation experiments the
master does not have much work to do.
Next you can try to run artest.ox, a distributed simulation experiment.
OxMPI for Linux has been tested with
OpenMPI under Fedora 7 (64-bit).
This section has not been updated for a while.
Under Linux, OxMPI_Init is called in the Ox driver replacement,
and skipped in the wrapper (although the wrapper OxMPI_Init should
still be called). The reason is that MPI does complex argument
manipulations, which must be handled.
To make external C (or C++ or FORTRAN, etc.) functions callable from
Ox code, it is necessary to write a small C wrapper around the
external function. The task of this wrapper is to:
Ox variables are represented by an OxVALUE, which holds all
types, including:
Others are function, class object, etc.
Access to the contents of an OxVALUE is through macros or C
functions.
The anatomy of a wrapper function is:
Here, rtn holds the return value, pv the cArg
arguments (Ox supports variable argument lists).
One important issue to note is that local OxVALUE variables
are not initialized. They must be first set to an integer, to avoid
that subsequent use results in spurious deallocation of non-existent memory.
On the other hand, the arguments of the wrapper function will be
initialized appropriately.
The following example checks if the first two arguments are an integer
(if not, a type conversion is done when possible, otherwise
a run-time error results),
and then assigns the sum to the return value:
There is no need to check the argument count, provided the
function is properly declared in a Ox header file:
The Ox code could now use, e.g.:
So the new function is used as any other Ox function.
We now give some examples of how the C wrapper that provides
MPI calls to Ox is written, considering MPI_Init,
as well as the implementation of send/receive.
The wrapper is compiled into a
dynamic link library (oxmpi.dll under windows, oxmpi.so under Linux),
which can be called from Ox.
The oxmpi.h header file provides the glue, and must
be included in Ox programs whenever MPI is used. It contains
for example:
defining the syntax for calls from Ox. MPI_Init resides
in oxmpi (.dll or .so), and is called
FnMPI_Init in the dynamic link library.
The contents of FnMPI_Init is given in the listing above.
Unusually, it makes allowance to be called with NULL for the
rtn argument -- this is not necessary if the function will only
be called from Ox. Next, it checks if it hasn't already been called.
MPI 2 states that MPI_Init should be callable with NULL for both arguments,
in case it is called from a dynamic-link library. This is the case
with OxMPI, so -DOXMPI_NO_ARGV should be used when compiling OxMPI.
In some older MPI implementations, command line arguments were used to pass
MPI arguments accross to different processes. In that case SKIP_MPI_Init
can be used to skip the call to MPI_Init, which now must be placed in
the main function so that the arguments can be manipulated. Then, a new
oxl driver should be compiled, as in oxl_mpi.c.
Next, we consider sending and receiving data. Ox variables
can have different types. Extra information is transmitted
in an array of three integers: the type and, if necessary, the dimensions.
If an integer is sent, the second value is the content, and only
one send suffices. Otherwise the first send is followed by the actual
data. An OX_ARRAY is a compound type, which can be sent
recursively.
Every MPI_Send must be matched by an MPI_Recv.
First the array of three integers is received. For types other
than integer, this allows the correct variable to be created
(i.e. memory allocated). The second receive than copies the actual
data into the new variable, which is returned to the Ox code.
Thanks to Jan Meyer and Charles Bos on improving the portability of OxMPI
and further testing.
1. OxMPI Introduction
mpiexec -n 4 oxl -DOX_MPI -rp1 artest
mpiexec -n 2 oxl -DOX_MPI -rp1 artest
mpiexec -n 1 oxl -DOX_MPI -rp1 artest
oxl -rp1 artest
oxl artest
When Ox is run with the -rp1 switch, it is run as a single process, thus avoiding multithreading.
1 quad core computer (Intel i5 750) 2 quad core computers twelve core computer oxl -rp1 (serial) OxMPI -n 5 oxl (parallel for) OxMPI -n 9 oxl -rp12 OxMPI -n 24 Time (secs) 418 124 120 77 34 17 2. OxMPI Installation
2.1 OxMPI Windows 10 Installation using MS-MPI (MPICH)
Ox 9 is 64-bit software.
mpiexec -n 4 oxl -DOX_MPI -rp1 artest
-n 4 specifies the number of processes for MPI;
-rp1 restricts Ox to using a single threads: if the number of computational threads and
processes exceeds the logical core count, the program starts to slow down.
batch/oxmpi cpi.ox
which should give the Ox output.
If oxl.exe is not in the search path, you can set the OXBIN_PATH variable first, e.g.
set OXBIN_PATH="C:\Program Files\OxMetrics9\ox"
2.3 OxMPI Linux Installation using OpenMPI
mpiexec -n 2 uptime
to see if OpenMPI works.
mpiexec -n 4 oxl64 -DOX_MPI artest
This requires that oxmpi_64.so is in ox/packages/oxmpi.
OxMPI Package Contents
4. OxMPI Function Reference
OxMPI_Barrier
OxMPI_Bcast
OxMPI_Comm_rank
OxMPI_Comm_size
OxMPI_Finalize
OxMPI_Get_processor_name
OxMPI_Init
OxMPI_Iprobe
OxMPI_IsMaster
OxMPI_Probe
OxMPI_Recv
OxMPI_Reduce
OxMPI_MAX, OxMPI_MIN, OxMPI_SUM, OxMPI_PROD,
OxMPI_LAND, OxMPI_BAND, OxMPI_LOR, OxMPI_BOR,
OxMPI_LXOR, OxMPI_BXOR, OxMPI_MINLOC, OxMPI_MAXLOC
OxMPI_Send
OxMPI_SetMaster
OxMPI_Wtime
Appendix A: Calling MPI from Ox
A.1 Extending Ox
void OXCALL FnFunction1(OxVALUE *rtn, OxVALUE *pv, int cArg)
{
/* .... */
}
void OXCALL FnEx1(OxVALUE *rtn, OxVALUE *pv, int cArg)
{
OxLibCheckType(OX_INT, pv, 0, 1); /* check pv[0]..pv[1] */
OxInt(rtn, 0) = OxInt(pv, 0) + OxInt(pv, 1);
}
extern "dll_name,FnEx1" SumInt2(const i1, const i2);
x = SumInt2(10, 100);
A.2: The MPI wrapper
extern "oxmpi,FnMPI_Init" MPI_Init();
extern "oxmpi,FnMPI_Send" MPI_Send(const val, const iDest, const iTag);
extern "oxmpi,FnMPI_Recv" MPI_Recv(const iSource, const iTag);
void OXCALL FnMPI_Init(OxVALUE *rtn, OxVALUE *pv, int cArg)
{
int iret;
if (rtn) /* rtn may be NULL */
OxInt(rtn,0) = OxMPI_SUCCESS;
if (s_bMPI_Initialized) /* if already done: don't do again */
return;
#ifdef SKIP_MPI_Init
/* assume already initialized in main */
s_bMPI_Initialized = TRUE;
OxRunMainExitCall(mpi_exit); /* call MPI_Finalize at end of Ox main */
return;
#endif
#ifdef OXMPI_NO_ARGV
iret = MPI_Init(NULL, NULL); /* MPICH2: don't pass Ox args to MPI */
#else
{ int argc; char **argv;
OxGetMainArgs(&argc, &argv); /* points to arguments for Ox program */
iret = MPI_Init(&argc, &argv); /* MPI may have prepended args */
OxSetMainArgs(argc, argv); /* and then remove them again */
}
#endif
iret = MPI2Ox_Error(iret); /* translate error code to those used by Ox */
if (rtn)
OxInt(rtn,0) = iret; /* set the return code */
if (iret == OxMPI_SUCCESS)
{
s_bMPI_Initialized = TRUE;
OxRunMainExitCall(mpi_exit); /* call MPI_Finalize at end of Ox main */
}
}
FnMPI_Init
void OXCALL FnMPI_Send(OxVALUE *rtn, OxVALUE *pv, int cArg)
{
int dest, tag, aisend[3], len;
OxLibCheckType(OX_INT, pv, 1, 2);
dest = OxInt(pv,1);
tag = OxInt(pv,2);
aisend[0] = GETPVTYPE(pv);
switch (GETPVTYPE(pv))
{
case OX_INT:
aisend[1] = OxInt(pv,0);
MPI_Send(aisend, 3, MPI_INT, dest, tag, s_iMPI_comm);
return; /* finished */
case OX_DOUBLE:
len = 1;
break;
case OX_MATRIX:
aisend[1] = OxMatr(pv,0);
aisend[2] = OxMatc(pv,0);
len = aisend[1] * aisend[2];
break;
case OX_STRING:
len = aisend[1] = OxStrLen(pv,0);
break;
case OX_ARRAY:
len = aisend[1] = OxArraySize(pv);
break;
default:
return;
}
MPI_Send(aisend, 3, MPI_INT, dest, tag, s_iMPI_comm);
if (len)
{
switch (GETPVTYPE(pv))
{
case OX_DOUBLE:
MPI_Send(&(OxDbl(pv,0)), 1, MPI_DOUBLE, dest, tag, s_iMPI_comm);
break;
case OX_MATRIX:
MPI_Send(OxMat(pv,0)[0], len, MPI_DOUBLE, dest, tag, s_iMPI_comm);
break;
case OX_STRING:
MPI_Send(OxStr(pv,0), len, MPI_CHAR, dest, tag, s_iMPI_comm);
break;
case OX_ARRAY:
{
int i; OxVALUE pvarg[3];
pvarg[1] = pv[1]; /* dest */
pvarg[2] = pv[2]; /* tag */
for (i = 0; i < len; ++i)
{
pvarg[0] = OxArrayData(pv)[i]; /* array entry */
FnMPI_Send(rtn, pvarg, 3);
}
break;
}
}
}
}
FnMPI_Send
Appendix B: Changes from previous versions
Changes from version 2
Changes from version 1
Acknowledgements
References
An earlier version, presented at the Royal Society, is still
available as
2001-W22 in Nuffield Economics Working Papers (PDF).