Creating Motifs
Ember
Ember is a library representing various network communications. It accepts events from motifs to drive network activities.
Motif
Motifs are condensed, efficient generators for simulating different kinds of communication/computation patterns. The motif presented here does no real work, but more detailed motifs can be found in sst-elements/src/sst/elements/ember/mpi/motifs/.
Motifs are executed as follows:
- The motif generator is initialized (the constructor)
- The generate function is invoked, places events on the event queue, and returns either true or false
- The events on the event queue are processed.
- If the generate function in step 2 returns false, go to step 2. Otherwise, the motif is complete.
Motifs are intended to be run as a 'job submission.' The generate function models an entire iteration of an application, using the event queue to mix compute and MPI operations in every iteration.
Register Subcomponent
Ember motifs need to be registered as an SST subcomponent. Motif SubComponents should implement the SST::Ember::EmberGenerator
API. The ELI registration macro must be placed in a public section of the SubComponent header.
For example:
SST_ELI_REGISTER_SUBCOMPONENT(
Example,
"ember",
"ExampleMotif",
SST_ELI_ELEMENT_VERSION(1,0,0),
"Performs an idle on the node. No traffic can be generated.",
SST::Ember::EmberGenerator
);
The parameters of this function are:
- Class associated with this Motif
- Library that the SubComponent belongs to
- Identifier of the Motif. This name, prefixed by the library name ('ember'), will be used by SST to identify the subcomponent. Note that the end of this parameter must always be 'Motif'.
- SST elements version
- Comment describing what the motif does
- EmberGenerator API
Writing a constructor
EmberExample::EmberExample(SST::ComponentId_t id, Params& params) :
EmberMessagePassingGenerator(id, params, "Example"),
The constructor also reads parameters from the python script.
The params are passed through the python file in the form: ep.addMotif("Example firstParam=100 secondParam=200")
. Note no space is allowed before or after the = operator. Parameters read from the python file will be prepended with "arg." before being passed to the C++ file. i.e. "firstParam" becomes "arg.firstParam".
The constructor can be used to perform the setup operations necessary for the generating function.
The params can be parsed in the C++ file with firstParam = params.find<uint32_t>("arg.firstParam", 100);
where the name of the parameter is "firstParam".
Writing a generate() function
bool Example::generate( std::queue<EmberEvent*>& evQ)
The generate function is the 'main' function of a Motif.
If the python file invokes an addMotif
, the generating function will be invoked until the generate function returns true.
Once the generate function returns, the events queued in the evQ variable will be performed.
The generate function takes an event queue as a parameter. The event queue allows the user to include computation operations and MPI events in the simulation. Motifs are intended to be designed so that every call to generate() mimics an entire iteration of the application. The events that can be added to the event queue are listed in subsequent sections.
User-defined events
These functions allow the user to control how the simulation estimates computation time.
enQ_compute(Queue&, uint64_t nanoSecondDelay)
-- The delay by the expected cost of the compute operation in nanoSeconds (without actually computing it)enQ_compute( Queue& q, std::function<uint64_t()> func)
; -- A function is passed as a parameter and invoked. It returns a 64-bit unsigned integer that indicates the simulation delay associated with invoking the function (in nanoseconds).enQ_getTime(evQ, &m_startTime)
-- sets m_startTime to the current time
There are two types of compute functions that can be enqueued.
The first computation operation takes a 64-bit integer as a parameter. This parameter is the amount of time needed in nanoseconds to perform the computation. The simulator then delays by that number of nanoseconds as if the actual computation had taken place. The second compute takes a function as a parameter. When the event is processed, the function is invoked. The function returns the amount of time needed to perform the compute operation. The simulation is delayed by the return value of the function. The time delay could be estimated through a heuristic or measured through representative computation.
MPI Events
We list the supported MPI events below. For more detail, see MPI API documentation.
enQ_init(evQ)
-- MPI initializeenQ_fini(evQ)
-- MPI finalizeenQ_rank(evQ, m_newComm[0], &m_new_rank)
-- MPI rankenQ_size(evQ, m_newComm[0], &m_new_size)
-- MPI sizeenQ_send(evQ, x_neg, x_xferSize, 0, GroupWorld)
-- MPI sendenQ_recv(evQ, x_neg, x_xferSize, 0, GroupWorld)
-- MPI recvenQ_isend(evQ, next_comm_rank, itemsThisPhase, 0, GroupWorld, &requests[next_request])
-- MPI isendenQ_irecv(evQ, xface_down, items_per_cell * sizeof_cell * ny * nz, 0, GroupWorld, &requests[nextRequest])
-- MPI irecvenQ_cancel(evQ, m_req[i])
-- MPI cancelenQ_sendrecv(evQ, m_sendBuf, m_messageSize, DATA_TYPE, destRank(), 0xdeadbeef, m_recvBuf, m_messageSize, DATA_TYPE, srcRank(), 0xdeadbeef, GroupWorld, &m_resp)
-- MPI send or recvenQ_wait(evQ, req)
-- MPI waitenQ_waitany(evQ, m_req.size(), &m_req[0], &m_indx, &m_resp)
-- MPI waitanyenQ_waitall(evQ, 1, &m_req, (MessageResponse**)&m_resp)
-- MPI waitallenQ_barrier(evQ, GroupWorld)
-- MPI barrierenQ_bcast(evQ, m_sendBuf, m_count, DOUBLE, m_root, GroupWorld)
-- MPI broadcastenQ_scatter(evQ, m_sendBuf, m_sendCnt, m_sendDsp.data(), LONG, m_recvBuf, m_count, LONG, m_root, GroupWorld)
enQ_scatterv(evQ, m_sendBuf, &m_sendCnts[0], m_sendDsp.data(), LONG, m_recvBuf, m_count, LONG, m_root, GroupWorld)
-- MPI Scatter variable message sizeenQ_reduce( evQ, m_sendBuf, m_recvBuf, m_count, DOUBLE, MP::SUM, m_redRoot, GroupWorld)
enQ_allreduce(evQ, m_sendBuf, m_recvBuf, m_count, DOUBLE, m_op, GroupWorld)
-- MPI allreduce functionenQ_alltoall(evQ, m_sendBuf, m_colSendCnt, &m_colSendDsp_f[0], DOUBLE, m_colComm)
-- MPI alltoall with constant message sizeenQ_alltoallv(evQ, m_sendBuf, &m_colSendCnts_f[0], &m_colSendDsp_f[0], DOUBLE, m_colComm)
-- MPI alltoall with varying message sizesenQ_allgather(evQ, m_sendBuf, m_count, INT, m_recvBuf, m_count, INT, GroupWorld)
-- MPI allgather with messages of constant sizeenQ_allgatherv(evQ, m_sendBuf, m_sendCnt, INT, m_recvBuf, &m_recvCnts[0], &m_recvDsp[0], INT, GroupWorld)
-- MPI allgather with messages of varying sizesenQ_commSplit(evQ, *comm, color, key, newComm)
-- Splits the MPI commenQ_commCreate(evQ, GroupWorld, m_rowGrpRanks, &m_rowComm)
-- Creates MPI ComenQ_commDestroy(evQ, m_rowComm)
--Destroys MPI comm