Skip to main content

Profiling simulation performance

There are a number of ways to gather information about simulation performance from SST itself. The simplest way is to use available command line options. These provide generic information about the simulation as a whole - total memory usage, execution time, etc. For advanced use cases, you can also use ProfileTools to gather more detailed information about specific components or operations within the simulation. The destination for all profiling output can be controlled using the --profiling-output command line option.

Command-line options

These command-line options provide profiling information for the simulation as a whole. They do not show breakdowns for specific components, etc. Run $ sst --help to see usage information for each option.

Timing information (--timing-info)

Timing information shows the breakdown of real (wall-clock) time spent in each stage of simulation, along with the maximum memory usage observed during that stage.

Heartbeats (--heartbeat-period, --heartbeat-wall-period, --heartbeat-sim-period)

Heartbeats are outputs generated by SST at a regular wall and/or simulation time interval. They can be used to track progress of a simulation. Heartbeats print information about the current simulated and wall clock time, as well as memory usage for events.

Output (--profilng-output)

The profiling output option allows users to specify where profiling output should be sent. By default, if profiling is enabled and the output option is not used, SST will print to stdout. The output option can also accept a filename to write data to. The filename must have an extension of .txt or .json indicating that the output is to be written in plaintext or JSON format, respectively. Both stdout and a file may be provided, separated by a comma.

Detailed profiling using ProfilePoints

In addition to simulation-level profiling, SST can also collect information about specific operations within a simulation. For example, you might want to profile how much time certain components spend handling events or measure the time spent by SST in synchronization. For this, SST provides ProfileTools and ProfilePoints. By attaching a ProfileTool to one of the ProfilePoints that SST provides, you can gather detailed information about the simulation.

info

The API for ProfileTools is not yet finalized and may change between releases. Additional ProfilePoints may be added and the syntax for enabling tools may change. See $ sst --help=enable-profiling for the most up-to-date usage and information.

SST provides three profiling points:

  • clock profiles calls to clock handlers
  • event profiles calls to event handlers
  • sync profiles calls to SST's SyncManager (only valid for parallel simulations)

Enabling ProfileTools

You can connect one of the built-in profiling tools to these points, or you can connect a custom profile tool if you have one in your own element library. Multiple tools can be connected to a single point. Be aware that adding profiling tools can perturb performance. The SST team has observed that while most profile tools add a small overhead, profiling event sends can have a noticeable effect on performance. Profiling event receives does not have the same impact. To enable tools, pass the list of tools to be enabled to SST using the command line option --enable-profiling. Multiple tools can be enabled by passing the option multiple times or by passing a list of tools to the option. It is also possible to attach more than one profiling tool to a given profiling point.

The format for enabling ProfileTools is a semicolon separated list where each item specifies details for a given tool using the following format: name:type(params)[point].

  • name name of tool to be shown in output
  • type type of profiling tool in ELI format (lib.type)
  • params optional parameters to pass to profiling tool, format is key=value,key=value...
  • point profiling point to load the tool into (i.e. clock, event, or sync)
  --enable-profiling="events:sst.profile.handler.event.time.high_resolution(level=component)[event]"
--enable-profiling="clocks:sst.profile.handler.clock.count(level=subcomponent)[clock]"
--enable-profiling=sync:sst.profile.sync.time.steady[sync]

Available tools

Currently, each ProfilePoint has its own set of valid profiling tools as listed below. These can also be viewed by running $ sst-info sst.

clock

Three tools which all take a parameter level are available.

  • profile.handler.clock.count Counts the number of times the clock handler was called
  • profile.handler.clock.time.high_resolution Times the clock handler using the high resolution clock
  • profile.handler.clock.time.steady Times the clock handler using the steady clock (note that for many systems, the high resolution and steady clocks are the same)

Parameters

  • level Level at which to track the profile data (default: type). Level can be one of the following:
    • global all data will be tracked globally in a single value
    • type data will be tracked by Component/SubComponent type. All elements of the same type (lib.element) will be tracked in a single value
    • component data will be tracked as one value for all handlers in an instance of a Component and all of its SubComponents
    • subcomponent data will be tracked as one value for all handlers in an instance of a Component or SubComponent (i.e. data will not be aggregated for the entire component, the Component and each of its SubComponents will all be tracked independently)

event

Three tools are available. They can accept any of the parameters listed below.

  • profile.handler.event.count Counts the number of times event handlers were called
  • profile.handler.event.time.high_resolution Times the event handlers using the high resolution clock
  • profile.handler.event.time.steady Times the event handlers using the steady clock (note that for many systems, the high resolution and steady clocks are the same)

Parameters

  • level Level at which to track the profile data (default: type). Level can be one of the following:
    • global all data will be tracked globally in a single value
    • type data will be tracked by Component/SubComponent type. All elements of the same type (lib.element) will be tracked in a single value
    • component data will be tracked as one value for all handlers in an instance of a Component and all of its SubComponents
    • subcomponent data will be tracked as one value for all handlers in an instance of a Component or SubComponent (i.e. data will not be aggregated for the entire component, the Component and each of its SubComponents will all be tracked independently)
  • track_ports Controls whether to track by individual ports (default: false)
  • profile_sends Controls whether sends are profiled (default: false) NOTE: Due to location of this profiling point in the code, turning on send profiling will incur relatively high overhead. Timing sends only measures code paths in the SST-Core and is likely not of as much use to the end user.
  • profile_receives Controls whether receives are profiled (default: true)

sync

Three tools are available. They do not take any parameters.

  • profile.sync.count Counts the number of times the SyncManager is called
  • profile.sync.time.high_resolution Times calls to the SyncManager using the high resolution clock
  • profile.sync.time.steady Times calls to the SyncManager using the steady clock (note that for many systems, the high resolution and steady clocks are the same)