Application Performance Analysis and Architectural Exploration with the Structural Simulation Toolkit (SST)

Location: IPDPS 2025, Milan, Italy

Dates: June 3, 2025 (Afternoon) and June 4, 2025 (Morning)

Organizers: Scott Hemmert, Clay Hughes, Joe Kenny, Gwen Voskuilen (Sandia National Laboratories); Dave Donofrio, John Leidel (Tactical Computing Laboratories)

Overview

As the science of high-performance computing (HPC) evolves, there is a growing need to understand and quantify the performance and compositional value of emerging technologies. Modeling and simulation techniques are well positioned to serve this purpose. The Structural Simulation Toolkit (SST) is a parallel discrete event-driven simulation framework that provides tools to enable co-design of HPC systems – from application to architecture. Tutorial participants will be introduced to key facets of conducting reproducible simulations of HPC architectures and infrastructures. This tutorial will be split into two self-contained sections: one focused on system-level modeling with an emphasis on application performance analysis and one focused on node-level modeling with an emphasis on architectural research.

If you are new to SST, we recommend attending at least the “introduction” portion of the first tutorial prior to attending the second.

Tutorial Part 1: Introduction to SST and application modeling in full-system simulations

June 3, 2025, Afternoon

The scale of modern distributed systems makes full detailed simulation of relevant distributed applications impractical. SST provides two workload modeling environments that allow researchers to create motifs/skeletons that can be used to drive network simulations and as scaffolding to incorporate more detailed simulations. The Ember environment is the mainstay for workload modeling within SST and has underpinned many large-scale system studies using SST. Mercury is a new environment within SST, based on SST/macro, which allows an approach more closely resembling direct execution. In this section of the tutorial we will examine the use of both workload modeling environments in the context of system-scale network simulations utilizing the Merlin network components.

Agenda

  1. Introduction to SST
    • Use cases
    • What is SST
    • How to use SST
    • Basic simulation workflow
      • Running SST
      • Analyzing output
  2. Full system modeling
    • Modeling networks
    • Modeling applications
      • Ember
      • Mercury

Tutorial Part 2:

June 4, 2025, Morning

As novel compute architectures gain prominence in the evolving landscape of high-performance computing, tools that enable accurate modeling and simulation of these technologies are crucial for advancing both research and design. The balar GPU component, integrated into SST, provides scalable, trace- and execution-driven simulations of GPU systems by leveraging GPGPU-Sim. Similarly, Tactical Computing Labs’ (TCL) Rev RISC-V component and the native SST vanadis CPU simulator extend SST’s capabilities, allowing for detailed exploration of emerging CPU architectures. Together, these tools empower researchers to model and evaluate both CPU- and GPU-centric systems with unprecedented flexibility and accuracy. Participants will gain insights into the integration and use of balar for GPU performance modeling, as well as the Rev RISC-V component for node-level simulations.

Agenda

  1. Brief overview/recap of introduction from the Part 1 Tutorial
  2. Use cases in architectural exploration
  3. Intermediate simulation workflow
    • Setting up and modifying simulation input
    • Running SST: scalability and optimization
    • Experimental methodology
    • Output analysis
  4. SST features and limitations
  5. Modeling RISC-V architectures using Rev
  6. Modeling CPU-GPU architectures using balar and vanadis

Tutorial materials

We will update this page with instructions on where to access slides, code, and precompiled binaries for the hands-on exercises prior to the tutorial. In the meantime, explore SST’s general documentation.