The PACE Project is developing an architecture aware compiler environment. Rice University is the lead site, with active participants at ET International, Ohio State, Stanford, and Texas Instruments.


An Overview of the PACE Compiler

The PACE compiler is a source-to-source optimizing compiler that tailors application code for efficient execution on a specific target system. It accepts as input a parallel program written in C with MPI and/or OpenMP parallel extensions. It analyzes and transforms the code in a number of different ways and produces, as output, a C program that has been tailored for efficient execution on the target system. The transformations performed by the PACE compiler focus on optimizing locality and parallelism. That transformed program will be tailored to the performance parameters of the target system and to the specific details of the processor cores and memory hierarchies.

    The PACE compiler uses information from several sources.

  • Characterization: The PACE characterization tools measure performance characteristics of the combined hardware/software stack of the target system. The PACE compiler uses those characteristic values, both to drive optimizing transformations and to estimate the performance of alternative optimization strategies on the target system.
  • Machine Learning: The PACE machine learning tools will provide suggestions to the compiler to guide the optimization process. The specific nature of those suggestions is a subject of ongoing research.
  • Runtime System: The PACE runtime system will provide the compiler with measured performance data from application executions. This data will include detailed performance profile information. The runtime system will identify rate-limiting resources for loop nests and functions that use a significant fraction of execution time.
  • Optimization Plan: The PACE compiler will coordinate its internal components, in part, by using an explicit but malleable optimization plan. The optimization plan is a concrete representation of the individual transformations that the compiler will apply to the code, along with any parameters to those transformations.
  • Configuration File: The configuration file is provided by the system installer. It contains critical facts about the target system and its software configuration that the PACE tools, including the PACE compiler, need to interact with that software.
The compiler produces, as output, executable C code that will interact with the runtime system. It embeds information into the executable that the runtime system needs. As examples:
  • The compiler creates a string representation of the optimization plan so that the runtime system can retrieve that information and associate it with the performance data, a necessary part of creating useful information for feedback-directed optimization and the machine learning tools.
  • The compiler inserts calls to the API for runtime optimization. These calls register the locations of runtime optimization parameters and suggested value ranges for them.  The API allows the compiler and runtime system to coordinate this behavior.

By embedding this information directly in the executable code, PACE provides a simple solution to the storage of information needed for feedback-driven optimization and for the application of machine learning to the selection of optimization plans. It avoids the need for a centralized store, like the central repository in the Rn system of the mid-1980s. It avoids the complication of creating a unique name for each compilation, recording those in some central repository, and ensuring that each execution can contact the central repository.