The PACE Project is developing an architecture aware compiler environment. Rice University is the lead site, with active participants at ET International, Ohio State, Stanford, and Texas Instruments.


Resource Characterization in the PACE Project

Resource characterization plays a critical role in the PACE Project's strategy for building an optimizing compiler that adapts itself and tunes itself to new systems.

The PACE Compiler and the PACE Runtime System need access to measurements of a variety of performance-related characteristics of the target computing system. The goal of the PACE Resource Characterization subproject is to produce those measured values.  These values include cache characteristics such as size of each level of the cache hierarchy, instruction costs and latencies, and the availability of vector operations.  The overarching design requirement of this subproject is that we can only describe those architectural characteristics that have an impact on the code produced by the PACE compiler.  Thus, for example, while it may be intellectually interesting to discover the length of a processor's instruction pipeline, if the compiler cannot take advantage of that information, then we do not spend time measuring it.  On the other hand, a characteristic such as the size of the first level of cache is important because the compiler can use that statistic to guide optimizations such as loop blocking.  Our example of pipeline length is instructive: while knowing the pipeline length might improve some optimizations (e.g., the pipeline length describes the cost of a missed branch, and that may be useful for instruction scheduling), the PACE compiler is limited by the requirement that it produce C code -- rather than native machine code -- as its output.  

One of the fundamental design goals of the PACE compiler is that it must be able to adapt to many different architectures, both those currently in production, as well as architectures that have not yet been designed.  This means that the resource characterization cannot rely on existing technologies and interfaces -- for example, some commodity microprocessors have programming interfaces to allow a piece of code to discover many of the characteristics for which we are looking.  The problem with relying on these interfaces is that they are inherently idiosyncratic, both in the kind of information offered and in the format of the calls and information returned.  The information is usually based on physical capacity, rather than usable limits -- for example, we measured a significant slowdown when trying to use more than five megabytes of a cache that can hold eight megabytes.  The reason for the discrepancy is that the cache is shared among cores and processes, so although the hardware reports a particular size, the practical capacity (i.e. the amount we can use before the code slows down -- the real value that the compiler needs to know) is a much different value.  Further, future architectures may well have different interfaces, levels of information, etc.  Clearly, the goals of the PACE project require a more generic, universal solution.

As a result, our approach to resource characterization will be to use microbenchmarks, small pieces of code designed to expose architectural behavior.  Each microbenchmark is tightly focused on discovering the characteristics of a specific architectural feature.  This results in a library of codes, along with a simple interface that allows the rest of the PACE tools to access the results of resource characterization.