The PACE Project is developing an architecture aware compiler environment. Rice University is the lead site, with active participants at ET International, Ohio State, Stanford, and Texas Instruments.


Compiler Optimization Through Machine Learning

Consider a problem in the PACE context: Given a program, a target platform and a compiler, predict a good compiler configuration, i.e., a good sequence of optimizations which yields fast execution for the program (or other advantageous properties, such as minimum memory need). The sequence of optimizations that yields fast execution (optimum performance, in general) depends on

  • the characteristics of the program being compiled,
  • the characteristics of the target system, and
  • the characteristics of the compiler. 

A human designer uses past experience to achieve this optimization, by remembering and applying a good configuration of compiler flag settings used for similar programs encountered before; or by constructing a good configuration of settings based on trial runs of the program of interest. Thus the success of the designer depends on his or her ability to remember past experience, on the ability to distill, abstract, and generalize knowledge from past experience, and on the ability to spot patterns in a complex multidimensional space. This, in itself, is a formidable task. Furthermore, all this experience and knowledge might become irrelevant if the target platform changes, and it would involve massive effort to re-acquire the relevant knowledge. The extremely large parameter spaces compiler optimization tasks should ideally work with for knowledge extraction, make this problem intractable for the human mind.  Automation is needed to effectively and efficiently characterize the interactions between programs, platforms, and compilers, and their relations to observed performance in a complex system that evolves over time.

Machine Learning aims to develop models of such complex relationships by learning from available data (past experience or from controlled experiments).  The learned models facilitate discovery of complex patterns and / or recognition of patterns of known characters, in huge, unorganized high-dimensional parameter spaces, thereby making the optimization task tractable and aiding in intelligent decision making.

The Machine Learning Group of the PACE effort is concerned with developing techniques to learn from the complex multidimensional data spaces that characterize the interactions between programs, target system, and compiler optimizations. The result of the learning -- the knowledge, captured in learned models of relevant optimization scenarios -- can then be deployed and used in a variety of PACE related tasks such as program optimization (for speed, for memory usage, etc.), or for resource characterization. Moreover, with certain machine learning techniques, the models deployed after initial satisfactory training could learn continuously in a run-time environment. This not only enables their use as oracles, but also allows ongoing improvement of their knowledge based on feedback about optimization success. Thus the central objective of the PACE project, which is to provide portable performance across a wide range of new and old systems, and to reduce the time required to produce high-quality compilers for new computer systems, can greatly be helped through machine learning approaches. 

Machine Learning for compiler optimization is a relatively new area, with much unexplored territory, which reflects the complexity of the associated challenges in both compiler optimization and applicable Machine Learning algorithms, as well as the large opportunities to develop new technologies for automation of code optimization. The mission of the PACE Machine Learning Group is to respond to these challenges and opportunities.