Synthetic Perturbation Tuning

We are developing tools to be integrated with the S-Check System developed at NIST. S-Check, based on Synthetic Perturbation Tuning (SPT), offers the unique potential for tuning parallel programs with little or no instrumentation. This guarantees that the observed behavior is not affected by intrusion introduced by the instrumentation. SPT has the additional advantage of ease of use. The user simply identifies points in the computation he or she suspects are responsible for performance problems. S-Check then reports which of the chosen points in the computation has the most effect on execution time. With this information the user can focus attention on optimizing that specific point with the assurance that the maximum benefit will be derived.

However, the statistical analysis that SPT is based on can require many runs of the application for the analysis. This may be undesirable for long running applications. For example, for a program that runs for several hours, it may not be feasible to execute the program 200 times to provide sufficient analysis. In addition, large programs may have proportionately more points that the user may wish to analyze, requiring additional runs of the program. Production parallel programs are often large and long running, compounding the problem. High Performance parallel machines may also be in high demand making excessive number of runs less desirable.

SPT has several advantages that mitigate some of the above concerns (e.g., because SPT does not change the computed results, the experimental runs can be actual production runs producing useful results). However, minimizing the number of runs is a central concern for performance tuning systems based on SPT. The proposed work will augment SPT techniques via analysis of parallel programs to:

identify points in the computation that are likely (and/or unlikely) to substantially affect performance by developing heuristics to predict the relative sensitivities of factors based on program structure,
determine alternate responses that can provide more detailed information on the behavior of the parallel program, and
interpret the results of SPT experiments in terms of program structure.

There are several areas of research in program modeling and analysis that can be adapted to provide the proposed analysis. In particular, a recent area of research, Delay Propagation Modeling (DPM), will be considered. DPM was developed to quantify the effects of instrumentation inserted in parallel programs and is based on static and dynamic analysis of parallel program structure.

DPM techniques will be used to develop pre-processors and post-processors to ``wrap'' around the SPT ``engine''. The pre-processing tools will identify potentially important factors by statically (and dynamically) analyzing program structure. The synchronization structure of the program will also be analyzed to determine which synchronizations are likely to play an important role in determining overall execution time. The amount of idle time incurred due to each of these synchronizations are additional responses that can be analyzed by SPT. Post-processing tools will be used to interpret the results of SPT experiments and to automatically direct future experimentation. The additional knowledge to be gained from analysis of program structure and behavior has the potential to substantially augment and enhance S-Check and SPT analysis.

Go Back