The ProDev WorkShop Performance Analyzer helps you understand how your program performs so that you can correct any problems. In performance analysis, you run experiments to capture performance data and see how long each phase or part of your program takes to run. You can then determine if the performance of the phase is slowed down by the CPU, I/O activity, memory, or a bug, and you can attempt to speed it up.
A menu of predefined tasks is provided to help you set up your experiments. With the Performance Analyzer views, you can conveniently analyze the data. These views show CPU utilization and process resource usage (such as context switches, page faults, and working set size), I/O activity, and memory usage (to capture such problems as memory leaks, bad allocations, and heap corruption).
The Performance Analyzer has three general techniques for collecting performance data:
Counting: counts the exact number of times each function or basic block has been executed. This requires instrumenting the program, that is, inserting code into the executable file to collect counts.
Profiling: periodically examines and records a program's program counter (PC), call stack, and resource consumption.
Tracing: traces events that affect performance, such as reads and writes, MPI calls, system calls, page faults, floating-point exceptions, and mallocs, reallocs, and frees.
The Performance Analyzer can record a number of different performance experiments, each of which provides one or more measures of code performance.
To set up a performance experiment, select a task from the Select Task submenu on the Perf menu in the Debugger Main View. The Select Task menu lets you select among several predefined experiment tasks. If you have not formed an opinion of where performance problems lie, select either the Profiling/PC Sampling task or the User Time/Callstack Sampling task. They are useful for locating general problem areas within a program.
Start the program by clicking the Run button in Main View.
After the experiment has finished running, you can display the results in the Performance Analyzer window by selecting Performance Analyzer from the Launch submenu in any ProDev WorkShop Admin menu or by typing the following:
% cvperf -exp experimentname
Results from a typical performance analysis experiment appear in Figure 4-1, the main Performance Analyzer window, and Figure 4-2, which shows a subset of the graphs in the Usage Views (Graphs) window. From the graphs, you should be able to determine where execution phases occur so that you can set traps between them to sample performance data and events at specified times and events during the experiment.
Setting traps to sample data between execution phases isolates the data to be analyzed on a phase-by-phase basis. To set a sample trap, select Sample, Sample at Function Entry, or Sample at Function Exit from the Set Trap submenu in the Traps menu in the Debugger Main View or through the Traps Manager window.
Select your next experiment from the Task menu in the Performance Pane and run it by clicking the Run button in Main View.
At this point you need to form a hypothesis about the source of the performance problem and select an appropriate task from the Select Task menu for your next experiment.
When the results of the second experiment are returned to you, you can analyze the results by using the Main View, any of its views, or Source View with performance data annotations displayed.
The Performance Analyzer provides results in the windows listed in Table 4-1.
Performance Analyzer Window
Performance Analyzer Main View
Function list with performance data, usage chart showing general resource usage over time, and time line for setting scope on data
Call Stack View
Call stack recorded when selected event occurred
Usage View (Graphs)
Specific resource usage over time, shown as graphs
Usage View (Numerical)
Specific resource usage for selected (by caliper) time interval, shown as numerical values
Call Graph View
A graph showing functions that were called during the time interval, annotated by the performance data collected
A graph showing I/O activity over time during the time interval
A list of all mallocs, their sizes and number of occurrences, and, if selected, their corresponding call stack within the selected time interval
Malloc Error View
A list of mallocs errors, their number of occurrences, and if selected, their corresponding call stack within the time interval
A list of specific leaks, their sizes and number of occurrences, and if selected, their corresponding call stack within the time interval
A generalized view of heap memory within the time interval
The ProDev WorkShop text editor window showing source code annotated by performance data collected
Working Set View
The instruction coverage of dynamic shared objects (DSOs) that make up the executable, which shows instructions, functions, and pages that were not used within the time interval
The callers and callees of designated functions.
MPI Stats View (Graphs)
A display of various MPI information in the form of graphs.
MPI Stats View (Numerical)
A display of various MPI information in the form of text.
The Cord Analyzer is not actually part of the Performance Analyzer, but it works with data from Performance Analyzer experiments. It allows you to arrange functions in defferent orders to determine the effect on performance.
The following table details where to find more information about the Performance Analyzer in the ProDev Workshop: Performance Analyzer User's Guide.
General Performance Analyzer information
Chapter 1, ''Introduction to the Performance Analyzer''
Chapter 2, ''Performance Analyzer Tutorial''
Setting up experiments
Chapter 3, ''Setting up Performance Analysis Experiments'' for details and Chapter 4, ''Selecting Performance Tasks'' heading for a summary
Setting sample traps
Chapter 3, ''Setting Sample Traps'' subsection
Chapter 4, ''The Performance Analyzer Main Window'' subsection
Usage View (Graphs)window
Chapter 4, ''The Usage View (Graphs) Window'' subsection
Watching an experiment using Process Meter
Chapter 4, ''The Process Meter Window'' subsection
Tracing I/O calls using the I/O Viewwindow
Chapter 4, ''The I/O View Window'' subsection
Call Graph Viewwindow
Chapter 4, ''The Call Graph View Window'' subsection
Finding memory problems
Chapter 4, ''Analyzing Memory Problems'' subsection
Call Stack View window
Chapter 4, ''The Call Stack Window''subsection