Chapter 4. Pinpointing Performance Problems

The ProDev WorkShop Performance Analyzer helps you understand how your program performs so that you can correct any problems. In performance analysis, you run experiments to capture performance data and see how long each phase or part of your program takes to run. You can then determine if the performance of the phase is slowed down by the CPU, I/O activity, memory, or a bug, and you can attempt to speed it up.

A menu of predefined tasks is provided to help you set up your experiments. With the Performance Analyzer views, you can conveniently analyze the data. These views show CPU utilization and process resource usage (such as context switches, page faults, and working set size), I/O activity, and memory usage (to capture such problems as memory leaks, bad allocations, and heap corruption).

The Performance Analyzer has three general techniques for collecting performance data:

Performance Analyzer User Model

The Performance Analyzer can record a number of different performance experiments, each of which provides one or more measures of code performance.

  1. To set up a performance experiment, select a task from the Select Task submenu on the Perf menu in the Debugger Main View. The Select Task menu lets you select among several predefined experiment tasks. If you have not formed an opinion of where performance problems lie, select either the Profiling/PC Sampling task or the User Time/Callstack Sampling task. They are useful for locating general problem areas within a program.

  2. Start the program by clicking the Run button in Main View.

  3. After the experiment has finished running, you can display the results in the Performance Analyzer window by selecting Performance Analyzer from the Launch submenu in any ProDev WorkShop Admin menu or by typing the following:

    % cvperf -exp experimentname

    Results from a typical performance analysis experiment appear in Figure 4-1, the main Performance Analyzer window, and Figure 4-2, which shows a subset of the graphs in the Usage Views (Graphs) window. From the graphs, you should be able to determine where execution phases occur so that you can set traps between them to sample performance data and events at specified times and events during the experiment.

    Figure 4-1. Performance Analyzer Main View

    Figure 4-2. Usage View (Graphs) Window: Lower Graphs

  4. Setting traps to sample data between execution phases isolates the data to be analyzed on a phase-by-phase basis. To set a sample trap, select Sample, Sample at Function Entry, or Sample at Function Exit from the Set Trap submenu in the Traps menu in the Debugger Main View or through the Traps Manager window.

  5. Select your next experiment from the Task menu in the Performance Pane and run it by clicking the Run button in Main View.

    At this point you need to form a hypothesis about the source of the performance problem and select an appropriate task from the Select Task menu for your next experiment.

  6. When the results of the second experiment are returned to you, you can analyze the results by using the Main View, any of its views, or Source View with performance data annotations displayed.

    The Performance Analyzer provides results in the windows listed in Table 4-1.

    Table 4-1. Performance Analyzer Views and Data

    Performance Analyzer Window

    Data Provided

    Performance Analyzer Main View

    Function list with performance data, usage chart showing general resource usage over time, and time line for setting scope on data

    Call Stack View

    Call stack recorded when selected event occurred

    Usage View (Graphs)

    Specific resource usage over time, shown as graphs

    Usage View (Numerical)

    Specific resource usage for selected (by caliper) time interval, shown as numerical values

    Call Graph View

    A graph showing functions that were called during the time interval, annotated by the performance data collected

    I/O View

    A graph showing I/O activity over time during the time interval

    Malloc View

    A list of all mallocs, their sizes and number of occurrences, and, if selected, their corresponding call stack within the selected time interval

    Malloc Error View

    A list of mallocs errors, their number of occurrences, and if selected, their corresponding call stack within the time interval

    Leak View

    A list of specific leaks, their sizes and number of occurrences, and if selected, their corresponding call stack within the time interval

    Heap View

    A generalized view of heap memory within the time interval

    Source View

    The ProDev WorkShop text editor window showing source code annotated by performance data collected

    Working Set View

    The instruction coverage of dynamic shared objects (DSOs) that make up the executable, which shows instructions, functions, and pages that were not used within the time interval

    Butterfly View

    The callers and callees of designated functions.

    MPI Stats View (Graphs)

    A display of various MPI information in the form of graphs.

    MPI Stats View (Numerical)

    A display of various MPI information in the form of text.

    Cord Analyzer

    The Cord Analyzer is not actually part of the Performance Analyzer, but it works with data from Performance Analyzer experiments. It allows you to arrange functions in defferent orders to determine the effect on performance.

The following table details where to find more information about the Performance Analyzer in the ProDev Workshop: Performance Analyzer User's Guide.

Table 4-2. Performance Analyzer Details



General Performance Analyzer information

Chapter 1, ''Introduction to the Performance Analyzer''

General tutorial

Chapter 2, ''Performance Analyzer Tutorial''

Setting up experiments

Chapter 3, ''Setting up Performance Analysis Experiments'' for details and Chapter 4, ''Selecting Performance Tasks'' heading for a summary

Setting sample traps

Chapter 3, ''Setting Sample Traps'' subsection

Main View

Chapter 4, ''The Performance Analyzer Main Window'' subsection

Usage View (Graphs)window

Chapter 4, ''The Usage View (Graphs) Window'' subsection

Watching an experiment using Process Meter

Chapter 4, ''The Process Meter Window'' subsection

Tracing I/O calls using the I/O Viewwindow

Chapter 4, ''The I/O View Window'' subsection

Call Graph Viewwindow

Chapter 4, ''The Call Graph View Window'' subsection

Finding memory problems

Chapter 4, ''Analyzing Memory Problems'' subsection

Call Stack View window

Chapter 4, ''The Call Stack Window''subsection