Chapter 4. Monitoring System Performance

This chapter describes the performance monitoring tools available in Performance Co-Pilot (PCP). This product provides a group of commands and tools for measuring system performance. Each tool is described completely by its own man page. The man pages are accessible through the man command. For example, the man page for the tool pmchart is viewed by entering the following command:

man pmchart

The following major sections are covered in this chapter:

Further monitoring tools covering performance visualization and automated reasoning about performance are described in Chapter 5, “System Performance Visualization Tools” and Chapter 6, “Performance Metrics Inference Engine”.

The following sections describe the various graphical and text-based PCP tools used to monitor local or remote system performance.

Other tools and commands available in PCP are described in Chapter 3, “Common Conventions and Arguments”, in the Performance Co-Pilot User's and Administrator's Guide.

The pmchart Tool

The pmchart utility supports interactive selection and plotting of trends over time for arbitrarily selected performance metrics from one or more hosts and one or more domains of performance metrics. First, you enter the following command:

pmchart 

You then see the Performance Co-Pilot Chart window shown in Figure 4-1.

Figure 4-1. pmchart Performance Co-Pilot Chart Window

pmchart Performance
Co-Pilot Chart Window

Normally, pmchart operates in live mode where performance metrics are fetched in real time and plotted against a time axis. The user can choose performance metrics and monitor the current values for these metrics from any host that is accessible on the network and has the PMCD server running.

When launched with the -a command line option, pmchart can also replay PCP archive logs of performance metrics created by pmlogger.

The man page for pmchart explains how to configure charts based on performance metrics, using either the Open View option of the File menu or the New Plot option of the File menu. Once charts have been configured and applied, the charts are placed in an expanded Performance Co-Pilot Chart window, as shown in Figure 4-2.

Figure 4-2. Two Charts and Metrics from Three Hosts in pmchart

Two Charts and Metrics from Three Hosts in 
pmchart

All metrics in the Performance Metrics Name Space (PMNS) with numeric value semantics can be graphed. By default, pmchart initially allows the user to select metrics to be plotted from the local host. However, the graphical user interface allows other hosts or archives to be chosen at any time as alternate sources of performance metrics and all metrics (independent of their source) are plotted on a common time axis.

For horizontal lines at major tick marks, see “Displaying Horizontal Lines”.

The -h command line option nominates an alternate default host to be used in preference to the local host.

The -a command line option may be used to start pmchart processing performance metrics from one or more PCP archive logs. The first named archive becomes the default source of performance metrics. This mode is particularly useful for retrospective comparisons and for postmortem analysis of performance problems, where a remote system is not directly accessible or a performance analyst is not available on site.

The pmchart utility examines the semantics of selected metrics, and where sensible, uses the metadata provided by the Performance Metrics Collection Subsystem (PMCS) to convert fetched metric values to a rate before plotting. In the case where different metrics are plotted in the same chart (for example, against a common Y-axis), the metrics must have the same dimension (taking into account any automatic rate conversion), but pmchart may scale metric values where necessary, to produce comparable values with common units and scale.

When replaying archive logs, the user may interactively control the current replay time, direction of replay, and replay rate, using the PCP time control dialog.

Mouse Controls

The pmchart tool uses the mouse buttons as follows:

Left 

The primary mouse button may be used to select the current chart by clicking anywhere in a specific chart. The current chart always has a border drawn around the graph area and its legend of metric names rendered in red. The Edit menu contains a variety of choices that operate only on the current chart. This mouse button also interacts with menus and dialog boxes in the usual manner.

Middle 

The middle mouse button is unused.

Right 

The secondary mouse button may be used to display metric values in a dialog box. Click this mouse button in the graph drawing area of any chart to display information about the nearest metric and its value at that point as plotted. The Metric Value Information dialog box remains visible until you dismiss it, and can be refreshed with new metric values by clicking this mouse button again, or updated automatically using the Show most Recent toggle button.

pmchart Select Performance View

A view in pmchart is a predefined collection of charts, typically constructed to display some common performance scenario. Default views are included in the PCP distribution, others are part of the various PCP add-on products, and others may be created by the pmchart end user. The Open View... option in the File menu launches a Select Performance View dialog box similar to Figure 4-3.

Figure 4-3. pmchart Select Performance View Dialog

pmchart Select Performance
View Dialog

You may use this dialog to select one of the available views. The default PCP views include the following:

BufferCache 

Cumulative amount of data read and written between system buffers and user memory or block devices.

CPU 

Processor utilization (user, system, memory break, interrupt, I/O wait, and idle time) aggregated over all CPUs.

NUMAlinks 

Usage of NUMAlink node connectors, if this hardware is present.

Disk 

Cumulative number of read and write transfers for all disk devices.

DiskCntrls 

Cumulative number of read and write transfers for all drives attached to each disk controller on the system.

FileSystem 

Percentage of each filesystem in use (percent full).

LoadAvg 

System load averaged over intervals of 1, 5, and 15 minutes.

Memory 

Memory used by the kernel, filesystems, user processes, and free space.

NetBytes 

Network interface activity--octets transmitted on various interfaces.

NetConnDrop 

TCP drops, connection drops, timeout drops, and TCP accepts.

NetPackets 

Rate of TCP and UDP packets received and sent.

NetTCPcongestion 

TCP packets retransmitted, retransmit timeouts, and TCP packets sent.

NFS2, NFS3 

Client and server NFS operation rates.

Overview 

Composite charts of CPU, LoadAvg, Memory, Disk, and NetBytes.

Paging 

Page-in and page-out rates from the virtual memory subsystem.

PMCD 

Message rates and CPU time used by PMCD or associated PMDAs.

Swap 

System swap space allocated, reserved, and unused.

Syscalls 

Rate of exec, fork, read, write, and total system calls.

You can create your own custom views using the metric selection facilities, and save your views for later using the Save View... option of the File menu.

Displaying Horizontal Lines

You can have pmchart display horizontal lines, usually in a lighter background color, at major tick marks by calling pmchart with the following arguments (quotes required):

% pmchart -xrm "PmChart*xrtYGridUseDefault: True" 

For greater convenience, you can place the following line in your $HOME/.Xresources file, to have pmchart always display horizontal lines:

PmChart*xrtYGridUseDefault: True

pmchart Metric Selection

The pmchart Metric Selection window, shown in Figure 4-4, allows interactive navigation of the Performance Metrics Name Space (PMNS) to create new chart configurations.

Figure 4-4. pmchart Metric Selection Dialog

pmchart Metric Selection
 Dialog

You can choose metrics, display information about metrics, change the current host or archive, select metric instances, and plot metric values on a common time axis. You bring up this window by choosing New Plot... from the File menu of pmchart .

Metric selection proceeds by navigating through the tree-structured PMNS. If you enter a partial metric specification in the Path field in the Metric Selection dialog, you can avoid having to navigate through the PMNS for the metrics you need. For example, if you enter network.interface, the window changes dynamically, as shown in Figure 4-5.

Figure 4-5. Further Metric Selection

Further Metric Selection

You can continue the selection process by choosing non-leaf nodes from the Nodes list, and finally a leaf node from the Metrics list. At this stage, the Path corresponds to a leaf node in the PMNS, as shown in Figure 4-6.

Figure 4-6. Selecting a Leaf Node in the PMNS (Performance Metric)

Selecting a Leaf Node in the PMNS (Performance Metric)

Once a metric has been selected, the Info button in the Metric Selection dialog launches the Metric Information dialog, as shown in Figure 4-7.

Figure 4-7. Metric Information Dialog

Metric Information Dialog

This dialog displays the name, unit, and semantics for the currently selected metric, along with the verbose help text that describes the metric, and optionally a description of the underlying instance domain.

Finally, you may have to select from several instances of a metric. In the example shown in Figure 4-7, you wish to monitor the input packet rate for some network interface(s). For the current source of performance metrics, there are two network interfaces configured. You must select one or more instances, as shown in Figure 4-8.

Figure 4-8. Selecting a Metric Instance

Selecting a Metric Instance

You can select multiple instances either by clicking and dragging up and down the list with the left mouse button, or by selecting the first instance and then using the Shift key (or Ctrl key) with the left mouse button to select one or more other instances.

Creating a PCP Archive from a pmchart Session

From the File menu of pmchart when running in live mode, the Record (Stop Recording) option may be used to start (or stop) the creation of a PCP archive log. The archive log is created using pmlogger and includes the update interval and all of the performance metrics in the current pmchart configuration when recording begins.


Note: Any changes made to the pmchart configuration after recording has been started will not be reflected in the archive log. For these to take effect, the recording must be stopped and restarted (thereby creating a second PCP archive log).

When recording is started, a File Chooser dialog is launched, and the user must provide the name of a new file to be used as the PCP archive folio for the new archive. The recording session produces multiple files in the same directory as the archive folio.

If necessary, pmchart creates directories on the path to the named archive folio.

It is often convenient to maintain one directory for each new folio, or else one directory for each group of folios related by collector host(s), service type, or chart selection.

When recording is active, a small red indicator appears in the time control button, as shown at the bottom left of Figure 4-9.

Figure 4-9. pmchart Display When Recording

pmchart Display When Recording

If you choose File > Stop Recording, logging stops immediately. The red light in the lower left turns gray.

To start recording again, chose File > Record and specify a new archive folio name.

If you exit pmchart by choosing File > Quit, an Archive Recording Session-pmchart dialog similar to that shown in Figure 4-10 appears to remind you where the archive folio was created, and to confirm that recording should be terminated.

Figure 4-10. Archive Recording Session-pmchart Dialog

Archive Recording Session-pmchart
Dialog

If you select Yes, recording stops immediately.

If you select No, recording continues. This is a useful way to continue archive logging without keeping pmchart active.

Changing pmchart Colors

When using a video projector, or when making presentations to a large group, or as a result of personal preference, the default pastel color scheme used by pmchart may be inappropriate.

The Colors option in the Edit menu allows arbitrary changes to the colors of individual charts. For more global changes, you can override the defaults using the X11 resources that pmchart honors.

For example, create or add the following entries in the $HOME/.xrdb file:

PmChart*xrtForegroundColor: "green" 
PmChart*xrtBackgroundColor: "black" 
PmChart*xrtGraphForegroundColor: "rgb:00/b0/00" 
PmChart*xrtGraphBackgroundColor: "black" 
PmChart*xrtHeaderForegroundColor: "green" 
PmChart*xrtHeaderBackgroundColor: "black" 
PmChart*pmDefaultColors: rgb:ff/ff/00 rgb:00/ff/00 rgb:00/00/ff \ 
                           rgb:ff/ff/00 rgb:00/ff/ff rgb:ff/00/ff 

Use the following command to change the default color scheme for pmchart to one with bright primary colors on a black background:

xrdb -merge $HOME/.xrdb 

Other Chart Customizations

The pmchart Edit menu provides options and a dialog that you may use to change and customize the display as follows:

Chart Style 

Chooses from line, bar, stacked bar, area plot, and utilization.

Chart Title and Legend... 

Changes the chart title, and enable or disable the legend annotation at the top of each chart.

Y-Axis Scaling... 

Fixes the maximum and minimum values of the range on the Y-axis, or allow pmchart to adjust the range dynamically to reflect currently displayed values.

Colors... 

Customizes plot colors.

Delete 

Selects all charts, a complete chart, or individual plots from a chart.

The pmchart Options menu provides another option for customizing the display:

Visible Points... 

Uses the slider to change the number of values along the time axis.

Time Control

The Options menu provides access to the PCP Time Control Dialog.

Show Time Control 

Exposes the dialog for the controlling pmtime instance, thereby allowing users to change the sampling interval.

Selecting the Time Control button in the lower left corner of the main pmchart window also exposes the Time Control dialog. If the current source of performance metrics is one or more PCP archive logs, this same dialog may be used for temporal navigation within the archive(s).

New Time Control 

Detaches pmchart from the controlling pmtime instance and launches a new pmtime instance, initially dedicated to this pmchart.

Launch New Pmchart 

Starts a new pmchart, with shared pmtime control.

Taking Snapshots of pmchart Displays and Value Dialogs

The Print option in the File menu enables the current pmchart display to be printed in a variety of PostScript styles. The output can be saved in a file or sent directly to a printer.

The -o option for pmchart also provides the facility to produce Graphics Interchange Format (GIF) image snapshots of the pmchart display.

It is often convenient to publish performance summary information for the users of a particular computing environment. The pmchart tool, in combination with the pmsnap script and its associated control files, can be used to produce high-quality performance summary snapshot images in GIF format. These images can be incorporated into Web pages, reports, e-mail, or presentation material.

The following files and utilities are included in support of this feature:

/var/pcp/config/pmsnap/Summary
 

This file contains a summary of the performance metrics used in the example snapshot.

/var/pcp/config/pmsnap/Summary.html
 

An example HTML page suitable for publishing images from the Summary pmsnap example via a Web server.

/var/pcp/config/pmsnap/control
 

This file controls the snapshot parameters.

/var/pcp/config/pmlogger/config.Summary
 

This configuration file specifies an archive log suitable for use with any pmview-type tool, and the example Summary snapshot configuration.

/usr/pcp/bin/pmsnap
 

The pmsnap script is designed to be periodically run by the cron command to process the control file /var/pcp/config/pmsnap/control and generate snapshot images according to the specifications therein. The pmsnap(1) man page describes the command line options for selecting the control lines to process, the default directory for the output files, the X display to use, and other parameters.

Instructions for configuring pmsnap are in the man page. There is also a verbose comment at the head of the control file. The pmchart(1) man page is also useful.

More Information

The annotated examples in the pmchart chapter of the PCP Tutorial provide a guided illustration to a typical user's interactions with pmchart . The PCP Tutorial can be optionally installed as the pcp.man.tutorial subsystem. To view the pmchart chapter of the tutorial, open the following URL with your Web browser:

file:/var/pcp/Tutorial/pmchart.html

The pmgadgets Command

The pmgadgets tool creates a small window containing a collection of graphical gadgets driven by performance metrics supplied by the PCP framework. Any numeric metric supported by PCP can be displayed.


Note: In the current PCP release, pmgadgets is constrained to process performance metrics from real-time sources (and not PCP archive logs), although metrics from several different hosts may be displayed simultaneously in the same window.

The layout of the gadgets and the performance metrics that lie behind them are specified in a configuration file, and pmgadgets is typically run on an existing configuration file or in conjunction with an application that automatically generates a configuration file. For example, pmgsys generates a configuration file for various IRIX performance metrics and feeds it to pmgadgets. The resulting display depends on the host configuration, but the display shown in Figure 4-11 is representative of a system with four CPUs, eleven disks on three controllers, and four network interfaces.

Figure 4-11. Representative pmgadgets Display Using pmgsys

Representative pmgadgets Display
Using pmgsys

Other pmgadget front end tools such as pmgcluster, pmgevctr, pmgcisco, and pmgshping are not described in this chapter. For information about these tools, see the pmgcluster(1), pmgevctr(1), pmgcisco(1), and pmgshping(1) man pages.

The pmgadgets tool displays much the same information as pmchart, but more compactly, and with less historical information.

The pmgadgets specification language provides the ability to define the following gadgets and components:

_bar 

Displays a single performance metric value as a rectangle. This rectangle is filled from left to right or bottom to top in proportion to the ratio of the metric's value to some maximum.

_multibar 

Is similar to the bar gadget, but displays several performance metrics at the same time (same as stacked bar). Each is allocated a color and the gadget's rectangle is filled with an amount of each color proportional to the ratio of the corresponding performance metric's contribution to a maximum value.

_bargraph 

Displays a simple xload style strip chart of a performance metric's values over time.

_led 

Is a circular gadget whose color is modulated using the value of a single performance metric.

_line 

Is a solid rectangle, not modulated by any performance metric, useful for highlighting connectivity between gadgets.

_label 

Provides textual annotation in the display.

_actions 

Provides customized menus of “drill-down” actions that may be associated with any gadget. Using the right mouse button over a visible gadget causes any associated action menu to pop up.

_colorlist 

Provides a list of X11 color specifications.

_legend 

Provides the association between color and range of performance metric values for use in a _led gadget.

Each visible gadget must be assigned a Cartesian position in the pmgadgets display.

By way of an example, the pmgadgets specification shown in Example 4-1 includes CPU, disk, and load average information from two hosts, and produces a customized pmgadgets display like the one shown in Figure 4-12.

Example 4-1. Specification File for pmgadgets

_colourlist cpuColours (blue3 red3 yellow3 cyan3 green3)
_legend diskLegend (
       _default  green3
       15        yellow
       40        orange
       75        red
)
# host moomba
_label 70 12 "moomba"
_multibar 5 5 30 6
       _metrics (
           moomba:kernel.all.cpu.user
           moomba:kernel.all.cpu.sys
           moomba:kernel.all.cpu.intr
           moomba:kernel.all.cpu.wait.total
           moomba:kernel.all.cpu.idle
       )
       _maximum 0.0
       _colourlist cpuColours
_bargraph 40 5 25 20
       _metric moomba:kernel.all.load["1 minute"]
       _max 1.0
_led 12 16 6 6
       _metric moomba:disk.all.read _legend diskLegend
_led 25 16 6 6
       _metric moomba:disk.all.write _legend diskLegend
# host gonzo
_label 70 39 "gonzo"
_multibar 5 32 30 6
       _metrics (
           gonzo:kernel.all.cpu.user
           gonzo:kernel.all.cpu.sys
           gonzo:kernel.all.cpu.intr
           gonzo:kernel.all.cpu.wait.total
           gonzo:kernel.all.cpu.idle
       )
       _maximum 0.0
       _colourlist cpuColours
_bargraph 40 32 25 20
       _metric gonzo:kernel.all.load["1 minute"]
       _max 1.0
_led 12 43 6 6
       _metric gonzo:disk.all.read _legend diskLegend
_led 25 43 6 6
       _metric gonzo:disk.all.write _legend diskLegend


Figure 4-12. Customized pmgadgets Display

Customized pmgadgets Display

In addition to the drill-down capabilities of pmgadgets, positioning the cursor over a gadget and entering a space character causes an information dialog to be exposed. This dialog tracks the current values of the performance metrics that are associated with the gadget as illustrated by the pmgadgets information dialog in Figure 4-13.

Figure 4-13. pmgadgets Dialog

pmgadgets Dialog

The pmgadgets(1) man page provides a complete description of the gadget specification language and the user interface controls of pmgadgets.

The pmdumptext Command

The pmdumptext command displays performance metrics in ASCII tables, suitable for export into databases or report generators. It is a flexible command. For example, the following command provides continuous memory statistics on a host named serv:

pmdumptext -imu -h serv -f `%H:%M:%S' mem.util
Metric        kernel  fs_ctl  _dirty  _clean    free    user 
      Units             b       b       b       b       b       b 
20:14:28        99.14M   6.03M   0.85M  98.42M   0.17G   0.16G 

See the pmdumptext(1) man page for more information.