Chapter 6. Fine-tuning Events with PEP

EnlightenDSM has two powerful tools for fine-tuning and managing networks. The Programmable Events Processor (PEP) is used to centrally manage Events so that it reports only on pre-determined issues.

The Enterprise Management Database (EMD) is an application manager in charge of all communication between the database and other EnlightenDSM components.

This chapter describes how to use these two tools.

Dispatching and Managing New Events

EnlightenDSM has a Programmable Events Processor (PEP) for centrally dispatching and managing new Events. Use PEP to have Events report only on issues you want to see. PEP can also communicate with external applications, such as Remedy.

Applications can inform PEP of noteworthy events by using an Remote Procedure Call (RPC)-based Application Programmers Interface (API). PEP will then perform whatever action(s) have been previously defined for that particular event. PEP can also log these events in its common logging facility.

PEP is implemented in EnlightenDSM as a daemon. There must be at least one instance of PEP per network, but no more than one instance per host machine. Most installations of EnlightenDSM will use a single PEP daemon, or a primary and secondary PEP daemon.

The diagram in Figure 6-1 shows how PEP interacts with the other main components of the EnlightenDSM package.

Figure 6-1. PEP interaction with EnlightenDSM components


Functionality

PEP plays a central role in determining how the parts of EnlightenDSM work. PEP can do the following:

  • Accept event notifications via an RPC-based API.

  • Write events to the common log via the Enterprise Management Database (EMD).

  • Determine how to dispatch the event (using a rule-based policy engine).

  • Dispatch events.

  • Manage multiple Status Map GUIs reflecting the event activity.

  • Manage the actions between GUIs, for example, acknowledging an event on one GUI will cause it to be acknowledged on all other GUIs.

  • Forward events to another PEP (see Figure 6-1), allowing scalability for large networks.

  • Help you query for an event's relationship to pools.

Policy Engine

PEP uses a rule-based dispatch table as its policy engine. Events are dispatched based on an ordered matching scheme. You can also add rules to define when this matching stops.

Each incoming event is matched against all defined rules (unless a stopmatch command is encountered). Events can be matched based on the following fields:

  • Actual event ID (test name)

  • Host where the event was generated

  • Severity level of the event

  • Application class of the event

  • Application type

  • Time the event occurred

  • The description of the event

Actions

Based on the policy you define (or the defaults), PEP can take one or more of the following actions:

  • Send e-mail

  • Send notification to the Status Map

  • Start a process you defined and run it on any managed host

  • Log the event

  • Forward the event to another instance of PEP (via RPC)

  • Query for the relationship to a pool

Status Map GUI Dispatching

PEP maintains connections with all running GUIs to allow the inter-operability between them. This includes:

  • Maintaining a list of current on-line GUIs

  • Dispatching events to all GUIs

  • Marshalling events' acknowledgments from the acknowledging GUI to all other connected GUIs

Policy Definition

The policy definition is a rules-based system contained in an ASCII file, so rules can be defined and changed dynamically as the system grows. Rules are scanned in the order they are defined.

Rules-Based Language

Using this rules-based method, new events can be generated and complex relationships between events can be created. The policy engine supports the following elements:

  • Complex, nested if/then/else syntax

  • Variables (local, global, static, and dynamic)

  • Timers with one-second granularity

  • Pattern matching on string fields

  • Time of day, day of week, date matching

  • Modifying the elements of the current event

  • Querying for pool relationships

Timers can be used to set up rules based on multiple occurrences of an event within a given time period. Portions of an event can be modified, as shown in the following example (this effectively creates a new event from an existing one, which can then be further dispatched).

Common Logging

All data logs go directly into the EMD. All event logs go through PEP for additional dispatching.

Manually Killing/Invoking PEP

Occasionally, you may want to manually kill or invoke PEP. This is necessary, for example, when bringing down the system for maintenance or upgrading to a new release of EnlightenDSM.

The rest of this section details how to manually kill and restart PEP.

Terminating PEP

The following CLI command terminates the pep daemon:

kill -2 <pid#>

This command sends an interrupt to the pep daemon, allowing it to shut down gracefully.

Starting PEP

The following CLI command invokes the pep daemon:

./pep &

This command starts the daemon in the background. If you have changed the pathname for the rules.txt file (from $ENLIGHTEN/contrib/
rules.txt
), you need to use the following command instead:

./pep -f yourpathname/contrib/rules.txt &

where yourpathname shows the full pathname to the rules.txt file.

Program Start-up

When the program is started, a “startup” event is passed through the system. The event's name is set to the constant EventInit. This event should be used to explicitly initialize data such as timer states.

Program Layout

PEP programs must adhere to a certain layout. The general layout is:

# Comments start with “#” and
# Terminate at the end of the line
#
Global variable declarations
Timer declarations

Rule declaration # Comments can start any where
   Local Variable Declarations
   Program
;

Rule declaration
   Local Variable Declarations
   Program
;

More specifically:

  • The program allows comments to begin anywhere on the line and extend all the way to the end of the line.

  • Any global data must be declared before any rules are defined.

  • Timers are implicitly global. They must come after any global variable declarations.

  • Rules are terminated with a semicolon (;). You can use as many rules as necessary.

Program Execution

The engine program is executed every time PEP receives an event. External processes, such as SysAdmin agents or Events agents, can send an event using timers. PEP can also do this (see “Timers” for more details).

There is a special event called a start-up event that gets generated internally at start-up. You should use this event to initialize timers.

Event Properties

Anything that is sent to PEP, most commonly by AgentMon in the form of an alarm, is considered an event.

PEP itself can generate events to itself in the form of:

  • PEP Startup where the event.name = EventInit

  • One for every timer expiration

Events are identified, inspected and modified by their properties. All non-timer events are prefixed by the keyword event followed by the property and are separated by the dot (.) operand. As in:

              event.property

where property is one of the properties described in the table below.

Timer events can be identified by testing the event by the timer identifier. See Timer Declarations later in this chapter.

Constants

Event constants are values for some event properties. You can use constants to compare event properties and time states. Constants are strings and integers that are preset to a certain state, depending on the event. All constants begin with an uppercase letter. The table below contains constants for those event properties that use them.

Properties:

event.name

Data type: string

Testname that generated the event. Testname as described in Chapter 10 of the EnlightenDSM Reference Manual.

event.time

Data type: data/time

Time when event was generated in system local time

event.hostname

Data type: string

Name of the host that caused the event

event.severity

Data Type: constant

Severity level of the event. May evaluate by constant in either string or numeric representation.

Constant

Numeric Value

OK

1

Info

2

Warning

3

Error

4

Severe

5

event.description

Data type: string

Descriptive text string about the event. Consists of the following components:

testname measured-value threshold-type threshold-value timestamp

event.state

Data type: string

Threshold type such as:

  • low

  • high

  • +jump / -jump

  • +rate / -rate

  • Age

For Timer events, event states have one of the following constants:

  • Set

  • Expired

  • Cleared

event.threshold

Data type: string

The value of the threshold against which the measured value was compared resulting in an alarm.

event.measurement

Data type: string

The value that resulted in the alarm.

event.totalValStr 

Data type: string

Represents the total measured value for certain tests. The following tests support total value.

Test

Subtest

totalValStr

File System

blocks totalValue

inodes totalValue

Max filesystem size

Max number of inodes

File

 

modified totalValue

accessed totalValue

File size in bytes

File size in bytes

Swap

totalValue

Max swap space capacity

Inventory

 

hardware totalValue

software totalValue

Total number of hardware items

total number of software items

CPU

idle totalValue

user totalValue

kernel totalValue

wait totalValue

total number of cpus

total number of cpus

total number of cpus

total number of cpus

CPU Load

totalValue

total number of cpus

File Clamp

 

totalValue

first line in the file to match the pattern searched for

event.appclass

Data type: constant

The application class which issued the event.

Constant

Source

EventClass

Events issued by AgentMon

CLIClass

Events issued via EventsCLI (Events Command-line Interface)

AdminClass

Unused

event.units

Data type: string

The unit type of the event as depicted in the Events Configure Window of the EnlightenDSM GUI.

Values such as:

  • Unit

  • Blocks

  • Inodes

  • Matches

event.application

Data type: constant

Application type that sent the event. Evaluate this to determine the type of operation that triggered the event

Constant

Meaning

EventAckOne

Event generated when a single event is acknowledged from the GUI.

EventAckAll

Event generated when a “Host Acknowledge” function is performed from the GUI.

EventDeletePool

Event generated when a host is deleted from a pool via the GUI.

EventChangeMap

Event generated when the background image is changed in the Status Map.

EventInit

Initialization Event upon PEP startup.

EventTimer

Event generated by PEP itself when a timer event changes state. To identify an event as a timer event, test the event.application for the value of EventTimer.


Modifying Events

Events can be modified by assigning values or constants to their event properties. A typical scenario would be to increment the severity of an event based on the measured value contained in the event. Lets say we want a CPU Load alarm, which is configured as Error severity (level 4) and a High threshold of 50, to escalate to Severe if the CPU load goes above 75. A rule to this effect would look like this:

rule escalateCpuLoad
     if ( event.name == “cpu load” and event.measurement >= 75 )
     then
         event.severity = Severe
     endif
;

In the above example the event was configured to be event.severity=Error before it passed through PEP. As PEP inspects the actual measurement it changes the severity to the next severity level by using the constant Severe.

Any event property can be modified to effect a particular outcome of the event.

When evaluating and assigning values to variables and event properties using constants, do not enclose the constant key- word in quotes as in:

     If (event.severity = Error )
     then . . .

          or
timer.state = Set

Data Types

The PEP language supports several different data types:

  • integer numbers

  • floating point numbers

  • strings

  • date/time

  • timers

  • events

  • identifiers

  • constants

Integer Numbers

The following are examples of valid integer declarations:

-1  32  42235523  09

The following are examples of invalid integer declarations:

+24  +2442533

Floating Point Numbers

The following are examples of valid floating point number declarations:

4.2   0.332   -.442  -2.44   -0.243   89.0

The following are examples of invalid floating point number declarations:

3.2e02  +5.2 

Strings

Strings are anything in quotes (““) except the quote character itself. All string data types in PEP support regular expression matching, including wildcard matching. The following are examples of valid string declarations:

“Hello There” 
“He said *” 
“123” 
“I can punctuate...,,,”

The following are examples of invalid string declarations:

“what “?”
“ No end in site
was”sup

Date/Time

PEP supports partial time and date declarations. You can use the keyword timeofday to read the system's local time. Add the designators PM/AM to specify absolute time designation. This is what the engine looks for to distinguish between relative and absolute time references.

Dates are declared in U.S. notations and assume the local time of the host system. The notation is as follows:

month day year OR
month year

It does not support the European or Asian date notation of:

day month year OR
year month day

A few different styles of notations and separators are supported. The following are examples of valid date/time declarations:

12/95   12-95   12/3/1990   12-3-90
timeofday
Dec 4, 1992
Wed 9:00 PM
3 am
12:24:31 AM
Tuesday

The following are examples of invalid date/time declarations:

12:30
1995/12/3
1:1
15/7/96

Timers

Timers are special data types that are an extension of the date/time data type. Timers support relative and absolute time specifications. Relative time is either hr:min:sec or min:sec. The following are examples of valid timer declarations:

0:30   1:00:00   1:00

Remember that timers support all date/time declarations as well. Timers can also be manipulated programmatically. They have two fields:

Field name

Data type

Description

state

string

Current state of the timer (Set, Cleared, Expired). See “Constants”.

time

timer

Absolute or relative time of timer.


Identifiers

You can use identifiers to name variables, rules, and timers. They must begin with a letter and can contain letters, numbers, or underscores (`_'). The following are examples of valid identifiers:

a1234  bbc  ABC  A_1234

The following are examples of invalid identifiers:

1A  A-Z  bor%d 

As a general rule, you should use lowercase letters for identifiers, since constants start with uppercase letters. This will help avoid confusion between identifiers and constants.

Constants

You can use constants to compare event subfields and timer states to preset states. Constants are actually strings and integers that are hardwired to a certain state. All constants begin with an uppercase letter. The following is a list of constants and their appropriate uses:

Events have one of these five severity types:

Okay 
Info 
Severe 
Warning 
Error

Events have one of these seven application types:

EventAckOne 
EventAckAll 
EventDeletePool
EventMoveIcon 
EventChangeMap 
EventInit 
EventTimer

Events have one of these three application classes:

EventClass 
AdminClass 
CLIClass

Timers have one of these three states:

Set 
Expired 
Cleared

Program Syntax

The PEP language consists of:

  • variables

  • timer declarations

  • rules declarations

  • expressions

  • implicit conversions

Variables

Variables can be either dynamic or static. All variables must be explicitly assigned and are implicitly typed:

variable static x = 2

This variable declares a static variable x of an integer type and assigns the value of 2 to it. The following are more examples of valid variable definitions:

variable static y = 4.2 + 44.2
variable cc = “Hello World”
variable backup_time = 10:30 pm
variable now = timeofday

The variable's domain is implicit where it is declared. Global variables are always defined before any rule declarations and are accessible in any rule. Local variables are always declared after the rule declaration and only accessible to the rule that declared it. Variables that are named the same as global ones will hide the globals and continue until they go out of the local scope.

Timer Declarations

You can use timers to set the time-in and time-out of events in PEP. Time-outs generate events themselves. Timer states are implicitly set to “CLEAR” and must be initialized explicitly using the start-up event:

timer backup = 4:30 am
timer april_fools = Apr 1
timer one_minute = 0:60
timer one_a_day = 24:00:00

Rules Declarations

Rules are declared with the key word rule and followed by a name using a valid identifier. Rules must have a body:

rule myrule
  variable static count = 0
  if ( event.severity > 0 )
  then
    increment count
  endif
;

The rule myrule uses a local static variable to count the number of events that have a positive severity.

Expressions

Assignments

Assignments are performed by using the equal sign (=).

     variable or event property  = expression

where

variable is one of any locally or globally declared variable. See the Variables section in this chapter.

event property is any event property described under the Event Properties section in this chapter.

expression is one of the following objects:

  • Integer

  • Floating point value

  • String

  • Variable

  • Constant

  • Date/Time expression

  • Event property

  • Complex arithmetic operation

  • Implicit conversions

Integer - is a value of 231 for signed integers and 232 for unsigned integers.

Floating point value - PEP supports floating point values to two digits to the right of the decimal point.

String - Any alpha-numeric sequence enclosed in quotes (“).

Variable - A valid variable type supported by PEP.

Constant - A valid constant supported by PEP. See Constants in the Event Properties section in this chapter. Constants are not enclosed in quotes.

Date/Time expression - Any date/time expression supported by PEP. See the Date/Time section in this chapter.

Event Property - Any event property described in the Event Properties section in this chapter.

Complex arithmetic operation - The following arithmetic expressions are supported:

+, -, / ,*, %

Increment - Increments a counter

Increment mycounter is the same as mycounter = mycounter + 1

Decrement - Decrements a counter

Decrement mycounter is the same as mycounter = mycounter - 1

String Conversions

Several event properties are string objects. Even though some of these properties have a numerical nature, such as event.threshold, PEP retrieves and processes them as string values because event properties have different meanings among the different types of tests.

Therefore, it is necessary to convert these string properties to integer when performing any type of value comparison or calculation. Using PEPs left-to-right evaluation of expressions, converting a string element to an integer involves proper ordering of the elements in the expression. The first element in the expression must be of integer type:

variable a = 1
variable b = event.threshold

newval = a * event.threshold

               or put more simply

newval = 1 * event.threshold

Date/Time Comparisons

Absolute date/time comparisons compare only the common elements defined. For example, variables a, b, c, and d are set as follows:

variable a = 12/19/95
variable b = 2 pm

variable c = Tuesday
variable d = timeofday

Assume the current time is “Tuesday Dec 19, 1995 2:25pm”. Comparing variable a, b, or c to variable d would evaluate to true because each comparison only compares the common date/time components. However, comparing a to b, b to c, or a to c would result in a false comparison since none share common components. If we added an additional variable e,

variable e = 2:25 pm

comparing e to b would result in a true expression since the only common components are the hour and am/pm designator.

Unsupported Conversions

The “string to time” conversion is not supported at this time.

Query Events using in pool

PEP supports simple queries. Use the following syntax:

host in pool abc_pool

This command searches the pool abc_pool recursively to see if the host is in that pool or its subpools.

This is useful when you set up your own pools and subpools. You can write a rule to check if an event came from a certain pool of hosts. For example, the following code fragment will check if an event came from the pool “important_pool”:

if ( event.hostname in pool “important_pool” )
then
  event -> mail “user”
endif

Implicit Conversions - PEP supports manipulating different data types by using implicit data type conversion. Unlike C, which converts to the highest data type, PEP converts based on left-to-right evaluations of expressions.

Integer and floating point comparisons

variable  a = 2
variable  c = 4.2
variable  i = 0

In this first expression, a + c is evaluated as an integer added to a float. Since the integer is the first expression, c is also converted to an integer. The temporary result is an integer with a value of 6, thus causing the if expression to evaluate to false.

if ( a + c > 6 )
then
  increment i
endif

In this second expression, c + a is evaluated as a float added to an integer. Since the float is the first expression, a is converted to a float. The temporary result is a float with a value of 6.2. This temporary float also causes the constant 6 in the if expression to promote to a float of 6.0, thus causing the if expression to evaluate to true.

if ( c + a > 6 )
then
  increment i
endif

Decisions

You can make decisions by using an “if/then/else/endif” expression in conjunction with logical expressions (==, !=, <, >, <=, >=, and, or). An if expression must have a then and a terminating endif statements; the else branch is optional.

variable count = 0
variable myval = 1

if ( event.hostname == “a*” )
then
  increment count
  if ( count >= 2 and my_val < 3 )
  then
    increment myval
  else
    decrement myval
  endif
endif

Use the stopmatch command to stop a rule's execution. When PEP encounters a stopmatch command, it will “break out” of the program and discontinue execution until the next event occurs.

Send Events

PEP's primary function is to direct events to an appropriate location or mechanism. PEP's logic, structure, and rule elements are designed to ultimately control, dispatch and otherwise suppress events. The control of events is referred to as Send Events.

Events can be sent to five separate targets:

  • The Event Management Database

  • Status Map

  • Mail Recipient

  • A shell command

  • PEP at another host

  • A SNMP destination

The syntax for sending an event is:

event -> destination

where destination is one of the following:

gui - Send the event to the Status Map portion of EnlightenDSM's Graphical Interface.

log - Send the event to the Event Management Database (EMD) not store the event.


Note: It is possible to configure an event such that AgentMon will not forward alarms to PEP. In this case the event is, by default, forwarded to the EMD and the GUI by AgentMon. When PEP is employed for an event, the burden of forwarding the event to the EMD or log, as well as to the gui, is placed on PEP.

mail - Send the event to an e-mail recipient or recipient list where the recipient or list is enclosed in quotes (“).

event -> mail “recipient [recipient]”

exec - Send the event to a shell command where the shell command and arguments are enclosed in quotes (“). Any valid shell command or script can be used. Note that the shells $PATH variable must include the path of the command or script. Since the shell is launched by PEP, it will inherit the $PATH value of PEPs launching shell.

event -> exec Shell-command-or-script [arg1...arg2]

Environment Variables

When an event is sent by PEP to a destination, it will forward the event with all event properties. Particularly with exec, PEP will pass these properties to the shell in the form of environment variables. Here is a list of environment variables passed by PEP. Refer to the section Event Properties for details on these properties.

     Event Property    Corresponding environment variable
     event.description $ENL_DESCRIPTION
     event.time        $ENL_EVENTTIME
     event.hostname    $ENL_HOSTNAME
     event.measurement $ENL_MEASUREMENT
     event.severity    $ENL_SEVERITY
     event.state       $ENL_STATE
     event.name        $ENL_TESTNAME
     event.threshold   $ENL_THRESH
     event.units       $ENL_UNITS
     event.appclass    $ENL_APPCLASS
     event.application $ENL_APPLICATION

pep at hostname - Send the event to PEP at another host. Requires that PEP be running at the other host. All event properties will be passed along with the event so that PEPs rules at another host can be invoked and take action on the event. This is known as PEP re-direction. The host name must be enclosed in quotes (“).

event -> pep at “hostname”

PEP Engine Program Example

The following is an example of a PEP engine program to test the event e-mail process:

# Rule to log all events
#

rule log_all_events

    event -> gui
;
#
# During off-hours, page the on-call operator
#
rule page_after_hours
if ( event.severity = “Severe”)
    then
    if ( timeofday <= 8:00 and timeofday >= 17:00 )
    then
         event -> exec “/opt/enlighten/scripts/PageOnCallOperator”
    endif
    event -> gui
    endif
;
#
# Uses the `in pool' option to determine if event should go to
# all admins or just certain ones.
#
rule dispatch_to_host_owner
    if ( event.hostname in pool SENIOR_POOL )
    then
    event -> mail “[email protected]”
    STOPMATCH
    else
  event-> mail “[email protected] [email protected]”
    endif
;
#
# Counts the number of icmp_error events and escalates the otherwise
# nominally sever event to Severe.
#
rule escalateICMPErrors

    variable static error_count = 0
 if  ( event.hostname == “myhost” and event.name == “icmp_error” )
    then
    increment error_count
    if ( error_count >= 10 )
    then
          event.severity = Severe
          event -> gui
          event -> mail “[email protected]”
          error_count = 0
          stopmatch
     endif
endif

Enterprise Management Database

The Enterprise Management Database (EMD) is in charge of all communication between the database and other EnlightenDSM components. It consists of three components:

  • Database Engine with an ANSI-compliant database.

  • ODBC drivers (currently supporting only Informix)

  • A daemon emdd that handles all requests from other EnlightenDSM components.

As an RPC-based server, the emdd communicates with the following EnlightenDSM components: Events agent, SysAdmin agent, Status Map, PEP, and the user interface.

The communication layer between emdd and the database is ODBC compliant. The drivers are Core and Level 1 API conformant and Minimum and Core SQL conformant.

The emdd daemon also checks daily to see if any data in the database needs to be expired. See Chapter 2, “Configure,” in the EnlightenDSM

Relational Database

EnlightenDSM includes a relational database as part of EMD. The following data is stored in the database:

  • pool configurations

  • session preferences

  • host overrides

  • user authorizations

  • add user templates

  • archive device configurations

  • Events log data

  • Events alarm data

  • software and hardware inventory lists generated by Events

  • audit log

  • acknowledgment of events from the Status Map

  • backup catalogs

  • scheduling of backups

  • host notes

The following data is not stored in a database:

  • Events host-specific testtab files

  • Events AgentENL.config file data

  • snapshots of disks/file systems

Directory Structure

The EMD directory structure is:

  • INSTALL-PATH/bin

  • INSTALL-PATH/dbtables

  • INSTALL-PATH/odbc

  • INSTALL-PATH/informix

  • INSTALL-PATH/msg

  • install-path/log

where:

bin  

contains the emdd daemon and scripts such as start_enl daemons

dbtables  

contains the database tables, the database's transaction log, and a script to create the database tables

odbc  

contains the ODBC drivers and other related components

informix  

contains the database products

msg 

contains any message catalog files

log 

contains the emdd's logfile emdd.log

Database Access

The database and its tables are owned by the user dbenl. This is a new user that is created at installation time. User dbenl's home directory will contain an odbc initialization file .odbc.ini, and its .cshrc or .profile file will contain the environment variables necessary to run Informix and the ODBC driver.

By default, only user dbenl has access to the database. dbenl must start the daemon emdd or the daemon will not be able to access the database. The database administrator at your installation site may change the database privileges.


Warning: Do not modify or rename the .odbc.ini initialization file. Doing so will cause all subsequent EMD usage and connections to fail.


Manually Killing/Invoking EMD

Sometimes you may want to manually kill or invoke EMD; for example, when bringing down the system for maintenance or upgrading to a new release of EnlightenDSM.


Note: You must terminate EMD before backing up any database tables. You must manually re-invoke them afterward.

The rest of this section describes how to manually kill or invoke EMD.

Terminating EMD

You can bring down the emdd daemon with the following command:

stop_enl_daemons

This sends an interrupt to all EnlightenDSM daemons and allows the program to shut down gracefully. The daemon waits for any child processes to exit before shutting down. This may take up to 10 seconds.

Starting EMD

You can invoke the emdd daemon with the following CLI command:

start_enl_daemons

This starts the daemon in the background and sets the log level to 0. The log file is generated, but only error messages are logged (if any occur).