Chapter 3. Maintaining, Monitoring, Verifying, and Troubleshooting IRIX GSN

This chapter describes how to maintain, monitor, verify, and troubleshoot the IRIX GSN subsystem.

Commands Available for IRIX GSN

IRIX GSN can be monitored and maintained with the commands summarized in Table 3-1.

Table 3-1. Utilities for Monitoring and Maintaining IRIX GSN

Command

Function

/usr/etc/gsnarp

Configures ULA(s) for contacting interface's HARP server(s) (see “Configure IRIX HARP Client” in Chapter 2

).

Adds and deletes static entries to the HIPPI address resolution (HARP) lookup table (in kernel-resident memory). The HARP table maps ULAs to INET (IP/ST) addresses (see “Changing the HARP Database on the Fly” in Chapter 2

).

Configures address for contacting HARP server (see “Adding Server Addresses for Client” in Chapter 2

).

Displays HARP servers that are configured, and the content of the HARP table with status flags for entries (see “Display Address Resolution Lookup Table”

).

/usr/etc/harpdump

Displays all data of all HARP tables in several formats; used for troubleshooting.

/usr/etc/gsncntl

Provides control and status functions for the hardware and driver portions of the IRIX GSN subsystem, including display of GSN port's ULA (see “Display ULA (MAC) Address”

, “Check Status of Hardware”

).

/usr/etc/gsntest

Verifies the functionality of the SGI GSN (HIPPI-6400) hardware, without using the IP or ST network interfaces (see “Verify the SGI GSN Hardware”

).

/usr/etc/ttcp

Verifies the functionality of IP over any physical-layer subsystem (see “Verify an IP-over-GSN Interface”

).

/usr/etc/ping

Verifies the functionality of IRIX IP network interfaces. Can be used to verify that a gsn# IP network interface is functioning (see “Verify an IP-over-GSN Interface”

).

/etc/init.d/gsn

Reconfigures all IRIX GSN network interfaces with HARP server information in gsnarp.options file. This script calls ifconfig and gsnarp commands (see “Build New HARP Server Addresses Into Network Interfaces”

).

/usr/etc/ifconfig

Configures standard IP stack options for IRIX GSN network interfaces (that is, ifconfig gsn#). Each ifconfig up of an IRIX GSN network interface, resets the SGI GSN hardware (see “Configure IRIX GSN Network Interfaces In Real Time” in Chapter 2

).

/usr/etc/netstat

Displays network parameters (including INET and ULA/MAC addresses), traffic statistics, and status information for IP–over-GSN and ST-over-GSN. When an IRIX GSN IP/ST network interface (gsn#) is not configured, the disabled interface is listed, but without an INET address (see “Verify an IP-over-GSN Interface”

,“Troubleshoot IP-over-GSN Interfaces”

).


Instructions for Common Procedures

This section describes some procedures that are commonly used to monitor and maintain the IRIX GSN subsystem. All of the IRIX GSN utilities (gsnmap, gsncntl, and gsntest) require the user to have superuser (root) privileges. Table 3-2 lists the procedures: 

Table 3-2. Common Procedures for Monitoring and Maintaining IRIX GSN

Procedure

Utility

Hardware:

 

 Check status of hardware

gsncntl (see “Check Status of Hardware”

)

 Configure IRIX GSN to loopback through the board

none (see “Configure the SGI GSN Board for On–board (Internal) Loopback Operation”

)

 Display ULA/MAC address for GSN hardware

gsncntl;
netstat (see “Display ULA (MAC) Address”

)

 Reset hardware (and network interface)

ifconfig (see “Disable or Enable IP/ST Interface”

)

 Verify that software has located GSN hardware

hinv (see “Verify That the Board Has Been Located by the Software”

)

 Verify/test GSN hardware

gsntest (see “Verify the SGI GSN Hardware”

)

IP/ST Software:

 

 Enable/disable IP protocol stack

chkconfig (see “Enable Networking” in Chapter 2

)

 Enable/disable each IP/ST–over–GSN network interface

ifconfig (see “Disable or Enable IP/ST Interface”

)

 Check status of IP traffic/interface

netstat -p ip (see “Checking Status of IP-over-GSN Traffic”

)

 Check status of ST traffic/interface

netstat -p stp (see “Checking Status of ST-over-GSN Traffic”

)

HARP/ARP Software:

 

 Change address for contacting primary server

gsnarp -S (see “Configure IRIX HARP Client” in Chapter 2

)

 Change address for alternate (backup) HARP server(s)

gsnarp -A (see “Configure IRIX HARP Client” in Chapter 2

)

 Add/delete static entries to HARP table

gsnarp (see “Adding Static HARP Table Entries” in Chapter 2

)

 Display current HARP server(s)

gsnarp -av (see “Display Client's Active HARP Server”

)

 Check status of HARP table

gsnarp -a (see “Display Address Resolution Lookup Table”

)


Disable or Enable IP/ST Interface

To enable/disable the network interface to an IRIX GSN port that service the IP protocol stack and the ST protocol stack, use the standard /usr/etc/ifconfig command:

# ifconfig  [gsn#] down
# ifconfig  [gsn#] up


Note: This sequence of commands also resets the XTALK interface(s) to the SGI GSN hardware, which results in a warm reset of the hardware.


Display ULA (MAC) Address

To display the ULA (MAC) address for the SGI GSN hardware, invoke this command:

% gsncntl [gsn#]  getmac
PROM ULA Address:   00:01:3b:ff:00:0e
Device ULA Address: 00:01:3b:ff:00:0e

This command displays both the ULA (MAC) address that is stored in (read from) the GSN board's PROM and the ULA that is currently being used for GSN communications, which is either the address assigned from the attached GSN switch or the address read from PROM. The IRIX GSN subsystem uses the “Switch ULA” for all communications.

You can also use the netstat command, which displays only the ULA that is currently being used by the GSN subsystem; netstat does not read from PROM:

% netstat -ina
gsn0  16256  <INET netid>  <complete INET address> ...
                         <ULA/MAC address>

For example:

% netstat -ina
gsn0  16256  192.0.113     192.0.113.1     ...
                             00:01:3b:ff:00:0e 

Configure the SGI GSN Board for On–board (Internal) Loopback Operation

To configure the SGI GSN board to loop outgoing traffic through the SHAC ASIC and back into the same system (host) that transmitted it, use the following procedure. Operating the board in this mode tests the software (that is, the host protocol stacks, the driver, and the firmware on the board) associated with the IRIX GSN subsystem as well as the SHAC ASIC on the board. This mode of operation does not utilize the GSN hardware (that is, the SuMAC ASIC and the panel plate receptacle).

  1. Edit the SGI GSN board's configuration file, as described in “Edit master.d/gsn File” in Chapter 2.

    Change this default entry:  gsn_use_loopback = 0 

    to this:  gsn_use_loopback = 1 

  2. Use ifconfig to disable then enable (reset) the GSN interface:

    % ifconfig gsn# down
    % ifconfig gsn# up
    

To return the board to normal operation, change this variable back to its default setting, then disable and enable the interface.


Note: In this board–loopback mode, the functionality of the entire SGI GSN board cannot be verified. Use an external loopback connector for complete hardware verification.


Check Status of Hardware

There are utilities for checking the status of the following functions:

Verifying That GSN Hardware Was Found

To verify that the operating system located the SGI GSN hardware during startup, use this command:

# /bin/hinv -d gsn 
<display for single-board product>
GSN 1-XIO adapter: unit #, in module # I/O slot # 
<display for two-board product>
GSN 2-XIO adapter: unit #,
        XIO port 1 in module # I/O slot # 
        XIO port 2 in module # I/O slot # 

Displaying Firmware Version

To display the version of firmware that is currently running on SGI GSN hardware, use this command:

# gsncntl [gsn#] versions

where gsn# identifies the SGI GSN board for which you want information.

Displaying ULA (MAC) Address

See “Display ULA (MAC) Address”.

Displaying Status of Physical Link

To display the status of the SGI GSN board and the physical link attached to it, use the command below. The status information is described in Table 3-3.

# gsncntl gsn#  status device

 

Table 3-3. Status Information for GSN Physical Link

Status Item

Normal Value

Description

State:

LNK_RDY

The GSN states are sequential. They are listed here in the order in which they occur during startup.

 NULL

 

The driver has reset the GSN hardware but has not yet been able to configure it. This state is normal for a few seconds during startup.

 CFG

 

The IRIX GSN driver has successfully configured the GSN hardware, but has not yet made contact with the SuMAC ASIC. This state is normal for a few seconds during startup.

 SUMAC_POLL

 

The driver has started communication with the HIPPI-6400 (SuMAC) ASIC on the GSN board; HIPPI-6400 ADMIN micropackets are being exchanged. This state is normal for a few seconds during startup.

 EX_ELEM

 

The local GSN subsystem has transmitted an EXCHANGE_ELEMENT_FUNCTION Admin micropacket and received a reply from the local SuMAC; the local GSN subsystem is attempting to contact the system at the other end of its physical link.This state is normal for a few seconds during startup; if it persists, there may be a problem with the remote GSN system or the physical link.

 ULA_REQ

 

The local GSN subsystem has completed the EXCHANGE_ELEMENT_FUNCTION with its neighbor, has requested a ULA from the attached switch, and is waiting for a reply. This state is normal for a few seconds during startup; if it persists, there may be a problem with the attached switch.

 LNK_RDY

LNK_RDY

The local GSN subsystem (driver and hardware) is ready to operate. This is the normal operational state.

Flags:

  LNK_SWITCH


LNK_SWITCH


The local GSN subsystem believes that its neighbor element is a GSN switch. That is, the physical link (cable) is attached to a GSN switch.

  LNK_P-2-P

 

The local GSN subsystem believes that its neighbor element is another GSN endpoint (not a switch). That is, the physical link (cable) is attached to a HIPPI-6400 element, but is not attached to a HIPPI-6400 (GSN) switch.

  LNK_EXT_LOOP

 

The SGI GSN board is operating with a loopback connector/cable installed.LNK_P_2_P indicates that the physical link (cable) attached to another GSN endpoint (not a switch).

  LNK_INT_LOOP

 

The SGI GSN board is configured to operate in board–loopback mode.


Displaying Status of Local GSN Hardware

To display status information for an SGI GSN board, use either of the following commands.

# gsncntl [gsn#] status 

or

# gsncntl -v [gsn#] status
<verbose mode expands general error counts into specific errors>

gsn# identifies the SGI GSN board for which you want information

The displayed information is described in Table 3-4. Most of the counted items are initialized to zero upon reset of the board and roll over to zero upon reaching 232 (that is, at 4,294,967,295); exceptions are explained in the table. Check all of the cables and nodes between this system and the system(s) you tried to ping. “Verify the SGI GSN Hardware” describes how to do this for the local SGI GSN port and its link.

.

Table 3-4. GSN Status Information for Copper-Based XIO Hardware

Status Item

Description

Link state and flags

See Table 3-3

.

ADMIN Packet Count and Errors

See Table 3-9

.

SHAC Status

See Table 3-5

 and Table 3-6

. Display shows errors collected by the SHAC ASIC on the main SGI GSN board.

Hop Status

See Table 3-7

. Display shows status for local HIPPI-6400-PH element (hop 0, SuMAC), as well as the remote (hop 1) link-end element.


Displaying Status of XIO and Network-layer Processing (SHAC ASIC)

To display status information for the SHAC ASIC, use this command. (The SHAC ASIC is the component on the main SGI GSN board that processes XIO [that is, XTALK] and network–layer traffic.) Table 3-5 describes the basic display; Table 3-6 describes the verbose display that occurs when the -v option is used.

# gsncntl [gsn#] status shac

or

# gsncntl -v [gsn#] status shac 
<verbose mode expands general error counts into specific errors>

Table 3-5. SHAC ASIC Status: Basic Listing

SHAC Status Item

Normal Value

Description

Error interrupt status

OK

A count of the total number of error interrupts generated by the SHAC ASIC.

Use the verbose option (-v) to list the specific errors.

QIDs with interrupts pending

None

A count of the total number of system interrupt queues (QIDs) that currently have interrupts (generated by the SHAC ASIC) awaiting processing by the operating system.

Use the verbose option (-v) to list the specific QIDs that have interrupts pending.

VC# receive error status

OK

A count of the total number of errors detected by the SHAC ASIC's receive logic on the identified GSN virtual channel (VC#).

Use the verbose option (-v) to list the specific errors.

Transmit error status

OK

A count of the total number of errors detected by the SHAC ASIC's transmit logic.

Use the verbose option (-v) to list the specific errors.


Table 3-6. SHAC ASIC Status: Verbose Listing

SHAC Status Item

Normal Value

Description

Error interrupt status

OK

Error interrupts generated by the SHAC ASIC.

  Xtalk0 error

0

An error occurred on the main GSN board's XIO slot (that is, XTALK interface 0).

  Xtalk1 error

0

An error occurred on the additional GSN board's XIO slot (that is, XTALK interface 1 located on the XTOWN board).

  MAC error

0

An error occurred on the SuMAC ASIC.

  IC error

0

An error was detected by the SHAC port that connects to the SuMAC ASIC.

  SSRAM parity error

0

An SSRAM parity error was detected.

  Receive VC# error

0

An error was detected by the receive logic for the indicated VC (0, 1, 2, or 3).

  Transmit error

0

An error was detected by the transmit logic.

  Synthetic port error

0

An error occurred during a read or write of the area in SSRAM where information is kept for IP processing and hardware interrupts to the driver.

  Total I desc. oflow

0

SHAC cannot keep up with the interrupts being generated (by all types of traffic). Specifically, one or more QIDs overflowed.

  ST I desc. oflow

0

SHAC cannot keep up with the ST traffic occurring on at least one of its QIDs. Specifically, the interrupts being generated by the ST traffic to a specific QID overflowed the queue.

  Total I desc. uflow

0

The counts of unprocessed interrupts on one or more QIDs kept by the GSN driver and the SHAC ASIC are out of sync with each other. SHAC's count has fewer unprocessed interrupts than the driver's count.

  ST I desc. uflow

0

The counts of unprocessed ST interrupts on one or more QIDs kept by the GSN driver and the SHAC ASIC are out of sync with each other. SHAC's count has fewer unprocessed ST interrupts than the driver's count.

  DMA engine error

0

The on-board DMA logic has been halted because an error was detected on a XTALK interface or the SSRAM interface. This is a secondary error indication.

QIDs with interrupts pending

None

Identifies the QIDs that currently have interrupts awaiting processing. 1=one or more interrupts are pending on this QID; 0=no interrupts pending on this QID.

  QID 0

0

 

  QID 1

0

 

  QID 2

0

 

  QID 3

0

 

  QID 4

0

 

  QID 5

0

 

  QID 6

0

 

  QID 7

0

 

VC# receive error status

OK

Errors detected by the SHAC ASIC's receive logic.
VC# identifies the GSN virtual channel (0, 1, 2, and 3).

  Sideband error

0

An incoming HIPPI-6400 micropacket had its ERROR bit set.

  Underrun error

0

A TAIL bit was received unexpectedly. That is, an incoming HIPPI-6400 micropacket had its TAIL bit set before the receive logic received the number of micropackets implied by message's M_len field.

  Overrun error

0

A TAIL bit was not received when expected. That is, the TAIL indicator was not set in the final HIPPI–6400 micropacket for the message, as calculated from the message's M_len field.

  Fatal error

0

The SHAC ASIC's connection to the SuMAC ASIC detected a data or control parity error, a Data Valid error, or an external Link or Warm Reset. The RX logic cannot recover from these conditions. When one occurs, SHAC must be reset. All incoming data is discarded and all pending DMAs are aborted.

  SSRAM error

0

The receive logic detected an SSRAM parity error.

  STC invalid port error

0

The ST header on an incoming Control Operation contained an invalid entry: for example, an incorrect destination port or an invalid key.

  STC invalid key error

0

The ST header on an incoming Control Operation contained an invalid key for the destination port.

  STD invalid port error

0

The ST header on an incoming Data Operation contained an invalid destination port entry.

  STD invalid BID error

0

The ST header on an incoming Data Operation contained an invalid Mx parameter. Either the Mx parameter itself was invalid (for example, greater than 4095), or the local information for the specified Mx parameter is not valid.

  STD invalid key error

0

The ST header on an incoming Data Operation contained an invalid key for the destination Mx.

  STD invalid BUFX error

0

The ST header on an incoming Data Operation contained an invalid BUFX value. Either the BUFX value specified an unsupported striping feature or the local information for the specified BUFX is invalid.

  STD BUFX range error

0

The Bufx in the Schedule Header of an ST Data Operation does not fall in the range allowed for the B_id as specified by the Base_Bufx and Bufx_Range fields in the B_ID table entry. Any of the following conditions result in this error:

 

• ST Schedule Header's Bufx[31:29] does not equal Base_Bufx_31_29 in the B_ID table.

• ST Schedule Header Bufx[19:0] is less than Base_Bufx_19_0 from the B_ID table entry.

• ST Schedule Header Bufx[19:0] is greater than the sum of Base_Bufx_19_0 and Bufx_Range from the B_ID table entry.

  STD offset error

0

An ST Data Operation included incorrect offset of data length values. The Offset (from its Schedule Header) plus the length of the DMA portion of the STU is greater than the destination buffer size:

 Offset + (M_len - 0x30) > (2**Bufsize).

 

This error also occurs when Offset[6:0] is non-zero when Bufx[31:29] is non-zero; this is a violation of the alignment restriction for memory striping.

  STD STU num error

0

The STU sequence numbering for an ST Data Operation is not sequential with respect to the previous STU for the same B_id. This error can also occur when the Poisoned (P) bit in the B_ID table entry is set.

  STD read timeout

0

During memory striping or dual-path operation, a readback timeout occurred on one of the XIO ports.

  SRC port 0 oflow

0

SHAC dropped at least one ST Control Operation due to an overflow on its queue for Port 0.

  STD 6-way offset error

0

An error occurred during 6-way striping of data. This error occurs when, for a six-way striped ST Data Operation, the Offset (from the Schedule Header) plus the length of the DMA portion of the STU is greater than three-quarters the destination buffer size:

Offset + (M_len - 0x30) >.75*(2**Bufsize)

Transmit error status

OK

 

  SSRAM parity error

0

The transmit logic encountered an SSRAM parity error.

  STD invalid BUFX error

0

The BUFX table entry is invalid for the specified Src_Bufx parameter in a transmit descriptor for a Data Operation. The transmission was not performed.

  Xtalk read req. timeout

0

The transmit logic encountered a timeout error for a read request on the XTALK interface. This error is fatal to the GSN interface because the SHAC ASIC forces the SuMAC ASIC to drop all micropackets.

  Xtalk read access error

0

The transmit logic encountered an error in a read request on the XTALK interface. This error is fatal to the GSN interface because the SHAC ASIC forces the SuMAC ASIC to drop all micropackets.

  PULL desc. timeout

0

The transmit logic encountered a timeout error on the XTALK interface for a read request done in Pull Mode.

  PULL desc. access error

0

The transmit logic encountered an error in a read response on the XTALK interface.

  FIFO CR underrun error

0

A transmission was initiated by a user process when no credits had been allocated to the process. The transmission was not performed. This error cannot be produced by transmissions initiated by the kernel.

  FIFO underrun

0

The pointers indicated that there was a descriptor on SHAC's transmission queue, but SHAC found none when it tried to read the queue. This error can occur only if software writes the queue's consumer pointer, which is illegal during normal operation.

  FIFO overrun

0

The pointers indicated that there were empty slots on SHAC's ST transmission queue, but the queue was full when a user process tried to initiate a transmission (that is, write to the queue). This error can occur only if software writes the queue's producer pointer, which is illegal during normal operation. This error cannot be produced by transmissions initiated by the kernel.

  User virt. conn. error

0

A user process attempted to transmit an ST Control Operation. This is illegal. The transmission is not performed.

  User RAW error

0

A user process attempted to transmit a raw ST Operation. This is illegal. The transmission is not performed.

  User invalid port error

0

A user process initiated a transmission for an invalid PORT. The transmission is not performed. This error cannot occur when the initiator is the kernel.

  Source offset error

0

A user process initiated a transmission using an invalid Src_Offset value; the value was greater than or equal to the source buffer size specified in Src_Bufsize in the PORT table. The transmission is not performed. This error cannot occur when the initiator is the kernel.

  Buffer range error

0

A user process initiated a transmission and one of the following errors occurred:

 

• The Src_Bufx field is less than Base_Bufx in the PORT table entry.

Src_Bufx[31:29] is not equal to Base_Bufx[31:29] in the PORT table entry.

• The data to be transmitted extends past the last valid Bufx in the series of valid Bufxs for the PORT.

 

The transmission was not performed. This error cannot be produced by a transmission initiated by the kernel.

  Stripe VC error

0

A user process initiated a transmission in which the Src_Bufx value indicated striping, but the transmission was to a VC0, VC1, or VC2, which is illegal; striping is legal only on VC3. The transmission was not performed. This error cannot be produced by a transmission initiated by the kernel.

Transmit Error Source

 

Identifies the source (the cause) of the first error recorded in the Transmit error status section. (This section consists of the rows immediately above this row in this table.) The item marked with a 1 is the source; only one item is marked.

  VC0

0

When 1, indicates error was encountered on VC0.

  VC1

0

When 1, indicates error was encountered on VC1.

  VC2

0

When 1, indicates error was encountered on VC2.

  VC3

0

When 1, indicates error was encountered on VC3.

  IP VC1

0

When 1, indicates error was caused by IP traffic on VC1.

  H2F FIFO

0

When 1, indicates error was caused by the queue that holds host-to-board (that is, operating system to firmware) interrupts/commands.

  Unknown

0

When 1, indicates that the source for the captured error is unknown.

  None

0

When 1, indicates that no error source is identified.

Local/Remote Xtalk Error Status

OK

 

  Req. unsupported OP

0

bit 0: SHAC received a crosstalk request packet type that it does not support: Fetch and Operation, Store and Operation, Special Packet Request, or Reserved. Note: Even numbered reserved packet types are considered by SHAC to be requests. Odd numbered reserved packet types are considered by SHAC to be responses and will cause the XRESP_UNEXPECTED bit to set.

  Req. unsupported size

0

bit 1: SHAC received a Crosstalk request packet type/size combination that it does not support. All register accesses must be a single word, i.e., the data enables must be 0x0F or 0xF0.

  Req. frame error

0

bit 2: The data size in the command word of a request did not match the actual size of the request packet as framed by the head and tail bits. This is a severe error. If it occurs, there may be other side effects, such as lost credits and spurious responses.

  Req. err/invalid pkt

0

bit 3: Either the ERROR bit was set in the command word of a request, or the Micro-Packet Invalid bit was set in the sideband of one or more micropackets in a Crosstalk request packet.

  Req. invalid address

0

bit 4: The requested address does not exist. This error will only occur when a request is made to a reserved location (i.e., a hole) in the SHAC register space. Accesses to unconfigured SSRAM or PROM locations will not set this bit.

  Req. access error

0

bit 5: A write request was received for a read-only register or a read request was received for a write-only register or descriptor FIFO tail.

  Req. fifo oflow

0

bit 6: A request was received when the request FIFO was full. This indicates a credit count problem, i.e., the Crossbow was programmed to reflect more than four request credits for SHAC. Because the request that overflowed the FIFO may have been partially written to the FIFO, this error will often be followed by a XREQ_FRAME_ERR.

  Req. drop mode enabled

0

bit 7: An error was detected for which it would be unsafe to continue executing write requests. All writes are dropped until SHAC is reset or until a clear register is read on the SHAC ASIC. To diagnose the failure, reads are allowed.

  Req. fatal error

0

bit 8: A condition was detected that indicates an inconsistency internal to SHAC.

  Resp. unexpected response

0

bit 9: An unexpected crosstalk response was received. A read response with a TNUM value that is not associated with an outstanding read request will cause this error. All write responses will also cause this error since SHAC does not generate Write w/Response requests. Also, read responses with data size = 3 (OVCL) will cause this error since SHAC does not send requests of this sizes.

  Resp. frame error

0

bit 10: The data size in the command word of a response did not match the actual size of the response packet as framed by the head and tail bits.

OR

A spurious micropacket without the sideband Head bit set was received when a packet was not in progress.

  Resp. err/invalid pkt

0

bit 11: Either the ERROR bit was set in the command word of a response, or the Micro-Packet Invalid bit was set in the sideband of one or more micropackets in a crosstalk response packet. Unrecoverable memory errors on DMA reads will result in this error.

  Xtalk req cnt oflow

0

bit 12: The counter that keeps track of the outstanding requests sent by SHAC has overflowed. This could be caused only by an internal error in SHAC, since it should never send a request if this counter is at its maximum value.

  Xtalk req cnt uflow

0

bit 13: The counter that keeps track of the outstanding requests sent by SHAC has underflowed. This will happen if the Crossbow returns more request credits to SHAC than the number of requests SHAC issued.

  Xtalk credit oflow

0

bit 14: The counter that keeps track of the credits available to SHAC for requests and responses to the Crossbow has overflowed. This will happen if the Crossbow returns more request and response credits to SHAC than the number of requests and responses SHAC issued.

  Xtalk credit uflow

0

bit 15: The counter that keeps track of the credits available to SHAC for requests and responses to the Crossbow has underflowed. This could be caused only by an internal error in SHAC, since it should never send a request or response if this counter is at its maximum value.

  Xtalk req timeout

0

bit 16: A Crosstalk read request to the host has not received a response in at least the time configured on the SHAC ASIC.

  LLP retry timeout

0

bit 17: The LLP has retried sending a micropacket LLP_MAXRETRY times without success.

  LLP send retry == 256

0

bit 18: The LLP send retry counter (LLP_TX_CNT in XT_STATUS) has reached x'FF'.

  LLP send retry

0

bit 19: An LLP retry was required on the LLP send side.

  LLP recv errors == 256

0

bit 20: The LLP receive error counter (LLP_REC_CNT in XT_STATUS) has reached x'FF'.

  LLP recv CB error

0

bit 21: A receive check bit error was detected by the LLP.

  LLP recv SN error

0

bit 22: A receive sequence number error was detected by the LLP.


Displaying Status of Closest GSN Elements (Hops)

To display status information for the two closest HIPPI-6400-PH (GSN) elements (hop 0, hop1, and hop2), use this command:

# gsncntlgsn# status elements

The closest element (hop 0) is the SuMAC ASIC located on the local SGI GSN board; the next closest element (hop 1) is the link end located at the other side of the physical link. The identity of this element depends on the configuration of the GSN connection, as illustrated in Figure 3-1. When a loopback device is installed, the hop 1 element is the same local GSN element as hop 0; when a cable is connected, the hop 1 element is the element at the other end of the cable (for example, switch or endpoint). Table 3-7 describes the gsncntl status elements display. Figure 3-1 illustrates the physical position for each numbered GSN element; status messages refer to the items identified in this illustration.


Note: When the SGI GSN board is configured for internal loopback, the hop 0 element is the IRIX GSN driver. The local SuMAC ASIC is not available when the board is configured for internal loopback.

Figure 3-1. Physical Position of GSN Elements for Different Connection Scenarios

Physical Position of 
GSN Elements for Different Connection Scenarios

Table 3-7. HIPPI–6400–PH Element (Hop) Status

Status Item

Normal Values

Description

Hop # Status:

 

Status for the HIPPI–6400 physical element.
When # = 0, status is for the SuMAC ASIC on the local SGI GSN board.

When # = 1, status is for the first element on the other side of the physical link.

 Port to endpoint's system

ready

Current state of element's port that connects to upper layers. States are sequential and are listed here in the order in which they occur. For hop 0, this information is for the port that connects to the SHAC ASIC on the GSN board.


shutdown = the port is shut down.

reset = the port is being reset.

synchronizing ssr = the port is trying to synchronize itself with the port into the system.

ready = the port is operational.

 Port to HIPPI-6400 fabric

CLK2,
TRAINED,
LLP_OP,
VC_OP

Status for element's port that connects to the HIPPI-6400 fabric. Multiple status items can be listed simultaneously.

CLK2 = the element detects the activity–monitor signal (CLOCK2) coming from the remote element.

TRAINED = the element has successfully completed its training sequence for dynamic adjustment of signal skew over physical link.

LLP_OP = the port's link-level protocol is operational.

VC_OP = the port's virtual channels are operating according to protocol.

 Port to endpoint's system
 Error Status

OK

Errors detected on the element's port that connects to the upper layers and the system. Errors are listed alphabetically. For hop 0 (local element), this information is for the SuMAC port that connects to the SHAC ASIC on the GSN board.

 

 

2nd link =
link errors (data/ctl parity or data invalid errors) have occurred on more than one clock cycle.

 

 

admin missing tail: VC hex_value =
an admin PDU (either GSN micropacket or SGI-LLP message) did not end with Tail bit set.
Table 3-8

  explains VC hex_value.

 

 

credit-only on VC hex_value =
an illegal credit-only (TYPE=0xA) micropacket was detected.
Table 3-8

 explains VC hex_value.

 

 

ctl parity =
control parity error was detected.

 

 

data invalid =
the signal that indicates the presence of invalid data was detected more than once in a single micropacket.

 

 

data parity in byte hex_value =
data parity error detected. Bit 7 covers most–significant byte of data, bit 0 covers least-significant byte. When a bit is set to 1, the corresponding data byte had an error.

 

 

ext link =
link error is asserted from system.

 

 

IC-SSR sample =
this port's SSR layer experienced an illegal clock transitions.

 

 

OK =
no errors were detected since last startup/reset of hardware.

 

 

Missing end of msg on VC hex_value =
a GSN DATA micropacket without Tail bit was followed by a HEADER or ADMIN micropacket.
Table 3-8

 explains VC hex_value.

 

 

Missing strt of msg on VC hex_value =
a GSN micropacket with Tail bit was followed by DATA micropacket (HEADER was missing).
Table 3-8

  explains VC hex_value.

 

 

Rcvr tail timeout on VC hex_value =
receiver timed out before a GSN micropacket with Tail bit arrived.
Table 3-8

 explains VC hex_value.

 

 

RX VC buf overflow on VC hex_value =
receiver's buffer for the VC overflowed.
Table 3-8

  explains VC hex_value.

 

 

SGI-LLP admin extra tail on VC hex_value =
non-GSN protocol error in the SGI-LLP layer.
Table 3-8

  explains VC hex_value.

 

 

sndr credit=0 timeout on VC hex_value =
the credit=0 timeout expired due to internal VC full assertion from system's receiving port.
Table 3-8

  explains VC hex_value.

 

 

undefined pkt type on VC hex_value =
a GSN micropacket was detected whose type was not Header, Data, or Admin.
Table 3-8

 explains VC hex_value.

 

 

undefined val: hex_value =
TYPE value (in hexadecimal notation) read from last detected undefined micropacket.

 Port to HIPPI-6400 fabric
 Error Status

OK

Errors that have been detected on the element's port that connects to the HIPPI-6400 fabric. Errors are listed alphabetically.

 

 

activity mon lost =
activity monitor signal (CLOCK2) became inactive after port had been operational.

 

 

admin missing tail: VC hex_value =
an admin PDU (either GSN micropacket or SGI-LLP message) did not end with Tail bit set.
Table 3-8

  explains VC hex_value.

 

 

credit ovfl: VC hex_value =
more than 255 credits were received on the indicated virtual channel.
Table 3-8

 explains “VC hex_value”.

 

 

ECRC rcv err cnt =
number of micropackets that were received with ECRC errors in which the micropacket's Error bit was not set.

 

 

ECRC snd err cnt =
number of micropackets that were transmitted in which the ECRC received from the system was erroneous and the micropacket ERROR bit was not set.

 

 

LCRC err cnt =
number of micropackets that were received with LCRC errors.

 

 

missing end of msg: VC hex_value =
a micropacket of Type=data without Tail bit set was followed by Type=header or Type=admin. The micropacket with its Tail bit set was missing.
Table 3-8

  explains VC hex_value.

 

 

missing start of msg: VC hex_value =
a micropacket with its Tail bit set was followed by a Type=data micropacket. The Type=header micropacket was missing.
Table 3-8

 explains VC hex_value.

 

 

OK =
no errors were detected since last startup/reset of hardware.

 

 

rcv tail timeout: VC hex_value =
receiving VC timed out before final micropacket (with its Tail bit set) for a message arrived. This error is called the “stall timeout error” in the HIPPI–6400–PH standard.
Table 3-8

 explains VC hex_value.

 

 

Retry cnt =
number of retransmissions that have occurred.

 

 

retry failed =
a retransmission failed, forcing port into shutdown.

 

 

RSEQ missing cnt =
number of times the ACK timeout expired before an expected RSEQ was received, resulting in a retransmission.

 

 

RSEQ val err cnt =
number of times an incoming RSEQ value fell outside the expected range, resulting in a retransmission.

 

 

rx vc buf overflow: VC hex_value =
receiving VC's buffer overflowed. See
Table 3-8

 for explanation of VC hex_value.

 

 

SGI-LLP admin extra tail: VC hex_value =
non-GSN protocol error in the SGI-LLP layer.
Table 3-8

  explains VC hex_value.

 

 

skew retrain failed =
skew compensation retraining sequence failed (after it had been healthy).

 

 

skew retrain rst err =
skew compensation has not been successful since last reset.

 

 

sndr credit=0 timeout: VC hex_value =
local source's credit=0 timeout expired due to VC full assertion from the element's port to the system. This error is called the “credit timeout error” in the HIPPI–6400–PH standard.
Table 3-8

 explains VC hex_value.

 

 

TSEQ val err cnt =
number of times a TSEQ value error was detected. TSEQ errors are not counted if a micropacket has not been accepted since the last TSEQ error.

 

 

undef pkt type: VC hex_value =
a micropacket of an undefined type was detected on the indicated virtual channel.
Table 3-8

 explains VC hex_value.

 

 

undef val: hex_value =
value of last detected undefined type of micropacket.


Table 3-8. Meaning of “VC hex_value” in Status Screen Displays

VC hex_value

Indicates Errors on These VCs

0x1

VC0

0x2

VC1

0x3

VC0, VC1

0x4

VC2

0x5

VC0, VC2

0x6

VC1, VC2

0x7

VC0, VC1, VC2

0x8

VC3

0x9

VC0, VC3

0xA

VC1, VC3

0xB

VC0, VC1, VC3

0xC

VC2, VC3

0xD

VC0, VC2, VC3

0xE

VC1, VC2, VC3

0xF

VC0, VC1, VC2, VC3


Displaying Status of HIPPI-6400 ADMIN Traffic

See “Checking Status of ADMIN Traffic”.

Check Status of GSN Traffic

This section describes methods for monitoring various types of traffic that can move through an IRIX GSN subsystem: HIPPI–6400 Admin micropackets in “Displaying Status of HIPPI-6400 ADMIN Traffic” and ST messages and IP datagrams in “Checking Status of IP-over-GSN Traffic”.

Checking Status of ADMIN Traffic

To display status information about HIPPI-6400 ADMIN micropackets for a SGI GSN connection, use the gsncntl status admin command. This command displays counts for the various types of HIPPI-6400 ADMIN micropackets, as described in Table 3-9; each count includes both those transmitted and those received. The counts for pairs (for example, Ping and Ping_Response) should be very close.

# gsncntl [gsn#] status admin 

Table 3-9 describes the information that is displayed about ADMIN micropackets: 

Table 3-9. Local Element's HIPPI–6400 ADMIN Traffic Status

Status Item

Description

ADMIN Packet Count
(by command/type):

For each type of ADMIN micropacket, the display shows a count of request/command (Cmd) micropackets and a count of responses (Rsp).

 Ping

Simple “HELLO, I'm alive.” Response does same for remote endpoint.

 Set Element Address

Commands the element to use the supplied element address. (This address is different from the globally–unique ULA.) Response is an acknowledgment only.

 Exchange Element Function

Describes element's function: link-end, endpoint, or switch. Response does same.

 ULA Request

Requests assignment of a Universal LAN MAC address (ULA) to the requestor from the receiver; the receiver must be a switch. Response assigns a ULA.

 Read Register

Asks for current contents of element's ADMIN register. Response provides the contents. The local IRIX GSN driver is the main reader for local element's ADMIN register.

 Write Register

Asks element to update (write) one or more fields in the ADMIN register. The local IRIX GSN driver is the main writer for local element's ADMIN register.

 ULA List Request

Asks for a list of all connected ULAs. Response provides the list.

 Port Remap

Changes the port-to-ULA mapping.

 Port Map Request

Asks switch to provide the physical port that is necessary to open a connection to a particular ULA. Response is the port identification.

 Reset

Commands element to reset (reinitialize) itself. There is no response to this.

 Invalid Command

The ADMIN micropacket's command (that is, the type of ADMIN packet) was not recognized as one of those listed in this table. There is no response to this.

ADMIN Packet Errors:

 

 Admin cmd/rsp on vc0 or 3

An ADMIN micropacket was illegally detected on VC0 or VC3.

 Admin cmd on vc != 1

A request/command ADMIN micropacket was illegally detected on a VC other than VC1.

 Admin rsp on vc != 2

A response ADMIN micropacket was illegally detected on a VC other than VC2.

 Admin rsp w/bad status

A response ADMIN micropacket was received with one of its status flags set. All status flags indicate that some problem occurred with the request/command that was generated by the source on the local element: undefined operation, invalid key, parameter out of range, invalid address for ADMIN register, command failed, etcetera.

 Rec'd pkt when shut down

When the physical link was shut down, an ADMIN micropacket arrived. The packet has been discarded.

 Rec'd unexp rd reg

An illegal Read Register request/command arrived. For example, the hop count asked that the micropacket be forwarded even though the local element's function is endpoint (not switch).

 Rec'd unexp rd reg rsp

A Read Register response arrived when no request/command has been sent to initiate the response.

 Rec'd unexp set elem addr rsp

A Set Element Address response arrived when no request/command has been sent to initiate the response.

 Rec'd unexp xchange elem func

An illegal Exchange Element Function arrived. For example, the hop count asked that the micropacket be forwarded, indicating that the sender believes the local element is a switch.

 Rec'd unexp xchange elem func rsp

An Element Function response arrived when no request/command has been sent to initiate the response.

 Rec'd unknown elem type

An Element Function response arrived with an unknown function. The known functions are switch, link-end, and endpoint.

 Rec'd unexp pkt in ULA_REQ state

While the local element was waiting for a response to its request for ULA assignment, an ADMIN micropacket arrived that seemed to require forwarding (for example, the hop count was not 0 or the destination element address was not 0xFFFFFFFF).

 Rec'd unexp pkt in LINK_RDY state

After the local element (an endpoint) was functional, an ADMIN micropacket arrived that required forwarding (for example, the hop count was not 0 or the destination element address was not 0xFFFFFFFF).

 Rec'd unexpected unknown
 admin pkt

An ADMIN micropacket was received that had an unknown command.


Checking Status of IP-over-GSN Traffic

To list the configuration information for the IRIX GSN network interface (gsn#), use this command:

% netstat -ina

To display statistics about the IP, TCP, UDP, or ICMP protocol stacks, use these commands:

% netstat -p ip
% netstat -p tcp
% netstat -p udp
% netstat -p icmp

Checking Status of ST-over-GSN Traffic

To list the configuration information for the IRIX GSN network interface (gsn#) that handles ST traffic, use this command:

% netstat -ina

To display protocol statistics for the ST protocol stack, use this command:

% netstat -p stp

Checking Status for All Protocols Simultaneously

To simultaneously display statistics for the IRIX GSN network interface, and traffic for all the protocol stacks, use this command:

% netstat -C

Use the numbers (highlighted in the menu area on the bottom of the display) to select the protocol or information you wish to view. Notice that there are two menus; press the 9 key on your keyboard to toggle between them. The first menu has Interfaces and IP; the second menus has ST (listed in the menu as STP), TCP, and UDP.

Use these letters to change the time period over which the displayed statistics have been counted:

r 

displays totals collected (counted) since the last reset of the interface or operating system

z 

resets all statistics to zero and starts counting

d 

every second resets all statistics to zero and starts counting

Build New HARP Server Addresses Into Network Interfaces

To reconfigure the GSN network, configure down and up the interface(s) that you are working with, as follows:

# ifconfig gsn# down
# ifconfig gsn# up

The HARP daemon will be informed of the device's changes in state and the daemon will configure the HARP tables accordingly.

Display Client's Active HARP Server

Use the following command line to display the address that is currently being used by IRIX HARP to contact the HARP server (marked “Primary”) and all other registered HARP servers (marked “Alternate”) and servers (marked “Not Registered”).

% gsnarp -av 

Output is as follows:

HRAL for gsn0:
         00:10:3b:ff:00:25  Primary
         00:10:3b:ff:00:26  Alternate
HRAL for gsn1:
         00:10:3b:ff:01:25  Primary
         00:10:3b:ff:01:26  Alternate
IP address      ULA                 if    Resolves  Flags 
iridium-g0      00:10:3b:ff:00:28   gsn0      9028  LOCAL 
iridium-g0st    00:10:3b:ff:00:28   gsn0      9028  LOCAL 
iridium-g1      00:10:3b:ff:01:28   gsn1       390  LOCAL 
iridium-g1st    00:10:3b:ff:01:28   gsn1       390  LOCAL 
harp2-g1        00:10:3b:ff:01:26   gsn1      2994  SERVER 
harp2-g1st      00:10:3b:ff:01:26   gsn1      2994  SERVER 
harp1-g1        00:10:3b:ff:01:25   gsn1     71331  PRIMARY_SERVER 
harp1-g1st      00:10:3b:ff:01:25   gsn1     71331  PRIMARY_SERVER 
harp2-g0        00:10:3b:ff:00:26   gsn0       189  SERVER 
harp2-g0st      00:10:3b:ff:00:26   gsn0       189  SERVER 
harp1-g0        00:10:3b:ff:00:25   gsn0     10027  PRIMARY_SERVER 
harp1-g0st      00:10:3b:ff:00:25   gsn0     10027  PRIMARY_SERVER 
ruby-g1         00:10:3b:ff:01:13   gsn1        46 
ruby-g0         00:10:3b:ff:00:13   gsn0        45 
uranium-g1      00:10:3b:ff:01:27   gsn1      9488 
uranium-g0      00:10:3b:ff:00:27   gsn0         9  PERM 
copper-g0       00:10:3b:ff:00:08   gsn0      2492 

The gsnarp(1) man page provides a full example of the output from the gsnarp -av command.

Display Address Resolution Lookup Table

Use the following command line to display the HARP table that is currently loaded into memory and being used by IRIX HARP. This table maps INET addresses to physical/hardware addresses (ULAs):

# gsnarp -a 

Output is as follows:

IP address                 ULA                Flags
iron-g1.private            00:10:3b:ff:01:31  
iron-g0.private            00:10:3b:ff:00:31  
graphite-g1.private        00:10:3b:ff:00:1a  
copper-g1.private          00:10:3b:ff:01:08  
copper-g0.private          00:10:3b:ff:00:08  
quartz-g1.private          00:10:3b:ff:01:07  
harp2-g1.private           00:10:3b:ff:01:26  SERVER 
harp2-g1st.private         00:10:3b:ff:01:26  SERVER 
harp1-g1.private           00:10:3b:ff:01:25  PRIMARY_SERVER 
harp1-g1st.private         00:10:3b:ff:01:25  PRIMARY_SERVER 
harp2-g0.private           00:10:3b:ff:00:26  SERVER 
harp2-g0st.private         00:10:3b:ff:00:26  SERVER 
harp1-g0.private           00:10:3b:ff:00:25  PRIMARY_SERVER 
harp1-g0st.private         00:10:3b:ff:00:25  PRIMARY_SERVER 
ruby-g0.private            00:10:3b:ff:00:13  LOCAL 
ruby-g0st.private          00:10:3b:ff:00:13  LOCAL 
mercury-g0.private         00:10:3b:ff:00:2b  
ruby-g1.private            00:10:3b:ff:01:13  LOCAL 
ruby-g1st.private          00:10:3b:ff:01:13  LOCAL 
lead-g0.private            00:10:3b:ff:00:29  
quartz-g0.private          00:10:3b:ff:00:07  
iridium-g0.private         00:10:3b:ff:00:28  
iridium-g1.private         00:10:3b:ff:01:28  
uranium-g1.private         00:10:3b:ff:01:27  
uranium-g0.private         00:10:3b:ff:00:27  
lead-g1.private            00:10:3b:ff:01:29  

Flag descriptions are as follows:

Flag 

Description 

LOCAL 

Entry is the local host.

PRIMARY_SERVER 

Entry is for the HARP server currently in use.

SERVER 

Entry is for an alternate HARP server. Should the primary server fail to respond to requests, the host will select a different primary from among the alternate servers.

PERM 

Entry is permanent (static). Entry does not get revalidated, and cannot be changed by received HARP packets.

PERM_ULA 

Entry is for a nonremovable ULA, whose INET address is determined whenever the interface is activated by means of an InARP request. (For a full description, see the harpd(1) man page.)

PENDING 

Entry is incomplete and is awaiting a reply from the HARP server. (Entries should exist in this state for only a very short time.)

Installing a Loopback Device

To run a loopback verification test on an SGI GSN board, use one of the procedures described below to set up the loopback:

  1. Disable the IRIX GSN network interface:

    % ifconfig gsn# down 
    

    where # identifies the network interface you are going to verify.

  2. Use one of these methods for installing a loopback path:

    • Attach a loopback device to the HIPPI–6400 port on the board's I/O panel plate. The connector is keyed with a long and a short side to ensure correct alignment, as illustrated in Figure 3-2.

      Figure 3-2. Installing a HIPPI-6400 Loopback Connector

      Installing a HIPPI-6400 Loopback Connector

    • At the switch, configure the attached input port so that it loops back all traffic to the same port's output. For example, you might need to make all the entries in the port's routing (forwarding) table point to this port.

    • Configure the board for internal loopback, as described in “Configure the SGI GSN Board for On–board (Internal) Loopback Operation”.


      Note: With internal board loopback, the HIPPI-6400 connector and the GSN component on the board are not verified during the verification procedures.


  3. Enable the network interface:

    % ifconfig gsn# up 
    

    where # is the interface you disabled.


    Note: When you remove the loopback device or link and make a new connection (point-to-point or to a switch), you must invoke ifconfig down to disable, then ifconfig up to enable the network interface.


Verifying the IRIX GSN Subsystem

The most reliable method for verifying an IRIX GSN subsystem is to install a loopback device or looped back link (as described in “Installing a Loopback Device”), then run the gsntest hardware verification test, as described below. After the GSN hardware has been verified, further upper-layer verification and interconnectivity tests can be run (for example, the tests described under the headings “Verify an IP-over-GSN Interface”) by communicating with other GSN systems.


Note: Unlike many IRIX drivers, the IRIX GSN driver does not automatically route self-addressed packets through the local loopback interface (lo0), so that even the IP stack can be verified with the loopback link in place.


Verify That the Board Has Been Located by the Software

To verify that a SGI GSN board has been located by the operating system during the last reboot, use any of the following commands:

% hinv -d gsn 
GSN #-XIO adapter: unit #, in module # I/O slot # 
% hinv -mvv -d gsn 
. . .
Location: /hw/module/#/slot/io#/xio_gsn
 XIO_GSN Board: barcode ######     part 030-1361-00# rev # 
 Group ff Capability ffffffff Variety ff Laser 0000002adfaa 
. . . 
GSN #-XIO adapter: unit #, in module # I/O slot #  

% find /hw/module -name xio_gsn 
/hw/module/#/slot/io#/xio_gsn 

Each GSN solution may have multiple full-path entries in the IRIX hardware graph. For example, in the sample display shown below, the startup routine located two SGI GSN products installed into two different modules; one instance is the two-board product while the other is the single-board product. With the two-board product, the main GSN board is always identified as XIO port 1, and the additional board is XIO port 2.

% hinv -d gsn 
GSN 1-XIO adapter: unit 0, in module 1 I/O slot 8
GSN 2-XIO adapter: unit 1, 
   XIO port 1 in module 2 I/O slot 4 
   XIO port 2 in module 2 I/O slot 12 
% find /hw/module -name xio_gsn 
/hw/module/1/slot/io8/xio_gsn
/hw/module/2/slot/io12/xio_gsn
/hw/module/2/slot/io4/xio_gsn

Verify the SGI GSN Hardware

To verify the main SGI GSN board (without going through the IP or ST stack), use the /usr/etc/gsntest command. This test works only for a SGI GSN board that has a loopback device installed. (See “Installing a Loopback Device” for instructions.) The command requires the user to be superuser (root).


Note: Unlike many IRIX drivers, the IRIX GSN driver does not automatically route self-addressed IP/ST packets through the network stack's loopback interface (lo0); the GSN driver passes self-addressed packets to the hardware. This allows you to use standard IP utilities (such as ping <myaddress>) to test the IRIX GSN subsystem when it is physically looped back.

For a simple, quick verification test, use the commands below:

% cd /usr/etc 
% su 
Password: thepassword 
# hinv -d gsn 
<use the displayed unit number for # in the following command lines> 
# /usr/etc/gsncntl gsn# status device 
<verify that the STATE and FLAGS are correct 
as described in 

Table 3-3>

# /usr/etc/gsntest gsn# 
gsntest: PING 
GSN PING hop 0: Received ping cmd/response from element in 150.40 us
GSN PING hop 1: Received ping cmd/response from element in 117.60 us
GSN PING hop 2: Received ping cmd/response from element in 1707.20 us
<refer to 

Figure 3-1 to identify the physical element
associated with each hop #>

The gsntest utility sends HIPPI-6400 Admin “PING” micropackets. The command creates packets with the following nonconfigurable characteristics:

Key 

0

Hop Count 

0

Dest_Reg 

0

Dest_Addr 

0xFFFFFFFF

Command 

HIPPI–6400–SC Ping

Status_Flags 

0

Src_Reg 

0

Src_Addr 

Local interface's ULA obtained from switch, or if not available, as read from PROM.

Data_Reg 

Bytes 0-3 contain the PID of the process. Bytes 4-15 contain randomly generated data.

Example:

This example illustrates the test when the local SGI GSN port is connected to a switch:

% /usr/etc/gsntest gsn# 
gsntest: PING
GSN PING hop 0: Received ping cmd/response from element in 148.80 us
GSN PING hop 1: Received ping cmd/response from element in 131.20 us
GSN PING hop 2: Received ping cmd/response from element in 13844.80 us


Note: The hop 1 response came from the switch while the hop 2 response came from the remote endpoint.

This example illustrates the test when a loopback device is installed on the local SGI GSN port:

% /usr/etc/gsntest gsn# 
gsntest: PING
GSN PING hop 0: Received ping cmd/response from element in 140.20 us
GSN PING hop 1: Received ping cmd/response from element in 148.80 us
GSN PING hop 2: Received ping cmd/response from element in xx us


Note: Hop 0 came from the local SuMAC's source logic. The hop 1 response came from the destination logic (on the “other side” of the external loopback device) within the local SuMAC ASIC, while the hop 2 response came from the local IRIX GSN driver.

The gsntest utility does not verify the data path through the second XIO slot of a two-board GSN solution. If the gsntest utility fails with an error message, locate the error message in the section “ Error Messages from the gsntest Command” in Chapter 4 and follow the instructions

Verify an IP-over-GSN Interface

To verify that each IP-over-GSN network interface is functional, follow the instructions in this section. This test assumes that the IRIX GSN subsystem has passed the gsntest verification, as described under the heading “Verify the SGI GSN Hardware”.


Note: Unlike many network products, the IRIX GSN software does not loop IP packets through the system's local loopback interface (lo0). All IP-over-GSN packets are passed to the GSN hardware.

To accomplish this verification, use /usr/etc/ping -r (lowercase -r, not -R) to make this station communicate with another GSN IP station (or itself) over the IRIX GSN subsystem.

  1. Obtain the IP network addresses for all the IP–over–GSN interfaces (gsn#) on this system. This information can be displayed with the command shown below. The network address is listed in the column labeled Network, as illustrated in Figure 3-3.

    % /usr/etc/netstat -ina
    

    Figure 3-3. The /usr/etc/netstat -ina Display

    The /usr/etc/netstat -ina Display

  2. Obtain the name (or IP address) of at least one remote station on each of these GSN network addresses. Two methods for obtaining station names are described below.

    • For a system connected to a local area network that provides name lookup service (NIS), use the commands below to create a file for each GSN network connection. Each file will contain the names and addresses of stations that share a particular network address:

      % ypcat hosts | grep gsn0_networkaddress > gsn0.s
      % ypcat hosts | grep gsn1_networkaddress > gsn1.s
      <do this for each GSN IP network address> 
      

      where each gsn#_networkaddress value is an address from the Network column of the netstat display (illustrated in Figure 3-3).

      Example:

      % ypcat hosts | grep 253.5.88 > gsn0.s
      

    • For a system that does not have access to NIS, use these commands to create a file for each network connection. Each file will contain the locally-known names and addresses of stations that share a particular network address:

      % grep gsn0_networkaddress /etc/hosts > gsn0.s
      % grep gsn1_networkaddress /etc/hosts > gsn1.s
      <do this for each IP–over–GSN network address>
      

      Example:

      % grep 253.5.88 /etc/hosts > gsn1.s
      

  3. Communicate with one station on the GSN network used by the gsn0 connection. For the variable gsn0_station, you can use any of the names or IP addresses from the gsn0.s file.

    % ping -r gsn0_station
    PING stationname (IPaddress): 56 data bytes 
    64 bytes from . . . time=x ms . . .
    <Ctrl><c>
    ----stationname PING Statistics----
    # packets trans,# pckts rcvd, x% packet loss
    


    Note: If a loopback link is in place, use the system's own IP address for the gsn0_station variable.


  4. If netstat lists more than one IRIX GSN (gsn#) network interface, communicate with one station on each of those networks. For the variable gsn#_station, you can use any of the names from the gsn#.s file.

    % ping -r gsn#_station
    PING stationname (IPaddress): 56 data bytes
    64 bytes from . . . time=x ms
    . . .
    <Ctrl><c>
    ----stationname PING Statistics----
    # packets trans, # pckts rcvd, x% packet loss
    


    Note: If a loopback link is in place on any of the ports, use the system's own IP address for the gsn#_station variable.


  5. If one ping on each network succeeds, you have completed the verification procedure. All the local network connections are functioning. Use the commands below to remove the files with the lists of stations:

    % rm gsn0.s
    % rm gsn1.s
    

    Do this command line for each gsn#.s file created.

    If the ping on a network fails, follow the instructions in “Troubleshoot IP-over-GSN Interfaces” in the next section.

  6. To verify the throughput for a functional network interface, open an IRIX shell window for the local station and antoher one for a remote station. Then, invoke the following commands:

    In the IRIX shell for station #1, the receiver:
    % /usr/etc/ttcp -s -r -l49152 -b1048576

    In the IRIX shell for station #2, the transmitter:
    % /usr/etc/ttcp -s -t -l49152 -b1048576 -n262144 station1_IPaddress

    
    


    Note: In the previous example, please note that the -l option specifies the length.

     

Troubleshooting

This section provides basic procedures for troubleshooting IRIX GSN.

Troubleshoot SGI GSN Hardware

See the SGI GSN Hardware Owner's Guide.

If the gsntest utility fails with an error message, locate the error message in “ Error Messages from the gsntest Command” in Chapter 4 and follow the instructions.

Troubleshoot IP-over-GSN Interfaces

This section describes separate sets of instructions for troubleshooting a system in which all the IP–over–GSN network interfaces are failing, and a system where only some of the network interfaces are failing.

All IP-over-GSN Network Interfaces Are Failing

If the ping verification tests fail for all the system's IP–over–GSN network connections, your system probably has been configured incorrectly. Verify the configuration by performing the steps below.


Note: This procedure assumes that the hardware has been varified by following the instructions in “Verify the SGI GSN Hardware”“Verify the SGI GSN Hardware” and has been found to function correctly.


  1. Use this command to verify that the local GSN board is not in loopback mode and that the physical link is operational:

    % gsncntl gsn# status device
    STATE: LNK_RDY
    FLAGS: LNK_SWITCH
    

  2. If the system is connected to a switch, use gsncntl to verify that the switch has assigned a ULA to the local system. The assigned address is the one labeled Device.

    % gsncntl gsn# getmac
    PROM ULA Address:   08:00:69:05:0d:2c4
    Device ULA Address:  00:01:3b:ff:00:04
    

  3. Verify that IP networking is enabled with the following command line:

    % /sbin/chkconfig | grep network
    network  on
    

  4. Use /usr/etc/netstat -ina to verify that the local IP–over–GSN network interfaces have been configured and enabled. The display should look similar to that shown in Figure 3-3.

    Refer to the online IRIX Admin:Networking and Mail guide for information about configuring and troubleshooting IP network interfaces.

  5. Use gsnarp -av to verify that the address resolution mechanism is functioning. Check that a primary HARP server is listed for each local GSN network interface and verify that the HARP table has correct entries for the remote system(s) that you tried to ping.

    # /usr/etc/gsnarp -av 
    HRAL for gsn0:
             00:10:3b:ff:00:25  Primary
             00:10:3b:ff:00:26  Alternate
    HRAL for gsn1:
             00:10:3b:ff:01:25  Primary
             00:10:3b:ff:01:26  Alternate
    IP address      ULA                 if    Resolves  Flags 
    iridium-g0      00:10:3b:ff:00:28   gsn0      9028  LOCAL 
    iridium-g0st    00:10:3b:ff:00:28   gsn0      9028  LOCAL 
    iridium-g1      00:10:3b:ff:01:28   gsn1       390  LOCAL 
    iridium-g1st    00:10:3b:ff:01:28   gsn1       390  LOCAL 
    harp2-g1        00:10:3b:ff:01:26   gsn1      2994  SERVER 
    harp2-g1st      00:10:3b:ff:01:26   gsn1      2994  SERVER 
    harp1-g1        00:10:3b:ff:01:25   gsn1     71331  PRIMARY_SERVER 
    harp1-g1st      00:10:3b:ff:01:25   gsn1     71331  PRIMARY_SERVER 
    harp2-g0        00:10:3b:ff:00:26   gsn0       189  SERVER 
    harp2-g0st      00:10:3b:ff:00:26   gsn0       189  SERVER 
    harp1-g0        00:10:3b:ff:00:25   gsn0     10027  PRIMARY_SERVER 
    harp1-g0st      00:10:3b:ff:00:25   gsn0     10027  PRIMARY_SERVER 
    ruby-g1         00:10:3b:ff:01:13   gsn1        46 
    ruby-g0         00:10:3b:ff:00:13   gsn0        45 
    uranium-g1      00:10:3b:ff:01:27   gsn1      9488 
    uranium-g0      00:10:3b:ff:00:27   gsn0         9  PERM 
    copper-g0       00:10:3b:ff:00:08   gsn0      2492 
    

  6. Use ttcp to verify that the local IP protocol stack is functioning:

    % ttcp -r -s & 
    % ttcp -t -s IPaddress 
    

    where IPaddress is the INET address (as displayed by the netstat command) for the local IP-over-GSN network interface.

  7. Verify that the remote system you tried to ping is operational.

Some IP-over-GSN Network Interfaces Are Failing

If the ping verification tests succeed for one GSN network connection, but others fail, the local IP stack is functioning, but one (or more) specific interface has a problem. To resolve the problem, follow the instructions below for each problematic network connection.

  1. Make sure that you know which IRIX GSN port is associated with the GSN network interface (gsn#) that you are troubleshooting.

  2. Use gsnarp -av to verify that a primary HARP server is listed for the problematic interface (Dev) and to verify that the HARP table has a correct entry for the remote system you tried to ping.

    # /usr/etc/gsnarp -av
    

    The output for this command is shown in step 5 in “All IP-over-GSN Network Interfaces Are Failing”.

  3. Verify that the other endpoint (IP host) is operational.

    Or, as an alternative, select a different station in this LIS, and use the ping -r command with the station's numerical address (instead of the name). If the ping works, the network connection is functional. If the ping fails, proceed to the next step.

  4. Verify that the network portion (leftmost digits) of the addresses you are attempting to ping match the network address for the GSN interface you are troubleshooting. The network address for each GSN network interface can be displayed by the /usr/etc/netstat -in command.

Check all of the cables and nodes between this system and the system(s) you tried to ping. “Verify the SGI GSN Hardware” describes how to do this for the local SGI GSN port and its link.