Monitoring Data Flow Between Coservers

Home | Previous Page | Next Page Monitoring Database Server Performance > Monitoring Sessions and Queries >

Monitoring Data Flow Between Coservers

If the onstat -g xqs output shows data skew, use ISA or onstat -g xmf to monitor the high-speed interconnect between coserver nodes for possible bottlenecks.

Figure 19. Selected onstat -g xmf Output

XMF Information 
--------------- 
  Cosvr_id: 1    Domain_Cnt: 1  

  Poll Information: 

    Domain_ID  Interval   Current    Average    Cycle      Wk_Cycles  In_DG/Sig
    -----------------------------------------------------------------------
    0          10         10         10         621918498  113281   N / N  

  ...

  Coserver Information: 

    ID  X_Msgs  X_Bytes    R_Msgs  R_Bytes   X_Rtrns R_Dupls XOffs XO_Cycls
    -----------------------------------------------------------------------
    1      1529    1109420      1529     1109420       0        0     0   0 
    2      1244    323321     111658     1399806       3        4     0   0 
    3      376     315612        334       57054      11      173     0   0

Follow these general guidelines:

Check the overall statistics in Coserver Information, which is the last section of the onstat -g xmf output.
Coserver Information displays the following statistics by coserver. The values in these fields indicate whether or not traffic between coservers is balanced and not skewed toward any one coserver:
- The X_Msgs and X_Bytes fields display the number of messages and bytes that each coserver transmits.
- The R_Msgs and R_Bytes fields display the number of messages and bytes that each coserver receives.
- The X_Rtrns field displays the number of retransmits to this coserver.
- The R_Dupls field displays the number of duplicate messages that were received from this coserver
Because some high-speed interconnects have limited kernel buffer space for each connection, their buffer space might be exhausted when interconnect traffic is heavy. When the buffer space is exhausted, some packets are dropped and must be retransmitted.

If you see a large number of retransmits (X_Rtrns) with a low number of duplicate packets received (R_Dupls), you might adjust the setting of the SENDEPDS configuration parameter. It is usually not worth altering this parameter unless you see a large number of retransmits with a low number of duplicate packets received, which means that packets are being transmitted but are not arriving at the remote end. For more information about the SENDEPDS parameter, consult your machine notes file.
Check Poll Information in the first part of the onstat -g xmf output:
- The Average field displays the average poll-time interval.
- The Cycle field displays the number of polls.
- The Wk_Cycles field displays the number of polls that have resulted in work.
The values in the Average, Cycle, and Wk_Cycles fields indicate how often the database server checks the interconnect without any work to do, as calculated by the following formula:
```
percent_poll_work = Wk_Cycles / Cycle
```
A low percentage by itself does not indicate a problem. However, if a query is taking a long time to complete, but the percentage of poll time that results in work is low, a problem exists somewhere. The problem might be that a coserver is down, a data skew exists, or the fragmentation strategy is incorrect. To locate the problem, check the status of the nodes and coservers, the statistics in the onstat -g dfm output, and other onstat utility output.