Troubleshooting IBM MQ
Documentation
The MQ library has links to documentation for all versions of MQ: http://www-01.ibm.com/software/integration/wmq/library/index.html
Basic Display Commands
dspmqinst lists the MQ installations on the machine
dspmqver shows the MQ version and patch level
dspmq lists queue managers on the local machine, and the status of each one
Multiple Installations of MQ on the Same Machine
Starting with MQ v7.1, it is possible to install multiple copies of MQ (of the same or different version) on a single machine. (Prior to this, MQ directory pathnames had been hard-coded so it was not possible to install more than one copy of MQ.) Each separate instance of MQ on a machine is referred to as an "installation," and you can choose where in the filesystem each installation should be based, subject to a few restrictions. All installations still share a common MQ data directory tree -- it is only the MQ binaries which are kept separate for different installations. On Unix-like systems, the /etc/opt/mqm/mqinst.ini command lists the currently-existing installations, and the directory path to each one. The dspmqinst command also lists the installations on a machine.
There is a command named setmqinst which can be used to set all appropriate environment variables to point to a particular installation, as a means of determining which of the multiple MQ installations on a machine will be referenced when you issue other MQ commands. For example, ". /opt/mqm/bin/setmqenv -s" on a Linux machine sets the MQ environment variables to refer to the copy of MQ that lives in /opt/mqm. If you are having problems with "command not found" errors or the like, you may need to issue the setmqenv command. Each queue manager is associated with a particular installation, so you may also need to issue setmqinst if you get errors saying that your queue manager is associated with a different installation.
Log Files
There is a "high-level" errors directory at the top of the MQ tree, and each queue manager also has its own errors directory. The high-level errors directory has messages that do not pertain to a specific queue manager. Note that the high-level MQ directory named "log" contains transaction logs, not error logs.
Unix default locations: /var/mqm/errors and /var/mqm/qmgrs/<QM_NAME>/errors
Windows prior to MQ v8.0: \Program Files\IBM\WebSphere MQ\errors and \Program Files\IBM\WebSphere MQ\qmgrs\<QM_NAME>\errors
Windows v8.0: C:\ProgramData\IBM\MQ\errors and C:\ProgramData\IBM\MQ\qmgrs\<QM_NAME>\errors; note that the C:\ProgramData directory is typically a "hidden" directory
Each "errors" directory always contains exactly 3 log files: AMQERR01.LOG, AMQERR02.LOG, and AMQERR03.LOG
MQ automatically rolls the log files, so AMQERR01.LOG is always most recent
Maximum size can be controlled via ErrorLogSize in the QMErrorLog stanza of qm.ini on Unix, or via MQ Explorer on Windows (queue manager Properties > Extended)
Application event log on Windows also contains MQ events
Location of error logs on all MQ platforms: http://www-01.ibm.com/support/docview.wss?uid=swg21172370
Reason Codes and Error Messages
The mqrc command can decode a 4-digit MQ reason code, for example: mqrc 2035
Understanding common MQ reason codes: http://www-01.ibm.com/support/docview.wss?uid=swg21167821
Common MQ error messages (AMQxxxx codes) and most likely causes: http://www-1.ibm.com/support/docview.wss?uid=swg21265188
2007 MQ Problem Determination presentation: http://www-01.ibm.com/support/docview.wss?uid=swg27009878
First-failure Support Technology (FST), First-failure Data Capture (FDC)
Intended to log enough information about unexpected events (not routine MQ errors) that the problem can be resolved without further recreation and tracing.
Located in the top-level errors directory, plain text format, never purged by MQ.
Named like AMQnnnnn.x.FDC
Probe severity: 1 = Critical, 2 = Error, 3 = Warning, 4 = Informational
Issue the ffstsummary command from the errors directory to get a summary listing
IBM Hursley lab article on FFST files: >https://hursleyonwmq.wordpress.com/2007/05/04/introduction-to-ffsts/
Tech note on FFST files: http://www-01.ibm.com/support/docview.wss?uid=swg21304647
Tracing
MQ tracing can be started and stopped from the command line, and also from MQ Explorer.
Command-line options allow you to choose the desired level of detail
Output goes to the "trace" subdirectory at the top of the MQ tree
One active trace output file per MQ process; suffixes .TRC and .TRS are used for rollover (.TRC is more recent)
Unix requires an additional step, to format the trace output into humanreadable form (.FMT files)
New in MQ v7: strmqtrc -c to start tracing, and automatically stop after an FDC when a specific Probe ID is generated
Detailed tracing instructions for various MQ components on many OS platforms: http://www-1.ibm.com/support/docview.wss?uid=swg21174924
"Open mic:" MQ developers talk about MQ tracing: http://www-01.ibm.com/support/docview.wss?uid=swg27018159
Tracing and debugging 2035 authorization failures:
Commands to Enable and Disable Tracing
Enable tracing: strmqtrc
Reproduce the problem
End tracing: endmqtrc
On Unix: use dspmqtrc to translate binary trace output files to text format
Result: text files with names ending in .TRS and .TRC on Windows; binary .TRS and TRC and human-readable .FMT files on Unix
Real Time Monitoring
Checking queue manager and channel statistics while MQ is running
Must be enabled before MQ will start recording data (default is not to collect most of this information)
Queue manager attributes MONQ, MONCHL
NONE = disabled, no matter what the queues/channels say
OFF= off, but individual queues and channels can override
LOW, MEDIUM, HIGH = enabled, individual queues and channels can override
Queue attribute MONQ and channel attribute MONCHL
QMGR = use the queue manager attribute setting
OFF, LOW, MEDIUM, HIGH (LOW, MEDIUM, and HIGH are equivalent for queues
Defaults are queue manager OFF, queue and channel = QMGR
runmqsc
DISPLAY QSTATUS (queueName)
DISPLAY CHSTATUS (channelName)
MQ Explorer: right-click the queue name, click Status
Fields
MSGAGE: age of oldest message on the queue, in seconds
QTIME: average time in microseconds between put and get (recent average and long-term average)
LGETTIME and LGETDATE: time/date of last get operation
LPUTTIME and LPUTDATE: time/date of last put operation
UNCOM: pending uncommitted puts and gets
Some queue status attributes do not require monitoring to be enabled:
CURDEPTH: current queue depth (number of messages on the queue)
IPPROCS, OPPROCS: number of processes that have the queue open for input (can get messages) and for output (can put messages)
DISPLAY QL (queueName) CURDEPTH IPPROCS OPPROCS
MONCHL=off
STATUS; MCASTAT, SUBSTATE: channel and MCA state information
CURSEQNO: sequence number of last message sent or received
BTYSSENT, BYTSRCVD: number of bytes sent and received since the channel was started
MSGS: number of messages sent or received since the channel was started
LSTMSGTI, LSTMSGDA: time and date of last message sent or received
MONCHL=enabled
NETTIME: recent and long-term average network round-trip times in microseconds for request/response to/from the other end of the channel
- Requires MONCHL = MEDIUM or HIGH
XQTIME: average times in microseconds that messages were on the transmission queue before being retrieved
Requires MONCHL = HIGH
Sender channels only (same with NETTIME)
Event Monitoring
An instrumentation event is a logical combination of events that is detected by a queue manager or channel instance. Such an event causes the queue manager or channel instance to put a special message, called an event message, on an event queue.
Event messages go to one of a small set of system-defined event queues (SYSTEM.ADMIN.*.EVENT), depending on their type. Event message payloads are in binary format, not human-readable text.
Decode
There is a sample program in the InfoCenter to partially decode them, and you could build on that program; OR
Use Support Pac MS0P: an extension to MQ Explorer that decodes event messages into readable text
Windows Perfmon can also be used to visually monitor queue depth
Queue Depth
Queue depth events, a type of performance event, will show up in the SYSTEM.ADMIN.PERFM.EVENT queue
Documented here:
Enable PERFMEV on the queue manager
Enable some or all of QDPMAXEV, QDPHIEV, QDPLOEV on the queue
Set MAXDEPTH, QDEPTHHI, QDEPTHLO (the last two are percentages) on the queue
ALTER QMGR PERFMEV (ENABLED)
DEFINE QLOCAL (MY_Q)
ALTER QL (MY_Q) MAXDEPTH (10) QDPMAXEV (ENABLED) +
QDEPTHHI (50) + QDPHIEV (ENABLED) +
QDEPTHLO(30) QDPLOEV (DISABLED)
Now put messages on the queue (I attempted to put 11 messages, using amqsput; the 11th put failed, of course)
CURDEPTH of SYSTEM.ADMIN.PERFM.EVENT is incremented after the 5th and the 11th put
MS0P
MS0P: http://www-01.ibm.com/support/docview.wss?uid=swg24011617
Installation is just a matter of unzipping into the right place, modifying one text file, then strmqcfg -c
After that, you can right-click a queue manager, then do Event Messages > Format Events...
Can watch individual queues, showing number of puts and gets, plus bargraph of queue depth, every N seconds (configurable via Window > Preferences)
Not Authorized Events
"Queue manager events" include six types of "not-authorized" events
Messages appear in SYSTEM.ADMIN.QMGR.EVENT
To enable: ALTER QMGR AUTHOREV (ENABLED)
Put and Get Programs
"Bindings mode" (communicate with queue manager via IPC, works only on the queue manager machine): amqsput, amqsget
"Client mode" (uses TCP and MQ channels, works from remote machines too): amqsputc, amqsgetc
Command-line arguments: queue name and queue manager name; e.g. amqsput ORDERQ QM_1
Need full path (e.g. /opt/mqm/samp/bin/amqsput) on Unix
"Put" programs allow you to type text, sending one message for each line; "get" programs retrieve and display messages
SupportPacs
A few useful SupportPacs:
IH03 (RFHutil): GUI to put and get messages, decode and display message headers, etc
MO04: SSL setup wizard
MQ Health Checker
http://www.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.mon.doc/q036150_.htm
MQ SupportPacs: http://www-01.ibm.com/support/docview.wss?uid=swg27007205
developerWorks article about SupportPacs: http://www.ibm.com/developerworks/websphere/techjournal/0909_mismes/0909_mismes.html
Message Monitoring
The process of identifying the route a message has taken through a queue manager network
Can be done in two ways:
Setting a flag in any MQ message can cause special "activity report" messages to be generated; or
Special "trace-route" messages can be sent; activity information is accumulated in the message payload
The dspmqrte program uses these techniques to trace message flow through an MQ network
SupportPac MS0P also has trace-route functionality
Setup SOURCE and TARGET queue managers
Right-click Q.ON.TARGET (a remote queue definition on queue manager SOURCE) in MQ Explorer, select Trace Route
Reference: http://www.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.mon.doc/q036600_.htm
Retry on Server Down
To retry for server going down (e.g. reason code 2162): Application Servers > $SERVER > Message Listener Service > Content > Additional Properties > Custom Properties
MAX.RECOVERY.RETRIES=N
RECOVERY.RETRY.INTERVAL=60