jojo's blog--快樂憂傷都與你同在
          為夢想而來,為自由而生。 性情若水,風(fēng)起水興,風(fēng)息水止,故時而激蕩,時又清平……
          posts - 11,  comments - 30,  trackbacks - 0

          This was one of the search terms that found an article here… I hadn’t addressed this directly, but I use Nagios to monitor my company’s server environment, and specifically implemented that monitoring for IBM Websphere MQ.

          For MQ, I run nagios monitoring against queue depth and processes. I installed three plugins to run against WebSphere. Of these one was developed for my company’s needs (qdepth), one was changed slightly (channels) and the last debugged, found not to actually measure accurately, and not resolved (message age).

          Here’s the nagios console for the websphere MQ server. “message age” in the second qdepth check service title is deceptive - actually checking qdepth…

          nagios

          This is the commands section from the nrpe.cfg file on the WebSphere MQ server.


          command[check_mq_channel]=/usr/local/nagios/libexec/check_mq_channel.sh $ARG1$ $ARG2$
          command[check_mq_msgage]=/usr/local/nagios/libexec/check_mq_msgage.sh $ARG1$ $ARG2$ $ARG3$ $ARG4$
          command[wmq_check_qdepth]=/usr/local/nagios/libexec/wmq_check_qdepth.pl $ARG1$ $ARG2$ $ARG3$

          Of these we only really using qdepth monitoring. The channels come up triggered, so an inactive state is fine, and the plugin as written only tests for “running”. The message age plugin, as I mentioned, doesn’t actually work.

          When I first looked at setting this messaging up and then monitoring it, I searched for “nagios monitoring MQ webshere” and found several pre-written plugins. I took each plugin and tested it for usability and for accurate results and for meeting what we needed for monitoring.

          The message age plugin, in testing, actually returned a hard-coded result rather than actually testing and returning a valid answer. I started to fix it, set it aside and haven’t resolved it. I don’t recall the source for the plugin. Check each piece of code you download from the internet - it may have gone through extensive development and testing, or it could just as easily have been hacked together in an hour. Your mileage may seriously vary and I would highly recommend you verify any of this before you bet your job on it.

          Here’s the qdepth plugin - I think I wrote or re-wrote this pretty much from scratch, but the original concept for parsing runmcsc came from one of the plugins I downloaded, written by Kyle O’Donnell - the channel plugin has his original author credit intact. This plugin has alerted once to an increasing qdepth, which turned out to be an issue with an SSL certificate.



          #! /bin/perl

          ## wmq_check_qdepth.pl
          #
          # nrpe (nagios) script to check websphere qdepth

          # uses runmqsc binary
          #
          # display queue ('APP.REQUEST')
          # 8 : display queue ('APP.REQUEST')
          # AMQ8409: Display Queue details.
          # QUEUE(APP.REQUEST) TYPE(QLOCAL)
          # ACCTQ(QMGR) ALTDATE(2008-01-22)
          # ALTTIME(14.18.23) BOQNAME( )
          # BOTHRESH(0) CLUSNL( )
          # CLUSTER( ) CLWLPRTY(0)
          # CLWLRANK(0) CLWLUSEQ(QMGR)
          # CRDATE(2008-01-22) CRTIME(14.18.23)
          # CURDEPTH(0) DEFBIND(OPEN)
          # DEFPRTY(0) DEFPSIST(NO)
          # DEFSOPT(SHARED) DEFTYPE(PREDEFINED)
          # DESCR( ) DISTL(NO)
          # GET(ENABLED) HARDENBO
          # INITQ( ) IPPROCS(0)
          # MAXDEPTH(5000) MAXMSGL(4194304)
          # MONQ(QMGR) MSGDLVSQ(PRIORITY)
          # NOTRIGGER NPMCLASS(NORMAL)
          # OPPROCS(0) PROCESS( )
          # PUT(ENABLED) QDEPTHHI(80)
          # QDEPTHLO(20) QDPHIEV(DISABLED)
          # QDPLOEV(DISABLED) QDPMAXEV(ENABLED)
          # QSVCIEV(NONE) QSVCINT(999999999)
          # RETINTVL(999999999) SCOPE(QMGR)
          # SHARE STATQ(QMGR)
          # TRIGDATA( ) TRIGDPTH(1)
          # TRIGMPRI(0) TRIGTYPE(FIRST)
          # USAGE(NORMAL)

          ### Variables ###

          # test values set if this flag is true (1)
          ### THIS MUST BE SET TO 0 IN PRODUCTION!!! ###
          my $test = 0;

          # debug flag (adds messages)
          my $debug = 0;
          my $LOG = "/tmp/wmq_check_qdepth.pl.log";

          # runmqsc binary
          my $MQSC = "/opt/mqm/bin/runmqsc";

          ### ARGS ###

          # first argument is warn level
          my $WARN = shift;
          # second arg is crtitical level
          my $CRIT = shift;

          # third arg is queue name
          my $QUEUE = shift;

          # set for dev purposes
          if ($test) {
          $WARN = 5;
          $CRIT = 10;
          $QUEUE = "1A33.EVG.REQUEST";
          }

          # validate
          # WARN and CRIT must be greater than 0 and CRIT must be greater than WARN
          unless (($WARN > 0) && ($CRIT > 0)) {
          print ("Command Failed: WARN and CRIT levels must be greater than 0"n");
          exit 3;
          }
          unless ($CRIT > $WARN) {
          print ("Command Failed: CRIT must be greater than WARN"n");
          exit 4;
          }

          ### Subs ###

          ### MAIN ###

          # run query
          my $result = `echo "display queue ('${QUEUE}')" | $MQSC | grep CURDEPTH`;
          print ("result: $result"n") if $debug;
          # parse result
          my @lines = split (""n", $result); # divide into an array by end of line...
          # each element of the array will contain a single line
          # set variables
          my ($PARAM, $VALUE);

          for my $line (@lines) {
          # each line is one or two elements like "QDPLOEV(DISABLED) QDPMAXEV(ENABLED)"
          # divide those...
          my ($first, $discard) = split (' ', $line);
          print (""$first: $first "$discard $discard"n") if $debug;
          ($PARAM, $VALUE) = split ('"(', $first);
          $VALUE =~ s/")//;
          print (""$PARAM: $PARAM "$VALUE: $VALUE"n") if $debug;
          }

          # testing value
          $VALUE = 13 if $test;
          # check for $WARN and $CRIT levels, exit 0 as OK, 1 as warn or 2 as critical
          if ($VALUE == 0) {
          print ("OK: found qdepth for $QUEUE at 0"n");
          exit 0;
          } elsif ($VALUE < $WARN) {
          print ("OK: found qdepth for $QUEUE at $VALUE"n");
          exit 0;
          } elsif (($VALUE >= $WARN) && ($VALUE < $CRIT)) {
          print ("WARN: qdepth of $QUEUE is at $VALUE: exceeds WARN thresh of $WARN"n");
          exit 1;
          } elsif ($VALUE >= $CRIT) {
          print (”CRITICAL: qdepth for $QUEUE at $VALUE: exceeds CRITICAL thresh of $CRIT"n”);
          exit 2;
          }


          This is the channel status plugin - I may have re-written the original data gathering runmssc string, but the majority of the plugin remained intact…



          #!/bin/ksh
          #
          # check queue manager status
          #
          # Kyle O'Donnell
          #
          #$Id: check_mq_channel,v 1.2 2007/04/04 14:36:02 kodonnel Exp $
          #
          # debug
          DATE=`date`
          LOG=”/tmp/nrpe_check_mq_channel.sh.log”
          echo “” >> $LOG
          echo $DATE >> $LOG
          echo “” >> $LOG
          [ $# -ne 2 ] && echo “usage: $0 ” && exit 3
          channel=$1
          qmgr=$2
          echo “channel: $channel qmanager: $qmgr” >> $LOG
          RUNMQSC=”/opt/mqm/bin/runmqsc”
          chanstatus=`echo “dis chs(${channel}) status” | ${RUNMQSC} ${qmgr} | grep -i “status(running)”`
          echo “channel status result: $chanstatus” >> $LOG
          if echo $chanstatus |grep -i “status(running)” > /dev/null 2>&1; then
          STATE=0
          printf “${channel} on ${qmgr} running”
          echo “”
          echo “”
          else
          STATE=2
          printf “${channel} on ${qmgr} not running”
          echo “”
          echo “”
          fi
          echo “state: $STATE” >> $LOG
          exit $STATE;

          Here’s the server.cfg file for the Websphere MQ machine on the nagios server:



          define service {
          use generic-service
          host_name mq1
          service_description Host Alive
          check_period 24x7
          contact_groups unix-administrators
          notification_period 24x7
          check_command check-host-alive
          }

          define service {
          use generic-service
          host_name mq1
          service_description Sonic Bridge java process
          check_period 24x7
          contact_groups esb-administrators
          notification_period 24x7
          check_command check_unix_proc!mqm!1!java
          }

          define service {
          use generic-service
          host_name mq1
          service_description SSB queue depth EVGPQM01.DEAD.QUEUE message age
          check_period 24x7
          contact_groups systems-services,help_desk
          notification_period 24x7
          check_command wmq_check_qdepth!1!3!QMGR01!QMGR01.DEAD.QUEUE
          }

          define service {
          use generic-service
          host_name mq1
          service_description server queue depth APPLICATION.RESPONSE
          check_period 24x7
          contact_groups systems-services,help_desk
          notification_period 24x7
          check_command wmq_check_qdepth!5!10!APPLICATION.RESPONSE
          }

          define service {
          use generic-service
          host_name mq1
          service_description server queue depth OPPOSITE-QMGR
          check_period 24x7
          contact_groups systems-services,help_desk
          notification_period 24x7
          check_command wmq_check_qdepth!5!10!OPPOSITE-QMGR
          }

          define service {
          use generic-service
          host_name mq1
          service_description WMQ command server
          check_period 24x7
          contact_groups systems-services,help_desk
          notification_period 24x7
          check_command check_unix_proc!mqm!1!amqpcsea
          }

          define service {
          use generic-service
          host_name mq1
          service_description WMQ Critical process manager
          check_period 24x7
          contact_groups systems-services,help_desk
          notification_period 24x7
          check_command check_unix_proc!mqm!1!amqzmuc0
          }


          The strategy is to monitor qdepth and processes specific to IBM WebSphere MQ on the Websphere MQ server, along with the normal UNIX processes and disk space.

          — dsm

          Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.



          One Response to “how to monitor ibm mq from nagios”

          1. Nice article! I hadn’t been paying attention and didn’t know there were modules for Nagios to check WMQ.

            How about this for checking queue depth…


            # Isolate the CURDEPTH element to a line, then strip the attribute so Perl gets only the value
            my $curdepth = `echo "display queue ('${QUEUE}')" | $MQSC $QMGRNAME | tr ')' '"n' | tr ' ' '"n' | grep CURDEPTH | tr '(' '"n' | grep -v CURDEPTH`;


            if ($curdepth == '0') {
            # Look for literal '0'
            } elsif ($curdepth > 0) {
            # Check against $WARN and $CRIT
            } else {
            # runmqsc command failed. QMgr down?
            }

            Note that I added the QMgr name to the runmqsc command to handle cases where the QMgr is not set as the default or there is more than one. I also added logic to catch the case where WMQ is down.

            In the case of process monitoring, amqzmuc0 is the log formatter and I don’t think it runs in all cases. A better choice might be amqzxma0 which is the execution controller.

            – T.Rob

          posted on 2009-05-06 15:51 Blog of JoJo 閱讀(2087) 評論(0)  編輯  收藏 所屬分類: Linux 技術(shù)相關(guān)

          <2025年6月>
          25262728293031
          1234567
          891011121314
          15161718192021
          22232425262728
          293012345

          常用鏈接

          留言簿(6)

          隨筆檔案

          文章分類

          文章檔案

          新聞分類

          新聞檔案

          相冊

          收藏夾

          搜索

          •  

          最新評論

          閱讀排行榜

          評論排行榜

          主站蜘蛛池模板: 绥德县| 尚义县| 潜江市| 新和县| 尖扎县| 许昌县| 探索| 斗六市| 仙居县| 宁远县| 白沙| 泽州县| 凤山县| 秦安县| 潼关县| 乌鲁木齐县| 库车县| 左权县| 通道| 岗巴县| 昭觉县| 电白县| 石城县| 徐水县| 萨嘎县| 方正县| 威信县| 来安县| 吐鲁番市| 珠海市| 邯郸市| 社旗县| 隆尧县| 乐安县| 英德市| 城口县| 麻阳| 长治市| 鹰潭市| 阳新县| 上饶县|