Working with check_http and check_bigip_pool to monitor web sites

Happy New Year! Here is a little tip on how we used the check_http and check_bigip_pool to eliminate our need for WebTrends. The servers being monitored are in the DMZ and the Nagios host is inside our firewall. I have deployed the NRPE daemon to our Solaris 9 servers, so we will also need to modify the nrpe.cfg file. The check_bigip_pool is run against our F5 appliance, which we use for load balancing. The check_bigip_pool command is run from the Nagios host against the F5. The following information is step-by-step on how I set these hosts and services up.

Using check_http is very straight forward. I would suggest that you concentrate on receiving a string or file size in addition to the straight HTTP service check. You also want to check the results by running the command form the command line.

Using check_bigip_pool is also straight forward. You will need a hostname (hostname for F5), community (generally public), software version and pool name. The check_bigip_pool will return the number of nodes found running on the F5. So, you will need to work with the warning and critical percentages. HINT: If check_bigip_pool finds one of two nodes running and you only require one node running, set the warning to 49 and critical to 25. When both nodes are down you will receive a critical alert and notification.

Here are the configuration steps for this setup:

Define what is to be monitored (NOTE: this is just an outline, my organizational setup monitors 5 clients and 7 ports per client):

Client					URL				Port	HTTP Response
webstore.<your domain>.com	http://webstore			        80	 OK
webstore.<your domain>.com	http://webstore.com/WatchDog.tem 	80	 OK – I am here
f5host.<your domain>.com		your pool name			count of nodes up

*Service definition are placed within a template file. I prefer to place service definitions within a file that is descriptive of the function. You will also need to create a separate service definition for each service check, because you need to create distinct service_descriptions section. You can leave blanks in the service description; however, I do not because it will remove your ability to acknowledge problems via email. I also used the “hosts” flag instead of hostgroups, because I have 4 webstore servers, but not all of them are looking at the same URLs and ports. Also include a notes section, we will be using it later for our notification email.

Service Definition (nagios_host:/path-to-nagios/etc/objects/services/srv_check_http.cfg):
define service{
        use                     generic-service
        hosts                   hostname
        notes                   description of the service being monitored.
        service_description    	check_webstore
        check_command        	check_nrpe!check_webstore
        max_check_attempts      3
        normal_check_interval   5
        retry_check_interval    1
        check_period            24x7
        contact_groups          webstore_admins
        }
*Notice the variation in the check_command definition? This is so you can run the command
on the Nagios host and be listed under the defined host.
F5 Service Definition (nagios_host:/path-to-nagios/etc/objects/services/f5_checks.cfg):
define service{
        use                      generic-service
        hosts                    f5host
        service_description      webstore_business_station
        notes                    There are 0 nodes online for http://business-station.coat.com.
        check_command            check_webstore_business_station_pool!
        max_check_attempts       3
        normal_check_interval    5
        retry_check_interval     1
        notification_period      24x7
        check_period             24x7
        contact_groups           webstore_admins
        }

*Host definition is placed within a template. I have created a separate template for each classification of hosts, which makes it easier to find and update hosts. The host definition would be the same for the f5 server.

Host Definition (nagios_host:/path_to_nagios/etc/objects/hosts/webstore_servers.cfg):
define host{
        use                      webstore-servers   ; Name of host template to use
        host_name                your host's name
        address                  ip_address
        alias                    short name for your host
        notes                    What does the host do <br>LOCATION:<CONSOLE: >
        icon_image               sun_logo.gif
        icon_image_alt           Sun Host
        hostgroups               Add your host groups for standard monitoring and specialty
                                 monitoring
        notification_period      24x7
        }

*Here is how I defined the commands. For separation purposes, I prefer to the definitions within a distinct configuration file and give each command name a distinct definition. This way if one port is not accessible, you can address the one port issue and not assume the entire web application is down.

Command Definition (nagios_host:/path_to_nagios/etc/objects/commands/cmd_http.cfg):
define command{
        command_name    	check_webstore
        command_line    	$USER1$/check_http  -H $ARG1$ -u $ARG2$ -p $ARG3$
        }
F5 Command Definition (nagios_host:/path_to_nagios/etc/objects/commands/cmd_check_bigip_pool.cfg):
define command{
        command_name    	check_webstore_pool
        command_line    	$USER1$/check_bigip_pool -H $HOSTADDRESS$ -C $ARG1$ -S $ARG2$ -P $ARG3$
                                 -w $ARG4$ -c $ARG5$
        }

*I combine the contactgroup with the contact definition in its own file. You may not have the luxury of email aliases, so this helps me manage the contacts for a specific host or service.

Contact Definition (nagios_host:/path_to_nagios/etc/objects/contacts/webstore_admins.cfg):
define contactgroup{
        contactgroup_name          webstore_admins
        alias                      WEBSTORE ADMINS
        members                    webstore_admins
        }
define contact{
        contact_name               webstore_admins
        alias                      WEBSTORE Admins
        contact_groups             webstore_admins
        host_notifications_enabled      0
        service_notifications_enabled  	1
        service_notification_period     24x7
        host_notification_period        24x7
        service_notification_options   	w,u,c,r
        host_notification_options      	d,r
        service_notification_commands   notify-webstore-service-by-email
        host_notification_commands     	notify-linux-host-by-email
        email                           webstore_support@your_domain.com
        can_submit_commands             1
        }

*Notifications are fun. I keep separate command files, because different groups need to receive different information. Notice that the service_notification_commands in the contacts matches the notification definition command name. Within the definition, you will find the notes section after the “-s” flag and also the method I use for enabling problem acknowledgments via a link within email.

Notification Definition (nagios_host:/path_to_nagios/etc/objects/commands/
webstore_admins_email_notification.cfg):
# Linux 'notify-webstore-service-by-email' command definition
define command{
        command_name    notify-webstore-service-by-email
        command_line    /usr/bin/printf "%b" "If this is a CRITICAL problem, please insure the
ecommerce on-call person is notified.\n\nHostname:\t\t$HOSTNAME$\nService:\t\t$SERVICEDESC$\n
State:\t\t\t$SERVICESTATE$\nDate/Time:\t\t$LONGDATETIME$\nAdditional Info:\t$SERVICEOUTPUT$
\nAcknowledge:\t\thttp://nagios_host.your_domain.com/nagios/cgi-bin/cmd.cgi?cmd_typ=34&host=
$HOSTNAME$&service=$SERVICEDESC$\nAuthor:\t\t$SERVICEACKAUTHOR$\nAcknowledgement:\t$SERVICEACKCOMMENT$"
| /usr/bin/mailx -s "$HOSTALIAS$:$SERVICENOTES$" $CONTACTEMAIL$ -r core.admin@coat.com
        }
Email Notification Message Received:
Subject: <Your service notes>
Body:
If this is a CRITICAL problem, please insure the ecommerce on-call person is notified.
Hostname:               webstore1.coat.com
Service:                check_webstore_8150
State:                  CRITICAL
Date/Time:              Wed Jan 6 02:24:39 EST 2010
Additional Info:        CRITICAL - Socket timeout after 10 seconds
Acknowledge:	        http://nagios_host.your_domain.com/nagios/cgi-bin/cmd.cgi?cmd_typ=34
                        &host=$service=service_name
Author:
Acknowledgement:

Hopefully this helps with your monitoring of web sites. Any questions please feel free to email me at mikhail@ebusinessjuncture.com. Here is a link to the document HTTP Monitoring Write-up

If anyone has comments or another method to perform this task, please leave a comment.

Bookmark and Share

0 Response to “Working with check_http and check_bigip_pool to monitor web sites”


  • No Comments

Leave a Reply

You must login to post a comment.