Thank you for pointing me in the right direction. I ran strace on both and the telling item was
shutdown(3, 1 /* send */)************** = -1 ENOTCONN (Transport endpoint is not connected)
on akagi. As it turns out, restarting the client (again) fixed the problem. Thanks for reminding me about strace...Being a sysadmin and not a programmer, I forget about this tool.
On Fri, Aug 26, 2011 at 10:03 PM, David Parker <firstname.lastname@example.org> wrote:
----- Original Message -----
From: Brad Alexander <email@example.com>
Date: Friday, August 26, 2011 8:13 pm
Subject: [Spam: 5.2] Re: [OT] Nagios question
To: David Parker <firstname.lastname@example.org>
> On Fri, Aug 26, 2011 at 6:41 PM, David Parker <email@example.com> wrote:
> ----- Original Message -----
> From: Brad Alexander <firstname.lastname@example.org>
> Date: Friday, August 26, 2011 6:07 pm
> Subject: [OT] Nagios question
> To: Debian-user List <email@example.com>
> > There is a nagios plugin called check_ubc, which checks for increasing /proc/user_beancounters. This is an indication of a spec that needs to be tuned for the vm. In any case, I can run the check_ubc from the command line on both machines, but on the original server, hornet, when I run it from nrpe on the nagios server:
> > # /usr/local/nagios/libexec/check_nrpe -H hornet -c check_ubc
> > OK.
> > On the new server, built two days ago, same configuration, I get
> > # /usr/local/nagios/libexec/check_nrpe -H akagi -c check_ubc
> > NRPE: Command 'check_ubc' not defined
> > I have the check_ubc script in /usr/local/nagios/libexec/nrpe_local on both, the command is defined in/usr/local/nagios/etc/nrpe_local/override.cfg with an include in /usr/local/nagios/etc/nrpe.cfg. Everything is the same on both servers. Why does the command work on one but not the other?
> Just to clarify, the configs are *exactly* the same?* Are these two machines different architectures (32-bit vs. 64-bit, etc.)?
> Nope. Both are Dell PE 1850s with dual 3.2GHz Xeons. The only differing factor is that one has 2GB of RAM and the other has 6GB.
That's really strange.* This may seem obvious, but are the permissions on the config file correct, and is it readable by the user who is running this command?* Also, is the Nagios version the same on the two boxes?
Does strace show anything?* Try:
*** strace -o strace.out /usr/local/nagios/libexec/check_nrpe -H akagi -c check_ubc
Then check strace.out and see if it shows anything along the lines of permissions errors, parsing errors, etc.
*** - Dave