While Ricky E. was working on some improved lockfile checks, I
mentioned having done some checks on the puppetdlock file where I work
that allow us to drop some text into that file giving a reason why
puppet is disabled. It also lets nagios read that content and display
it, making it easier to tell when a box has puppet intentionally
disabled (changing what would be a CRIT for not running in N seconds
into a WARN).
Mike asked me to post this code here (again, in case I did it before).
So, the mildly over-engineered check_puppet and puppetstatus tools are
Here's a brief description I wrote on this to the puppet list when
nagios checks came up a little while ago¹:
I have found it necessary to disable puppet for a short time to work
on something and not have puppet helpfully undo my work more than a
few times. While it's easy to use puppetd --disable to prevent puppet
from running, it's also easy to forget to re-enable it. Or worse, in
a place with multiple SA's, it's easy for someone else to come along
and notice puppetd seems to be 'stuck' and 'helpfully' clear out the
Using 'sudo puppetstatus -d "Testing some foo"' creates the lock file
as puppetd --disable would, but adds the text given and the username
of the person disabling puppet. That then shows up in nagios and if
puppet remains disabled for longer than check_puppet would normally
consider a critical amount of time, it remains a warning if there is a
reason in the lockfile. That also lets other SA's know puppet is down
intentionally so they don't have to bug me or worry about 'fixing' it.
(The checks in the script to chide folks running it as root are more
of a goof, to gently prod admins in the habit of doing everything as
root to stop that.
Todd OpenPGP -> KeyID: 0xBEAF0CE3 | URL: www.pobox.com/~tmz/pgp
Genius is 1% inspiration and 99% perspiration, which is why engineers
sometimes smell really bad.
-- Demotivators (www.despair.com)
infrastructure mailing list