FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > Red Hat Linux

 
 
LinkBack Thread Tools
 
Old 11-04-2008, 10:49 AM
"Erling Ringen Elvsrud"
 
Default Defining load thresholds for Nagios

Hello list,

I have been reading and thinking about proper thresholds for
the check_load plugin in Nagios.

My current understanding of load in Linux:

The load average over 1,5, and 15 min in Linux is the number of processes
in running, runnable, and uninterruptable sleep states
(according to the load entry in Wikipedia).
According to the same Wikipedia page processes in the uninterruptable state
usually waits for I/O so both CPU-bound and IO-bound processes
can contribute to the load average.
So if we have a server with many I/O-bound processes the
CPU utilization can be low and the load average can be high.
The number of cores or CPUs also determines the impact of the load.
A load of 8 can therefore mean that all cores in a 2 x 4 core-server are
utilized.

To determine where to set warning and critical thresholds the impact the load
has on the services running must also be taken into account. For
instance on a system running large batch-jobs a high load can be less
of a problem than
on a system running a webserver where users want a response quickly.

So if you had a server where you had little knowledge of the services,
how would you pick thresholds for 1,5, and 15 min warning and 1,5, and
15 min critical?

Thanks,

Erling

--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list
 
Old 11-04-2008, 12:56 PM
hike
 
Default Defining load thresholds for Nagios

On Tue, Nov 4, 2008 at 6:49 AM, Erling Ringen Elvsrud <erlingre@gmail.com>wrote:

> Hello list,
>
> I have been reading and thinking about proper thresholds for
> the check_load plugin in Nagios.
>
> My current understanding of load in Linux:
>
> The load average over 1,5, and 15 min in Linux is the number of processes
> in running, runnable, and uninterruptable sleep states
> (according to the load entry in Wikipedia).
> According to the same Wikipedia page processes in the uninterruptable state
> usually waits for I/O so both CPU-bound and IO-bound processes
> can contribute to the load average.
> So if we have a server with many I/O-bound processes the
> CPU utilization can be low and the load average can be high.
> The number of cores or CPUs also determines the impact of the load.
> A load of 8 can therefore mean that all cores in a 2 x 4 core-server are
> utilized.
>
> To determine where to set warning and critical thresholds the impact the
> load
> has on the services running must also be taken into account. For
> instance on a system running large batch-jobs a high load can be less
> of a problem than
> on a system running a webserver where users want a response quickly.
>
> So if you had a server where you had little knowledge of the services,
> how would you pick thresholds for 1,5, and 15 min warning and 1,5, and
> 15 min critical?
>
> Thanks,
>
> Erling
>
> --
> redhat-list mailing list
> unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
> https://www.redhat.com/mailman/listinfo/redhat-list
>

while not linux, the rule for solaris/sparc is 5 for all.
we use that for solaris/x86 and linux.
--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list
 
Old 11-04-2008, 04:38 PM
"Erling Ringen Elvsrud"
 
Default Defining load thresholds for Nagios

On Tue, Nov 4, 2008 at 2:56 PM, hike <mh1272@gmail.com> wrote:
> while not linux, the rule for solaris/sparc is 5 for all.
> we use that for solaris/x86 and linux.

You use 5 for both warning and critical?
How did you determine that number?

Erling

--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list
 
Old 11-04-2008, 04:52 PM
hike
 
Default Defining load thresholds for Nagios

On Tue, Nov 4, 2008 at 12:38 PM, Erling Ringen Elvsrud
<erlingre@gmail.com>wrote:

> On Tue, Nov 4, 2008 at 2:56 PM, hike <mh1272@gmail.com> wrote:
> > while not linux, the rule for solaris/sparc is 5 for all.
> > we use that for solaris/x86 and linux.
>
> You use 5 for both warning and critical?
> How did you determine that number?
>
> Erling
>
> --
> redhat-list mailing list
> unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
> https://www.redhat.com/mailman/listinfo/redhat-list
>


Oops,,,,that should be 5 per cpu.
1 cpu = 5
4 cpus = 20

anything over 5 is bad; 5 and under is good.
sun classes: sysadmin & hardware.
sun field & system engineers.
sun, being an engineering powerhouse with tons of experience, set the
standard.

of course, i have seen v120s trudge like a dog at ~20 and come to a dead
stop at ~40.

i haven't seen a properly sized linux box even hit a load of 5.
--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list
 

Thread Tools




All times are GMT. The time now is 05:45 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org