Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Fedora Infrastructure (http://www.linux-archive.org/fedora-infrastructure/)
-   -   Last week (http://www.linux-archive.org/fedora-infrastructure/156116-last-week.html)

Mike McGrath 09-08-2008 01:56 PM

Last week
 
Strange week last week, many of you noticed a bunch of nagios outages so I
thought I'd send a roundup of what happened.

1) The big one was what seems to be a corrupt database table. For some
reason running a vacuum on a table (which was only 66M large) was taking a
long time and even after it would finish the disks would thrash for
sometimes 10 minutes after. This caused outages of lots of our systems
like the account system, to which other systems depend. The job was
hourly so thats why it kept happening.

We were able to reproduce this on another host and never quite figured out
what was going on but a dump, drop, restore fixed the issue and so far we
haven't had time to revisit what was going on, just that it hasn't
happened since.

2) Strange network issues towards the end of the week. Seems our round
time to server beach went up causing nagios to flag some hosts as dead.
I've also not yet had time to look into this. The network seems and I
don't think we're seeing any functional issues from it but it was
different.

3) pkgdb's home page started taking longer to load causing our balancer to
start flagging it dead causing it to throw 503's. We only recently moved
it to haproxy so this could be a normal behavior that we just didn't see.
I've moved response time of the front page up to 5 seconds from 2.

-Mike

_______________________________________________
Fedora-infrastructure-list mailing list
Fedora-infrastructure-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list

Jesse Keating 09-08-2008 03:33 PM

Last week
 
On Mon, 2008-09-08 at 08:56 -0500, Mike McGrath wrote:
>
> Strange week last week, many of you noticed a bunch of nagios outages so I
> thought I'd send a roundup of what happened.

Any ideas what has been making releng2 flap?

--
Jesse Keating
Fedora -- Freedomē is a feature!
identi.ca: http://identi.ca/jkeating
_______________________________________________
Fedora-infrastructure-list mailing list
Fedora-infrastructure-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list

Mike McGrath 09-08-2008 04:50 PM

Last week
 
On Mon, 8 Sep 2008, Jesse Keating wrote:

> On Mon, 2008-09-08 at 08:56 -0500, Mike McGrath wrote:
> >
> > Strange week last week, many of you noticed a bunch of nagios outages so I
> > thought I'd send a roundup of what happened.
>
> Any ideas what has been making releng2 flap?
>

I was away this weekend but did see the notice that releng2 rebooted
again. I take it that was not intended? I'll ping you on irc.

-Mike

_______________________________________________
Fedora-infrastructure-list mailing list
Fedora-infrastructure-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list


All times are GMT. The time now is 04:18 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.