FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor


 
 
LinkBack Thread Tools
 
Old 09-08-2008, 01:56 PM
Mike McGrath
 
Default Last week

Strange week last week, many of you noticed a bunch of nagios outages so I
thought I'd send a roundup of what happened.

1) The big one was what seems to be a corrupt database table. For some
reason running a vacuum on a table (which was only 66M large) was taking a
long time and even after it would finish the disks would thrash for
sometimes 10 minutes after. This caused outages of lots of our systems
like the account system, to which other systems depend. The job was
hourly so thats why it kept happening.

We were able to reproduce this on another host and never quite figured out
what was going on but a dump, drop, restore fixed the issue and so far we
haven't had time to revisit what was going on, just that it hasn't
happened since.

2) Strange network issues towards the end of the week. Seems our round
time to server beach went up causing nagios to flag some hosts as dead.
I've also not yet had time to look into this. The network seems and I
don't think we're seeing any functional issues from it but it was
different.

3) pkgdb's home page started taking longer to load causing our balancer to
start flagging it dead causing it to throw 503's. We only recently moved
it to haproxy so this could be a normal behavior that we just didn't see.
I've moved response time of the front page up to 5 seconds from 2.

-Mike

_______________________________________________
Fedora-infrastructure-list mailing list
Fedora-infrastructure-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list
 
Old 09-08-2008, 03:33 PM
Jesse Keating
 
Default Last week

On Mon, 2008-09-08 at 08:56 -0500, Mike McGrath wrote:
>
> Strange week last week, many of you noticed a bunch of nagios outages so I
> thought I'd send a roundup of what happened.

Any ideas what has been making releng2 flap?

--
Jesse Keating
Fedora -- Freedom˛ is a feature!
identi.ca: http://identi.ca/jkeating
_______________________________________________
Fedora-infrastructure-list mailing list
Fedora-infrastructure-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list
 
Old 09-08-2008, 04:50 PM
Mike McGrath
 
Default Last week

On Mon, 8 Sep 2008, Jesse Keating wrote:

> On Mon, 2008-09-08 at 08:56 -0500, Mike McGrath wrote:
> >
> > Strange week last week, many of you noticed a bunch of nagios outages so I
> > thought I'd send a roundup of what happened.
>
> Any ideas what has been making releng2 flap?
>

I was away this weekend but did see the notice that releng2 rebooted
again. I take it that was not intended? I'll ping you on irc.

-Mike

_______________________________________________
Fedora-infrastructure-list mailing list
Fedora-infrastructure-list@redhat.com
https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list
 

Thread Tools




All times are GMT. The time now is 11:05 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org