FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > CentOS > CentOS

 
 
LinkBack Thread Tools
 
Old 02-09-2010, 04:15 PM
Fernando Gleiser
 
Default disk I/O problems with LSI Logic RAID controller

we're having a weird disk I/O problem on a 5.4 server connected to an external SAS storage with an LSI logic megaraid sas 1078.

The server is used as a samba file server.

Every time we try to copy some large file to the storage-based file system, the disk utilization see-saws up to 100% to several seconds of inactivity, to climb up again to 100% and so forth.
Here are a snip from the iostat -kx 1:

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sdb1 0.00 133811.00 0.00 1889.00 0.00 513660.00 543.84 126.24 65.00 0.47 89.40
sdb1 0.00 138.61 0.00 109.90 0.00 29845.54 543.14 2.54 54.32 0.37 4.06
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 134680.00 0.00 1920.00 0.00 526524.00 548.46 126.06 64.57 0.47 90.00
sdb1 0.00 142.00 0.00 74.00 0.00 20740.00 560.54 1.25 45.14 0.47 3.50
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 0.00 1.00 0.00 4.00 0.00 8.00 0.01 14.00 14.00 1.40
sdb1 0.00 116129.00 1.00 1576.00 4.00 434816.00 551.45 125.47 75.38 0.57 90.30
sdb1 0.00 17301.98 0.00 412.87 0.00 106506.93 515.93 24.59 75.40 0.48 19.80
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00


It happens when I copy a file over the net using samba or when copying/creating a local file

It looks like the disk tries to get more than it can handle, then it chokes with data and stales for a few seconds until some buffer empties and it's able to get a bit more data again.

It happens in two identical servers, so I'd discard faulty hardware as the cause and look into a miscofiguration issue.

Are there any guidelines/docs for heavy I/O tuning? are there any issues with this raid controller?

any help will be apreciated



Fer



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-09-2010, 04:34 PM
"nate"
 
Default disk I/O problems with LSI Logic RAID controller

Fernando Gleiser wrote:
> we're having a weird disk I/O problem on a 5.4 server connected to an
> external SAS storage with an LSI logic megaraid sas 1078.

Not sure I know what the issue is but telling us how many disks,
what the RPM of the disks are, and what level of RAID would probably
help.

It sounds like perhaps you have a bunch of 7200RPM disks in a RAID
setup where the dataarity ratio may be way out of whack(e.g. high
number of data disks to parity disks), which will result in very
poor write performance.

nate


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-09-2010, 08:33 PM
Fernando Gleiser
 
Default disk I/O problems with LSI Logic RAID controller

----- Original Message ----

> From: nate <centos@linuxpowered.net>
>
> Not sure I know what the issue is but telling us how many disks,
> what the RPM of the disks are, and what level of RAID would probably
> help.
>
> It sounds like perhaps you have a bunch of 7200RPM disks in a RAID
> setup where the dataarity ratio may be way out of whack(e.g. high
> number of data disks to parity disks), which will result in very
> poor write performance.


yes, ita bunch of 12 7k2 RPM disks organized as 1 hot spare, 2 parity disks, 9 data disks in a RAID 5 configuration. is 9/2 a "high ratio"?

Thanks for your help

Fer



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-09-2010, 08:53 PM
"nate"
 
Default disk I/O problems with LSI Logic RAID controller

Fernando Gleiser wrote:

> yes, ita bunch of 12 7k2 RPM disks organized as 1 hot spare, 2 parity disks,
> 9 data disks in a RAID 5 configuration. is 9/2 a "high ratio"?

Perhaps RAID 6, as I've never heard of RAID 5 with two parity (two
parity is dual parity which is RAID 6).

RAID 6 performance can vary dramatically between controllers, if it
were me unless you get any other responses shortly I would test other
RAID configurations and see how the performance compares

RAID 1+0
RAID 5+0 (striped RAID 5 arrays, in your case perhaps 3+1 * 4 w/no hot
spares? at least for testing)
RAID 5+0 (5+1 * 2)

RAID 1+0 should be first though, even if you don't end up using it
in the end, it's good to get a baseline with the fastest configuration.

I would expect the RAID card to support RAID 50, but not all do, if it
doesn't one option may be to perform striping using LVM at the OS
level.

nate

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-09-2010, 08:56 PM
"nate"
 
Default disk I/O problems with LSI Logic RAID controller

nate wrote:

> Perhaps RAID 6, as I've never heard of RAID 5 with two parity (two
> parity is dual parity which is RAID 6).

Forgot to mention my own personal preference on my high end SAN at
least is for RAID 5 with a 3:1 parity ratio, or a max of 5:1 or 6:1,
really never higher than that unless activity is very low.

The RAID controllers on my array are the fastest in the industry,
and despite that, in the near future I am migrating to a parity
ratio of 2:1 to get (even)better performance, that brings me to
within about 3-4% of RAID 1+0 performance for typical workloads.

nate


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-09-2010, 09:26 PM
Ross Walker
 
Default disk I/O problems with LSI Logic RAID controller

On Tue, Feb 9, 2010 at 4:33 PM, Fernando Gleiser <fergleiser@yahoo.com> wrote:
> ----- Original Message ----
>
>> From: nate <centos@linuxpowered.net>
>>
>> Not sure I know what the issue is but telling us how many disks,
>> what the RPM of the disks are, and what level of RAID would probably
>> help.
>>
>> It sounds like perhaps you have a bunch of 7200RPM disks in a RAID
>> setup where the dataarity ratio may be way out of whack(e.g. high
>> number of data disks to parity disks), which will result in very
>> poor write performance.
>
>
> yes, ita bunch of 12 7k2 RPM disks organized as 1 hot spare, 2 parity
> disks, 9 data disks in a RAID 5 configuration. is 9/2 a "high ratio"?

A bit. Your RAID array is configured for a read-mostly configuration.

Here is a simple rule, given you have a HW RAID controller with
write-back cache, assume each write will span the whole stripe
width (controller tries to cache full stripe writes), if that is the
case then the write IOPS will be equal to the IOPS of your slowest
disk within the set as the next write can't go until the first write
has finished.

Of course with RAID5/RAID6 the write performance can be much, much
worse if the write is short of the whole stripe width as it will then
have to read the remaining stripe set (in order to caclulate parity),
then write the whole stripe set out. It sounds like your data is
sequential though so this shouldn't happen much, maybe the first or
last stripes, so using the above simple rule is a good guide.

For software RAID5/RAID6 that doesn't have a write-cache to cache the
stripe-width, make sure the file system knows the stripe width and
hope it does the write thing

-Ross
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-11-2010, 06:46 AM
Andrzej Szymanski
 
Default disk I/O problems with LSI Logic RAID controller

On 2010-02-09 18:15, Fernando Gleiser wrote:
> Every time we try to copy some large file to the storage-based file system, the disk utilization see-saws up to 100% to several seconds of inactivity, to climb up again to 100% and so forth.
> Here are a snip from the iostat -kx 1:
>
> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
> sdb1 0.00 133811.00 0.00 1889.00 0.00 513660.00 543.84 126.24 65.00 0.47 89.40

The iostat output looks good to me for the RAID setup you have.
I'd look for the problem in a different place:

note the output of
cat /proc/sys/vm/dirty_background_ratio
and try
echo 1 > /proc/sys/vm/dirty_background_ratio
whether it helps.

Andrzej
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-11-2010, 02:30 PM
Ross Walker
 
Default disk I/O problems with LSI Logic RAID controller

On Feb 11, 2010, at 2:46 AM, Andrzej Szymanski <szymans@agh.edu.pl>
wrote:

> On 2010-02-09 18:15, Fernando Gleiser wrote:
>> Every time we try to copy some large file to the storage-based file
>> system, the disk utilization see-saws up to 100% to several seconds
>> of inactivity, to climb up again to 100% and so forth.
>> Here are a snip from the iostat -kx 1:
>>
>> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-
>> sz avgqu-sz await svctm %util
>> sdb1 0.00 133811.00 0.00 1889.00 0.00 513660.00
>> 543.84 126.24 65.00 0.47 89.40
>
> The iostat output looks good to me for the RAID setup you have.
> I'd look for the problem in a different place:
>
> note the output of
> cat /proc/sys/vm/dirty_background_ratio
> and try
> echo 1 > /proc/sys/vm/dirty_background_ratio
> whether it helps.

Excellent suggestion, on machines with lots of memory the default
dirty background ratio is way too big, and needs to be tuned down for
both data integrity in the event of a system failure and performance
of the underlying storage configuration.

Take into account the RAID setup, write-back cache size and time it
takes to empty it to disk and pick a dirty background ratio somewhere
in between.

-Ross

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-11-2010, 09:42 PM
Fernando Gleiser
 
Default disk I/O problems with LSI Logic RAID controller

----- Original Message ----
> From: Ross Walker <rswwalker@gmail.com>
> To: CentOS mailing list <centos@centos.org>
> Cc: CentOS mailing list <centos@centos.org>
> Sent: Thu, February 11, 2010 12:30:43 PM
> Subject: Re: [CentOS] disk I/O problems with LSI Logic RAID controller
>
> On Feb 11, 2010, at 2:46 AM, Andrzej Szymanski
> wrote:
>
> > The iostat output looks good to me for the RAID setup you have.
> > I'd look for the problem in a different place:
> >
> > note the output of
> > cat /proc/sys/vm/dirty_background_ratio
> > and try
> > echo 1 > /proc/sys/vm/dirty_background_ratio
> > whether it helps.
>
> Excellent suggestion, on machines with lots of memory the default
> dirty background ratio is way too big, and needs to be tuned down for
> both data integrity in the event of a system failure and performance
> of the underlying storage configuration.
>
> Take into account the RAID setup, write-back cache size and time it
> takes to empty it to disk and pick a dirty background ratio somewhere
> in between.

You nailed it. I tweaked the dirty_background ratio and changed the scheduller to deadline and now it works way better. it still see-saws a bit but the utilization never dropts to zero.


Thanks you for your help.



Fer



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 
Old 02-11-2010, 09:42 PM
Fernando Gleiser
 
Default disk I/O problems with LSI Logic RAID controller

----- Original Message ----
> From: Ross Walker <rswwalker@gmail.com>
> To: CentOS mailing list <centos@centos.org>
> Cc: CentOS mailing list <centos@centos.org>
> Sent: Thu, February 11, 2010 12:30:43 PM
> Subject: Re: [CentOS] disk I/O problems with LSI Logic RAID controller
>
> On Feb 11, 2010, at 2:46 AM, Andrzej Szymanski
> wrote:
>
> > The iostat output looks good to me for the RAID setup you have.
> > I'd look for the problem in a different place:
> >
> > note the output of
> > cat /proc/sys/vm/dirty_background_ratio
> > and try
> > echo 1 > /proc/sys/vm/dirty_background_ratio
> > whether it helps.
>
> Excellent suggestion, on machines with lots of memory the default
> dirty background ratio is way too big, and needs to be tuned down for
> both data integrity in the event of a system failure and performance
> of the underlying storage configuration.
>
> Take into account the RAID setup, write-back cache size and time it
> takes to empty it to disk and pick a dirty background ratio somewhere
> in between.

You nailed it. I tweaked the dirty_background ratio and changed the scheduller to deadline and now it works way better. it still see-saws a bit but the utilization never dropts to zero.


Thanks you for your help.



Fer



_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
 

Thread Tools




All times are GMT. The time now is 02:29 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org