FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor


 
 
LinkBack Thread Tools
 
Old 06-19-2008, 12:05 AM
"Mag Gam"
 
Default stride

I am trying to understand the stride option for ext3 .

If I am using a Hardware RAID (3ware) with 6 disks and I decide to go with RAID 5 with stripe of 128KB (default on my controller) and no spare.
By reading documentation I should do 128/4 as my stride size when creating the file system. I am not understanding how this number works and what exactly stride does. Can someone care to explain this to me?



TIA


_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-19-2008, 12:14 AM
"Mag Gam"
 
Default stride

I am trying to understand the stride setting for ext3 .

If I am
using a Hardware RAID (3ware) with 6 disks and I decide to go with RAID
5 with stripe of 128KB (default on my controller) and no spare.
By
reading documentation I should do 128/4 as my stride size when creating
the file system. I am not understanding how this number works and what
exactly stride does. Can someone care to explain this to me?



TIA
_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-19-2008, 05:47 AM
Andreas Dilger
 
Default stride

On Jun 18, 2008 20:14 -0400, Mag Gam wrote:
> If I am using a Hardware RAID (3ware) with 6 disks and I decide to go with
> RAID 5 with stripe of 128KB (default on my controller) and no spare.
> By reading documentation I should do 128/4 as my stride size when creating
> the file system. I am not understanding how this number works and what
> exactly stride does. Can someone care to explain this to me?

The "stride" option changes the location of some of the filesystem metadata
so that it isn't all located on the same disk.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-19-2008, 10:21 AM
"Mag Gam"
 
Default stride

ok, in a way its like a stripe? I though when you do a stripe you put the metadata on number of disks too. How is that different? Is there a diagram I can refer to?


TIA



On Thu, Jun 19, 2008 at 1:47 AM, Andreas Dilger <adilger@sun.com> wrote:

On Jun 18, 2008 *20:14 -0400, Mag Gam wrote:

> If I am using a Hardware RAID (3ware) with 6 disks and I decide to go with

> RAID 5 with stripe of 128KB (default on my controller) and no spare.

> By reading documentation I should do 128/4 as my stride size when creating

> the file system. I am not understanding how this number works and what

> exactly stride does. Can someone care to explain this to me?



The "stride" option changes the location of some of the filesystem metadata

so that it isn't all located on the same disk.



Cheers, Andreas

--

Andreas Dilger

Sr. Staff Engineer, Lustre Group

Sun Microsystems of Canada, Inc.





_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-19-2008, 11:42 AM
Theodore Tso
 
Default stride

On Thu, Jun 19, 2008 at 06:21:24AM -0400, Mag Gam wrote:
> ok, in a way its like a stripe? I though when you do a stripe you put the
> metadata on number of disks too. How is that different? Is there a diagram I
> can refer to?

Yes, which is why the mke2fs man page states:

stride=<stripe-size>
Configure the filesystem for a RAID array with
<stripe-size> filesystem blocks per stripe.

So if the size of a stripe on each a disk is 64k, and you are using a
4k filesystem blocksize, then 64k/4k == 16, and that would be an
"ideal" stride size, in that for each successive block group, the
inode and block bitmap would increased by an offset of 16 blocks from
the beginning of the block group.

The reason for doing this is to avoid problems where the block bitmap
ends up on the same disk for every single block group. The classic
case where this would happen is if you have a 5 disks in a RAID 5
configuration, which means with 4 disks per stripe, and 8192 blocks in
a blockgroup, then if the block bitmap is always at the same offset
from the beginning of the block group, one disk will get all of the
block bitmaps, and that ends up being a major hot spot problem for the
hard drive.

As it turns out, if you use 4 disks in a RAID 5 configuration, or 6
disks in a RAID 5 configuration, this problem doesn't arise at all,
and you don't need to use the stride option. And in most cases,
simply using a stride=1, that is actually enough to make sure that
each block and inode bitmaps will get forced onto successively
different disks.

With ext4's flex_bg enhancement, the need to specify stride option of
RAID arrays will also go away.

- Ted

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-20-2008, 01:17 AM
"Mag Gam"
 
Default stride

What happens if you use a hardware raid, should the stride option be considered? It seems you are referring to software raid, correct?
TIA


On Thu, Jun 19, 2008 at 7:42 AM, Theodore Tso <tytso@mit.edu> wrote:

On Thu, Jun 19, 2008 at 06:21:24AM -0400, Mag Gam wrote:

> ok, in a way its like a stripe? I though when you do a stripe you put the

> metadata on number of disks too. How is that different? Is there a diagram I

> can refer to?



Yes, which is why the mke2fs man page states:



* * * *stride=<stripe-size>

* * * * * * * *Configure *the *filesystem *for *a *RAID *array with

* * * * * * * *<stripe-size> filesystem blocks per stripe.



So if the size of a stripe on each a disk is 64k, and you are using a

4k filesystem blocksize, then 64k/4k == 16, and that would be an

"ideal" stride size, in that for each successive block group, the

inode and block bitmap would increased by an offset of 16 blocks from

the beginning of the block group.



The reason for doing this is to avoid problems where the block bitmap

ends up on the same disk for every single block group. *The classic

case where this would happen is if you have a 5 disks in a RAID 5

configuration, which means with 4 disks per stripe, and 8192 blocks in

a blockgroup, then if the block bitmap is always at the same offset

from the beginning of the block group, one disk will get all of the

block bitmaps, and that ends up being a major hot spot problem for the

hard drive.



As it turns out, if you use 4 disks in a RAID 5 configuration, or 6

disks in a RAID 5 configuration, this problem doesn't arise at all,

and you don't need to use the stride option. *And in most cases,

simply using a stride=1, that is actually enough to make sure that

each block and inode bitmaps will get forced onto successively

different disks.



With ext4's flex_bg enhancement, the need to specify stride option of

RAID arrays will also go away.



* * * * * * * * * * * * * * * * * * * * * * * * * * * *- Ted



_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-20-2008, 02:08 AM
Theodore Tso
 
Default stride

On Thu, Jun 19, 2008 at 09:17:45PM -0400, Mag Gam wrote:
> What happens if you use a hardware raid, should the stride option be
> considered? It seems you are referring to software raid, correct?

It doesn't matter whethre it is hardware or software raid. What
matters is the *geometry* of the RAID array. i.e., how many
filesystem blocks are in an individual disk's stripe, and how many
disks are in use (minus how many parity disks are in use). This
information may be somewhat more hidden in a hardware raid array, but
it is possible to extract this information, and most hardware raid
arrays will allow you to configure these parameters as well, to
varying degrees of flexibility.

- Ted

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-20-2008, 10:21 AM
"Mag Gam"
 
Default stride

Ted,

This is the type of information I was looking for. No seems to explain this well.

Also, on the same topic. For a very large filesystem ie, 3TB, should I consider anything special, something like -O dir_index? I am looking for peek performance.



TIA


On Thu, Jun 19, 2008 at 10:08 PM, Theodore Tso <tytso@mit.edu> wrote:

On Thu, Jun 19, 2008 at 09:17:45PM -0400, Mag Gam wrote:

> What happens if you use a hardware raid, should the stride option be

> considered? It seems you are referring to software raid, correct?



It doesn't matter whethre it is hardware or software raid. *What

matters is the *geometry* of the RAID array. *i.e., how many

filesystem blocks are in an individual disk's stripe, and how many

disks are in use (minus how many parity disks are in use). *This

information may be somewhat more hidden in a hardware raid array, but

it is possible to extract this information, and most hardware raid

arrays will allow you to configure these parameters as well, to

varying degrees of flexibility.



* * * * * * * * * * * * * * * * * * * *- Ted



_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 06-22-2008, 12:34 AM
Christian Kujau
 
Default stride

On Fri, 20 Jun 2008, Mag Gam wrote:

consider anything special, something like -O dir_index? I am looking for
peek performance.


Depends on how many files, directories, small/big files,
reads/writes...etc.


There are various benchmarks and tuning hints for ext3 around, but if you
want peak performance, you're better off testing *your* application with
different mkfs/mount options and see what's best for *you*.


my 2 cents,
C.
--
BOFH excuse #391:

We already sent around a notice about that.

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 

Thread Tools




All times are GMT. The time now is 11:54 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org