Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   CentOS (http://www.linux-archive.org/centos/)
-   -   new "large" fileserver config questions (http://www.linux-archive.org/centos/708947-new-large-fileserver-config-questions.html)

Keith Keller 10-02-2012 03:39 AM

new "large" fileserver config questions
 
Hi all,

I was recently charged with configuring a new fairly large (24x3TB
disks) fileserver for my group. I think I know mostly what I want to do
with it, but I did have two questions, at least one of which is directly
related to CentOS.

1) The controller node has two 90GB SSDs that I plan to use as a
bootable RAID1 system disk. What is the preferred method for laying
out the RAID array? I found this document on the wiki:

http://wiki.centos.org/HowTos/Install_On_Partitionable_RAID1

But that seems like it's somewhat nonstandard. From what I've read in
the RHEL6 docs, it seems like the anaconda-supported RAID1 install is
the alternative that the wiki suggests, partitioning the disk and adding
the partitions to the appropriate RAID1 array.

So, is there a happy medium, where anaconda more directly supports the
partitionable RAID1 install method? And if so, what are the drawbacks
to such a configuration? The wiki talks about the advantages but
doesn't really address any disadvantages.

2) With large arrays you often hear about "aligning the filesystem to
the disk". Is there a fairly standard way (I hope using only CentOS
tools) of going about this? Are the various mkfs tools smart enough to
figure out how an array is aligned on its own, or is sysadmin
intervention required on such large arrays? (If it helps any, the disk
array is backed by a 3ware 9750 controller. I have not yet decided how
many disks I will use in the array, if that influences the alignment.)

--keith


--
kkeller@wombat.san-francisco.ca.us


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

John R Pierce 10-02-2012 04:57 AM

new "large" fileserver config questions
 
On 10/01/12 8:39 PM, Keith Keller wrote:
> The controller node has two 90GB SSDs that I plan to use as a
> bootable RAID1 system disk. What is the preferred method for laying
> out the RAID array?

a server makes very little use of its system disks after its booted,
everything it needs ends up in cache pretty quickly. and you typically
don't reboot a server very often. why waste SSD for that?

I'd rather use SSD for something like LSI Logic's CacheCade v2 (but this
requires you use a LSI SAS raid card too)

> 2) With large arrays you often hear about "aligning the filesystem to
> the disk". Is there a fairly standard way (I hope using only CentOS
> tools) of going about this? Are the various mkfs tools smart enough to
> figure out how an array is aligned on its own, or is sysadmin
> intervention required on such large arrays? (If it helps any, the disk
> array is backed by a 3ware 9750 controller. I have not yet decided how
> many disks I will use in the array, if that influences the alignment.)

I would suggest not using more than 10-11 disks in a single raid group
or the rebuild times get hellaciously long (11 x 3TB SAS2 RAID6 took 12
hours to rebuild when I ran tests). if this is for nearline bulk
storage, I'd use 2 disks as hot spares, and have 2 seperate RAID5 or 6
of 11 disks, then stripe those together so its raid 5+0 or 6+0. if
this is for higher performance storage, I would build mirrors and stripe
them (raid 1+0)

re: alignment, use the whole disks, without partitioning. then there's
no alignment issues. use a raid block size of like 32k. if you need
multiple file systems, put the whole mess into a single LVM vg, and
create your logical volumes in lvm.




--
john r pierce N 37, W 122
santa cruz ca mid-left coast

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Rafa Griman 10-02-2012 07:25 AM

new "large" fileserver config questions
 
Hi :)

On Tue, Oct 2, 2012 at 6:57 AM, John R Pierce <pierce@hogranch.com> wrote:
> On 10/01/12 8:39 PM, Keith Keller wrote:
>> The controller node has two 90GB SSDs that I plan to use as a
>> bootable RAID1 system disk. What is the preferred method for laying
>> out the RAID array?
>
> a server makes very little use of its system disks after its booted,
> everything it needs ends up in cache pretty quickly. and you typically
> don't reboot a server very often. why waste SSD for that?
>
> I'd rather use SSD for something like LSI Logic's CacheCade v2 (but this
> requires you use a LSI SAS raid card too)

Just add to this comment that you can also use the SSD drives to store
the logs/journals/metadata/whatever_you_call_it.

As an example, with XFS you would use the -l option.

Rafa
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Nux! 10-02-2012 07:59 AM

new "large" fileserver config questions
 
On 02.10.2012 08:25, Rafa Griman wrote:
> Just add to this comment that you can also use the SSD drives to
> store
> the logs/journals/metadata/whatever_you_call_it.
>
> As an example, with XFS you would use the -l option.
>
> Rafa

I'd use the SSDs for bcache/flashcache.

--
Sent from the Delta quadrant using Borg technology!

Nux!
www.nux.ro
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Akemi Yagi 10-02-2012 09:10 AM

new "large" fileserver config questions
 
On Tue, Oct 2, 2012 at 12:59 AM, Nux! <nux@li.nux.ro> wrote:

> I'd use the SSDs for bcache/flashcache.

Try kmod-flashcache [1] and flashcache-utils [2] from ELRepo. Still in
the testing repository but seems to work well. Some testimony and
additional package by John Newbigin can be found here [3].

Akemi

[1] http://elrepo.org/tiki/kmod-flashcache
[2] http://elrepo.org/tiki/flashcache-utils
[3] https://groups.google.com/forum/?fromgroups=#!topic/flashcache-dev/sHnurG502eo
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

John Doe 10-02-2012 09:30 AM

new "large" fileserver config questions
 
From: Keith Keller <kkeller@wombat.san-francisco.ca.us>

> 1) The controller node has two 90GB SSDs that I plan to use as a
> bootable RAID1 system disk.* What is the preferred method for laying
> out the RAID array?

See the "Deployment Considerations" about SSDs and RAID:
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/newmds-ssdtuning.html

> 2) With large arrays you often hear about "aligning the filesystem to
> the disk".* Is there a fairly standard way (I hope using only CentOS
> tools) of going about this?* Are the various mkfs tools smart enough to
> figure out how an array is aligned on its own, or is sysadmin
> intervention required on such large arrays?* (If it helps any, the disk
> array is backed by a 3ware 9750 controller.* I have not yet decided how
> many disks I will use in the array, if that influences the alignment.)

>From memory:
For alignment, first partition starts at 2048.
For filesystem, call mkfs with appropriate -E stride=xxx,stripe-width=yyy
Stride = RAID Stripeisize_KB / FS blocksize_KB
Stripe-width = Stride * RAID_number_of_data_holding_disks (RAID6 = n-2 by example)

JD
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Keith Keller 10-02-2012 06:47 PM

new "large" fileserver config questions
 
On 2012-10-02, John R Pierce <pierce@hogranch.com> wrote:
>
> a server makes very little use of its system disks after its booted,
> everything it needs ends up in cache pretty quickly. and you typically
> don't reboot a server very often. why waste SSD for that?

I think the impetus (which I wasn't totally on top of) was to maximize
the number of drive bays in the controller node. So the bays are 2.5"
instead of 3.5", and finding 2.5" 'enterprise' SATA drives is fairly
nontrivial from what I can tell. I don't actually need 8 2.5" drive
bays, so that was an oversight on my part.

After reading the SSD/RAID docs that John Doe posted, I am a little
concerned, but I think my plan will be to use these disks as I
originally planned, and if they fail too quickly, find some 2.5"
magnetic drives and RAID1 them instead. I may also end up putting /tmp,
/var, and swap on the disk array instead of on the SSD array, and treat
the SSD array as just the write-seldom parts of the OS (e.g., /boot,
/usr, /usr/local). If I do that I should be able to alleviate any
issues with excessive writing of the SSDs.

I am not sure what drives I have, but I have seen claims of "enterprise"
SSDs which are designed to be up 24/7 and be able to tolerate more
writes before fatiguing. Has anyone had experience with these drives?

> re: alignment, use the whole disks, without partitioning. then there's
> no alignment issues. use a raid block size of like 32k. if you need
> multiple file systems, put the whole mess into a single LVM vg, and
> create your logical volumes in lvm.

So, something like mkfs.xfs will be able to determine the proper stride
and stripe settings from whatever the 3ware controller presents? (The
controller of course uses whole disks, not partitions.) From reading
other sites and lists I had the (perhaps mistaken) impression that this
was a delicate operation, and not getting it exactly correct would cause
performance issues, possibly set fire to the entire data center, and
even cause the next big bang.

--keith

--
kkeller@wombat.san-francisco.ca.us


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Rafa Griman 10-03-2012 07:29 AM

new "large" fileserver config questions
 
Hi :)

On Tue, Oct 2, 2012 at 8:47 PM, Keith Keller
<kkeller@wombat.san-francisco.ca.us> wrote:
> On 2012-10-02, John R Pierce <pierce@hogranch.com> wrote:
>>
>> a server makes very little use of its system disks after its booted,
>> everything it needs ends up in cache pretty quickly. and you typically
>> don't reboot a server very often. why waste SSD for that?
>
> I think the impetus (which I wasn't totally on top of) was to maximize
> the number of drive bays in the controller node. So the bays are 2.5"
> instead of 3.5", and finding 2.5" 'enterprise' SATA drives is fairly
> nontrivial from what I can tell. I don't actually need 8 2.5" drive
> bays, so that was an oversight on my part.
>
> After reading the SSD/RAID docs that John Doe posted, I am a little
> concerned, but I think my plan will be to use these disks as I
> originally planned, and if they fail too quickly, find some 2.5"
> magnetic drives and RAID1 them instead. I may also end up putting /tmp,
> /var, and swap on the disk array instead of on the SSD array, and treat
> the SSD array as just the write-seldom parts of the OS (e.g., /boot,
> /usr, /usr/local). If I do that I should be able to alleviate any
> issues with excessive writing of the SSDs.


If it works with you ... I mean, there's no perfect partition scheme
(IMHO), depends greatly on what you do, your budget, workflow, file
size, ... So if you're happy with this, go ahead. Just some advice:
test a couple of different options first just in case ;)


> I am not sure what drives I have, but I have seen claims of "enterprise"
> SSDs which are designed to be up 24/7 and be able to tolerate more
> writes before fatiguing. Has anyone had experience with these drives?
>
>> re: alignment, use the whole disks, without partitioning. then there's
>> no alignment issues. use a raid block size of like 32k. if you need
>> multiple file systems, put the whole mess into a single LVM vg, and
>> create your logical volumes in lvm.
>
> So, something like mkfs.xfs will be able to determine the proper stride
> and stripe settings from whatever the 3ware controller presents?


Yup, even though you've got the sw and su options in case you want to
play around ... With XFS, you shouldn't have to use su and sw ... in
fact you shouldn't have to use many options since it tries to
autodetect and use the best options. Check the XFS FAQ.


> (The
> controller of course uses whole disks, not partitions.) From reading
> other sites and lists I had the (perhaps mistaken) impression that this
> was a delicate operation, and not getting it exactly correct would cause
> performance issues, possibly set fire to the entire data center, and
> even cause the next big bang.


Nope, just mass extinction of the Human Race. Nothing to worry about.

HTH

Rafa
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Keith Keller 10-03-2012 06:01 PM

new "large" fileserver config questions
 
On 2012-10-03, Rafa Griman <rafagriman@gmail.com> wrote:
>
> If it works with you ... I mean, there's no perfect partition scheme
> (IMHO), depends greatly on what you do, your budget, workflow, file
> size, ... So if you're happy with this, go ahead. Just some advice:
> test a couple of different options first just in case ;)

Well, given the warnings about SSD endurance, I didn't want to do
excessive testing and contribute to faster wear. But I've been reading
around, and perhaps I'm just overreacting. For example:

http://www.storagesearch.com/ssdmyths-endurance.html

This article talks about RAID1 potentially being better for increasing
SSD lifetime, despite the full write that mdadm will want to do.

So. For now, let's just pretend that these disks are not SSDs, but
regular magnetic disks. Do people have preferences for either of the
methods for creating a bootable RAID1 I mentioned in my OP? I like the
idea of using a partitionable RAID, but the instructions seem
cumbersome. The anaconda method is straightforward, but simply creates
RAID1 partitions, AFAICT, which is fine till a disk needs to be replaced,
then gets slightly annoying.

> Yup, even though you've got the sw and su options in case you want to
> play around ... With XFS, you shouldn't have to use su and sw ... in
> fact you shouldn't have to use many options since it tries to
> autodetect and use the best options. Check the XFS FAQ.

Well, I'm also on the XFS list, and there are varying opinions on this.
>From what I can tell most XFS experts suggest just as you do--don't
second-guess mkfs.xfs, and let it do what it thinks is best. That's
certainly what I've done in the past. But there's a vocal group of
posters who think this is incredibly foolish, and strongly suggest
determing these numbers on your own. If there were a straightforward
way to do this with standard CentOS tools (well, plus tw_cli if needed)
then I could try both methods and see which worked better. John Doe
suggested a guideline which I may try out. But my gut instinct is that
I shouldn't try to second-guess mkfs.xfs.

> Nope, just mass extinction of the Human Race. Nothing to worry about.

So, it's a win-win? ;-)

--keith

--
kkeller@wombat.san-francisco.ca.us


_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

John R Pierce 10-04-2012 02:25 AM

new "large" fileserver config questions
 
On 10/02/12 2:10 AM, Akemi Yagi wrote:
> On Tue, Oct 2, 2012 at 12:59 AM, Nux!<nux@li.nux.ro> wrote:
>
>> >I'd use the SSDs for bcache/flashcache.
> Try kmod-flashcache [1] and flashcache-utils [2] from ELRepo. Still in
> the testing repository but seems to work well. Some testimony and
> additional package by John Newbigin can be found here [3].
>
>


I'm looking for those, but not seeing them...


# yum list --enablerepo=epel-testing kmod-flashcache
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.ecvps.com
* epel: mirrors.kernel.org
* epel-testing: mirrors.kernel.org
* extras: linux.mirrors.es.net
* updates: mirrors.easynews.com
Error: No matching Packages to list

# cat /etc/redhat-release
CentOS release 6.3 (Final)

# uname -a
Linux xxxxx.xxx.domain.com 2.6.32-279.9.1.el6.x86_64 #1 SMP Tue Sep 25
21:43:11 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux



--
john r pierce N 37, W 122
santa cruz ca mid-left coast

_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


All times are GMT. The time now is 08:46 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.