On Dec 23, 2010, at 11:21 AM, Christopher Chan <christopher.chan@bradbury.edu.hk> wrote:
> On Thursday, December 23, 2010 11:08 PM, Ross Walker wrote:
>> On Dec 23, 2010, at 2:12 AM, cpolish@surewest.net wrote:
>>
>>> Matt wrote:
>>>> Is ext4 stable on CentOS 5.5 64bit? I have an email server with a
>>>> great deal of disk i/o and was wandering if ext4 would be better then
>>>> ext3 for it?
>>>
>>> Before committing to ext4 on a production server, it
>>> would be good to consider the comments made in
>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
>>> which presumably still apply to current CentOS 5.5 64-bit kernels.
>>> As I read it, Ts'o argues that the apparent loss of stability
>>> compared to ext3 is a design issue in the realm of applications
>>> that run atop it. I hope this is not a misreading.
>>
>> Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes.
>
> Whoa, whoa. 1) Theodore was not pushing fsync, he was pushing fsyncdata
> and switching from storing configuration in thousands of small files to
> everything in one sqlite database or similar single file solution 2) you
> bet applications that are data sensitive better be properly written and
> making proper and efficient use of system calls such as fsync, fsyncdata
> and whatever else there is and 3) write barriers were introduced to
> ensure that fsync/fsyncdata do not lie unlike the previous behaviour
> where they return before data is safely written to media. In the case of
> email, you bet the entire toolchain better do fsync. postmark from
> Netapp as a benchmark for mail delivery was completely laughable because
> it does not use a single fsync call whereas all mta credible software
> (sendmail, postfix, qmail, exim) use fsync/fsyncdata where needed.
> Unless you want thousands of zero'd files in say the mail queue, you
> better make sure that both the app and the filesystem do what they are
> supposed to do. Which is use fsync/fsyncdata and filesystem must support
> write barriers if disk write caches are to be left on or disable disk
> write caches and take big performance hit.
>
> If a filesystem does not support write barriers (like JFS) you bet it is
> a concern to take note of with regard to your hardware (eg: do you have
> hardware raid with sufficient BBU cache?). Then there is the case of
> running on top of LVM which I suspect does not have write barrier
> support backported to RHEL/Centos 5.5.
>
>
>>
>> It was only a side-effect of ext3's journal=ordered that caused it to flush dirty pages every 5 seconds. If that's what you want then you can use sysctl to tune vm to flush every 5 seconds and that will cover all delayed write file systems.
>>
>
> More precisely, the journal is committed every 5 seconds no matter what
> the mode.
>
> I'd stick with ext3 + data=journal with the journal either on some uber
> fast and large external BBU nvram block device (you can get up to 1TB
> with speeds of 750MiB/sec+ if you have a fat enough bus) or on hardware
> raid with sufficient BBU cache for an email server. Or anything with
> barrier support through the entire chain (read: no LVM).
I believe barrier support is being deprecated in current kernels. Don't remember what they are replacing them with, straight FUA maybe. Barriers never performed well enough.
If you have BBU write cache there is no need to worry about barriers.
-Ross
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
12-23-2010, 08:30 PM
Ross Walker
Ext4 on CentOS 5.5 64bit
On Dec 23, 2010, at 4:25 PM, Ross Walker <rswwalker@gmail.com> wrote:
> On Dec 23, 2010, at 11:21 AM, Christopher Chan <christopher.chan@bradbury.edu.hk> wrote:
>
>> On Thursday, December 23, 2010 11:08 PM, Ross Walker wrote:
>>> On Dec 23, 2010, at 2:12 AM, cpolish@surewest.net wrote:
>>>
>>>> Matt wrote:
>>>>> Is ext4 stable on CentOS 5.5 64bit? I have an email server with a
>>>>> great deal of disk i/o and was wandering if ext4 would be better then
>>>>> ext3 for it?
>>>>
>>>> Before committing to ext4 on a production server, it
>>>> would be good to consider the comments made in
>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
>>>> which presumably still apply to current CentOS 5.5 64-bit kernels.
>>>> As I read it, Ts'o argues that the apparent loss of stability
>>>> compared to ext3 is a design issue in the realm of applications
>>>> that run atop it. I hope this is not a misreading.
>>>
>>> Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes.
>>
>> Whoa, whoa. 1) Theodore was not pushing fsync, he was pushing fsyncdata
>> and switching from storing configuration in thousands of small files to
>> everything in one sqlite database or similar single file solution 2) you
>> bet applications that are data sensitive better be properly written and
>> making proper and efficient use of system calls such as fsync, fsyncdata
>> and whatever else there is and 3) write barriers were introduced to
>> ensure that fsync/fsyncdata do not lie unlike the previous behaviour
>> where they return before data is safely written to media. In the case of
>> email, you bet the entire toolchain better do fsync. postmark from
>> Netapp as a benchmark for mail delivery was completely laughable because
>> it does not use a single fsync call whereas all mta credible software
>> (sendmail, postfix, qmail, exim) use fsync/fsyncdata where needed.
>> Unless you want thousands of zero'd files in say the mail queue, you
>> better make sure that both the app and the filesystem do what they are
>> supposed to do. Which is use fsync/fsyncdata and filesystem must support
>> write barriers if disk write caches are to be left on or disable disk
>> write caches and take big performance hit.
>>
>> If a filesystem does not support write barriers (like JFS) you bet it is
>> a concern to take note of with regard to your hardware (eg: do you have
>> hardware raid with sufficient BBU cache?). Then there is the case of
>> running on top of LVM which I suspect does not have write barrier
>> support backported to RHEL/Centos 5.5.
>>
>>
>>>
>>> It was only a side-effect of ext3's journal=ordered that caused it to flush dirty pages every 5 seconds. If that's what you want then you can use sysctl to tune vm to flush every 5 seconds and that will cover all delayed write file systems.
>>>
>>
>> More precisely, the journal is committed every 5 seconds no matter what
>> the mode.
>>
>> I'd stick with ext3 + data=journal with the journal either on some uber
>> fast and large external BBU nvram block device (you can get up to 1TB
>> with speeds of 750MiB/sec+ if you have a fat enough bus) or on hardware
>> raid with sufficient BBU cache for an email server. Or anything with
>> barrier support through the entire chain (read: no LVM).
>
> I believe barrier support is being deprecated in current kernels. Don't remember what they are replacing them with, straight FUA maybe. Barriers never performed well enough.
>
> If you have BBU write cache there is no need to worry about barriers.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
12-24-2010, 02:04 AM
Christopher Chan
Ext4 on CentOS 5.5 64bit
On Friday, December 24, 2010 05:25 AM, Ross Walker wrote:
>> I'd stick with ext3 + data=journal with the journal either on some uber
>> fast and large external BBU nvram block device (you can get up to 1TB
>> with speeds of 750MiB/sec+ if you have a fat enough bus) or on hardware
>> raid with sufficient BBU cache for an email server. Or anything with
>> barrier support through the entire chain (read: no LVM).
>
> I believe barrier support is being deprecated in current kernels. Don't remember what they are replacing them with, straight FUA maybe. Barriers never performed well enough.
Yea, I get a heads up. I did not know about barriers until after a year
it was implemented. But I'll still want to use some nvram solution from
umem or fusionio for scenarios where a lot of files are transient or
created and deleted almost immediately and high performance is required.
Storage usually means get the hardware to mitigate data loss due to
whatever or ensure barrier/whatever support.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
12-24-2010, 02:17 AM
Christopher Chan
Ext4 on CentOS 5.5 64bit
On Friday, December 24, 2010 01:03 AM, Les Mikesell wrote:
> On 12/23/2010 10:28 AM, Christopher Chan wrote:
>>
>>>>> Matt wrote:
>>>>>> Is ext4 stable on CentOS 5.5 64bit? I have an email server with a
>>>>>> great deal of disk i/o and was wandering if ext4 would be better then
>>>>>> ext3 for it?
>>>>>
>>>>> Before committing to ext4 on a production server, it
>>>>> would be good to consider the comments made in
>>>>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45
>>>>> which presumably still apply to current CentOS 5.5 64-bit kernels.
>>>>> As I read it, Ts'o argues that the apparent loss of stability
>>>>> compared to ext3 is a design issue in the realm of applications
>>>>> that run atop it. I hope this is not a misreading.
>>>>
>>>> Waiting for applications to be properly written, ie use fsync(), is no way to pick a file system. You'd have the same problems on xfs or any other file system that does delayed writes.
>>>
>>> But note that the reason applications don't use fsync() when they should
>>> is probably due to linux historically not implementing it in a
>>> reasonable way (i.e. it would flush the entire filesystem buffer and
>>> wait for completion instead of just the requested file's outstanding
>>> blocks). Not sure when/if that was fixed - but it is also probably
>>> behind the old impressions that mysql is faster than postgresql.
>>>
>>
>> Can we drop the fsync nonsense?
>
> No, if you don't remember history you are doomed to repeat it.
Well, come to think of it, I guess most open source apps are developed
on Linux and so its implementation does colour how devs think about
fsync...nevermind that it is done properly on the BSDs and UNIXes
>
>> Apps that are data sensitive should be
>> using fsync/fsyncdata (fsync is a posix specification so the history of
>> how linux implemented fsync has nothing to do with whether applications
>> used it or not) otherwise it should not be even consider for the task.
>> The lying fsync/fsyncdata was fixed when write barrier support was
>> introduced and filesystems updated to use write barriers. As for the
>> flush entire buffer...IIRC, that is specific to ext3 and even that
>> should be now gone with the update to write barrier support.
>
> It's one of those 'have you stopped beating your wife things'. Apps
> that correctly used fsync were slow because of the OS implementation, so
> people used other apps. So now you have popular apps that do things
> wrong.
>
Yeah, funny how ext3 managed to become the dominant Linux filesystem
when it was the one with the flush everything quirk and at a time when
fsync did not really honour the 'yes it is safely on the platters'
maxim. Let's thank Redhat for this mess.
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos