FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Redhat > EXT3 Users

 
 
LinkBack Thread Tools
 
Old 01-31-2009, 11:45 AM
Ric Wheeler
 
Default barrier and commit options?

Theodore Tso wrote:
- If I remember the details correctly, Chris Mason has demonstrated a
50% chance of corruption directory entries in ext3 for example.



Chris Mason has a script which forces the system to be under a lot of
memory pressure, and in that scenario, it is highly likely that
without barriers, there will be filesystem corruptions if the system
is abruptly turned off while his script is running.

Andrew Monrton has been resistant in making barriers=1 be the default
for ext3 because (as I understand it) he disbelieves that this is an
adequate real-world example, and there is a real performance hit to
running without barriers.


If you have a battery backed write cache (say, in a high end array)
barriers can be ignored since the storage can effectively make that
write cache non-volatile, but otherwise, this is pretty key for
anyone wanting to maintain data integrity,




That's what I getting at, array controllers with a battery backed
write cache (BBWC). We disable the write cache on the physical
disks and provide no mechanism to re-enable the cache except in
some SATA configurations.



Well, we still need the barrier on the block I/O elevantor side to
make sure that requests don't get reordered in the block layer. But
what you're saying is that once the write is posted to the array, it
is guaranteed that it is on "stable storage" (even if it is BBWC) such
that if someone hits the Big Red Switch at the exit to the data
center, and power is forcibly cut from the entire data center in case
of a fire, the battery will still keep the cache alive, at least until
the sprinklers go off, anyway, right? :-)



Yes, true....

In that case, I suspect the right thing for the cciss array to do is
to ignore the barrier, but not to return an error. If you return an
error, and refuse the write with barrier operation (which is what the
cciss driver seems to be doing starting in 2.6.29-rcX), ext4 will
retry the write without the barrier, at which point we are vulnerable
to the block layer reordering things at the I/O scheduler layer. In
effect, you're claiming that every single write to cciss is implicitly
a "barrier write" in that once it is received by the device, it is
guaranteed not to be lost even if the power to the entire system is
forcibly removed.

- Ted



Aren't barriers tied still to the state of the write cache on the target
drive? In other words, if the write cache is off, we disable barriers
automatically. I think that this happens for scsi in sd_revalidate_disk().


In this case, it sounds like we have tangled the need to flush a drive's
write with the need to not re-order IO in the elevator code.


Ric


_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 
Old 02-02-2009, 02:55 PM
"Miller, Mike (OS Dev)"
 
Default barrier and commit options?

Theodore Tso wrote:

>
> Well, we still need the barrier on the block I/O elevantor
> side to make sure that requests don't get reordered in the
> block layer. But what you're saying is that once the write
> is posted to the array, it is guaranteed that it is on
> "stable storage" (even if it is BBWC) such that if someone
> hits the Big Red Switch at the exit to the data center, and
> power is forcibly cut from the entire data center in case of
> a fire, the battery will still keep the cache alive, at least
> until the sprinklers go off, anyway, right? :-)

That's an accurate accessment. ;-)

>
> In that case, I suspect the right thing for the cciss array
> to do is to ignore the barrier, but not to return an error.

We agree and will fix the IO error.

> If you return an error, and refuse the write with barrier
> operation (which is what the cciss driver seems to be doing
> starting in 2.6.29-rcX), ext4 will retry the write without
> the barrier, at which point we are vulnerable to the block
> layer reordering things at the I/O scheduler layer. In
> effect, you're claiming that every single write to cciss is
> implicitly a "barrier write" in that once it is received by
> the device, it is guaranteed not to be lost even if the power
> to the entire system is forcibly removed.

Of course, we can't cover all possible scenarios like the data center exploding or something crazy. But under _most_ circumstances the data will remain in cache for up to 72 hours of no power. So if there is a complete power outage the controller will write any cached data (in order) to the disks on the next power up.

-- mikem

> _______________________________________________
> Ext3-users mailing list
> Ext3-users@redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>

_______________________________________________
Ext3-users mailing list
Ext3-users@redhat.com
https://www.redhat.com/mailman/listinfo/ext3-users
 

Thread Tools




All times are GMT. The time now is 07:32 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org