Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Debian dpkg (http://www.linux-archive.org/debian-dpkg/)
-   -   The fsync issue (http://www.linux-archive.org/debian-dpkg/458124-fsync-issue.html)

Guillem Jover 11-27-2010 09:46 AM

The fsync issue
 
On Sat, 2010-11-27 at 01:41:19 -0600, Jonathan Nieder wrote:
> Guillem Jover wrote:
> > Unfortunately that patch does not seem much appealing, it's Linux only,
> > not even in mainline, and it would need for dpkg to track on which file
> > system each file is located and issue such ioctl once per file system.
> >
> > I'd rather not complicate the dpkg source code even more for something
> > that seems to me to be a bug or missfeature in the file system. More so
> > when there's a clear fix (nodelalloc) that solves both the performance
> > and data safety issues in general.
>
> I don't really understand this point of view: isn't the fsync storm
> going to cause seeky I/O on just about all file systems?

Well sure it might, but then some seem to be able to cope just fine, even
ext4 with nodelalloc. Also seeks might stop being that relevant (in the
mid/long term) once SSD becomes more widespread.

> So the POSIX primitives are not rich enough to express what we want to
> happen. Delayed allocation is pretty much essential for the use case
> ubifs targets, so it doesn't make much sense to me to pretend it
> doesn't exist.

As long as delayed allocation is a synonym for zero-length files, then
I personally consider it a misfeature. This is data loss we are talking
about, and while data coming from packages is easily recoverable
although cumbersome, user data might not. We got fsck, journals and
similar to recover from system crashes, and now we get zero-length
files in the name of performance, it seems clear to me that's a
regression.

Anyway my thinking process goes a bit like this: There's currently a
handful of programs doing the complete write+fsync+rename dance, with
the file systems which need it penalize heavily. If more programs start
to get "fixed" to do the fsyncs then the situation overall will just
worsen. And then at that point I think it's completely unreasonable
to expect every userland program to add such complexity and unportable
hack over hack to workaround the file system problems.

For non-technical users, data safety should be way more important than
performance, having to recover a hosed system might mean they'd just
reinstall it. For technical users I see the options as follows: help
fix the file system to perform reasonably with fsync() or not lose
data w/o fsync(), use another file system, use other better mount
options, use dpkg --force-unsafe-io and cope with data loss.

But then I think I've said most of this elsewhere already.

> I'll look into a (Linux-specific, obviously) patch to add a function
> that takes an array of paths and performs the relevant syncs of
> filesystems where that ioctl exists tomorrow. I would rather see a
> system call that just takes an array of paths, since I imagine
> filesystems like btrfs could do something good with that, but since
> there are no VFS primitives for it I can see why that wasn't proposed.

Tracking fds is going to be easier, at that point dpkg already has
the stat information, so it could queue an fd per unique st_dev for
example.

regards,
guillem


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101127104636.GB14860@gaara.hadrons.org">http://lists.debian.org/20101127104636.GB14860@gaara.hadrons.org

Ben Hutchings 11-27-2010 02:01 PM

The fsync issue
 
On Sat, 2010-11-27 at 07:59 +0100, Guillem Jover wrote:
> Hi Ben!
>
> On Fri, 2010-11-26 at 13:31:20 +0000, Ben Hutchings wrote:
> > Just got this from Christoph Helwig:
> >
> > 13:23 < hch> bwh: if you guys are interested in helping dpkg review and ack the
> > per-fs sync ioctl path that sage weil sent out a couple of weeks
> > ago
> > 13:24 < hch> bwh: and report the ext4 fsync issues to the list, I know ext4
> > fsync isn't stellar, but the numbers sounds so bad that there must
> > be a bug somewhere
>
> > The patch referred to is in
> > <http://thread.gmane.org/gmane.linux.file-systems/44628>.
>
> Unfortunately that patch does not seem much appealing, it's Linux only,
> not even in mainline, and it would need for dpkg to track on which file
> system each file is located and issue such ioctl once per file system.

You don't need to tell me this.

> I'd rather not complicate the dpkg source code even more for something
> that seems to me to be a bug or missfeature in the file system. More so
> when there's a clear fix (nodelalloc) that solves both the performance
> and data safety issues in general.

But that 'clear fix' is bad for performance in general, as delayed
allocation reduces fragmentation. Please talk to upstream about the bad
fsync() performance.

Ben.

--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

Guillem Jover 11-27-2010 06:03 PM

The fsync issue
 
On Sat, 2010-11-27 at 15:01:15 +0000, Ben Hutchings wrote:
> On Sat, 2010-11-27 at 07:59 +0100, Guillem Jover wrote:
> > On Fri, 2010-11-26 at 13:31:20 +0000, Ben Hutchings wrote:
> > > Just got this from Christoph Helwig:
> > >
> > > 13:23 < hch> bwh: if you guys are interested in helping dpkg review and ack the
> > > per-fs sync ioctl path that sage weil sent out a couple of weeks
> > > ago
> > > 13:24 < hch> bwh: and report the ext4 fsync issues to the list, I know ext4
> > > fsync isn't stellar, but the numbers sounds so bad that there must
> > > be a bug somewhere
> >
> > > The patch referred to is in
> > > <http://thread.gmane.org/gmane.linux.file-systems/44628>.
> >
> > Unfortunately that patch does not seem much appealing, it's Linux only,
> > not even in mainline, and it would need for dpkg to track on which file
> > system each file is located and issue such ioctl once per file system.
>
> You don't need to tell me this.

Well, you posted what seemed a proposal for a possible solution (even if
forwarded from someone else), and I gave the reasons why I'd rather not
use it. It was not meant as and insult or implying you don't know those
things. I'd think that was better than just a "No" or no reply? (But then
maybe I'm just misreading your reply?)

> Please talk to upstream about the bad fsync() performance.

Upstream was notified already about that some time ago
<https://bugzilla.kernel.org/show_bug.cgi?id=15910>.

thanks,
guillem


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101127190320.GA25206@gaara.hadrons.org">http://lists.debian.org/20101127190320.GA25206@gaara.hadrons.org

Goswin von Brederlow 11-28-2010 10:26 PM

The fsync issue
 
Guillem Jover <guillem@debian.org> writes:

> On Sat, 2010-11-27 at 01:41:19 -0600, Jonathan Nieder wrote:
>> Guillem Jover wrote:
>> > Unfortunately that patch does not seem much appealing, it's Linux only,
>> > not even in mainline, and it would need for dpkg to track on which file
>> > system each file is located and issue such ioctl once per file system.

What if you issue one ioctl per file? Won't the duplicates just return
provided there is nothing else writing fresh data to the FS?

>> > I'd rather not complicate the dpkg source code even more for something
>> > that seems to me to be a bug or missfeature in the file system. More so
>> > when there's a clear fix (nodelalloc) that solves both the performance
>> > and data safety issues in general.
>>
>> I don't really understand this point of view: isn't the fsync storm
>> going to cause seeky I/O on just about all file systems?
>
> Well sure it might, but then some seem to be able to cope just fine, even
> ext4 with nodelalloc. Also seeks might stop being that relevant (in the
> mid/long term) once SSD becomes more widespread.
>
>> So the POSIX primitives are not rich enough to express what we want to
>> happen. Delayed allocation is pretty much essential for the use case
>> ubifs targets, so it doesn't make much sense to me to pretend it
>> doesn't exist.
>
> As long as delayed allocation is a synonym for zero-length files, then
> I personally consider it a misfeature. This is data loss we are talking
> about, and while data coming from packages is easily recoverable
> although cumbersome, user data might not. We got fsck, journals and
> similar to recover from system crashes, and now we get zero-length
> files in the name of performance, it seems clear to me that's a
> regression.

What if you use data journaling? Shouldn't that replay the data after a
crash and thus not suffer from 0 byte files? Or does delalloc prevent
the data to be written to journal until the time it allocates a block
for it?

> Anyway my thinking process goes a bit like this: There's currently a
> handful of programs doing the complete write+fsync+rename dance, with
> the file systems which need it penalize heavily. If more programs start
> to get "fixed" to do the fsyncs then the situation overall will just
> worsen. And then at that point I think it's completely unreasonable
> to expect every userland program to add such complexity and unportable
> hack over hack to workaround the file system problems.

Usualy one does this on ONE file and everything is fine.

The problem only arises because dpkg is doing this on a million files
and if I understood the problem correctly in ext4 each one of them
causes a lengthy data + metadata + super sync again and again.

I think one long term solution to this might be to invent an async
fsync() call. A way to tell the FS that the file should be synced
soonest and report back when it is done. This should make the FS collect
multiple files into a single sync. One possible way to implement this
would be to mmap each file and msync() it with MS_ASYNC. But as that
doesn't cover the metadata part I'm not to sure it would completly solve
the bottleneck.

Anyone with ext4 feel up to implementing this in dpkg and measuring it?

> For non-technical users, data safety should be way more important than
> performance, having to recover a hosed system might mean they'd just
> reinstall it. For technical users I see the options as follows: help
> fix the file system to perform reasonably with fsync() or not lose
> data w/o fsync(), use another file system, use other better mount
> options, use dpkg --force-unsafe-io and cope with data loss.
>
> But then I think I've said most of this elsewhere already.
>
>> I'll look into a (Linux-specific, obviously) patch to add a function
>> that takes an array of paths and performs the relevant syncs of
>> filesystems where that ioctl exists tomorrow. I would rather see a
>> system call that just takes an array of paths, since I imagine
>> filesystems like btrfs could do something good with that, but since
>> there are no VFS primitives for it I can see why that wasn't proposed.
>
> Tracking fds is going to be easier, at that point dpkg already has
> the stat information, so it could queue an fd per unique st_dev for
> example.

That sounds like a good plan. How hard would it be to implement this
based on FDs instead of path? Would the ioctl patch need changes to work
on an FD instead of path? (sorry, haven't read the patch)

> regards,
> guillem

MfG
Goswin


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 87pqtp58tu.fsf@frosties.localnet">http://lists.debian.org/87pqtp58tu.fsf@frosties.localnet

Jonathan Nieder 11-28-2010 10:33 PM

The fsync issue
 
Goswin von Brederlow wrote:
> Guillem Jover <guillem@debian.org> writes:

>> Tracking fds is going to be easier, at that point dpkg already has
>> the stat information, so it could queue an fd per unique st_dev for
>> example.
>
> That sounds like a good plan. How hard would it be to implement this
> based on FDs instead of path? Would the ioctl patch need changes to work
> on an FD instead of path? (sorry, haven't read the patch)

It uses fds.

BTW review of the dpkg and kernel patches would be quite welcome.
Both are simple, or so I hope. I would like to send another version
or the dpkg patch with sync-per-filesystem enabled unconditionally
when it works on Linux (thus no string changes) and
fsync_or_sync_filesystem implemented in a separate file; if there are
other updates to include at the same time, that would be nice.

Regards,
Jonathan


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101128233340.GA25620@burratino">http://lists.debian.org/20101128233340.GA25620@burratino


All times are GMT. The time now is 10:15 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.