FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian dpkg

 
 
LinkBack Thread Tools
 
Old 11-17-2010, 04:24 AM
Guillem Jover
 
Default Pre-approval request for dpkg sync() changes for squeeze

Hi!

On Mon, 2010-11-15 at 19:31:00 +0100, Philipp Kern wrote:
> On Mon, Nov 15, 2010 at 09:58:47AM +0100, Sven Joachim wrote:
> > All this is with a standard squeeze kernel on an otherwise idle system.
> > It should be noted that with lots of other disk activity such as writing
> > to USB disks, the figures in dpkg 1.15.8.5 can become much worse and
> > dpkg might even stall because of the many sync() calls:
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=595927.
> >
> > As far as ext4 is concerned, switching back to fsync() seems to be
> > acceptable only if the filesystem is mounted with the nodelalloc
> > option. Maybe the installer should set this up.
>
> and I don't suppose we could make that the default?

That would be the sanest thing to do IMO, otherwise the users might
lose data in general. Barring that probably d-i could set the flag
on newly created file systems. And otherwise as a last resort an entry
on the release notes warning users of the perils of using ext4 with
default options (in addition to the really bad performance with
applications using fsync()) would be nice.

> Is there anything else the dpkg developers can try to be portable
> and still not be sacrificing performance?

It's not much about portability than the side-effects using sync()
entails, as can be seen by the bug report Sven pointed out to. Which
I consider unacceptable for an application to get into. The portability
issues are just a symptom of the wrongness of using sync() as a
subsitute for fsync().

regards,
guillem


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101117052408.GB20745@gaara.hadrons.org">http://lists.debian.org/20101117052408.GB20745@gaara.hadrons.org
 
Old 11-21-2010, 03:11 AM
Ben Hutchings
 
Default Pre-approval request for dpkg sync() changes for squeeze

On Mon, 2010-11-15 at 19:31 +0100, Philipp Kern wrote:
> Dear kernel team,
>
> On Mon, Nov 15, 2010 at 09:58:47AM +0100, Sven Joachim wrote:
> > > I'm sorry, I won't have the time to do new benchmarks on this.
> > >
> > > The only benchmarks we have have been made by Sven Joachim:
> > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=578635#20
> > > (asyncsync is the switch to sync() instead of fsync() so the opposite of
> > > the above patch)
> > >
> > > He mentionned that without the sync() trick it takes 3 to 5 times longer
> > > to unpack a package.
> >
> > Even longer actually, see the figures below.
> >
> > > Sven, would you have time to provide some of the stats asked by the
> > > release team?
> >
> > I can only test ext4, here are some samples of dpkg unpacking a large
> > package (dpkg --unpack --no-triggers emacs23-common_23.2+1-5.1_all.deb),
> > leaving out user and sys times since those do not vary much (~ 0.5
> > seconds in every case):
> >
> > dpkg version Cache mount options unpack time
> > ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅
> > 1.15.8.5 cold defaults 7.803s
> > 1.15.8.5 warm defaults 5.283s
> > 1.15.8.5 cold nodelalloc 7.608s
> > 1.15.8.5 warm nodelalloc 3.783s
> > 1.15.7 cold defaults 40.429s
> > 1.15.7 warm defaults 37.848s
> > 1.15.7 cold nodelalloc 7.945s
> > 1.15.7 warm nodelalloc 3.524s
> >
> > All this is with a standard squeeze kernel on an otherwise idle system.
> > It should be noted that with lots of other disk activity such as writing
> > to USB disks, the figures in dpkg 1.15.8.5 can become much worse and
> > dpkg might even stall because of the many sync() calls:
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=595927.
> >
> > As far as ext4 is concerned, switching back to fsync() seems to be
> > acceptable only if the filesystem is mounted with the nodelalloc
> > option. Maybe the installer should set this up.
>
> and I don't suppose we could make that the default? Is there anything
> else the dpkg developers can try to be portable and still not be
> sacrificing performance?

I'm coming to this late. It sounds like dpkg has changed its behaviour
several times recently. Please can you summarise dpkg's current and
proposed use of fsync() vs sync(), and the reasons for this.

Also do I understand correctly that fsync() is more expensive when ext4
delayed allocation is in use?

Ben.

--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
 
Old 11-21-2010, 03:11 AM
Ben Hutchings
 
Default Pre-approval request for dpkg sync() changes for squeeze

On Mon, 2010-11-15 at 19:31 +0100, Philipp Kern wrote:
> Dear kernel team,
>
> On Mon, Nov 15, 2010 at 09:58:47AM +0100, Sven Joachim wrote:
> > > I'm sorry, I won't have the time to do new benchmarks on this.
> > >
> > > The only benchmarks we have have been made by Sven Joachim:
> > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=578635#20
> > > (asyncsync is the switch to sync() instead of fsync() so the opposite of
> > > the above patch)
> > >
> > > He mentionned that without the sync() trick it takes 3 to 5 times longer
> > > to unpack a package.
> >
> > Even longer actually, see the figures below.
> >
> > > Sven, would you have time to provide some of the stats asked by the
> > > release team?
> >
> > I can only test ext4, here are some samples of dpkg unpacking a large
> > package (dpkg --unpack --no-triggers emacs23-common_23.2+1-5.1_all.deb),
> > leaving out user and sys times since those do not vary much (~ 0.5
> > seconds in every case):
> >
> > dpkg version Cache mount options unpack time
> > ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ ̅̅̅̅̅
> > 1.15.8.5 cold defaults 7.803s
> > 1.15.8.5 warm defaults 5.283s
> > 1.15.8.5 cold nodelalloc 7.608s
> > 1.15.8.5 warm nodelalloc 3.783s
> > 1.15.7 cold defaults 40.429s
> > 1.15.7 warm defaults 37.848s
> > 1.15.7 cold nodelalloc 7.945s
> > 1.15.7 warm nodelalloc 3.524s
> >
> > All this is with a standard squeeze kernel on an otherwise idle system.
> > It should be noted that with lots of other disk activity such as writing
> > to USB disks, the figures in dpkg 1.15.8.5 can become much worse and
> > dpkg might even stall because of the many sync() calls:
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=595927.
> >
> > As far as ext4 is concerned, switching back to fsync() seems to be
> > acceptable only if the filesystem is mounted with the nodelalloc
> > option. Maybe the installer should set this up.
>
> and I don't suppose we could make that the default? Is there anything
> else the dpkg developers can try to be portable and still not be
> sacrificing performance?

I'm coming to this late. It sounds like dpkg has changed its behaviour
several times recently. Please can you summarise dpkg's current and
proposed use of fsync() vs sync(), and the reasons for this.

Also do I understand correctly that fsync() is more expensive when ext4
delayed allocation is in use?

Ben.

--
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.
 
Old 11-21-2010, 04:00 AM
Jonathan Nieder
 
Default Pre-approval request for dpkg sync() changes for squeeze

Hi Ben,

Ben Hutchings wrote:
> On Mon, 2010-11-15 at 19:31 +0100, Philipp Kern wrote:

>> and I don't suppose we could make that the default? Is there anything
>> else the dpkg developers can try to be portable and still not be
>> sacrificing performance?
>
> I'm coming to this late. It sounds like dpkg has changed its behaviour
> several times recently. Please can you summarise dpkg's current and
> proposed use of fsync() vs sync(), and the reasons for this.
>
> Also do I understand correctly that fsync() is more expensive when ext4
> delayed allocation is in use?

Here's a try, based on "git log --grep=sync".

Some reports[1] indicated that dpkg was truncating files to zero
length on ext4 (and ubifs) filesystems with delayed allocation
enabled. This happened whenever a system crash occured during or
closely after an upgrade, which is not really acceptable, especially
considering that upgrades are a time a person is likely to be trying
things out that might crash the system.

I. So some patches were written and applied to fsync() each new file
as it is written (before the rename()). These patches are part of
dpkg 1.15.6.

The result was very slow[2], especially on ext4 but also on ext3.
Colin Watson noticed that a sync() is a lot faster. Unfortunately
sync() being synchronous is not portable (e.g., on BSD it returns
right away, before files have been committed to disk), so the
attempted fix was

II. Write all .dpkg-tmp files. fsync storm to make sure all
.dpkg-tmp files have well defined content, then rename
storm to put them in place.

This appeared to improve performance quite a bit, but that was a
bug[3]. After fixing that bug, the slowdown was still present[4] (as
Mike Hommey had predicted[5]), at least on ext4. There seems to be a
per-fsync cost.

So that leaves us with sync():

III. Write all .dpkg-tmp files. sync(). Rename storm to put
the files in place.

which is quite fast, really --- it about cancels out the effect of the
new optimization of using FIEMAP to read /var/lib/dpkg/* to be about
the same speed as lenny.

Unfortunately, in addition to not being portably synchronous, sync()
does not have the right semantics. In particular, when building with
pbuilder on tmpfs, sync() syncs _all_ filesystems, including whatever
slow thumb drive happens to be mounted at the same time.

Hope that helps,
Jonathan

[1] See https://bugzilla.kernel.org/show_bug.cgi?id=15910 for example.
[2] http://lists.debian.org/debian-dpkg/2010/03/threads.html#00029
[3] http://bugs.debian.org/577756
[4] http://bugs.debian.org/578635
[5] http://lists.debian.org/debian-dpkg/2010/03/msg00036.html


--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101121050023.GB11884@burratino">http://lists.debian.org/20101121050023.GB11884@burratino
 
Old 11-21-2010, 04:00 AM
Jonathan Nieder
 
Default Pre-approval request for dpkg sync() changes for squeeze

Hi Ben,

Ben Hutchings wrote:
> On Mon, 2010-11-15 at 19:31 +0100, Philipp Kern wrote:

>> and I don't suppose we could make that the default? Is there anything
>> else the dpkg developers can try to be portable and still not be
>> sacrificing performance?
>
> I'm coming to this late. It sounds like dpkg has changed its behaviour
> several times recently. Please can you summarise dpkg's current and
> proposed use of fsync() vs sync(), and the reasons for this.
>
> Also do I understand correctly that fsync() is more expensive when ext4
> delayed allocation is in use?

Here's a try, based on "git log --grep=sync".

Some reports[1] indicated that dpkg was truncating files to zero
length on ext4 (and ubifs) filesystems with delayed allocation
enabled. This happened whenever a system crash occured during or
closely after an upgrade, which is not really acceptable, especially
considering that upgrades are a time a person is likely to be trying
things out that might crash the system.

I. So some patches were written and applied to fsync() each new file
as it is written (before the rename()). These patches are part of
dpkg 1.15.6.

The result was very slow[2], especially on ext4 but also on ext3.
Colin Watson noticed that a sync() is a lot faster. Unfortunately
sync() being synchronous is not portable (e.g., on BSD it returns
right away, before files have been committed to disk), so the
attempted fix was

II. Write all .dpkg-tmp files. fsync storm to make sure all
.dpkg-tmp files have well defined content, then rename
storm to put them in place.

This appeared to improve performance quite a bit, but that was a
bug[3]. After fixing that bug, the slowdown was still present[4] (as
Mike Hommey had predicted[5]), at least on ext4. There seems to be a
per-fsync cost.

So that leaves us with sync():

III. Write all .dpkg-tmp files. sync(). Rename storm to put
the files in place.

which is quite fast, really --- it about cancels out the effect of the
new optimization of using FIEMAP to read /var/lib/dpkg/* to be about
the same speed as lenny.

Unfortunately, in addition to not being portably synchronous, sync()
does not have the right semantics. In particular, when building with
pbuilder on tmpfs, sync() syncs _all_ filesystems, including whatever
slow thumb drive happens to be mounted at the same time.

Hope that helps,
Jonathan

[1] See https://bugzilla.kernel.org/show_bug.cgi?id=15910 for example.
[2] http://lists.debian.org/debian-dpkg/2010/03/threads.html#00029
[3] http://bugs.debian.org/577756
[4] http://bugs.debian.org/578635
[5] http://lists.debian.org/debian-dpkg/2010/03/msg00036.html


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101121050023.GB11884@burratino">http://lists.debian.org/20101121050023.GB11884@burratino
 
Old 11-21-2010, 07:18 AM
Raphael Hertzog
 
Default Pre-approval request for dpkg sync() changes for squeeze

On Sun, 21 Nov 2010, Ben Hutchings wrote:
> I'm coming to this late. It sounds like dpkg has changed its behaviour
> several times recently. Please can you summarise dpkg's current and
> proposed use of fsync() vs sync(), and the reasons for this.

Jonathan made a good summary of the history. I should add that dpkg uses
sync() instead of fsync() only on systems where we know that sync() is
synchronous (i.e. Linux only).

Now we want to stop using sync() because of the bad side-effects:
- using on a tmpfs is slower because it syncs changes on unrelated
filesystems
- there are those reports of dpkg blocked due to the sync
see http://bugs.debian.org/595927 http://bugs.debian.org/600075

> Also do I understand correctly that fsync() is more expensive when ext4
> delayed allocation is in use?

Apparently, at least for dpkg's usage pattern. But the performance are so
much slower that you have been asked whether it would make sense to change
the defaults on ext4 to include "nodelalloc".

Cheers,
--
Raphaël Hertzog ◈ Debian Developer

Follow my Debian News ▶ http://RaphaelHertzog.com (English)
▶ http://RaphaelHertzog.fr (Français)


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101121081804.GC11156@rivendell.home.ouaza.com">h ttp://lists.debian.org/20101121081804.GC11156@rivendell.home.ouaza.com
 
Old 11-21-2010, 07:18 AM
Raphael Hertzog
 
Default Pre-approval request for dpkg sync() changes for squeeze

On Sun, 21 Nov 2010, Ben Hutchings wrote:
> I'm coming to this late. It sounds like dpkg has changed its behaviour
> several times recently. Please can you summarise dpkg's current and
> proposed use of fsync() vs sync(), and the reasons for this.

Jonathan made a good summary of the history. I should add that dpkg uses
sync() instead of fsync() only on systems where we know that sync() is
synchronous (i.e. Linux only).

Now we want to stop using sync() because of the bad side-effects:
- using on a tmpfs is slower because it syncs changes on unrelated
filesystems
- there are those reports of dpkg blocked due to the sync
see http://bugs.debian.org/595927 http://bugs.debian.org/600075

> Also do I understand correctly that fsync() is more expensive when ext4
> delayed allocation is in use?

Apparently, at least for dpkg's usage pattern. But the performance are so
much slower that you have been asked whether it would make sense to change
the defaults on ext4 to include "nodelalloc".

Cheers,
--
Raphaël Hertzog ◈ Debian Developer

Follow my Debian News ▶ http://RaphaelHertzog.com (English)
▶ http://RaphaelHertzog.fr (Français)


--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101121081804.GC11156@rivendell.home.ouaza.com">h ttp://lists.debian.org/20101121081804.GC11156@rivendell.home.ouaza.com
 
Old 11-21-2010, 09:08 AM
Mike Hommey
 
Default Pre-approval request for dpkg sync() changes for squeeze

On Sun, Nov 21, 2010 at 09:18:04AM +0100, Raphael Hertzog wrote:
> On Sun, 21 Nov 2010, Ben Hutchings wrote:
> > I'm coming to this late. It sounds like dpkg has changed its behaviour
> > several times recently. Please can you summarise dpkg's current and
> > proposed use of fsync() vs sync(), and the reasons for this.
>
> Jonathan made a good summary of the history. I should add that dpkg uses
> sync() instead of fsync() only on systems where we know that sync() is
> synchronous (i.e. Linux only).
>
> Now we want to stop using sync() because of the bad side-effects:
> - using on a tmpfs is slower because it syncs changes on unrelated
> filesystems
> - there are those reports of dpkg blocked due to the sync
> see http://bugs.debian.org/595927 http://bugs.debian.org/600075
>
> > Also do I understand correctly that fsync() is more expensive when ext4
> > delayed allocation is in use?
>
> Apparently, at least for dpkg's usage pattern. But the performance are so
> much slower that you have been asked whether it would make sense to change
> the defaults on ext4 to include "nodelalloc".

Something that might be worth trying is using fallocate, which /might/
mitigate the delayed allocation effects.

Mike


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101121100802.GA3504@glandium.org">http://lists.debian.org/20101121100802.GA3504@glandium.org
 
Old 11-21-2010, 09:08 AM
Mike Hommey
 
Default Pre-approval request for dpkg sync() changes for squeeze

On Sun, Nov 21, 2010 at 09:18:04AM +0100, Raphael Hertzog wrote:
> On Sun, 21 Nov 2010, Ben Hutchings wrote:
> > I'm coming to this late. It sounds like dpkg has changed its behaviour
> > several times recently. Please can you summarise dpkg's current and
> > proposed use of fsync() vs sync(), and the reasons for this.
>
> Jonathan made a good summary of the history. I should add that dpkg uses
> sync() instead of fsync() only on systems where we know that sync() is
> synchronous (i.e. Linux only).
>
> Now we want to stop using sync() because of the bad side-effects:
> - using on a tmpfs is slower because it syncs changes on unrelated
> filesystems
> - there are those reports of dpkg blocked due to the sync
> see http://bugs.debian.org/595927 http://bugs.debian.org/600075
>
> > Also do I understand correctly that fsync() is more expensive when ext4
> > delayed allocation is in use?
>
> Apparently, at least for dpkg's usage pattern. But the performance are so
> much slower that you have been asked whether it would make sense to change
> the defaults on ext4 to include "nodelalloc".

Something that might be worth trying is using fallocate, which /might/
mitigate the delayed allocation effects.

Mike


--
To UNSUBSCRIBE, email to debian-kernel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101121100802.GA3504@glandium.org">http://lists.debian.org/20101121100802.GA3504@glandium.org
 

Thread Tools




All times are GMT. The time now is 12:17 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org