Joey Hess wrote:
> Hideki Yamane wrote:
> > I tested as well, and sometimes decompression with xz is so sloooow,
> > it takes 6-8 times than default gz.
>
> I was just watching your DebConf presentation "Lets shrink Debian
> package archive" and I think there you said decompression with xz was
> between 2x and 6x slower. Is that the current number?
>
> I'm concerned with the thought that installation of Debian (as well
> as debootstrap) could take twice or more as long if xz were used for
> say, every package on a Gnome desktop CD. In d-i we try to make
> installation faster; slow installs make people less happy. It would
> be useful to have some real-world installation time benchmarks with
> and without xz.
Does unpacking really take a substantial portion of the time used by the
installer? In my experience the installer takes a LOT longer than it
would take to unzip a CDs worth of data.
Most of the time taken by cdebootstrap is wasted by dpkg on doing
useless file syncs:
cdebootstrap --arch=amd64 unstable debian-tree/
from local package cache on ext4: 138 seconds
on tmpfs where dpkg can't waste time on useless syncs: 21 seconds (and a
significant portion of that is used by package scripts with "sleep 1")
So at least in this case the biggest performance problem by far is the
inappropriate use of fsync() or other disk synchronization primitives,
and CPU use for unpacking is pretty much irrelevant.
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/1342895150.3820.11.camel@glyph.nonexistent.invalid
07-21-2012, 07:05 PM
"brian m. carlson"
CD sizes again (and BoF reminder!)
On Sat, Jul 21, 2012 at 09:25:50PM +0300, Uoti Urpala wrote:
> Most of the time taken by cdebootstrap is wasted by dpkg on doing
> useless file syncs:
>
> cdebootstrap --arch=amd64 unstable debian-tree/
>
> from local package cache on ext4: 138 seconds
>
> on tmpfs where dpkg can't waste time on useless syncs: 21 seconds (and a
> significant portion of that is used by package scripts with "sleep 1")
>
> So at least in this case the biggest performance problem by far is the
> inappropriate use of fsync() or other disk synchronization primitives,
> and CPU use for unpacking is pretty much irrelevant.
My understanding is that dpkg uses fsync properly; that is, to guarantee
the data is on the disk before exiting or doing things that require that
data to be present. I don't currently see any bugs on dpkg that
indicate that it is calling fsync needlessly or wastefully. If you see
that behavior, could you please file a bug on dpkg?
--
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187
07-21-2012, 10:18 PM
Uoti Urpala
CD sizes again (and BoF reminder!)
brian m. carlson wrote:
> On Sat, Jul 21, 2012 at 09:25:50PM +0300, Uoti Urpala wrote:
> > So at least in this case the biggest performance problem by far is the
> > inappropriate use of fsync() or other disk synchronization primitives,
> > and CPU use for unpacking is pretty much irrelevant.
>
> My understanding is that dpkg uses fsync properly; that is, to guarantee
> the data is on the disk before exiting or doing things that require that
> data to be present. I don't currently see any bugs on dpkg that
There are no things that would "require that data to be present" on disk
in the middle of a dpkg invocation. There may be write ordering
requirements if you want to guarantee some type of consistency after a
crash, and there are AFAIK still no good functions to express such
requirements in Linux (though I haven't checked recently); but it's
important to understand the difference - the "data is on physical disk"
semantics of fsync() are NEVER what you want within a dpkg run.
Whatever the semantics are when using dpkg on a running system, all
attempts to ensure on-disk consistency are wrong for installer and
cdebootstrap use. If the machine crashes in the middle of installation
or debootstrap directory creation, I'm not going to attempt continuing
the operation from where it stopped, so on-disk consistency of the
partial installation is worthless. And as the timings show, you can
speed up installation by more than a factor of 5 by skipping the useless
disk waits (I verified that's still true even if you add moving the
directory to a persistent filesystem and then running sync after the
tmpfs installation).
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/1342909103.3820.43.camel@glyph.nonexistent.invalid
07-21-2012, 10:42 PM
Simon Paillard
CD sizes again (and BoF reminder!)
Hi,
On Sat, Jul 21, 2012 at 09:25:50PM +0300, Uoti Urpala wrote:
> Joey Hess wrote:
> > Hideki Yamane wrote:
> > > I tested as well, and sometimes decompression with xz is so sloooow,
> > > it takes 6-8 times than default gz.
> >
> > I was just watching your DebConf presentation "Lets shrink Debian
> > package archive" and I think there you said decompression with xz was
> > between 2x and 6x slower. Is that the current number?
> >
> > I'm concerned with the thought that installation of Debian (as well
> > as debootstrap) could take twice or more as long if xz were used for
> > say, every package on a Gnome desktop CD. In d-i we try to make
> > installation faster; slow installs make people less happy. It would
> > be useful to have some real-world installation time benchmarks with
> > and without xz.
>
> Does unpacking really take a substantial portion of the time used by the
> installer?
[..]
> So at least in this case the biggest performance problem by far is the
> inappropriate use of fsync() or other disk synchronization primitives,
> and CPU use for unpacking is pretty much irrelevant.
Though the kernel will have to sync sooner or later, I understand
debian-installer ask dpkg not to fsync:
http://bugs.debian.org/605384
base-installer (1.121) unstable; urgency=low
.
[ Colin Watson ]
* Merge from Ubuntu:
- Run dpkg with --force-unsafe-io during installation; syncing is
unnecessary in this context and can slow things down quite a bit
(closes: #605384).
--
Simon Paillard
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120721224210.GJ5645@glenfiddich.mraw.org">http://lists.debian.org/20120721224210.GJ5645@glenfiddich.mraw.org
07-21-2012, 11:13 PM
Joey Hess
CD sizes again (and BoF reminder!)
Mike Hommey wrote:
> Note that slower decompression doesn't necessarily mean longer
> installation time. I/O is still more time consuming than CPU.
Which is why I asked for actual, real-world benchmarks...
--
see shy jo
07-21-2012, 11:58 PM
Adam Borowski
CD sizes again (and BoF reminder!)
On Sat, Jul 21, 2012 at 11:42:03AM -0400, Joey Hess wrote:
> Hideki Yamane wrote:
> > On Sun, 8 Jul 2012 17:58:16 +0200
> > Adam Borowski <kilobyte@angband.pl> wrote:
> > > • xz -6 (the default) is a lot slower when compressing, fast when
> > > decompressing, needs only 10MB memory, 58% size
> > > • xz -9 has very slow compression, takes gobs of memory, 56% size
> > > (Obviously, the "size" numbers are dragged down by uncompressible files
> > > when you look at the whole archive.)
> >
> > I tested as well, and sometimes decompression with xz is so sloooow,
> > it takes 6-8 times than default gz.
>
> I was just watching your DebConf presentation "Lets shrink Debian
> package archive" and I think there you said decompression with xz was
> between 2x and 6x slower. Is that the current number?
Here are the numbers, of decompressing alone (rather than, say, a
debootstrap run). User times only, best of three tries.
There are two interesting pieces here:
* higher compression settings tend to improve speed (I suspected the
opposite...)
* xz seems to have a special case for incompressible data
> It would be useful to have some real-world installation time benchmarks
> with and without xz.
It is said that decompression is a small part of install time, I did not
test that.
>
> BTW, when we switched to building udebx with xz, Philipp Kern benchmarked
> it using little or no additional CPU to decompress xz produced with
> -Zxz -z1 -Sextreme http://lists.debian.org/debian-boot/2011/10/msg00247.html
Per the above, you'd want a higher setting than -1. With the default (-6),
you need just 10MB memory to decompress.
--
Copyright and patents were never about promoting culture and innovations;
from the very start they were legalized bribes to give the king some income
and to let businesses get rid of competition. For some history, please read
https://en.wikipedia.org/wiki/Statute_of_Monopolies_1623
07-22-2012, 04:58 PM
Uoti Urpala
CD sizes again (and BoF reminder!)
Simon Paillard wrote:
> On Sat, Jul 21, 2012 at 09:25:50PM +0300, Uoti Urpala wrote:
> > So at least in this case the biggest performance problem by far is the
> > inappropriate use of fsync() or other disk synchronization primitives,
> > and CPU use for unpacking is pretty much irrelevant.
>
> Though the kernel will have to sync sooner or later
The normal background writes to disk don't affect performance all that
much. The problem is sync operations that force disk waits before
continuing with the install. Copying the debootstrap directory from
tmpfs to disk after installation took about 6 seconds, whereas doing the
syncs between installation steps added about two minutes to the install
time.
> , I understand debian-installer ask dpkg not to fsync:
> - Run dpkg with --force-unsafe-io during installation; syncing is
This only affects one particular instance of syncing (which I think may
be useless anyway on normal ext4 after write+rename reliability was
improved in kernel commit 7d8f9f7d150dded7b68e61ca6403a1f166fb4edf). It
does not disable ALL disk sync operations in dpkg, like
installer/debootstrap use should.
I tested installing and purging libqt4-dev and some dependencies on ext4
(total 17 packages).
With just force-unsafe-io in dpkg config:
aptitude install libqt4-dev: 16 seconds
aptitude --purge-unused purge libqt4-dev: 14 seconds
So unless this is fixed in dpkg, the installer might want to use
eatmydata...
BTW eatmydata doesn't seem to work with cdebootstrap. I guess it uses
chroot or something in a way which breaks that.
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/1342976322.3820.68.camel@glyph.nonexistent.invalid
07-22-2012, 08:29 PM
Philipp Kern
CD sizes again (and BoF reminder!)
On Sun, Jul 22, 2012 at 01:58:59AM +0200, Adam Borowski wrote:
> > BTW, when we switched to building udebx with xz, Philipp Kern benchmarked
> > it using little or no additional CPU to decompress xz produced with
> > -Zxz -z1 -Sextreme http://lists.debian.org/debian-boot/2011/10/msg00247.html
> Per the above, you'd want a higher setting than -1. With the default (-6),
> you need just 10MB memory to decompress.
The main point I looked at was memory usage compared to gzip and I wanted to
pick something for d-i that does not increase the RAM requirements. That's why
I went for 1e. The case for udebs is probably not comparable to plain debs:
with udebs wrote more intense compression options do not gain significant space
savings.
Obviously higher compression ratios require more RAM to decompress for quick
lookups and need to touch less input bytes because they are fewer, so I don't
find it that surprising that it will be faster to decompress. (If you collapse
larger chunks to a few bits on compression, you'll need fewer dictionary
lookups.) And going from 30M (-0) to 21M (-9) is significant given the
size of the whole file.
Kind regards
Philipp Kern
07-22-2012, 09:51 PM
Adam Borowski
CD sizes again (and BoF reminder!)
On Sun, Jul 22, 2012 at 07:58:42PM +0300, Uoti Urpala wrote:
> Simon Paillard wrote:
> > , I understand debian-installer ask dpkg not to fsync:
>
> > - Run dpkg with --force-unsafe-io during installation; syncing is
>
> This only affects one particular instance of syncing (which I think may
> be useless anyway on normal ext4 after write+rename reliability was
> improved in kernel commit 7d8f9f7d150dded7b68e61ca6403a1f166fb4edf). It
> does not disable ALL disk sync operations in dpkg, like
> installer/debootstrap use should.
>
> I tested installing and purging libqt4-dev and some dependencies on ext4
> (total 17 packages).
>
> With just force-unsafe-io in dpkg config:
> aptitude install libqt4-dev: 16 seconds
> aptitude --purge-unused purge libqt4-dev: 14 seconds
>
> eatmydata aptitude install libqt-dev: 4-5 seconds
> eatmydata aptitude --purge-unused purge libqt4-dev: 4-5 seconds
You tested ext4. On btrfs, dpkg is around an order of magnitude slower,
making using it without eatmydata a laughable idea.
And that's on a filesystem whose features include:
* transactions (so all dpkg processing could be done without a single fsync)
* writeable snapshots (if you happen to get a power loss right during an
untransacted dpkg run with eatmydata, all you need is a [re]boot with
subvol=my_last_checkpoint)
Thus, having an option to disable fsync in dpkg without unreliable
LD_PRELOAD tricks would be great.
--
Copyright and patents were never about promoting culture and innovations;
from the very start they were legalized bribes to give the king some income
and to let businesses get rid of competition. For some history, please read
https://en.wikipedia.org/wiki/Statute_of_Monopolies_1623
07-23-2012, 12:30 AM
Russell Coker
CD sizes again (and BoF reminder!)
On Mon, 23 Jul 2012, Adam Borowski <kilobyte@angband.pl> wrote:
> You tested ext4. On btrfs, dpkg is around an order of magnitude slower,
> making using it without eatmydata a laughable idea.
>
> And that's on a filesystem whose features include:
> * transactions (so all dpkg processing could be done without a single
> fsync)
How would it do that? Presumably we need some dpkg changes to get that
result.
> * writeable snapshots (if you happen to get a power loss right
> during an untransacted dpkg run with eatmydata, all you need is a [re]boot
> with subvol=my_last_checkpoint)
How would we do that? Make a snapshot and modify the boot loader
configuration before the installation and then fix the boot loader afterwards?
Also if you have a separate filesystem for /usr or if a postinst script does
something to /var on a separate filesystem then using a snapshot might not get
the result you desire. Of course you could snapshot other filesystems (as
long as /var doesn't happen to contain a live database or something else you
don't want to lose).
I don't think we can easily solve these problems automatically unless we can
put some BTRFS specific code in dpkg.
I agree that it would be good to have a configuration option to have dpkg not
call sync, the data integrity of a system is the responsibility of the
sysadmin and we should respect their choices.
As an aside, I haven't had a serious problem with BTRFS root yet. I've got
one system that's been running wheezy for a couple of weeks and there haven't
been any updates that have been big enough to be a problem.
--
My Main Blog http://etbe.coker.com.au/
My Documents Blog http://doc.coker.com.au/
--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 201207231030.48469.russell@coker.com.au">http://lists.debian.org/201207231030.48469.russell@coker.com.au