Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Debian dpkg (http://www.linux-archive.org/debian-dpkg/)
-   -   use xz compression for Debian package by default (http://www.linux-archive.org/debian-dpkg/698298-use-xz-compression-debian-package-default.html)

Hideki Yamane 08-28-2012 03:10 AM

use xz compression for Debian package by default
 
Hi,

In DebConf12, I talked about xz compression for Debian packages(*).
Now I'll talk about next step, suggestion for use xz with with result
from some experiment.


*) http://penta.debconf.org/dc12_schedule/events/930.en.html

------------------------------------------------------------------------------
test environment (armel)
------------------------------------------------------------------------------

I used Netgear ReadyNAS Duo v2, armel arch machine for this test.
see http://www.netgear.com/home/products/storage/prosumer/rnd2000.aspx
It is based on Debian Squeeze, so the result will be the same in Debian :)


# uname -a
Linux nas-A0-96-88 2.6.31.8.duov2 #1 Mon May 14 18:35:20 HKT 2012 armv5tel GNU/Linux

# cat /proc/cpu
cpu/ cpuinfo
root@nas-A0-96-88:/tmp# cat /proc/cpuinfo
Processor : Feroceon 88FR131 rev 1 (v5l)
BogoMIPS : 1599.07
Features : swp half thumb fastmult edsp
CPU implementer : 0x56
CPU architecture: 5TE
CPU variant : 0x2
CPU part : 0x131
CPU revision : 1

Hardware : Feroceon-KW
Revision : 0000
Serial : 0000000000000000

# free
total used free shared buffers cached
Mem: 246820 139216 107604 0 2760 57460
-/+ buffers/cache: 78996 167824
Swap: 524268 360 523908


And I used libreoffice-core package (about 35MB) for the test, now it uses bz2
for package compression, and openclipart-png (old version, about 600MB).


------------------------------------------------------------------------------
results1 (libreoffice-core)
------------------------------------------------------------------------------

Okay? Here we go...

# du -m *
1 control.tar.gz
35 data.tar.bz2
38 data.tar.gz
24 data.tar.xz
1 debian-binary
35 libreoffice-core_3.5.4-7_armel.deb

# time gzip -d data.tar.gz

real 0m7.253s
user 0m4.980s
sys 0m1.070s

# time bzip2 -dfk data.tar.bz2

real 0m45.256s
user 0m42.320s
sys 0m2.000s

# time xz -dfk data.tar.xz

real 0m11.443s
user 0m9.710s
sys 0m1.450s

size decomp-time
without compression : 141MB -
Default compression(gzip -9) : 38MB 7.3s
Package option (bzip2 -9) : 35MB 45.3s
xz (--arm --check=crc32 --lzma2=dict=64KiB) : 24MB 11.4s
(--arm --check=crc32 --lzma2=dict=1MiB) : 22MB 11.0s
(--arm --lzma2=dict=64KiB) : 24MB 12.5s
(--arm --lzma2=dict=1MiB) : 22MB 12.0s
(--lzma2=dict=64KiB) : 27MB 12.8s
(--lzma2=dict=1MiB) : 25MB 12.3s


------------------------------------------------------------------------------
results2 (openclipart-png, it's arch:all and huge package)
------------------------------------------------------------------------------

# du -m *
(snip)

# time gzip -d data.tar.gz
# time bzip2 -dfk data.tar.bz2
# time xz -dfk data.tar.xz
(snip)

size decomp-time
without compression : 632MB -
Default compression(gzip -9) : 607MB 48.7s
bzip2 compression (bzip2 -9) : 611MB 6m52s
xz (--check=crc32 --lzma2=dict=64KiB) : 604MB 2m09s
(--check=crc32 --lzma2=dict=1MiB) : 601MB 2m12s
(--lzma2=dict=64KiB) : 604MB 2m12s
(--lzma2=dict=1MiB) : 601MB 2m11s


------------------------------------------------------------------------------
results3 (libreoffice-core by amd64 machine)
------------------------------------------------------------------------------

armel vs Intel Corei3 2.90MHz -> almost x5 than armel. size is 10% large.

size decomp-time
xz (--x86 --lzma2=dict=1MiB) : 25MB 2.7s



------------------------------------------------------------------------------
conclusion (half)
------------------------------------------------------------------------------
We should use xz compression instead of bzip2 at least. bzip is harmful for
compressing debian package, so should drop it from support to check easier.

Using xz is
- smaller than gz and bz2, able to be cut 1/3 size
- faster than bz2 and not much slower than gz (on armel arch, at least)
1.5 times slower than gzip

gzip or xz?
- cut 1/3 size = cut download time/traffic and repository size
- slower 1.5 times = it takes more extract time when package is installed

-> average download rate = almost 600KB/s
-> download 35MB = 60 sec
24MB = 40 sec -> diff = 20 sec

+ 4 sec - 20 sec = -16 sec (if you use xz)


------------------------------------------------------------------------------
conclusion (rest)
------------------------------------------------------------------------------
I recommend to use xz ***by default*** (with appropriate option) on not only
i386/amd64 but on ANY architectures. Increasing extract time can be ignore by
decreasing download time and its only part of installation as Mike Hommey
suggested "I/O is still more time consuming than CPU", and nothing worse than
high cpu usage.

We know some packages are better to use gzip, but it's an exception. Using xz
is best choice for rest 99.99% of packages. We can deal with such exception
by specifying gzip for that (e.g. openclipart-png).


*** what's the best compress option for default? ***

low CPU : --check=crc32 -> -10% time
low memory : --lzma2=dict=64KiB (or -0) -> use 100KiB mem
average CPU/memory : --lzma2=dict=8MiB (= -6 = default)
use arch optimization? : Yes, if we can (*) -> -10% size


*** how to find appropriate compression rate(1, 6 or 9) for xz? ***

build your package with each option :-)

I've proposed tiny hack for debhelper, with specifying environment variable,
it creates each compression option - gz, 1, 6, 9, 1e, 6e and 9e.
See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=686048


------------------------------------------------------------
*) tiny pseudo code

arch=`dpkg-architecture -qDEB_HOST_ARCH`

if [ arch = arm | armel | armhf | aarch64 ] // maybe
set on_arch --arm
elsif [ arch = powerpc | ppc64 | powerpcspe ] // maybe
set on_arch --powerpc
elsif [ arch = sparc | sparc64 ] // maybe
set on_arch --sparc
elsif [ arch = ia64 ]
set on_arch --ia64
elsif [ arch = i386 | amd64 ]
set --x86
fi


--
Regards,

Hideki Yamane henrich @ debian.or.jp/org
http://wiki.debian.org/HidekiYamane


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828121018.9bc106d568e137356e37e8df@debian.org ">http://lists.debian.org/20120828121018.9bc106d568e137356e37e8df@debian.org

Hideki Yamane 08-28-2012 03:10 AM

use xz compression for Debian package by default
 
Hi,

In DebConf12, I talked about xz compression for Debian packages(*).
Now I'll talk about next step, suggestion for use xz with with result
from some experiment.


*) http://penta.debconf.org/dc12_schedule/events/930.en.html

------------------------------------------------------------------------------
test environment (armel)
------------------------------------------------------------------------------

I used Netgear ReadyNAS Duo v2, armel arch machine for this test.
see http://www.netgear.com/home/products/storage/prosumer/rnd2000.aspx
It is based on Debian Squeeze, so the result will be the same in Debian :)


# uname -a
Linux nas-A0-96-88 2.6.31.8.duov2 #1 Mon May 14 18:35:20 HKT 2012 armv5tel GNU/Linux

# cat /proc/cpu
cpu/ cpuinfo
root@nas-A0-96-88:/tmp# cat /proc/cpuinfo
Processor : Feroceon 88FR131 rev 1 (v5l)
BogoMIPS : 1599.07
Features : swp half thumb fastmult edsp
CPU implementer : 0x56
CPU architecture: 5TE
CPU variant : 0x2
CPU part : 0x131
CPU revision : 1

Hardware : Feroceon-KW
Revision : 0000
Serial : 0000000000000000

# free
total used free shared buffers cached
Mem: 246820 139216 107604 0 2760 57460
-/+ buffers/cache: 78996 167824
Swap: 524268 360 523908


And I used libreoffice-core package (about 35MB) for the test, now it uses bz2
for package compression, and openclipart-png (old version, about 600MB).


------------------------------------------------------------------------------
results1 (libreoffice-core)
------------------------------------------------------------------------------

Okay? Here we go...

# du -m *
1 control.tar.gz
35 data.tar.bz2
38 data.tar.gz
24 data.tar.xz
1 debian-binary
35 libreoffice-core_3.5.4-7_armel.deb

# time gzip -d data.tar.gz

real 0m7.253s
user 0m4.980s
sys 0m1.070s

# time bzip2 -dfk data.tar.bz2

real 0m45.256s
user 0m42.320s
sys 0m2.000s

# time xz -dfk data.tar.xz

real 0m11.443s
user 0m9.710s
sys 0m1.450s

size decomp-time
without compression : 141MB -
Default compression(gzip -9) : 38MB 7.3s
Package option (bzip2 -9) : 35MB 45.3s
xz (--arm --check=crc32 --lzma2=dict=64KiB) : 24MB 11.4s
(--arm --check=crc32 --lzma2=dict=1MiB) : 22MB 11.0s
(--arm --lzma2=dict=64KiB) : 24MB 12.5s
(--arm --lzma2=dict=1MiB) : 22MB 12.0s
(--lzma2=dict=64KiB) : 27MB 12.8s
(--lzma2=dict=1MiB) : 25MB 12.3s


------------------------------------------------------------------------------
results2 (openclipart-png, it's arch:all and huge package)
------------------------------------------------------------------------------

# du -m *
(snip)

# time gzip -d data.tar.gz
# time bzip2 -dfk data.tar.bz2
# time xz -dfk data.tar.xz
(snip)

size decomp-time
without compression : 632MB -
Default compression(gzip -9) : 607MB 48.7s
bzip2 compression (bzip2 -9) : 611MB 6m52s
xz (--check=crc32 --lzma2=dict=64KiB) : 604MB 2m09s
(--check=crc32 --lzma2=dict=1MiB) : 601MB 2m12s
(--lzma2=dict=64KiB) : 604MB 2m12s
(--lzma2=dict=1MiB) : 601MB 2m11s


------------------------------------------------------------------------------
results3 (libreoffice-core by amd64 machine)
------------------------------------------------------------------------------

armel vs Intel Corei3 2.90MHz -> almost x5 than armel. size is 10% large.

size decomp-time
xz (--x86 --lzma2=dict=1MiB) : 25MB 2.7s



------------------------------------------------------------------------------
conclusion (half)
------------------------------------------------------------------------------
We should use xz compression instead of bzip2 at least. bzip is harmful for
compressing debian package, so should drop it from support to check easier.

Using xz is
- smaller than gz and bz2, able to be cut 1/3 size
- faster than bz2 and not much slower than gz (on armel arch, at least)
1.5 times slower than gzip

gzip or xz?
- cut 1/3 size = cut download time/traffic and repository size
- slower 1.5 times = it takes more extract time when package is installed

-> average download rate = almost 600KB/s
-> download 35MB = 60 sec
24MB = 40 sec -> diff = 20 sec

+ 4 sec - 20 sec = -16 sec (if you use xz)


------------------------------------------------------------------------------
conclusion (rest)
------------------------------------------------------------------------------
I recommend to use xz ***by default*** (with appropriate option) on not only
i386/amd64 but on ANY architectures. Increasing extract time can be ignore by
decreasing download time and its only part of installation as Mike Hommey
suggested "I/O is still more time consuming than CPU", and nothing worse than
high cpu usage.

We know some packages are better to use gzip, but it's an exception. Using xz
is best choice for rest 99.99% of packages. We can deal with such exception
by specifying gzip for that (e.g. openclipart-png).


*** what's the best compress option for default? ***

low CPU : --check=crc32 -> -10% time
low memory : --lzma2=dict=64KiB (or -0) -> use 100KiB mem
average CPU/memory : --lzma2=dict=8MiB (= -6 = default)
use arch optimization? : Yes, if we can (*) -> -10% size


*** how to find appropriate compression rate(1, 6 or 9) for xz? ***

build your package with each option :-)

I've proposed tiny hack for debhelper, with specifying environment variable,
it creates each compression option - gz, 1, 6, 9, 1e, 6e and 9e.
See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=686048


------------------------------------------------------------
*) tiny pseudo code

arch=`dpkg-architecture -qDEB_HOST_ARCH`

if [ arch = arm | armel | armhf | aarch64 ] // maybe
set on_arch --arm
elsif [ arch = powerpc | ppc64 | powerpcspe ] // maybe
set on_arch --powerpc
elsif [ arch = sparc | sparc64 ] // maybe
set on_arch --sparc
elsif [ arch = ia64 ]
set on_arch --ia64
elsif [ arch = i386 | amd64 ]
set --x86
fi


--
Regards,

Hideki Yamane henrich @ debian.or.jp/org
http://wiki.debian.org/HidekiYamane


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828121018.9bc106d568e137356e37e8df@debian.org ">http://lists.debian.org/20120828121018.9bc106d568e137356e37e8df@debian.org

Bastian Blank 08-28-2012 10:02 AM

use xz compression for Debian package by default
 
On Tue, Aug 28, 2012 at 12:10:18PM +0900, Hideki Yamane wrote:
> We know some packages are better to use gzip, but it's an exception. Using xz
> is best choice for rest 99.99% of packages. We can deal with such exception
> by specifying gzip for that (e.g. openclipart-png).

Or even no compression at all. But this needs to be checked.

> *** what's the best compress option for default? ***
> low CPU : --check=crc32 -> -10% time

You tested this on a CPU without 64-Bit multiplication unit? Otherwise
it should not be visible.

> low memory : --lzma2=dict=64KiB (or -0) -> use 100KiB mem

I think we could go for -2 or -3. You can't run Debian on anything with
less then 32MiB RAM.

> use arch optimization? : Yes, if we can (*) -> -10% size

No good idea. The doc clearly stats:

| [...] so it generally isn't good to blindly apply a BCJ filter when
| compressing binary packages for distribution.

> *** how to find appropriate compression rate(1, 6 or 9) for xz? ***
> build your package with each option :-)

This is not appropriate. Needing 700MiB to compress is no good idea.

Bastian

--
Madness has no purpose. Or reason. But it may have a goal.
-- Spock, "The Alternative Factor", stardate 3088.7


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828100259.GB10952@wavehammer.waldi.eu.org">ht tp://lists.debian.org/20120828100259.GB10952@wavehammer.waldi.eu.org

Bastian Blank 08-28-2012 10:02 AM

use xz compression for Debian package by default
 
On Tue, Aug 28, 2012 at 12:10:18PM +0900, Hideki Yamane wrote:
> We know some packages are better to use gzip, but it's an exception. Using xz
> is best choice for rest 99.99% of packages. We can deal with such exception
> by specifying gzip for that (e.g. openclipart-png).

Or even no compression at all. But this needs to be checked.

> *** what's the best compress option for default? ***
> low CPU : --check=crc32 -> -10% time

You tested this on a CPU without 64-Bit multiplication unit? Otherwise
it should not be visible.

> low memory : --lzma2=dict=64KiB (or -0) -> use 100KiB mem

I think we could go for -2 or -3. You can't run Debian on anything with
less then 32MiB RAM.

> use arch optimization? : Yes, if we can (*) -> -10% size

No good idea. The doc clearly stats:

| [...] so it generally isn't good to blindly apply a BCJ filter when
| compressing binary packages for distribution.

> *** how to find appropriate compression rate(1, 6 or 9) for xz? ***
> build your package with each option :-)

This is not appropriate. Needing 700MiB to compress is no good idea.

Bastian

--
Madness has no purpose. Or reason. But it may have a goal.
-- Spock, "The Alternative Factor", stardate 3088.7


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828100259.GB10952@wavehammer.waldi.eu.org">ht tp://lists.debian.org/20120828100259.GB10952@wavehammer.waldi.eu.org

Adam Borowski 08-28-2012 10:05 AM

use xz compression for Debian package by default
 
On Tue, Aug 28, 2012 at 12:10:18PM +0900, Hideki Yamane wrote:
> In DebConf12, I talked about xz compression for Debian packages(*).
> Now I'll talk about next step, suggestion for use xz with with result
> from some experiment.

> ------------------------------------------------------------------------------
> conclusion (rest)
> ------------------------------------------------------------------------------
> I recommend to use xz ***by default*** (with appropriate option) on not only
> i386/amd64 but on ANY architectures. Increasing extract time can be ignore by
> decreasing download time and its only part of installation as Mike Hommey
> suggested "I/O is still more time consuming than CPU", and nothing worse than
> high cpu usage.
>
> We know some packages are better to use gzip, but it's an exception. Using xz
> is best choice for rest 99.99% of packages. We can deal with such exception
> by specifying gzip for that (e.g. openclipart-png).

There's a better compressor here, it's name is "cat" (meow!). PNG files are
already deflate-compressed, so gzip can't help (higher settings for an
infinitessimal benefit aside).

XZ is smart enough to detect uncompressible files and use "cat" for them,
except for one issue: PNGs are not strictly uncompressible, and xz can often
cut another percent or more. This means, it will try to compress them which
wastes time from our point of view. So it not using any compression here
can save CPU for negligible costs.

> *** what's the best compress option for default? ***
> *** how to find appropriate compression rate(1, 6 or 9) for xz? ***

I'd say, let's not go there. The benefits of -9 compression are small and
can break tiny systems (with less than ~100MB of memory) if you're not
careful; -1 produces negligible CPU savings at the cost of often significant
disk space, if the data is incompressible one may want to disable
compression altogether -- but only for packages big enough to bother.

Micromanaging compression levels costs human time, and increases complexity.

> [BCJ filters]
> arch=`dpkg-architecture -qDEB_HOST_ARCH`
>
> if [ arch = arm | armel | armhf | aarch64 ] // maybe
> set on_arch --arm

If this can be applied blindly to non-code files without a noticeable loss,
that could be good if placed in dpkg-dev. If not, we're entering the
micromanaging land again.

--
Copyright and patents were never about promoting culture and innovations;
from the very start they were legalized bribes to give the king some income
and to let businesses get rid of competition. For some history, please read
https://en.wikipedia.org/wiki/Statute_of_Monopolies_1623


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828100525.GA28291@angband.pl">http://lists.debian.org/20120828100525.GA28291@angband.pl

Vincent Lefevre 08-28-2012 10:56 AM

use xz compression for Debian package by default
 
On 2012-08-28 12:05:26 +0200, Adam Borowski wrote:
> On Tue, Aug 28, 2012 at 12:10:18PM +0900, Hideki Yamane wrote:
> > We know some packages are better to use gzip, but it's an
> > exception. Using xz is best choice for rest 99.99% of packages.
> > We can deal with such exception by specifying gzip for that (e.g.
> > openclipart-png).
>
> There's a better compressor here, it's name is "cat" (meow!). PNG files are
> already deflate-compressed, so gzip can't help (higher settings for an
> infinitessimal benefit aside).

Before wondering whether PNG files should have an additional
compression level, is there any reason why a better PNG compression
isn't used in the first place? For instance, "optipng -o9" tries
various parameters and keeps the best one.

--
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828105647.GE19561@xvii.vinc17.org">http://lists.debian.org/20120828105647.GE19561@xvii.vinc17.org

Riku Voipio 08-28-2012 01:55 PM

use xz compression for Debian package by default
 
On Tue, Aug 28, 2012 at 12:10:18PM +0900, Hideki Yamane wrote:
> ------------------------------------------------------------------------------
> conclusion (rest)
> ------------------------------------------------------------------------------
> I recommend to use xz ***by default*** (with appropriate option) on not only
> i386/amd64 but on ANY architectures. Increasing extract time can be ignore by
> decreasing download time and its only part of installation as Mike Hommey
> suggested "I/O is still more time consuming than CPU", and nothing worse than
> high cpu usage.

Thanks for your detailed tests.

Wearing armel buildd maintainer hat, I agreei with the conclusion. 1.5x slower
decompression is small enough hit and as mentioned decompression is only a part of
package install time.

It is worth noticing that using xz by default will slow down package builds (especially
ones with huge -dbg packages, but we are already working on getting faster armel/armhf
build machines.

Riku


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828135531.GA6393@afflict.kos.to">http://lists.debian.org/20120828135531.GA6393@afflict.kos.to

Riku Voipio 08-28-2012 01:55 PM

use xz compression for Debian package by default
 
On Tue, Aug 28, 2012 at 12:10:18PM +0900, Hideki Yamane wrote:
> ------------------------------------------------------------------------------
> conclusion (rest)
> ------------------------------------------------------------------------------
> I recommend to use xz ***by default*** (with appropriate option) on not only
> i386/amd64 but on ANY architectures. Increasing extract time can be ignore by
> decreasing download time and its only part of installation as Mike Hommey
> suggested "I/O is still more time consuming than CPU", and nothing worse than
> high cpu usage.

Thanks for your detailed tests.

Wearing armel buildd maintainer hat, I agreei with the conclusion. 1.5x slower
decompression is small enough hit and as mentioned decompression is only a part of
package install time.

It is worth noticing that using xz by default will slow down package builds (especially
ones with huge -dbg packages, but we are already working on getting faster armel/armhf
build machines.

Riku


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828135531.GA6393@afflict.kos.to">http://lists.debian.org/20120828135531.GA6393@afflict.kos.to

Adam Borowski 08-28-2012 09:03 PM

use xz compression for Debian package by default
 
On Tue, Aug 28, 2012 at 12:56:47PM +0200, Vincent Lefevre wrote:
> Before wondering whether PNG files should have an additional
> compression level, is there any reason why a better PNG compression
> isn't used in the first place? For instance, "optipng -o9" tries
> various parameters and keeps the best one.

optipng can improve only earlier stages of PNG format (ARGB->paletted, pixel
filters), its deflate implementation is pretty bad. You'd want to use it
together with advpng (package advancecomp) which attacks the deflate stage
better:
optipng -o4 -i0 -fix $* && advpng -z4 $*

(Optimizations above -o4 affect deflate only, advpng is scared by interlaced
images and files with junk after PNG data.)


As per my tests, other combinations of PNG optimizers give worse results,
and in some cases (PNGOUT, fortunately not in Debian) even destroy images.
It might be possible to compress files even better by tossing away dubious
chunks added by some editors (Adobe stuff is especially notorious), but
that can be argued to be data loss of sorts.

--
Copyright and patents were never about promoting culture and innovations;
from the very start they were legalized bribes to give the king some income
and to let businesses get rid of competition. For some history, please read
https://en.wikipedia.org/wiki/Statute_of_Monopolies_1623


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120828210353.GA9830@angband.pl">http://lists.debian.org/20120828210353.GA9830@angband.pl

Guillem Jover 08-29-2012 07:51 AM

use xz compression for Debian package by default
 
Hi!

On Tue, 2012-08-28 at 12:10:18 +0900, Hideki Yamane wrote:
> In DebConf12, I talked about xz compression for Debian packages(*).
> Now I'll talk about next step, suggestion for use xz with with result
> from some experiment.

> ------------------------------------------------------------------------------
> conclusion (half)
> ------------------------------------------------------------------------------
> We should use xz compression instead of bzip2 at least. bzip is harmful for
> compressing debian package, so should drop it from support to check easier.

Removing bzip2 uncompression support is not an option, there's packages
on the wild compressed with it that should remain extractable from
dpkg-deb. Removing compression support could be considered, but then I
think right now bzip2 availability is more widespread than xz, so some
might want to use it for just that reason, even if it might be slower
or produce bigger packages.

> ------------------------------------------------------------------------------
> conclusion (rest)
> ------------------------------------------------------------------------------
> I recommend to use xz ***by default*** (with appropriate option) on not only
> i386/amd64 but on ANY architectures. Increasing extract time can be ignore by
> decreasing download time and its only part of installation as Mike Hommey
> suggested "I/O is still more time consuming than CPU", and nothing worse than
> high cpu usage.
>
> We know some packages are better to use gzip, but it's an exception. Using xz
> is best choice for rest 99.99% of packages. We can deal with such exception
> by specifying gzip for that (e.g. openclipart-png).

I thought this was already the consensus, and the only dissenting
opinion was that the base system should still be using gzip so that
foreign non-Debian systems can unpack it w/o requiring to build or
install xz beforehand.

Given the recent flurry of several packaging helpers and packages
switching to use xz, I think for jessie what makes most sense is
to switch all base packages to explicitly compress with gzip and then
switch dpkg-deb to default to xz; which I think would have made more
sense for wheezy too, but it seems too late now, given that non-base
packages have already been switched, because it might imply reverting
stuff and having to modify all base right now.

So if there's still consensus on this by then, I'll be switching
the default dpkg-deb compression to xz, *after* all base has been
switched to gzip. I've already queued a tiny patch for 1.17.x that
allows changing the default dpkg-deb compressor when building dpkg.
I've also some code already which avoids at least two copies of the
data.tar through pipes, which should speed up the unpacking.

> *) tiny pseudo code
>
> arch=`dpkg-architecture -qDEB_HOST_ARCH`
>
> if [ arch = arm | armel | armhf | aarch64 ] // maybe
> set on_arch --arm
> elsif [ arch = powerpc | ppc64 | powerpcspe ] // maybe
> set on_arch --powerpc
> elsif [ arch = sparc | sparc64 ] // maybe
> set on_arch --sparc
> elsif [ arch = ia64 ]
> set on_arch --ia64
> elsif [ arch = i386 | amd64 ]
> set --x86
> fi

I don't think this is a good idea, and I'm not really planning on
making the dpkg-deb compression code conditional on the being built
package architecture.

In any case, thanks for the testing and comparisons!

thanks,
guillem


--
To UNSUBSCRIBE, email to debian-dpkg-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20120829075114.GA2876@gaara.hadrons.org">http://lists.debian.org/20120829075114.GA2876@gaara.hadrons.org


All times are GMT. The time now is 11:51 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.