Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Gentoo Development (http://www.linux-archive.org/gentoo-development/)
-   -   omitting redirecting man pages from compression (http://www.linux-archive.org/gentoo-development/429241-omitting-redirecting-man-pages-compression.html)

Mike Frysinger 09-19-2010 11:43 PM

omitting redirecting man pages from compression
 
many man pages exist merely as a redirect to another man page:
$ xzcat /usr/share/man/man1/zcat.1.xz
.so man1/gzip.1

compressing these tiny (always?) results in a larger file. that means we
arent saving space, and we're adding overhead at runtime.

two options which we can do transparently:
- rewrite the .so man pages into symlinks
- omit them from compression

the latter is pretty easy (see below). any preferences on which route to take
though as the former shouldnt be too hard either ...

--- a/bin/ebuild-helpers/ecompressdir
+++ b/bin/ebuild-helpers/ecompressdir
@@ -13,6 +13,7 @@ case $1 in
--ignore)
shift
for skip in "$@" ; do
+ skip=${skip#${D}}
[[ -d ${D}${skip} || -f ${D}${skip} ]]
&& touch "${D}${skip}.ecompress.skip"
done
--- a/bin/ebuild-helpers/prepman
+++ b/bin/ebuild-helpers/prepman
@@ -27,6 +27,10 @@ for subdir in "${mandir}"/man* "${mandir}"/*/man* ; do
[[ -d ${subdir} ]] && really_is_mandir=1 && break
done

-[[ ${really_is_mandir} == 1 ]] && exec ecompressdir --queue "${mandir#${D}}"
+if [[ ${really_is_mandir} == 1 ]] ; then
+ ecompressdir --queue "${mandir#${D}}" || exit 1
+ # compressing small files just adds overhead
+ find "${mandir}" -type f '!' -size +100c -print0 | ${XARGS} -0 ecompressdir --ignore
+fi

exit 0
-mike

Zac Medico 09-19-2010 11:50 PM

omitting redirecting man pages from compression
 
On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
>
> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.
>
> two options which we can do transparently:
> - rewrite the .so man pages into symlinks
> - omit them from compression
>
> the latter is pretty easy (see below). any preferences on which route to take
> though as the former shouldnt be too hard either ...

It feels like an insignificant optimization to me, but I don't feel
strongly either way.
--
Thanks,
Zac

Mike Frysinger 09-19-2010 11:59 PM

omitting redirecting man pages from compression
 
On Sunday, September 19, 2010 19:50:57 Zac Medico wrote:
> On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
> >
> > two options which we can do transparently:
> > - rewrite the .so man pages into symlinks
> > - omit them from compression
> >
> > the latter is pretty easy (see below). any preferences on which route to
> > take though as the former shouldnt be too hard either ...
>
> It feels like an insignificant optimization to me, but I don't feel
> strongly either way.

~19% of the man pages on my system appear to be forwarding files (glorified
symlinks). in my case, that's almost 3000 files. considering things like
`makewhatis` need to decompress & read all of these, i think the difference is
worth addressing.
-mike

Peter Volkov 09-20-2010 05:59 AM

omitting redirecting man pages from compression
 
В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
>
> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.

Isn't it better to skip compression on all tiny files (not only man
pages)? In such case some other functions will need to be updated too
(e.g. ecompress --suffix)...

--
Peter.

Ulrich Mueller 09-20-2010 06:31 AM

omitting redirecting man pages from compression
 
>>>>> On Sun, 19 Sep 2010, Mike Frysinger wrote:

> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1

> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.

> two options which we can do transparently:
> - rewrite the .so man pages into symlinks
> - omit them from compression

> the latter is pretty easy (see below). any preferences on which
> route to take though as the former shouldnt be too hard either ...

With "controllable compression" in EAPI 4, /usr/share/man will no
longer be special in any way. (Currently, the part of prepman that
your patch changes won't even be reached in EAPI 4.)

If we take the second route, then maybe it should be a more general
solution, i.e. exclude all tiny files (man page or not) from
compression?

But I think that rewriting the .so files into symlinks would be
cleaner.

Ulrich

Mike Frysinger 09-20-2010 06:45 AM

omitting redirecting man pages from compression
 
On Monday, September 20, 2010 01:59:33 Peter Volkov wrote:
> В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
>
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...

perhaps, but i think it should only be done on automatic dirs like
docs/info/man. as for the --suffix thing, where would that be an issue ?
people already know they should never rely on these dirs being compressed with
a predictable format.
-mike

Ulrich Mueller 09-20-2010 06:56 AM

omitting redirecting man pages from compression
 
>>>>> On Mon, 20 Sep 2010, Mike Frysinger wrote:

>> Isn't it better to skip compression on all tiny files (not only man
>> pages)? In such case some other functions will need to be updated
>> too (e.g. ecompress --suffix)...

> perhaps, but i think it should only be done on automatic dirs like
> docs/info/man.

It's not really an issue outside of /usr/share/man. On my system, I
count about 5600 tiny (smaller than 100 chars uncompressed) files in
man, 170 in doc, and none in info.

Ulrich

James Cloos 09-21-2010 09:26 AM

omitting redirecting man pages from compression
 
>>>>> "UM" == Ulrich Mueller <ulm@gentoo.org> writes:

UM> If we take the second route, then maybe it should be a more general
UM> solution, i.e. exclude all tiny files (man page or not) from
UM> compression?

First, from a user’s perspective, not compressing small files is a good
thing. Man pages perhaps most of all, given makewhatis, et al.
(Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)

Ideally, there would be some way to configure, per filesystem and/or per
directory, what constitutes a small file. If the fs uses fixed-size
blocks then anything already smaller than one block needn’t be compressed.
OTOH, if the fs supports partial block file packing, then a smaller
threshold may be better.

Even for large files, if the compression fails to save any blocks then
it may be better to leave it uncompressed.

That said, some backup strategies may be better served by compressing
all but the smallest files.

Good heuristics for the default compress-or-don’t threshold should cover
most systems, but the ability to easily override the default is desirable.

-JimC
--
James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6

Mike Frysinger 09-21-2010 10:14 AM

omitting redirecting man pages from compression
 
On Tuesday, September 21, 2010 05:26:36 James Cloos wrote:
> Ulrich Mueller writes:
> > If we take the second route, then maybe it should be a more general
> > solution, i.e. exclude all tiny files (man page or not) from
> > compression?
>
> First, from a user’s perspective, not compressing small files is a good
> thing. Man pages perhaps most of all, given makewhatis, et al.
> (Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)
>
> Ideally, there would be some way to configure, per filesystem and/or per
> directory, what constitutes a small file. If the fs uses fixed-size
> blocks then anything already smaller than one block needn’t be compressed.
> OTOH, if the fs supports partial block file packing, then a smaller
> threshold may be better.

probably not a bad idea, but i'm going to attempt the other route and avoid
the whole issue (automatically turn .so into symlinks). feel free to pursue
this in the related EAPI bug ;).
-mike

Enrico Weigelt 09-21-2010 11:01 AM

omitting redirecting man pages from compression
 
* Peter Volkov <pva@gentoo.org> schrieb:
> ?? ??????, 19/09/2010 ?? 19:43 -0400, Mike Frysinger ??????????:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
>
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...

Maybe it would be even better to mount some compressing filesystem
on /usr/share/man and /usr/share/info (or perhaps even the whole
/usr/share ?), leave off the explicit compression at all and
replace the link files by symlinks ?


cu
--
----------------------------------------------------------------------
Enrico Weigelt, metux IT service -- http://www.metux.de/

phone: +49 36207 519931 email: weigelt@metux.de
mobile: +49 151 27565287 icq: 210169427 skype: nekrad666
----------------------------------------------------------------------
Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------


All times are GMT. The time now is 12:51 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.