On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
>
> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.
>
> two options which we can do transparently:
> - rewrite the .so man pages into symlinks
> - omit them from compression
>
> the latter is pretty easy (see below). any preferences on which route to take
> though as the former shouldnt be too hard either ...
It feels like an insignificant optimization to me, but I don't feel
strongly either way.
--
Thanks,
Zac
09-19-2010, 11:59 PM
Mike Frysinger
omitting redirecting man pages from compression
On Sunday, September 19, 2010 19:50:57 Zac Medico wrote:
> On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
> >
> > two options which we can do transparently:
> > - rewrite the .so man pages into symlinks
> > - omit them from compression
> >
> > the latter is pretty easy (see below). any preferences on which route to
> > take though as the former shouldnt be too hard either ...
>
> It feels like an insignificant optimization to me, but I don't feel
> strongly either way.
~19% of the man pages on my system appear to be forwarding files (glorified
symlinks). in my case, that's almost 3000 files. considering things like
`makewhatis` need to decompress & read all of these, i think the difference is
worth addressing.
-mike
09-20-2010, 05:59 AM
Peter Volkov
omitting redirecting man pages from compression
В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
>
> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.
Isn't it better to skip compression on all tiny files (not only man
pages)? In such case some other functions will need to be updated too
(e.g. ecompress --suffix)...
--
Peter.
09-20-2010, 06:31 AM
Ulrich Mueller
omitting redirecting man pages from compression
>>>>> On Sun, 19 Sep 2010, Mike Frysinger wrote:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.
> two options which we can do transparently:
> - rewrite the .so man pages into symlinks
> - omit them from compression
> the latter is pretty easy (see below). any preferences on which
> route to take though as the former shouldnt be too hard either ...
With "controllable compression" in EAPI 4, /usr/share/man will no
longer be special in any way. (Currently, the part of prepman that
your patch changes won't even be reached in EAPI 4.)
If we take the second route, then maybe it should be a more general
solution, i.e. exclude all tiny files (man page or not) from
compression?
But I think that rewriting the .so files into symlinks would be
cleaner.
Ulrich
09-20-2010, 06:45 AM
Mike Frysinger
omitting redirecting man pages from compression
On Monday, September 20, 2010 01:59:33 Peter Volkov wrote:
> В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
>
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...
perhaps, but i think it should only be done on automatic dirs like
docs/info/man. as for the --suffix thing, where would that be an issue ?
people already know they should never rely on these dirs being compressed with
a predictable format.
-mike
09-20-2010, 06:56 AM
Ulrich Mueller
omitting redirecting man pages from compression
>>>>> On Mon, 20 Sep 2010, Mike Frysinger wrote:
>> Isn't it better to skip compression on all tiny files (not only man
>> pages)? In such case some other functions will need to be updated
>> too (e.g. ecompress --suffix)...
> perhaps, but i think it should only be done on automatic dirs like
> docs/info/man.
It's not really an issue outside of /usr/share/man. On my system, I
count about 5600 tiny (smaller than 100 chars uncompressed) files in
man, 170 in doc, and none in info.
UM> If we take the second route, then maybe it should be a more general
UM> solution, i.e. exclude all tiny files (man page or not) from
UM> compression?
First, from a user’s perspective, not compressing small files is a good
thing. Man pages perhaps most of all, given makewhatis, et al.
(Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)
Ideally, there would be some way to configure, per filesystem and/or per
directory, what constitutes a small file. If the fs uses fixed-size
blocks then anything already smaller than one block needn’t be compressed.
OTOH, if the fs supports partial block file packing, then a smaller
threshold may be better.
Even for large files, if the compression fails to save any blocks then
it may be better to leave it uncompressed.
That said, some backup strategies may be better served by compressing
all but the smallest files.
Good heuristics for the default compress-or-don’t threshold should cover
most systems, but the ability to easily override the default is desirable.
-JimC
--
James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
09-21-2010, 10:14 AM
Mike Frysinger
omitting redirecting man pages from compression
On Tuesday, September 21, 2010 05:26:36 James Cloos wrote:
> Ulrich Mueller writes:
> > If we take the second route, then maybe it should be a more general
> > solution, i.e. exclude all tiny files (man page or not) from
> > compression?
>
> First, from a user’s perspective, not compressing small files is a good
> thing. Man pages perhaps most of all, given makewhatis, et al.
> (Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)
>
> Ideally, there would be some way to configure, per filesystem and/or per
> directory, what constitutes a small file. If the fs uses fixed-size
> blocks then anything already smaller than one block needn’t be compressed.
> OTOH, if the fs supports partial block file packing, then a smaller
> threshold may be better.
probably not a bad idea, but i'm going to attempt the other route and avoid
the whole issue (automatically turn .so into symlinks). feel free to pursue
this in the related EAPI bug .
-mike
09-21-2010, 11:01 AM
Enrico Weigelt
omitting redirecting man pages from compression
* Peter Volkov <pva@gentoo.org> schrieb:
> ?? ??????, 19/09/2010 ?? 19:43 -0400, Mike Frysinger ??????????:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
>
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...
Maybe it would be even better to mount some compressing filesystem
on /usr/share/man and /usr/share/info (or perhaps even the whole
/usr/share ?), leave off the explicit compression at all and
replace the link files by symlinks ?
cu
--
----------------------------------------------------------------------
Enrico Weigelt, metux IT service -- http://www.metux.de/