FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo Development

 
 
LinkBack Thread Tools
 
Old 09-19-2010, 11:43 PM
Mike Frysinger
 
Default omitting redirecting man pages from compression

many man pages exist merely as a redirect to another man page:
$ xzcat /usr/share/man/man1/zcat.1.xz
.so man1/gzip.1

compressing these tiny (always?) results in a larger file. that means we
arent saving space, and we're adding overhead at runtime.

two options which we can do transparently:
- rewrite the .so man pages into symlinks
- omit them from compression

the latter is pretty easy (see below). any preferences on which route to take
though as the former shouldnt be too hard either ...

--- a/bin/ebuild-helpers/ecompressdir
+++ b/bin/ebuild-helpers/ecompressdir
@@ -13,6 +13,7 @@ case $1 in
--ignore)
shift
for skip in "$@" ; do
+ skip=${skip#${D}}
[[ -d ${D}${skip} || -f ${D}${skip} ]]
&& touch "${D}${skip}.ecompress.skip"
done
--- a/bin/ebuild-helpers/prepman
+++ b/bin/ebuild-helpers/prepman
@@ -27,6 +27,10 @@ for subdir in "${mandir}"/man* "${mandir}"/*/man* ; do
[[ -d ${subdir} ]] && really_is_mandir=1 && break
done

-[[ ${really_is_mandir} == 1 ]] && exec ecompressdir --queue "${mandir#${D}}"
+if [[ ${really_is_mandir} == 1 ]] ; then
+ ecompressdir --queue "${mandir#${D}}" || exit 1
+ # compressing small files just adds overhead
+ find "${mandir}" -type f '!' -size +100c -print0 | ${XARGS} -0 ecompressdir --ignore
+fi

exit 0
-mike
 
Old 09-19-2010, 11:50 PM
Zac Medico
 
Default omitting redirecting man pages from compression

On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
>
> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.
>
> two options which we can do transparently:
> - rewrite the .so man pages into symlinks
> - omit them from compression
>
> the latter is pretty easy (see below). any preferences on which route to take
> though as the former shouldnt be too hard either ...

It feels like an insignificant optimization to me, but I don't feel
strongly either way.
--
Thanks,
Zac
 
Old 09-19-2010, 11:59 PM
Mike Frysinger
 
Default omitting redirecting man pages from compression

On Sunday, September 19, 2010 19:50:57 Zac Medico wrote:
> On 09/19/2010 04:43 PM, Mike Frysinger wrote:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
> >
> > two options which we can do transparently:
> > - rewrite the .so man pages into symlinks
> > - omit them from compression
> >
> > the latter is pretty easy (see below). any preferences on which route to
> > take though as the former shouldnt be too hard either ...
>
> It feels like an insignificant optimization to me, but I don't feel
> strongly either way.

~19% of the man pages on my system appear to be forwarding files (glorified
symlinks). in my case, that's almost 3000 files. considering things like
`makewhatis` need to decompress & read all of these, i think the difference is
worth addressing.
-mike
 
Old 09-20-2010, 05:59 AM
Peter Volkov
 
Default omitting redirecting man pages from compression

В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1
>
> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.

Isn't it better to skip compression on all tiny files (not only man
pages)? In such case some other functions will need to be updated too
(e.g. ecompress --suffix)...

--
Peter.
 
Old 09-20-2010, 06:31 AM
Ulrich Mueller
 
Default omitting redirecting man pages from compression

>>>>> On Sun, 19 Sep 2010, Mike Frysinger wrote:

> many man pages exist merely as a redirect to another man page:
> $ xzcat /usr/share/man/man1/zcat.1.xz
> .so man1/gzip.1

> compressing these tiny (always?) results in a larger file. that means we
> arent saving space, and we're adding overhead at runtime.

> two options which we can do transparently:
> - rewrite the .so man pages into symlinks
> - omit them from compression

> the latter is pretty easy (see below). any preferences on which
> route to take though as the former shouldnt be too hard either ...

With "controllable compression" in EAPI 4, /usr/share/man will no
longer be special in any way. (Currently, the part of prepman that
your patch changes won't even be reached in EAPI 4.)

If we take the second route, then maybe it should be a more general
solution, i.e. exclude all tiny files (man page or not) from
compression?

But I think that rewriting the .so files into symlinks would be
cleaner.

Ulrich
 
Old 09-20-2010, 06:45 AM
Mike Frysinger
 
Default omitting redirecting man pages from compression

On Monday, September 20, 2010 01:59:33 Peter Volkov wrote:
> В Вск, 19/09/2010 в 19:43 -0400, Mike Frysinger пишет:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
>
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...

perhaps, but i think it should only be done on automatic dirs like
docs/info/man. as for the --suffix thing, where would that be an issue ?
people already know they should never rely on these dirs being compressed with
a predictable format.
-mike
 
Old 09-20-2010, 06:56 AM
Ulrich Mueller
 
Default omitting redirecting man pages from compression

>>>>> On Mon, 20 Sep 2010, Mike Frysinger wrote:

>> Isn't it better to skip compression on all tiny files (not only man
>> pages)? In such case some other functions will need to be updated
>> too (e.g. ecompress --suffix)...

> perhaps, but i think it should only be done on automatic dirs like
> docs/info/man.

It's not really an issue outside of /usr/share/man. On my system, I
count about 5600 tiny (smaller than 100 chars uncompressed) files in
man, 170 in doc, and none in info.

Ulrich
 
Old 09-21-2010, 09:26 AM
James Cloos
 
Default omitting redirecting man pages from compression

>>>>> "UM" == Ulrich Mueller <ulm@gentoo.org> writes:

UM> If we take the second route, then maybe it should be a more general
UM> solution, i.e. exclude all tiny files (man page or not) from
UM> compression?

First, from a user’s perspective, not compressing small files is a good
thing. Man pages perhaps most of all, given makewhatis, et al.
(Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)

Ideally, there would be some way to configure, per filesystem and/or per
directory, what constitutes a small file. If the fs uses fixed-size
blocks then anything already smaller than one block needn’t be compressed.
OTOH, if the fs supports partial block file packing, then a smaller
threshold may be better.

Even for large files, if the compression fails to save any blocks then
it may be better to leave it uncompressed.

That said, some backup strategies may be better served by compressing
all but the smallest files.

Good heuristics for the default compress-or-don’t threshold should cover
most systems, but the ability to easily override the default is desirable.

-JimC
--
James Cloos <cloos@jhcloos.com> OpenPGP: 1024D/ED7DAEA6
 
Old 09-21-2010, 10:14 AM
Mike Frysinger
 
Default omitting redirecting man pages from compression

On Tuesday, September 21, 2010 05:26:36 James Cloos wrote:
> Ulrich Mueller writes:
> > If we take the second route, then maybe it should be a more general
> > solution, i.e. exclude all tiny files (man page or not) from
> > compression?
>
> First, from a user’s perspective, not compressing small files is a good
> thing. Man pages perhaps most of all, given makewhatis, et al.
> (Think of all the C₁₂ which won’t be un-sequestered quite so soon. ☺ ;^)
>
> Ideally, there would be some way to configure, per filesystem and/or per
> directory, what constitutes a small file. If the fs uses fixed-size
> blocks then anything already smaller than one block needn’t be compressed.
> OTOH, if the fs supports partial block file packing, then a smaller
> threshold may be better.

probably not a bad idea, but i'm going to attempt the other route and avoid
the whole issue (automatically turn .so into symlinks). feel free to pursue
this in the related EAPI bug .
-mike
 
Old 09-21-2010, 11:01 AM
Enrico Weigelt
 
Default omitting redirecting man pages from compression

* Peter Volkov <pva@gentoo.org> schrieb:
> ?? ??????, 19/09/2010 ?? 19:43 -0400, Mike Frysinger ??????????:
> > many man pages exist merely as a redirect to another man page:
> > $ xzcat /usr/share/man/man1/zcat.1.xz
> > .so man1/gzip.1
> >
> > compressing these tiny (always?) results in a larger file. that means we
> > arent saving space, and we're adding overhead at runtime.
>
> Isn't it better to skip compression on all tiny files (not only man
> pages)? In such case some other functions will need to be updated too
> (e.g. ecompress --suffix)...

Maybe it would be even better to mount some compressing filesystem
on /usr/share/man and /usr/share/info (or perhaps even the whole
/usr/share ?), leave off the explicit compression at all and
replace the link files by symlinks ?


cu
--
----------------------------------------------------------------------
Enrico Weigelt, metux IT service -- http://www.metux.de/

phone: +49 36207 519931 email: weigelt@metux.de
mobile: +49 151 27565287 icq: 210169427 skype: nekrad666
----------------------------------------------------------------------
Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------
 

Thread Tools




All times are GMT. The time now is 07:04 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org