FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 06-05-2011, 11:18 AM
Emil Langrock
 
Default Link-time optimization in debian packages

Hi,

I have currently the problem that I have to use large, computing intensive
applications [1,2]. These are usually implemented in many source files. I used
in the past pseudo c files which include all other c files [3]. Of course,
this is a hack and don't work in many situation due to conflicting local
symbols.

I played around a little bit with GCC's LTO [4]. It is really impressive for
this kind of applications. I had a size reduction and speed increase with the
tested applications. Of course, it was just a small testset and not really
scientific.

Link time-optimization exchanges the meaning of flags slightly [5]. It is
currently necessary to provide the optimization related flags from
CFLAGS/CXXFLAGS also in LDFLAGS. Otherwise the LTO will not really to a
optimization step.

I already found some smaller problems related to weird asm usage in some pic
library code [6], but I would doubt that this is a big show blocker and will
be fixed soon(tm).

My question is now whether there are already plans to use LTO in Debian
packages, any big debian related studies, policies, release goals, ...? It
could also be interesting for large projects like Iceweasel, LibreOffice, ...
Maybe the KDE Debian Package maintainer have a reason why they don't use
KDE4_ENABLE_FINAL --- which would also be an argument against LTO.

[1] http://packages.qa.debian.org/p/povray.html
[2] http://packages.qa.debian.org/m/mednafen.html
[3] see KDE4_ENABLE_FINAL in all KDE libraries/applications
[4] http://gcc.gnu.org/wiki/LinkTimeOptimization
[5] http://gcc.gnu.org/wiki/lto/OptionHandling
[6] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49286
--
Emil Langrock


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 201106051318.43581.emil.langrock@gmx.de">http://lists.debian.org/201106051318.43581.emil.langrock@gmx.de
 
Old 06-05-2011, 11:11 PM
Mike Hommey
 
Default Link-time optimization in debian packages

On Sun, Jun 05, 2011 at 01:18:43PM +0200, Emil Langrock wrote:
> Hi,
>
> I have currently the problem that I have to use large, computing intensive
> applications [1,2]. These are usually implemented in many source files. I used
> in the past pseudo c files which include all other c files [3]. Of course,
> this is a hack and don't work in many situation due to conflicting local
> symbols.
>
> I played around a little bit with GCC's LTO [4]. It is really impressive for
> this kind of applications. I had a size reduction and speed increase with the
> tested applications. Of course, it was just a small testset and not really
> scientific.
>
> Link time-optimization exchanges the meaning of flags slightly [5]. It is
> currently necessary to provide the optimization related flags from
> CFLAGS/CXXFLAGS also in LDFLAGS. Otherwise the LTO will not really to a
> optimization step.
>
> I already found some smaller problems related to weird asm usage in some pic
> library code [6], but I would doubt that this is a big show blocker and will
> be fixed soon(tm).
>
> My question is now whether there are already plans to use LTO in Debian
> packages, any big debian related studies, policies, release goals, ...? It
> could also be interesting for large projects like Iceweasel, LibreOffice, ...
> Maybe the KDE Debian Package maintainer have a reason why they don't use
> KDE4_ENABLE_FINAL --- which would also be an argument against LTO.

On the large projects you are mentioning, LTO is simply not possible to
use at all on gcc 4.6. For instance, the linking phase for Firefox
requires more memory than the address space on 32-bits machines permit,
which means we'd need a) buildds with a lot of RAM b) to cross compile
32-bits code with 64-bits toolchains.

gcc 4.7 should be better, though.

Mike


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110605231154.GA3649@glandium.org">http://lists.debian.org/20110605231154.GA3649@glandium.org
 
Old 06-05-2011, 11:15 PM
Adam Borowski
 
Default Link-time optimization in debian packages

On Sun, Jun 05, 2011 at 01:18:43PM +0200, Emil Langrock wrote:
> I played around a little bit with GCC's LTO [4]. It is really impressive for
> this kind of applications. I had a size reduction and speed increase with the
> tested applications. Of course, it was just a small testset and not really
> scientific.
>
> Link time-optimization exchanges the meaning of flags slightly [5]. It is
> currently necessary to provide the optimization related flags from
> CFLAGS/CXXFLAGS also in LDFLAGS. Otherwise the LTO will not really to a
> optimization step.

> My question is now whether there are already plans to use LTO in Debian
> packages, any big debian related studies, policies, release goals, ...?

I'm afraid that it's not as simple. Every package has to be changed on
their own. For a systemic solution it might be better to talk to autotools
folks and their competition.

What needs to be changed:
* as you said, optimization flags need to be added to LDFLAGS as well
* the invocation of gcc (at least the one called for link) has to be
prefixed with "+", and you need to add "-flto=jobserver" to the above.
Otherwise, that massive link step will be done using only one thread.

There's a cost of greatly increased memory usage, although not above what a
typical parallel compilation would take. You just lose the option of doing
many one-threaded builds in parallel unless your memory is insane. This
affects only Debian buildds rather than actual developers, though.

Even worse, for some strange reason, GCC folks decided to do the compilation
_twice_. The -c invocation will go through the front-end and store the
gimple tree in the .o file, but then it will proceed with useless
compilation and add actual code into .o as well. During the link step, the
code is thrown away and the gimple trees are compiled again.

As far as I know, the only rationale for that is so if you forget to specify
-flto during link you still end up with a slow but working executable.
IMHO, it'd be so much better to throw a fatal error in that case: if the
user asked for LTO, he should be notified that something is wrong. No
backward compatibility is lost since old code won't have -flto.

With that double compilation misfeature, build time is roughly doubled:

make -j6 +gcc -O3
real 1m11.545s
user 5m55.626s
sys 0m18.689s

make -j6 +gcc -O3 -flto=jobserver
real 2m15.786s
user 10m16.195s
sys 0m26.114s

Speed gains for compiled executables are great, though: around 20%[2].

> I already found some smaller problems related to weird asm usage in some pic
> library code [6], but I would doubt that this is a big show blocker and will
> be fixed soon(tm).

There are some bugs, too. LTO was utterly useless in gcc-4.5, throwing an
ICE for anything slightly more complex than "hello world"[1]. It works well in
gcc-4.6, good enough for production usage as long as you're prepared to deal
with the occasional bug.

> It could also be interesting for large projects like Iceweasel,
> LibreOffice, ...

If the buildds can handle them. It is sad that architectures that care
about code speed and size the most would be unable to use LTO builds because
of 1-core Pentium3-equivalent speed buildds with 256MB ram when there's a
349823492357-way amd64 machine standing idle next to them. An amd64->armel
build takes as long as an amd64->amd64 one, while armel->armel is a matter
of eight hours on the code above without LTO and would keep swapping until
the heat death of the machine with LTO.

Thus, before enabling LTO on anything large, you'd have to ensure the
buildds have enough ram to handle that...


[1]. As proven by exhaustive research by testing on a random 16 kLOC C
project, a 330 kLOC C++ one and an 8 kLOC C library.

[2]. It depends on the code in question, of course. Something with a good
locality of calls will see no gain, something that jumps between different
source files in an intense area can see more. Not needing to cram
everything into a single file can increase readability...

--
1KB // Microsoft corollary to Hanlon's razor:
// Never attribute to stupidity what can be
// adequately explained by malice.


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110605231529.GA9387@angband.pl">http://lists.debian.org/20110605231529.GA9387@angband.pl
 
Old 06-06-2011, 02:21 AM
"brian m. carlson"
 
Default Link-time optimization in debian packages

On Mon, Jun 06, 2011 at 01:15:29AM +0200, Adam Borowski wrote:
> Speed gains for compiled executables are great, though: around 20%[2].

It depends. I have code where using -flto causes no significant
improvement (< 2%) in some cases and major performance losses (-7 to
-37%) in others. This is not something that is a boon in every case and
certainly not something that should automatically be used project-wide.

--
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187
 
Old 06-06-2011, 07:01 AM
"Bernhard R. Link"
 
Default Link-time optimization in debian packages

* Adam Borowski <kilobyte@angband.pl> [110606 01:15]:
> On Sun, Jun 05, 2011 at 01:18:43PM +0200, Emil Langrock wrote:
> > I played around a little bit with GCC's LTO [4]. It is really impressive for
> > this kind of applications. I had a size reduction and speed increase with the
> > tested applications. Of course, it was just a small testset and not really
> > scientific.
> >
> > Link time-optimization exchanges the meaning of flags slightly [5]. It is
> > currently necessary to provide the optimization related flags from
> > CFLAGS/CXXFLAGS also in LDFLAGS. Otherwise the LTO will not really to a
> > optimization step.
>
> > My question is now whether there are already plans to use LTO in Debian
> > packages, any big debian related studies, policies, release goals, ...?
>
> I'm afraid that it's not as simple. Every package has to be changed on
> their own. For a systemic solution it might be better to talk to autotools
> folks and their competition.

autotools give CFLAGS/CXXFLAGS to the linker, as in the past linkers
often needed -fPIC or -g to produce the right results.
(Only preprocessor flags are not passed (that's why -I and -D belong to
CPPFLAGS and not to CFLAGS/CXXFLAGS)).

Bernhard R. Link


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20110606070141.GA5818@pcpool00.mathematik.uni-freiburg.de">http://lists.debian.org/20110606070141.GA5818@pcpool00.mathematik.uni-freiburg.de
 

Thread Tools




All times are GMT. The time now is 04:08 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org