FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Gentoo > Gentoo Portage Developer

 
 
LinkBack Thread Tools
 
Old 07-26-2008, 06:48 PM
Duncan
 
Default portage-2.2-rc3 parallel merges quit being parallel

So I'm running the 2.2-rcs and have been seeing blogs about the new
parallel merge capacities... Having a dual-dual-core Opteron and having
run multiple merges manually for some time, this is VERY welcome news.
=8^)

So after upgrading to -rc3 I set EMERGE_DEFAULT_OPTS to include

"--jobs=10 --keep-going --load-average=15"

and tried a few small merges (the rest of the world update). It worked
nicely.

So then, as I had adjusted LDFLAGS recently and hadn't yet done a full
world remerge, I decided to try the BIG test, emerge --emptytree --world,
with some 674 packages.

For the first 100 or so packages, it worked quite well. However, about
there, maybe package 120 or so, so about 20% of the way thru, it reverted
to doing them one-at-a-time again. I'm now on package #279 and it's
still doing them only one at a time, and this has included stuff like the
xorg-docs (IIRC) package, that really shouldn't be pre-deps for stuff
that has come since, so /something/ else should have been trying to
merge, as long as load-average stays low, as it has much of the time (I
have MAKEOPTS="-j -l20" so it's not going to be low all the time).

Is this a known issue still being worked on, perhaps due to the limits of
the package dependency and scheduling resolution such that it can't
handle a full world remerge and defaults back to one-at-a-time after it
reaches the extent of its mapping, or is this a new bug?

Also, a little monitoring utility that could be run in another terminal
and just list and update all the currently merging packages, and any that
had failed to merge, so I could take a look at them while the system is
still working on the rest, would be quite useful. I tried watch ls -d
$PORTAGE_TMPDIR/*/* and it starts to work, but of course starts to error
in a few seconds when the first package completes since the *s are
resolved initially. With a bit of work I'm sure can get something simple
working, but it'd be nice if there were a pre-made utility to do it.
Maybe there is and I just don't know about it yet? =8^)

Finally... I was rather confused the first time at just one job an
install took a bit, as that's apparently not counted as "running", so it
appeared nothing was going on for a bit. Maybe an "installing" count as
well would be useful... and prevent that confusion.

Thanks, guys. It's already fun playing with! =8^)

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
 
Old 07-26-2008, 09:56 PM
Andrew Gaffney
 
Default portage-2.2-rc3 parallel merges quit being parallel

Duncan wrote:

"--jobs=10 --keep-going --load-average=15"


For a dual-dual-core setup, a load average of 4.0 is "fully loaded". Anything
higher than that and you're just causing jobs to queue up unnecessarily and your
system to "thrash".



have MAKEOPTS="-j -l20" so it's not going to be low all the time).


Same thing here. Also, why would you specify different --load-average values in
these two places?


--
Andrew Gaffney http://dev.gentoo.org/~agaffney/
Gentoo Linux Developer Catalyst/Installer + x86 release coordinator
 
Old 07-26-2008, 11:31 PM
Duncan
 
Default portage-2.2-rc3 parallel merges quit being parallel

Andrew Gaffney <agaffney@gentoo.org> posted 488B9D84.4050209@gentoo.org,
excerpted below, on Sat, 26 Jul 2008 16:56:20 -0500:

> Duncan wrote:
>> "--jobs=10 --keep-going --load-average=15"
>
> For a dual-dual-core setup, a load average of 4.0 is "fully loaded".
> Anything higher than that and you're just causing jobs to queue up
> unnecessarily and your system to "thrash".

Not really. The highest system load-average possible is the one-minute
load-average, right? From my experience there are times when it just
sits there doing nothing. No I/O, CPU graphs low, but load average still
high so it won't start any more jobs.

I see gains at times up to ~4 jobs per core (tho it's arguably possible
they're counteracted above say, 3/core, by extra shuffling, I won't argue
that and haven't checked that closely, I just don't like to see blank
spots in the CPU utilization that aren't accounted for by I/O). I just
boosted it to five so I could do the below....

>> have MAKEOPTS="-j -l20" so it's not going to be low all the time).
>
> Same thing here. Also, why would you specify different --load-average
> values in these two places?

The idea here is to create a differential, so it doesn't start new builds
when a single build can adequately parallelize. I'm building in tmpfs,
which is (limited quantity, 8 gigs, but...) memory, and if a single
emerge is keeping the system sufficiently busy, no reason to add a new
one to the pile. However, when a single merge isn't keeping the system
busy, /then/ add more. The differential at least in theory should favour
single packages when they can provide sufficient parallelization.

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
 
Old 07-27-2008, 12:00 AM
Zac Medico
 
Default portage-2.2-rc3 parallel merges quit being parallel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Duncan wrote:
> For the first 100 or so packages, it worked quite well. However, about
> there, maybe package 120 or so, so about 20% of the way thru, it reverted
> to doing them one-at-a-time again. I'm now on package #279 and it's
> still doing them only one at a time, and this has included stuff like the
> xorg-docs (IIRC) package, that really shouldn't be pre-deps for stuff
> that has come since, so /something/ else should have been trying to
> merge, as long as load-average stays low, as it has much of the time (I
> have MAKEOPTS="-j -l20" so it's not going to be low all the time).
>
> Is this a known issue still being worked on, perhaps due to the limits of
> the package dependency and scheduling resolution such that it can't
> handle a full world remerge and defaults back to one-at-a-time after it
> reaches the extent of its mapping, or is this a new bug?

The current algorithm is intentionally as conservative as possible
in the sense that it will not build given package in parallel if
there are any packages in it's subgraph of deep dependencies
scheduled to be merged. We can add one or more options to control
the criteria for choosing packages. Those options will modify the
behavior of Scheduler._choose_pkg().

There are a couple of reasons for the current conservative
behavior. In many cases the conservative behavior is beneficial
(avoids breakage) in order to ensure that a package's subgraph of
deep dependencies is up to date before building the package itself.
In addition, the conservative behavior lessens the need to be
concerned about "invariance" requirements like those discussed in
bug 232990 [1].

> Also, a little monitoring utility that could be run in another terminal
> and just list and update all the currently merging packages, and any that
> had failed to merge, so I could take a look at them while the system is
> still working on the rest, would be quite useful. I tried watch ls -d
> $PORTAGE_TMPDIR/*/* and it starts to work, but of course starts to error
> in a few seconds when the first package completes since the *s are
> resolved initially. With a bit of work I'm sure can get something simple
> working, but it'd be nice if there were a pre-made utility to do it.
> Maybe there is and I just don't know about it yet? =8^)

I'm not aware of any tool like that yet. Such a tool should probably
use the hidden lock file located in the parent directory of a build
directory in order to detect an active build. In the future I plan
to have a daemon process to allow cooperation between multiple
emerge invocations, for things like bug 177311 [2]. Once that's
implemented, there will be a way to query the daemon for information
about running builds.

> Finally... I was rather confused the first time at just one job an
> install took a bit, as that's apparently not counted as "running", so it
> appeared nothing was going on for a bit. Maybe an "installing" count as
> well would be useful... and prevent that confusion.

There used to be a "merges" count in the status display but somebody
thought it was confusing (darkside/Jeremy Olexa) and I decided that
it wasn't interesting enough to be worthy of it's display space so I
removed it. I guess we can add it back if there's space and demand
for it. Maybe it should only be shown when the job count drops to zero?

Zac

[1] http://bugs.gentoo.org/show_bug.cgi?id=232990
[2] http://bugs.gentoo.org/show_bug.cgi?id=177311
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAkiLupEACgkQ/ejvha5XGaP8KQCffKnpIiplEioyP4JOlD8HGD21
Q2QAnAroP5voKc8zyXcCFArPxrYjsec3
=3epU
-----END PGP SIGNATURE-----
 
Old 07-27-2008, 01:32 AM
Marius Mauch
 
Default portage-2.2-rc3 parallel merges quit being parallel

On Sat, 26 Jul 2008 16:56:20 -0500
Andrew Gaffney <agaffney@gentoo.org> wrote:

> Duncan wrote:
> > "--jobs=10 --keep-going --load-average=15"
>
> For a dual-dual-core setup, a load average of 4.0 is "fully loaded".

Only in ideal cases, when you have long-running processes
hammering the cpu and little or no context switches. This isn't the case
with builds, the actual compile processes that need cpu time usually
terminate very quickly, which increases the load average.
I did some benchmarking a while ago with different combinations of -j
and -l in MAKEOPTS, using the kernel as testcase, and IIRC the fastest
builds were around -l6.0 (on a dual-core system) and high (or
unlimited) values for -j (sidenote here: some ebuilds like openoffice
parse MAKEOPTS to get parameters for their own build systems, but only
recognize/support -j and ignore -l, so one should still be careful with
the value for -j).

Marius
 
Old 07-27-2008, 08:59 AM
Duncan
 
Default portage-2.2-rc3 parallel merges quit being parallel

Marius Mauch <genone@gentoo.org> posted
20080727033210.f3441ab3.genone@gentoo.org, excerpted below, on Sun, 27
Jul 2008 03:32:10 +0200:

> I did some benchmarking a while ago with different combinations of -j
> and -l in MAKEOPTS, using the kernel as testcase, and IIRC the fastest
> builds were around -l6.0 (on a dual-core system) and high (or unlimited)
> values for -j

FWIW... My experience suggests that there's some sort of race condition
with the job count. I was getting occasional very irritating errors to
the effect of "Job count is 17, should be 16" (or possibly the reverse),
that would terminate the build. Rerunning it wouldn't trigger the
problem again. By setting unlimited jobs (-j with no appended number), I
avoided whatever it was, or at least haven't seen it since.

I don't know enough about make's parallel job processing (OK, I simply
know that it normally works, which makes it irritating when it doesn't)
to have the foggiest what that was all about, except to assume it had to
be some sort of race condition on the job counter.

Has anyone else seen similar?

Anyway, that's why I ultimately ended up with -j -lX, using the -lX part
to do the limiting. With the previous simple job-at-a-time emerging, the
few cases where -lX wasn't honored and therefore wasn't limiting wasn't a
problem. Of course, there's some adjusting to be done now. =8^) But
it's for a good reason! =8^) (I've already decided that --jobs=10 is
going to need changed, I'm intuiting it needs to go up, but it's possible
I need to lower it. More experimentation is necessary! =8^)

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
 
Old 07-27-2008, 08:31 PM
René 'Necoro' Neumann
 
Default portage-2.2-rc3 parallel merges quit being parallel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Duncan schrieb:
> Also, a little monitoring utility that could be run in another terminal
> and just list and update all the currently merging packages, and any that
> had failed to merge, so I could take a look at them while the system is
> still working on the rest, would be quite useful.
A very basic thingy:
watch qlop -cC

qlop is part of portage-utils. And it seems to only work correctly here,
when run as root
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkiM2zoACgkQ4UOg/zhYFuBQwQCeOPut3WtWXQYZuvpRuf0HFqNk
r4wAn3JUVV3HWb/OYooBTNxz5mqb9Skx
=Wdte
-----END PGP SIGNATURE-----
 
Old 07-28-2008, 08:10 AM
Duncan
 
Default portage-2.2-rc3 parallel merges quit being parallel

René 'Necoro' Neumann <lists@necoro.eu> posted 488CDB3B.6050604@necoro.eu,
excerpted below, on Sun, 27 Jul 2008 22:31:55 +0200:

> Duncan schrieb:
>> Also, a little monitoring utility that could be run in another terminal
>> and just list and update all the currently merging packages, and any
>> that had failed to merge, so I could take a look at them while the
>> system is still working on the rest, would be quite useful.
> A very basic thingy:
> watch qlop -cC
>
> qlop is part of portage-utils. And it seems to only work correctly here,
> when run as root


Thanks! =8^)

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
 
Old 04-09-2010, 03:35 PM
Duncan
 
Default portage-2.2-rc3 parallel merges quit being parallel

Followup to a rather old post...

Zac Medico posted on Sat, 26 Jul 2008 17:00:18 -0700 as excerpted:

> Duncan wrote:
>> For the first 100 or so packages, it worked quite well. However, about
>> there, maybe package 120 or so, so about 20% of the way thru, it
>> reverted to doing them one-at-a-time again.

FWIW, that has long since been fixed. =:^)

>> Finally... I was rather confused the first time at just one job an
>> install took a bit, as that's apparently not counted as "running", so
>> it appeared nothing was going on for a bit. Maybe an "installing"
>> count as well would be useful... and prevent that confusion.
>
> There used to be a "merges" count in the status display but somebody
> thought it was confusing (darkside/Jeremy Olexa) and I decided that it
> wasn't interesting enough to be worthy of it's display space so I
> removed it. I guess we can add it back if there's space and demand for
> it. Maybe it should only be shown when the job count drops to zero?

From my point of view, the current display (as of portage-2.2_rc67) is
still lacking/confusing in this regard. What I see happening here is that
a number of packages will be building, then finish building one at a time,
but the completed count doesn't go up. They seem to sit there built but
not installing for quite awhile, then all install at once. But it's hard
to tell as there's not an "installing" count.

Now part of this may be due to the way I have jobs setup.

MAKEOPTS="-j13 -l10"

My emerge jobs, OTOH, are

"--jobs=4 --load-average=7"

(BTW, it would sure be nice if there was a make.conf variable for those,
similar to MAKEOPTS, and then a simple command-line option to toggle it on
and off. I don't put them in my default options as when I'm
troubleshooting a broken merge or am otherwise just merging a single
package, it's nice to be able to run non-parallel jobs and thus have the
output "live", but when I'm doing full updates, I want the parallel jobs.
The logical way to handle this would be to set the --jobs and
--load-average parameters in a var similar to MAKEOPTS, and then use a
command-line option to toggle parallel jobs mode on or off.)

Meanwhile...

The rational behind having portage jobs lower than makeopts is that since
I have PORTAGE_TMPDIR pointed at a tmpfs, I want to favor currently
running package emerges over starting new ones, because every new package
started unpacks into that tmpfs, thereby using more memory. Thus, if it's
possible to run more parallel jobs on already started package merges, I
want it to do that instead of starting more packages. The way to do that
is to keep the number of --jobs and --load-average lower than the
corresponding MAKEOPTS parameters, so MAKEOPTS gets used first if possible
and only if there's no more parallel jobs possible at that level, does the
load average drop down far enough for another package to start merging
beside the currently merging ones.

But for whatever reason, portage seems to sit there doing nothing with the
already built and ready to install packages, preferring to start more
package builds rather than finish off the installs of the ones it already
has built. So I can end up with 7 or 8 packages sitting there built, but
apparently not installing, then it goes and installs them all at once!
That's contrary to my strategy of trying to favor already started
packages, and only starting new ones when the existing running ones can't
keep the load average up high enough.

But as I said it's hard to track that, since portage doesn't track the
current number of installings, only the number built. I can however infer
from the difference between the number of the last started job, the number
of completed jobs, and the number of running jobs, plus the per-package
installing notices, that a whole stack of packages are accumulating that
are already built, but are apparently not installing yet. Why isn't
portage going ahead and installing them, instead of starting new package
builds?

So the two issues (plus the one in parentheses above) I see are:

1) the parallel jobs display needs to say how many it's installing.

2) portage needs to follow thru on built packages and finish installing
them as a higher job priority than starting new package builds.
Particularly with --jobs=4, there's no reason portage should be starting a
new package build, when there's a whole stack of packages (often more than
the four the --jobs=4 implies should be the max) apparently sitting there
built and ready to install, but not actively installing, and apparently
not doing anything except sitting there taking up tmpfs memory!

3) Having an EJOBS or similar variable, parallel to the MAKEOPTS variable,
and then a simple parallel-toggle emerge command-line option, would be
quite useful. =:^)

Let me know if you'd prefer that I file bugs on these, and as always,
thanks, Zac, for being so responsive. You have a gift that's a real
rarity in user/dev relations, and some of us really do appreciate it! =:^)

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
 

Thread Tools




All times are GMT. The time now is 07:03 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org