FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 03-09-2009, 12:07 AM
Samuel Thibault
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

Package: wnpp
Version: N/A; reported 2009-03-08
Severity: wishlist

* Package name : parallel
Version : 20090218
Upstream Author : Ole Tange
* URL : https://savannah.nongnu.org/projects/parallel/
* License : GPLv3
Description : build and execute command lines from standard input in parallel
For each line of input parallel will execute command with the line
as arguments. If no command is given the line of input is executed.
parallel can often be used as a substitute for xargs or cat | sh.


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-09-2009, 09:25 AM
Andreas Rottmann
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

Samuel Thibault <samuel.thibault@ens-lyon.org> writes:

> Package: wnpp
> Version: N/A; reported 2009-03-08
> Severity: wishlist
>
> * Package name : parallel
> Version : 20090218
> Upstream Author : Ole Tange
> * URL : https://savannah.nongnu.org/projects/parallel/
> * License : GPLv3
> Description : build and execute command lines from standard input in parallel
> For each line of input parallel will execute command with the line
> as arguments. If no command is given the line of input is executed.
> parallel can often be used as a substitute for xargs or cat | sh.
>
Did you know about the `-P' option of GNU xargs? IIUC, it does quite the
same thing -- what does 'parallel' offer of that functionality?

>From xargs(1):

--max-procs=max-procs
-P max-procs
Run up to max-procs processes at a time; the default is 1.
If max- procs is 0, xargs will run as many processes as
possible at a time. Use the -n option with -P; otherwise
chances are that only one exec will be done.

Regards, Rotty


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-09-2009, 09:40 AM
Samuel Thibault
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

clone 518696 -1
reassign -1 findutils
retitle -1 Add "parallel" somewhere in the description of -P
thanks

Andreas Rottmann, le Mon 09 Mar 2009 11:25:11 +0100, a écrit :
> Did you know about the `-P' option of GNU xargs?

Herm, I would have found it if the manpage didn't lack keywords like
"parallel", "simultaneous", ... Reassigning.

That being said, I guess xargs lacks one parallel feature:

-g Group output. Output from each jobs is grouped together and is only printed
when the command is finished. STDERR first followed by STDOUT. -g is the
default. Can be reversed with -u.

A lot of applications (including md5sum) would not necessarily print
their output atomically and then you get mixed output. Either we add
the option to findutils, or we package parallel.

Samuel


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-09-2009, 04:19 PM
Andreas Rottmann
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

Samuel Thibault <samuel.thibault@ens-lyon.org> writes:

> Andreas Rottmann, le Mon 09 Mar 2009 11:25:11 +0100, a écrit :
>> Did you know about the `-P' option of GNU xargs?
>
> Herm, I would have found it if the manpage didn't lack keywords like
> "parallel", "simultaneous", ... Reassigning.
>
> That being said, I guess xargs lacks one parallel feature:
>
> -g Group output. Output from each jobs is grouped together and is only printed
> when the command is finished. STDERR first followed by STDOUT. -g is the
> default. Can be reversed with -u.
>
> A lot of applications (including md5sum) would not necessarily print
> their output atomically and then you get mixed output. Either we add
> the option to findutils, or we package parallel.
>
Indeed, that's a very valuable feature (if not essential) when the
commands produce output; I've attached a script that can be used to
verify that "xargs -P" does not do this, can be used like:

xargs -P 5 ./test.sh < /some/text/file


Regards, Rotty
 
Old 03-09-2009, 08:53 PM
Samuel Thibault
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

Chuan-kai Lin, le Mon 09 Mar 2009 12:46:35 -0700, a écrit :
> On Mon, Mar 09, 2009 at 11:40:51AM +0100, Samuel Thibault wrote:
> > A lot of applications (including md5sum) would not necessarily print
> > their output atomically and then you get mixed output. Either we add
> > the option to findutils, or we package parallel.
>
> It appears to me that you can get the same functionality by using xargs
> with an adapted version of annotate-output(1) which is a part of
> devscripts. Are there other reasons to use parallel?

Upstream author would say that xargs also misses

-c Line is a command. The input line contains more than one
argument or the input line needs to be evaluated by the shell.
This is the default if command is not set. Can be reversed
with -f.

which makes parallel not take a command, but executes commands from
stdin. That can however be obtained by xargs sh -c. Another option
that xargs misses is

-j +N Add N to the number of CPUs. Run this many jobs in parallel.
For compute intensive jobs -j +0 is useful as it will run
number-of-cpus jobs in parallel.

-j -N Subtract N from the number of CPUs. Run this many jobs in
parallel. If the evaluated number is less than 1 then 1 will
be used.

-j N% Multiply N% with the number of CPUs. Run this many jobs in
parallel. If the evaluated number is less than 1 then 1 will
be used.

in particular -j +0 is really useful. I would personally be happy if
xargs was just able to consider e.g. -P -1 as "run as many processes as
there are processors".

Samuel


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-09-2009, 08:57 PM
Samuel Thibault
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

Chuan-kai Lin, le Mon 09 Mar 2009 12:46:35 -0700, a écrit :
> On Mon, Mar 09, 2009 at 11:40:51AM +0100, Samuel Thibault wrote:
> > A lot of applications (including md5sum) would not necessarily print
> > their output atomically and then you get mixed output. Either we add
> > the option to findutils, or we package parallel.
>
> It appears to me that you can get the same functionality by using xargs
> with an adapted version of annotate-output(1) which is a part of
> devscripts.

I thought at first "it's not particularly convenient", then "well, so
what". Now I'm thinking "Mmm, but people won't know they should do it
and blame xargs for being broken". Also annotate-output is not enough
when programs e.g. output Packages entries, which not only should be
line-atomic, but also paragraph-atomic...

Samuel


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-09-2009, 10:51 PM
Philip Charles
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

Phil's in hospital. Will reply towards end of week.



On Tuesday 10 March 2009, Chuan-kai Lin wrote:
> On Mon, Mar 09, 2009 at 10:57:57PM +0100, Samuel Thibault wrote:
> > I thought at first "it's not particularly convenient", then "well, so
> > what". Now I'm thinking "Mmm, but people won't know they should do
> > it and blame xargs for being broken". Also annotate-output is not
> > enough when programs e.g. output Packages entries, which not only
> > should be line-atomic, but also paragraph-atomic...
>
> Below is what I had in mind when I mentioned adapting annotate-output
> to a different "atomic-output" script. This script is usefull not just
> with "xargs -P", but also with "make -j" and with standard background
> jobs (shell & operator), all of which produce mixed output.
>
> Similarly, about matching the number of parallel jobs with the number
> of processors/cores, we can write a script "ncpus" which returns the
> number of processors/cores/hyper-threads. You can use the ncpus script
> with xargs, with make, or with my new project mdm (mdm.berlios.de)...
>
> I consider separating these concerns (output management, processor
> thread detection) into small, separate, and reusable scripts a cleaner
> solution. Of course, doing it this way requires some user education,
> so a few manpage updates (for example, adding atomic-output and ncpus
> to the SEE ALSO section of xargs) may be in order.
>
> ----------
>
> #! /bin/bash
> # Display stdout and stderr output after program termination
> # Adapted from annotate-output by Chuan-kai Lin
> # Original annotate-output author info and copyright notice as follows
>
> # this script was downloaded from:
> # http://jeroen.a-eskwadraat.nl/sw/annotate
> # and is part of devscripts 2.10.46
>
> # Executes a program annotating the output linewise with time and
> stream # Version 1.2
>
> # Copyright 2003, 2004 Jeroen van Wolffelaar <jeroen@wolffelaar.nl>
>
> # This program is free software; you can redistribute it and/or modify
> # it under the terms of the GNU General Public License as published by
> # the Free Software Foundation; version 2 of the License
> #
> # This program is distributed in the hope that it will be useful,
> # but WITHOUT ANY WARRANTY; without even the implied warranty of
> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> # GNU General Public License for more details.
> #
> # You should have received a copy of the GNU General Public License
> # along with this program; if not, write to the Free Software
> # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
> USA
>
> OUT=`mktemp /tmp/atomic.XXXXXX` || exit 1
> ERR=`mktemp /tmp/atomic.XXXXXX` || exit 1
>
> echo "------ `date +%H:%M:%S` Started $@" > $ERR
> echo "------ STDERR" >> $ERR
> echo "------ STDOUT" >> $OUT
> "$@" >> $OUT 2>> $ERR ; EXIT=$?
>
> cat $ERR
> cat $OUT
> echo "------ `date +%H:%M:%S` Finished with exitcode $EXIT"
> rm -f $OUT $ERR
>
> exit $EXIT
>
> --
> Chuan-kai Lin
> http://web.cecs.pdx.edu/~cklin/



--
Philip Charles; 39a Paterson Street, Abbotsford, Dunedin, New Zealand
+64 3 488 2818 Fax +64 3 488 2875 Mobile 027 663 4453
philipc@copyleft.co.nz - personal. info@copyleft.co.nz - business


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 
Old 03-11-2009, 03:34 PM
Samuel Thibault
 
Default #518696 ITP: parallel -- build and execute command lines from standard input in parallel]

Ole Tange, le Wed 11 Mar 2009 17:05:34 +0100, a écrit :
> One of friends alerted me to your discussion of 'parallel' and whether
> other tools can replace it.

The question could also be rephrased: can't we just extended xargs into
supporting what parallel does? Having two separate tools will always
make arguments about "A does this, B doesn't" and vice-versa, while
xargs could just do everything.

Samuel


--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
 

Thread Tools




All times are GMT. The time now is 06:28 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright ©2007 - 2008, www.linux-archive.org