Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Debian User (http://www.linux-archive.org/debian-user/)
-   -   Sorting elements *and* knowing where each one has been put (http://www.linux-archive.org/debian-user/317256-sorting-elements-knowing-where-each-one-has-been-put.html)

Merciadri Luca 01-29-2010 10:08 PM

Sorting elements *and* knowing where each one has been put
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I have some numerical values, i.e. something like

==
value_1
value_2
.
.
.
value_n
==

There are many ways to sort them, but the `sort' command is clearly
appropriate.

The problem is that I need to know where value_i is, before, and
after, the sorting. It can be found easily by keeping in memory where
value_i is before the sorting (for every i, 1<=i<=n), and finding
where value_i is after the sorting, by looking in the sorted numbers
and finding value_i. However, this solution, despite being simple, is
clearly bad on an algorithmic point of view. The best way is
consequently to use the sorting function (which could be, e.g. a
Heapsort, or simply the `sort' command) to know where value_i is
displaced once it has been transformed by the sorting function.

However, I do not know how I can do this with the `sort' function. Is
it even possible? (I considered 1<=i<=n through the whole message.)

Thanks.
- --
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
- --

Don't bite the hand that feeds you.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iEYEARECAAYFAktjalAACgkQM0LLzLt8MhxhNwCgjeBPyoWCZO LSRQoVV9R4cI1y
auMAnjRwcjoQ4c7MLBaCtz12MPnVltHe
=NkLy
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

T o n g 01-29-2010 11:10 PM

Sorting elements *and* knowing where each one has been put
 
On Sat, 30 Jan 2010 00:08:00 +0100, Merciadri Luca wrote:

> The problem is that I need to know where value_i is, before, and after,
> the sorting.

Is this homework?

--
Tong (remove underscore(s) to reply)
http://xpt.sourceforge.net/techdocs/
http://xpt.sourceforge.net/tools/


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Lee Winter 01-30-2010 12:48 AM

Sorting elements *and* knowing where each one has been put
 
On Fri, Jan 29, 2010 at 6:08 PM, Merciadri Luca
<Luca.Merciadri@student.ulg.ac.be> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi,
>
> I have some numerical values, i.e. something like
>
> ==
> value_1
> value_2
> .
> .
> .
> value_n
> ==
>
> There are many ways to sort them, but the `sort' command is clearly
> appropriate.
>
> The problem is that I need to know where value_i is, before, and
> after, the sorting. It can be found easily by keeping in memory where
> value_i is before the sorting (for every i, 1<=i<=n), and finding
> where value_i is after the sorting, by looking in the sorted numbers
> and finding value_i. However, this solution, despite being simple, is
> clearly bad on an algorithmic point of view. The best way is
> consequently to use the sorting function (which could be, e.g. a
> Heapsort, or simply the `sort' command) to know where value_i is
> displaced once it has been transformed by the sorting function.
>
> However, I do not know how I can do this with the `sort' function. Is
> it even possible? (I considered 1<=i<=n through the whole message.)

In the discussion above you persist in mixing the sort function with
the sort command. They are not the same, so please be more careful in
your descriptions of what it is you want to accomplish.

To sort with a function, such as is found in many programming
languages, you should create a data structure containing the value to
be sorted (the "key") and any audit/trance information you need about
the position of each item prior to the sort. After sorting such data
structures with the language's sort() function you will be able to
identify where each value started and ended.

To sort with a command, such as is found in many command-line
envoronments, you should create lines of text, each containing the
value to be sorted (the "key") and followed by any audit/trace
information you need about the position of each item prior to the
sort. After sorting the text lines with the command-line
environment's text sorting utility you will be able to identify where
each value started and ended.

The above descriptions answer the question you asked. But it appears
to me that the answer you need is related to a different question.
What do you need to do with the values after they are sorted? If you
have both the initially ordered list and the sorted list, why do you
care where things started as long as they are where they need to be
when you are done?

-- Lee


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

"Boyd Stephen Smith Jr." 01-30-2010 12:53 AM

Sorting elements *and* knowing where each one has been put
 
On Friday 29 January 2010 17:08:00 Merciadri Luca wrote:
> I have some numerical values, i.e. something like
>
> ==
> value_1
> value_2
> .
> .
> .
> value_n
> ==
>
> There are many ways to sort them, but the `sort' command is clearly
> appropriate.
>
> The problem is that I need to know where value_i is, before, and
> after, the sorting. It can be found easily by keeping in memory where
> value_i is before the sorting (for every i, 1<=i<=n), and finding
> where value_i is after the sorting, by looking in the sorted numbers
> and finding value_i. However, this solution, despite being simple, is
> clearly bad on an algorithmic point of view. The best way is
> consequently to use the sorting function (which could be, e.g. a
> Heapsort, or simply the `sort' command) to know where value_i is
> displaced once it has been transformed by the sorting function.
>
> However, I do not know how I can do this with the `sort' function. Is
> it even possible? (I considered 1<=i<=n through the whole message.)

nl values | sort -k2 | nl | grep value_i

The first column is the new location, 1-based. The second column is the old
location, 1-based. If your values are numbers, you might use (awk '$3 ==
value_i') instead of the grep.

nl = O(n)
sort = O(n lg n)
grep = O(n)

The whole pipeline is therefore O(n lg n).
--
Boyd Stephen Smith Jr. ,= ,-_-. =.
bss@iguanasuicide.net ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/ \_/

Merciadri Luca 01-30-2010 10:08 AM

Sorting elements *and* knowing where each one has been put
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lee Winter <lee.j.i.winter@gmail.com> writes:

> In the discussion above you persist in mixing the sort function with
> the sort command. They are not the same, so please be more careful in
> your descriptions of what it is you want to accomplish.
>
> To sort with a function, such as is found in many programming
> languages, you should create a data structure containing the value to
> be sorted (the "key") and any audit/trance information you need about
> the position of each item prior to the sort. After sorting such data
> structures with the language's sort() function you will be able to
> identify where each value started and ended.
>
> To sort with a command, such as is found in many command-line
> envoronments, you should create lines of text, each containing the
> value to be sorted (the "key") and followed by any audit/trace
> information you need about the position of each item prior to the
> sort. After sorting the text lines with the command-line
> environment's text sorting utility you will be able to identify where
> each value started and ended.
>
> The above descriptions answer the question you asked. But it appears
> to me that the answer you need is related to a different question.
> What do you need to do with the values after they are sorted? If you
> have both the initially ordered list and the sorted list, why do you
> care where things started as long as they are where they need to be
> when you are done?

Thanks for your answer, Lee.

Sorry if, after having read my text, you think that I am mixing the
sort function and the sort command. It was not my intention, as they
are different things, both to you, and to me.

I am here sorting with a _command_, this one being `sort'. Your idea
of appending to each key, e.g. a number, is great. I had not thought
about it. The question is now: how to append to lines some text? Okay,
generating numbers so that n_1 < n_2 < ... < n_n is not difficult, but
how can I redirect them to each line of a flux using bash?

The problem is not so complex. I have files with names which are only
constituted by numbers. Each file contains exactly one e-mail, but the
`Received' field of each e-mail is not linked to the name of the
file. For example, if I have three files, say `1' , `2' and `3', each
containing an e-mail with headers, displaying only the `Received'
field on each could give, for example:

File 1 -> Received 2 Jan 2009
File 2 -> Received 7 Sep 2005
File 3 -> Received 3 Feb 2009

Once I have sorted the dates, and as the global `cat *' takes account
of the order between numbers in the set of the integers (i.e. `cat *' is
equivalent to `cat 1' and `cat 2' iff the current folder only contains
`.', `..', `1' and `2'), I can then sort the dates. Before the
sorting, the n-th date is related to the n-th e-mail's headers'
`Received' field. After the sort, it may have completely no
link. Consequently, I need to know which e-mail was received first,
and consequently which date was moved at the beginning of the
list.

I hope I have been clear. I now need to be able to append increasing
number to each line of the flux _before_ the sorting. Do you have an
idea about how to implement this in bash?

Thanks.

- --
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
- --

Don't have too many irons in the fire.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iEYEARECAAYFAktkExoACgkQM0LLzLt8MhzhxwCeIIgQnuvbDf VRoLvJgmgkY3qn
6sUAn2Su5tf6iHQfvi43ZJWAebWml6lm
=W08e
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Merciadri Luca 01-30-2010 10:13 AM

Sorting elements *and* knowing where each one has been put
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

"Boyd Stephen Smith Jr." <bss@iguanasuicide.net> writes:

> nl values | sort -k2 | nl | grep value_i
>
> The first column is the new location, 1-based. The second column is the old
> location, 1-based. If your values are numbers, you might use (awk '$3 ==
> value_i') instead of the grep.
>
> nl = O(n)
> sort = O(n lg n)
> grep = O(n)
>
> The whole pipeline is therefore O(n lg n).

Not a bad idea. Thanks for complexity calculus. Would not something like

==
cat file.txt | awk -F" " '{ print $0,$(
for ((i = 1; i < n + 1; i++))
do
echo $i
done
) }'
==
do the trick, file.txt being a text file such as

==
20 Jan 2010 17:39:42
.
.
.
==
(the syntax errors being not comprised)?

Thanks.

- --
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
- --

Don't take life too seriously; you'll never get out of it alive.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iEYEARECAAYFAktkFFUACgkQM0LLzLt8Mhww+QCgiM1ASGf7RR Im5lDJ8yYtEHTt
leUAn2YpIQ+TgWMFCTgDqciLlbWxvPe4
=HaIU
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Clive Standbridge 01-30-2010 02:21 PM

Sorting elements *and* knowing where each one has been put
 
> > However, I do not know how I can do this with the `sort'
> > function. Is
> > it even possible? (I considered 1<=i<=n through the whole message.)
>
> nl values | sort -k2 | nl | grep value_i

If you want to sort by numeric order instead of alphabetic order,
you should replace
sort -k2
with
sort -k2n


PS I learnt a new command today: "nl" which is shorter than "cat -n" :-)


--
Cheers,
Clive


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

"Boyd Stephen Smith Jr." 01-30-2010 03:20 PM

Sorting elements *and* knowing where each one has been put
 
In <87mxzvg816.fsf@merciadriluca-station.MERCIADRILUCA>, Merciadri Luca wrote:
>"Boyd Stephen Smith Jr." <bss@iguanasuicide.net> writes:
>> nl values | sort -k2 | nl | grep value_i
>>
>> The first column is the new location, 1-based. The second column is the
>> old location, 1-based. If your values are numbers, you might use (awk '$3
>> == value_i') instead of the grep.
>>
>> nl = O(n)
>> sort = O(n lg n)
>> grep = O(n)
>>
>> The whole pipeline is therefore O(n lg n).
>
>Would not something like
>
>==
>cat file.txt | awk -F" " '{ print $0,$(
>for ((i = 1; i < n + 1; i++))
>do
>echo $i
>done
>) }'
>==
>do the trick, file.txt being a text file such as
>
>==
> 20 Jan 2010 17:39:42
> .
> .
> .
>==

Why don't you just test your code?

I don't think that awk program does what you want though.

My pipeline will work on any data values without whitespace. If you need to
handle whitespace, add -f "$(printf ' ')" to the sort command-line, which
should handle everything but tabs and newlines. If you also need to handle
tabs, you can probably use -k 2,9999 instead of -k2 on the sort command-line.
--
Boyd Stephen Smith Jr. ,= ,-_-. =.
bss@iguanasuicide.net ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/ \_/

Merciadri Luca 01-30-2010 03:40 PM

Sorting elements *and* knowing where each one has been put
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

"Boyd Stephen Smith Jr." <bss@iguanasuicide.net> writes:

> In <87mxzvg816.fsf@merciadriluca-station.MERCIADRILUCA>, Merciadri Luca wrote:
>>"Boyd Stephen Smith Jr." <bss@iguanasuicide.net> writes:
>>> nl values | sort -k2 | nl | grep value_i
>>>
>>> The first column is the new location, 1-based. The second column is the
>>> old location, 1-based. If your values are numbers, you might use (awk '$3
>>> == value_i') instead of the grep.
>>>
>>> nl = O(n)
>>> sort = O(n lg n)
>>> grep = O(n)
>>>
>>> The whole pipeline is therefore O(n lg n).
>>
>>Would not something like
>>
>>==
>>cat file.txt | awk -F" " '{ print $0,$(
>>for ((i = 1; i < n + 1; i++))
>>do
>>echo $i
>>done
>>) }'
>>==
>>do the trick, file.txt being a text file such as
>>
>>==
>> 20 Jan 2010 17:39:42
>> .
>> .
>> .
>>==
>
> Why don't you just test your code?
>
> I don't think that awk program does what you want though.
>
> My pipeline will work on any data values without whitespace. If you need to
> handle whitespace, add -f "$(printf ' ')" to the sort command-line, which
> should handle everything but tabs and newlines. If you also need to handle
> tabs, you can probably use -k 2,9999 instead of -k2 on the sort command-line.
You are right. Your solution is nice, effectively. Thanks.

- --
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
- --

Don't try to teach a pig to sing. It doesn't work, and you'll annoy
the pig.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iEYEARECAAYFAktkYOkACgkQM0LLzLt8MhwgAgCglMQJEdaQaE LlXUTomWXZ3+bq
Co0AnA+pUTc2Qp3xX3bp5hzvM4xomRKb
=T8PI
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Merciadri Luca 01-30-2010 03:40 PM

Sorting elements *and* knowing where each one has been put
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Clive Standbridge <Clive.Standbridge@myriadgroup.com> writes:

>> > However, I do not know how I can do this with the `sort'
>> > function. Is
>> > it even possible? (I considered 1<=i<=n through the whole message.)
>>
>> nl values | sort -k2 | nl | grep value_i
>
> If you want to sort by numeric order instead of alphabetic order,
> you should replace
> sort -k2
> with
> sort -k2n
>
> PS I learnt a new command today: "nl" which is shorter than "cat -n" :-)
Thanks!

- --
Merciadri Luca
See http://www.student.montefiore.ulg.ac.be/~merciadri/
- --

The ends justify the means.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.8 <http://mailcrypt.sourceforge.net/>

iEYEARECAAYFAktkYP0ACgkQM0LLzLt8MhzrLACeOxeTAvYhY3 ioqk4o+MBUibmL
85sAmwb6aC6wagtjSr5E6WdU0QX83u2J
=qqci
-----END PGP SIGNATURE-----


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


All times are GMT. The time now is 09:42 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.