Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   Debian User (http://www.linux-archive.org/debian-user/)
-   -   Locales/sort bug (http://www.linux-archive.org/debian-user/448314-locales-sort-bug.html)

Camaleón 11-04-2010 09:55 PM

Locales/sort bug
 
On Thu, 04 Nov 2010 21:23:27 +0100, Rob Gom wrote:

> [cut]
>>
>> I'm also getting that behaviour (locale set to "es_ES.UTF-8") so I
>> understand that my locale setting dictates "underscore" ("_") comes
>> first than "comma" (",") symbol.
>>
>> As per "man sort" page:
>>
>> *** WARNING *** The locale specified by the environment affects sort
>> order. Set LC_ALL=C to get the traditional sort order that uses native
>> byte values.
>>
>> Do you think that is a bug? :-?
>
> If so, why do I get order comma, underscore, comma? Even better,
> comma+quote+A, underscore+d,comma+quote+M. I don't get it...

Mmm... you're right, I missed the first line :-?

Heck, it's even weirder with this sequence:

aph3,"z
aph3_devel,"a
aph3,"b

I gets sorted as:

aph3,"b
aph3_devel,"a
aph3,"z

I'm trying to "reverse-engineering" the logic behind the sort but I can't
see it. Maybe it is done randomly? Very curious, indeed.

Greetings,

--
Camaleón


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: pan.2010.11.04.22.55.53@gmail.com">http://lists.debian.org/pan.2010.11.04.22.55.53@gmail.com

David Jardine 11-04-2010 10:36 PM

Locales/sort bug
 
On Thu, Nov 04, 2010 at 10:55:53PM +0000, Camaleón wrote:
> On Thu, 04 Nov 2010 21:23:27 +0100, Rob Gom wrote:
>
> > [cut]
> >>
> >> I'm also getting that behaviour (locale set to "es_ES.UTF-8") so I
> >> understand that my locale setting dictates "underscore" ("_") comes
> >> first than "comma" (",") symbol.
> >>
> >> As per "man sort" page:
> >>
> >> *** WARNING *** The locale specified by the environment affects sort
> >> order. Set LC_ALL=C to get the traditional sort order that uses native
> >> byte values.
> >>
> >> Do you think that is a bug? :-?
> >
> > If so, why do I get order comma, underscore, comma? Even better,
> > comma+quote+A, underscore+d,comma+quote+M. I don't get it...
>
> Mmm... you're right, I missed the first line :-?
>
> Heck, it's even weirder with this sequence:
>
> aph3,"z
> aph3_devel,"a
> aph3,"b
>
> I gets sorted as:
>
> aph3,"b
> aph3_devel,"a
> aph3,"z
>
> I'm trying to "reverse-engineering" the logic behind the sort but I can't
> see it. Maybe it is done randomly? Very curious, indeed.

It just seems to ignore certain characters. Try filtering the output
through, for example, 's/[_|"|,]//g' and the you get it in the right
order.

David


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20101104233647.GA2630@gennes.augarten">http://lists.debian.org/20101104233647.GA2630@gennes.augarten

Bob Proulx 11-04-2010 11:12 PM

Locales/sort bug
 
Camaleón wrote:
> I'm trying to "reverse-engineering" the logic behind the sort but I can't
> see it. Maybe it is done randomly? Very curious, indeed.

It is "dictionary" sort ordering as specified by the locale. Case is
folded and punctuation is (mostly) ignored.

Personally I always set the following in my ~/.bashrc file.

export LANG=en_US.UTF-8
export LC_COLLATE=C

Bob

Camaleón 11-05-2010 06:07 AM

Locales/sort bug
 
On Fri, 05 Nov 2010 00:36:47 +0100, David Jardine wrote:

> On Thu, Nov 04, 2010 at 10:55:53PM +0000, Camaleón wrote:

(...)

>> Heck, it's even weirder with this sequence:
>>
>> aph3,"z
>> aph3_devel,"a
>> aph3,"b
>>
>> I gets sorted as:
>>
>> aph3,"b
>> aph3_devel,"a
>> aph3,"z
>>
>> I'm trying to "reverse-engineering" the logic behind the sort but I
>> can't see it. Maybe it is done randomly? Very curious, indeed.
>
> It just seems to ignore certain characters. Try filtering the output
> through, for example, 's/[_|"|,]//g' and the you get it in the right
> order.

Yes, "sort" documentation and man page advice about that (to avoid custom
locales while using it), but what (an how) it really does when locales
are in use? Why ranking "comma" at the first place and then give
"underscore" a higher priority? :-?

Greetings,

--
Camaleón


--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: pan.2010.11.05.07.07.53@gmail.com">http://lists.debian.org/pan.2010.11.05.07.07.53@gmail.com


All times are GMT. The time now is 01:01 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.