FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor


 
 
LinkBack Thread Tools
 
Old 03-25-2010, 09:02 AM
Allan McRae
 
Default grep-2.6-1

Upstream big update.

Local changelog:
- Removed the multibyte locale speed-up patch (and all the patches to
fix the issues it created...) as it is now included upstream.
- Removed the other patches as it appears they are not being
considered upstream.


Upstream NEWS:
* Noteworthy changes in release 2.6 (2010-03-23) [stable]

** Speed improvements

grep is much faster on multibyte character sets, especially (but not
limited to) UTF-8 character sets. The speed improvement is also very
pronounced with case-insensitive matches.

** Bug fixes

Character classes would malfunction in multi-byte locales when using
grep -i.

Examples which would print nothing for LC_ALL=en_US.UTF-8 include:
- for ranges, echo Z | grep -i '[a-z]'
- for single characters, echo Y | grep -i '[y]'
- for character types, echo Y | grep -i '[[:lower:]]'

grep -i -o would fail to report some matches; grep -i --color, while not
missing any line containing a match, would fail to color some matches.

grep would fail to report a match in a multibyte character set other than
UTF-8, if another match occurred earlier in the line but started in the
middle of a multibyte character.

Various bugs in grep -P, caused by expressions such as [^b] or S
matching

newlines, were fixed. grep -P also supports the special sequences  and
z, and can be combined with the command-line option -z to perform
searches

on NUL-separated records.

grep would mistakenly exit with status 1 upon error, rather than 2,
as it is documented to do.

Using options like -1 -2 or -1 -v -2 results in two lines of
context (the last value that appears on the command line) instead
twelve (the concatenation of all the values). This is consistent
with the behavior of options -A/-B/-C.

Two new command-line options, --group-separator=ARGUMENT and
--no-group-separator, enable further customization of the output
when -A, -B or -C is being used.

** Other changes

egrep accepts the -E option and fgrep accepts the -F option. If egrep
and fgrep are given another of the -E/-F/-G options, they print a more
meaningful error message.


Signoff both,
Allan
 
Old 03-25-2010, 11:18 AM
Ronald van Haren
 
Default grep-2.6-1

On Thu, Mar 25, 2010 at 11:02 AM, Allan McRae <allan@archlinux.org> wrote:
> Upstream big update.
>

echo -e "signoff both $(grep --version)
none $(grep --version)" | grep both
signoff both GNU grep 2.6

Ronald
 
Old 03-25-2010, 03:22 PM
Xavier Chantry
 
Default grep-2.6-1

On Thu, Mar 25, 2010 at 11:02 AM, Allan McRae <allan@archlinux.org> wrote:
> Upstream big update.
>
> Local changelog:
> *- Removed the multibyte locale speed-up patch (and all the patches to fix
> the issues it created...) as it is now included upstream.
> *- Removed the other patches as it appears they are not being considered
> upstream.
>
> Upstream NEWS:
> * Noteworthy changes in release 2.6 (2010-03-23) [stable]
>
> ** Speed improvements
>
> *grep is much faster on multibyte character sets, especially (but not
> *limited to) UTF-8 character sets. *The speed improvement is also very
> *pronounced with case-insensitive matches.
>

That's awesome. After all these years, I thought this would never happen

I did a quick benchmark before and after, and I got very similar
results, so we are good.

grep -i is still considerably slower than grep in UTF-8 (0.1 -> 1.5s ,
that is 15x slower), but IIRC it was MUCH worse with an unpatched grep
2.5, like hundred of times slower.
With LANG=C , grep and grep -i are both at 0.1s.
 

Thread Tools




All times are GMT. The time now is 12:04 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org