Linux Archive

Linux Archive (http://www.linux-archive.org/)
-   ArchLinux Development (http://www.linux-archive.org/archlinux-development/)
-   -   grep-2.6-1 (http://www.linux-archive.org/archlinux-development/346733-grep-2-6-1-a.html)

Allan McRae 03-25-2010 09:02 AM

grep-2.6-1
 
Upstream big update.

Local changelog:
- Removed the multibyte locale speed-up patch (and all the patches to
fix the issues it created...) as it is now included upstream.
- Removed the other patches as it appears they are not being
considered upstream.


Upstream NEWS:
* Noteworthy changes in release 2.6 (2010-03-23) [stable]

** Speed improvements

grep is much faster on multibyte character sets, especially (but not
limited to) UTF-8 character sets. The speed improvement is also very
pronounced with case-insensitive matches.

** Bug fixes

Character classes would malfunction in multi-byte locales when using
grep -i.

Examples which would print nothing for LC_ALL=en_US.UTF-8 include:
- for ranges, echo Z | grep -i '[a-z]'
- for single characters, echo Y | grep -i '[y]'
- for character types, echo Y | grep -i '[[:lower:]]'

grep -i -o would fail to report some matches; grep -i --color, while not
missing any line containing a match, would fail to color some matches.

grep would fail to report a match in a multibyte character set other than
UTF-8, if another match occurred earlier in the line but started in the
middle of a multibyte character.

Various bugs in grep -P, caused by expressions such as [^b] or S
matching

newlines, were fixed. grep -P also supports the special sequences  and
z, and can be combined with the command-line option -z to perform
searches

on NUL-separated records.

grep would mistakenly exit with status 1 upon error, rather than 2,
as it is documented to do.

Using options like -1 -2 or -1 -v -2 results in two lines of
context (the last value that appears on the command line) instead
twelve (the concatenation of all the values). This is consistent
with the behavior of options -A/-B/-C.

Two new command-line options, --group-separator=ARGUMENT and
--no-group-separator, enable further customization of the output
when -A, -B or -C is being used.

** Other changes

egrep accepts the -E option and fgrep accepts the -F option. If egrep
and fgrep are given another of the -E/-F/-G options, they print a more
meaningful error message.


Signoff both,
Allan

Ronald van Haren 03-25-2010 11:18 AM

grep-2.6-1
 
On Thu, Mar 25, 2010 at 11:02 AM, Allan McRae <allan@archlinux.org> wrote:
> Upstream big update.
>

echo -e "signoff both $(grep --version)
none $(grep --version)" | grep both
signoff both GNU grep 2.6

Ronald

Xavier Chantry 03-25-2010 03:22 PM

grep-2.6-1
 
On Thu, Mar 25, 2010 at 11:02 AM, Allan McRae <allan@archlinux.org> wrote:
> Upstream big update.
>
> Local changelog:
> *- Removed the multibyte locale speed-up patch (and all the patches to fix
> the issues it created...) as it is now included upstream.
> *- Removed the other patches as it appears they are not being considered
> upstream.
>
> Upstream NEWS:
> * Noteworthy changes in release 2.6 (2010-03-23) [stable]
>
> ** Speed improvements
>
> *grep is much faster on multibyte character sets, especially (but not
> *limited to) UTF-8 character sets. *The speed improvement is also very
> *pronounced with case-insensitive matches.
>

That's awesome. After all these years, I thought this would never happen :)

I did a quick benchmark before and after, and I got very similar
results, so we are good.

grep -i is still considerably slower than grep in UTF-8 (0.1 -> 1.5s ,
that is 15x slower), but IIRC it was MUCH worse with an unpatched grep
2.5, like hundred of times slower.
With LANG=C , grep and grep -i are both at 0.1s.


All times are GMT. The time now is 07:51 AM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.