I am having problems building lucene index using Zend_Lucene. I get the following error
PHP Notice:* iconv(): Detected an illegal character in input string in /var/www/ZendFramework-1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php on line 56
Thanks in advance.
Regards,
Amitava Shee
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
07-04-2008, 09:29 AM
Ralph Angenendt
UTF-8 support in PCRE
Amitava Shee wrote:
> How do I get utf-8 support with PCRE?
>
> I am having problems building lucene index using Zend_Lucene. I get the
> following error
>
>
> PHP Notice: iconv(): Detected an illegal character in input string in
> /var/www/ZendFramework-1.5.2/library/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php
> on line 56
a) What does that have to do with pcre? (which can do UTF-8)
b) What is on line 56 in that file? Looks like iconv is choking on that.
So try to process that file with iconv on the command line.
Ralph
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
07-07-2008, 12:11 PM
"Amitava Shee"
UTF-8 support in PCRE
Please see my reply inline below
On Fri, Jul 4, 2008 at 5:29 AM, Ralph Angenendt <ra+centos@br-online.de> wrote:
Amitava Shee wrote:
> How do I get utf-8 support with PCRE?
>
> I am having problems building lucene index using Zend_Lucene. I get the
> following error
>
>
> PHP Notice: *iconv(): Detected an illegal character in input string in
a) What does that have to do with pcre? (which can do UTF-8)*[Shee] Zend lucene search engine uses pcre and requires pcre to be compiled with --enable-utf8. Please see http://framework.zend.com/manual/en/zend.search.lucene.charset.html#zend.search.lucene .charset.utf_analyzer
UTF-8 support can either be compiled into PCRE at build time or supported via shared library. But shared library support is included/excluded based on the distro. I believe, upstream RedHat does not include it. I was hoping to find a way in CentOS. I have no idea if other distro's support it. That's a research item for me.
b) What is on line 56 in that file? Looks like iconv is choking on that.[Shee] Framework code - don't know much there
So try to process that file with iconv on the command line.
Ralph
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
07-07-2008, 12:54 PM
Ralph Angenendt
UTF-8 support in PCRE
Amitava Shee wrote:
> On Fri, Jul 4, 2008 at 5:29 AM, Ralph Angenendt
> <ra+centos@br-online.de<ra%2Bcentos@br-online.de>>
> wrote:
> > Amitava Shee wrote:
> > > How do I get utf-8 support with PCRE?
> >
> > a) What does that have to do with pcre? (which can do UTF-8)
>
>
> [Shee] Zend lucene search engine uses pcre and requires pcre to be compiled
> with --enable-utf8. Please see
> http://framework.zend.com/manual/en/zend.search.lucene.charset.html#zend.search.lucene .charset.utf_analyzer
>
> UTF-8 support can either be compiled into PCRE at build time or supported
> via shared library. But shared library support is included/excluded based on
> the distro. I believe, upstream RedHat does not include it. I was hoping to
> find a way in CentOS. I have no idea if other distro's support it. That's a
> research item for me.
As I said: pcre can do UTF-8:
%build
%configure --enable-utf8
That's from the spec file. And again: It's not pcre, it is iconv which
doesn't like a character in one of the framework's files.
Ralph
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
07-07-2008, 02:36 PM
"Amitava Shee"
UTF-8 support in PCRE
Yes, building from source will work. I just want to know if there is a package (in some yum repository) somewhere so that updates, patches etc. gets applied with "yum update". It would be nice to do something like
yum install pcre-utf8
-Amitava
On Mon, Jul 7, 2008 at 8:54 AM, Ralph Angenendt <ra+centos@br-online.de> wrote:
> UTF-8 support can either be compiled into PCRE at build time or supported
> via shared library. But shared library support is included/excluded based on
> the distro. I believe, upstream RedHat does not include it. I was hoping to
> find a way in CentOS. I have no idea if other distro's support it. That's a
> research item for me.
As I said: pcre can do UTF-8:
%build
%configure --enable-utf8
That's from the spec file. And again: It's not pcre, it is iconv which
doesn't like a character in one of the framework's files.
Ralph
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
07-07-2008, 02:45 PM
"Jim Perrin"
UTF-8 support in PCRE
On Mon, Jul 7, 2008 at 10:36 AM, Amitava Shee <amitava.shee@gmail.com> wrote:
> Yes, building from source will work. I just want to know if there is a
> package (in some yum repository) somewhere so that updates, patches etc.
> gets applied with "yum update". It would be nice to do something like
>
> yum install pcre-utf8
Okay, there's a disconnect, somewhere which you aren't getting.
The pcre package included in centos does UTF8 just fine. The problem
you are seeing is related to another package. You need to look at the
script to see what iconv (where the problem actually is) is having
problems with.
--
During times of universal deceit, telling the truth becomes a revolutionary act.
George Orwell
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
07-07-2008, 07:57 PM
"Amitava Shee"
UTF-8 support in PCRE
My error log with iconv is misleading. Please ignore that portion and instead use this little php script to check for utf-8 support in pcre
<?php
if (@preg_match('/pL/u', 'a') == 1) {
*** echo "PCRE unicode support is turned on.
";
} else {
*** echo "PCRE unicode support is turned off.
";
}
?>
Also, please check out this thread (lack of pcre utf8 support in RHEL).
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos
07-08-2008, 10:44 AM
Ralph Angenendt
UTF-8 support in PCRE
Amitava Shee wrote:
> Yes, building from source will work. I just want to know if there is a
> package (in some yum repository) somewhere so that updates, patches etc.
> gets applied with "yum update". It would be nice to do something like
>
> yum install pcre-utf8
Again - and I'm going to type this very slowly: The supplied pcre which
is *IN* CentOS *IS* built with UTF-8 support.
And: Your problem has *nothing* to do with pcre, your problem lies
*within* the iconv library.
Ralph
_______________________________________________
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos