Is there any F/LOSS OCR software for Ubuntu that can input JPEG files and output text files?*
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
01-11-2011, 11:39 PM
Lucio M Nicolosi
OCR software for Ubuntu
On Tue, Jan 11, 2011 at 10:06 PM, Default User
<hunguponcontent@gmail.com> wrote:
> Hi!
>
> Is there any F/LOSS OCR software for Ubuntu that can input JPEG files and
> output text files?
(Synaptic) Cuneiform can do this.
"Cuneiform is an OCR system. In addition to text recognition it also does layout
analysis and text format recognition. Cuneiform supports several languages."
--
L M Nicolosi, Eng.
Ubuntu AMD64
GNU-Linux Regist. User #481505 - http://counter.li.org/
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
01-28-2011, 09:05 PM
Default User
OCR software for Ubuntu
On Tue, Jan 11, 2011 at 18:39, Lucio M Nicolosi <lmnicolosi@gmail.com> wrote:
On Tue, Jan 11, 2011 at 10:06 PM, Default User
<hunguponcontent@gmail.com> wrote:
> Hi!
>
> Is there any F/LOSS OCR software for Ubuntu that can input JPEG files and
> output text files?
(Synaptic) Cuneiform can do this.
"Cuneiform is an OCR system. In addition to text recognition it also does layout
analysis and text format recognition. Cuneiform supports several languages."
--
L M Nicolosi, Eng.
Ubuntu AMD64
GNU-Linux Regist. User #481505 - http://counter.li.org/
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
Thanks for the suggestion.* Unfortunately, on Ubuntu 10.10 64-bit, after installation, I get an error message apparently related to another program added as a dependency.* So, it would not work, and I do not have time right now to roll up my sleeves and investigate possible "fixes".
But thanks again for the pointer.*
Anyway,
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
01-28-2011, 09:37 PM
Lucio M Nicolosi
OCR software for Ubuntu
On Fri, Jan 28, 2011 at 8:05 PM, Default User <hunguponcontent@gmail.com> wrote:
>
>
> On Tue, Jan 11, 2011 at 18:39, Lucio M Nicolosi <lmnicolosi@gmail.com>
> wrote:
>>
>> On Tue, Jan 11, 2011 at 10:06 PM, Default User
>> <hunguponcontent@gmail.com> wrote:
>> > Hi!
>> >
>> > Is there any F/LOSS OCR software for Ubuntu that can input JPEG files
>> > and
>> > output text files?
>>
>> (Synaptic) Cuneiform can do this.
>>
>> "Cuneiform is an OCR system. In addition to text recognition it also does
>> layout
>> analysis and text format recognition. Cuneiform supports several
>> languages."
>
> Thanks for the suggestion.* Unfortunately, on Ubuntu 10.10 64-bit, after
> installation, I get an error message apparently related to another program
> added as a dependency.* So, it would not work, and I do not have time right
> now to roll up my sleeves and investigate possible "fixes".
>
Strange, I used it once, can't remember if on 10.04/64 or 10.10/64 and
it worked flawless. Perhaps you could try the lastest version
(1.0.0-1) from (the non official repo) GetDebian, that can be
activated through the following link:
--
L M Nicolosi, Eng.
GNU-Linux Regist. User #481505 - http://counter.li.org/
Ubuntu 10.10 AMD64
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
01-28-2011, 11:56 PM
NoOp
OCR software for Ubuntu
On 01/28/2011 02:05 PM, Default User wrote:
> On Tue, Jan 11, 2011 at 18:39, Lucio M Nicolosi <lmnicolosi@gmail.com>wrote:
...
>> "Cuneiform is an OCR system. In addition to text recognition it also does
>> layout
>> analysis and text format recognition. Cuneiform supports several
>> languages."
...
>
> Thanks for the suggestion. Unfortunately, on Ubuntu 10.10 64-bit, after
> installation, I get an error message apparently related to another program
> added as a dependency. So, it would not work, and I do not have time right
> now to roll up my sleeves and investigate possible "fixes".
>
Installs w/o issue on my 10.10 64bit. What exactly is the error message
that you get?
$ sudo apt-get install cuneiform
[sudo] password for gg:
Reading package lists... Done
Building dependency tree
Reading state information... Done
...
The following extra packages will be installed:
cuneiform-common
The following NEW packages will be installed:
cuneiform cuneiform-common
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 28.3MB of archives.
After this operation, 56.9MB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 http://archive.ubuntu.com/ubuntu/ maverick/multiverse
cuneiform-common all 0.7.0+dfsg.1-1 [26.4MB]
Get:2 http://archive.ubuntu.com/ubuntu/ maverick/multiverse cuneiform
amd64 0.7.0+dfsg.1-1 [1,916kB]
Fetched 28.3MB in 2min 12s (213kB/s)
Selecting previously deselected package cuneiform-common.
(Reading database ... 315168 files and directories currently installed.)
Unpacking cuneiform-common (from
.../cuneiform-common_0.7.0+dfsg.1-1_all.deb) ...
Selecting previously deselected package cuneiform.
Unpacking cuneiform (from .../cuneiform_0.7.0+dfsg.1-1_amd64.deb) ...
Processing triggers for man-db ...
Setting up cuneiform-common (0.7.0+dfsg.1-1) ...
Setting up cuneiform (0.7.0+dfsg.1-1) ...
$ cuneiform
Cuneiform for Linux 0.7.0
Usage: cuneiform[-l languagename -f format --dotmatrix --fax -o
result_file] imagefile
That's as far as I go as I haven't taken the time to learn how to use it :-)
You might find this useful:
https://help.ubuntu.com/community/OCR
You might try gocr:
http://manpages.ubuntu.com/manpages/maverick/man1/gocr.1.html
which is generally the default used in xsane etc.
<quote>
DESCRIPTION
gocr is an optical character recognition program that can be
used from the command line. It takes input in PNM, PGM, PBM, PPM, or
PCX format, and writes recognized text to stdout. If the pnm file
is a single dash, PNM data is read from stdin. If gzip, bzip2 and
netpbm-progs are installed and your system supports popen(3) also
pnm.gz, pnm.bz2, png, jpg, jpeg, tiff, gif, bmp, ps (only single pages)
and eps are supported as input files (not as input stream), where pnm
can be replaced by one of ppm, pgm and pbm.
</quote>
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users