On Wed, Jun 30, 2010 at 8:48 PM, brownh
<brownh@historicalmaterialism.info> wrote:
> I received a .docx file appended in an e-mail, and need to extract and
> convert it to a convenient format such as .html, .pdf, or plain .txt.
>
> Apparently .docx can be viewed in Abiword and OpenOffice, but I do not
> wish to install GUI applications, and so need a command-line format
> conversion utility.
>
> The best I find is OdfConverter
> (odf-converter_1.0.0-2-getdeb1_i386.deb). I installed this, but it
> lacks a manual and I have no idea how to use it. I gather that running
> $ OdfConverter wihout argument will return a list of possible
> arguments, but I don't seem to have any such OdfConverter executable.
>
> Anyone know of a simple command-line convertion utility?
apt-get install unoconv
HTH
--
Mathieu
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: AANLkTimc1iiZmppr2eJhTcDMb8rikrGT3mFe1dncApLi@mail .gmail.com">http://lists.debian.org/AANLkTimc1iiZmppr2eJhTcDMb8rikrGT3mFe1dncApLi@mail .gmail.com
06-30-2010, 08:18 PM
"Chris"
OTF conversion without OpenOffice
Looks like odfconverter /i example.docx is what you want
Sent from my BlackBerry®
-----Original Message-----
From: brownh <brownh@historicalMaterialism.info>
Date: Wed, 30 Jun 2010 14:48:49
To: <debian-user@lists.debian.org>
Subject: OTF conversion without OpenOffice
I received a .docx file appended in an e-mail, and need to extract and
convert it to a convenient format such as .html, .pdf, or plain .txt.
Apparently .docx can be viewed in Abiword and OpenOffice, but I do not
wish to install GUI applications, and so need a command-line format
conversion utility.
The best I find is OdfConverter
(odf-converter_1.0.0-2-getdeb1_i386.deb). I installed this, but it
lacks a manual and I have no idea how to use it. I gather that running
$ OdfConverter wihout argument will return a list of possible
arguments, but I don't seem to have any such OdfConverter executable.
Anyone know of a simple command-line convertion utility?
Haines Brown
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: http://lists.debian.org/87aaqcwe3y.fsf@teufel.historicalMaterialism.info
06-30-2010, 08:32 PM
Celejar
OTF conversion without OpenOffice
On Wed, 30 Jun 2010 14:48:49 -0400
brownh <brownh@historicalMaterialism.info> wrote:
> I received a .docx file appended in an e-mail, and need to extract and
> convert it to a convenient format such as .html, .pdf, or plain .txt.
>
> Apparently .docx can be viewed in Abiword and OpenOffice, but I do not
> wish to install GUI applications, and so need a command-line format
> conversion utility.
>
> The best I find is OdfConverter
> (odf-converter_1.0.0-2-getdeb1_i386.deb). I installed this, but it
> lacks a manual and I have no idea how to use it. I gather that running
> $ OdfConverter wihout argument will return a list of possible
> arguments, but I don't seem to have any such OdfConverter executable.
>
> Anyone know of a simple command-line convertion utility?
Celejar
--
foffl.sourceforge.net - Feeds OFFLine, an offline RSS/Atom aggregator
mailmin.sourceforge.net - remote access via secure (OpenPGP) email
ssuds.sourceforge.net - A Simple Sudoku Solver and Generator
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 20100630163236.e33ef1fd.celejar@gmail.com">http://lists.debian.org/20100630163236.e33ef1fd.celejar@gmail.com
07-01-2010, 11:11 AM
brownh
OTF conversion without OpenOffice
Thank you, Matheiu, and others. I ultimately succeeded and here report
my experiences with the options.
1. I found several on-line free conversion services. For various
reasons such as security and privacy I did not pursue them.
2. Install OpenOffice and OpenOffice.OpenXML
Translator. Because this contradicted my desire for command line
conversion rather than install big GUI apps, I did not pursue.
3. Abiword can be used to convert the document format from .docx to,
say, .pdf. It was my intent to use a command line utility instead, but
here report that Abiword did in fact work and automatically detected
the input format.
4. Antiword-for-Office is a perl script, but when I tried to compile,
found I was missing the perl Archive::Zip module. Not knowing what to
do about that and too little time to find out, I did not pursue.
5. Unoconv script is a debian package and seems what I really
want. However, when I ran it, I found that it depends on JRE, although
"$ aptitude show unoconv" indicates that it depends on python. In any
case, I don't happen to have JRE installed in current box, and so did not
pursue.
6. Odf-converter. This is a perl script. It requires libtiff.so.3, but
by symlinking found that it can use libtiff.so.4 instead. With it I
was able to generate an .otf file, which of course required Abiword to
convert to PDF since I can't use unoconv.
Haines Brown
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 87630zwj64.fsf@teufel.historicalMaterialism.info"> http://lists.debian.org/87630zwj64.fsf@teufel.historicalMaterialism.info
07-01-2010, 11:19 AM
Camaleón
OTF conversion without OpenOffice
On Wed, 30 Jun 2010 14:48:49 -0400, brownh wrote:
> I received a .docx file appended in an e-mail, and need to extract and
> convert it to a convenient format such as .html, .pdf, or plain .txt.
(...)
If it's a simple file (just plain text) you can extract (unzip) the .docx
into *.xml data for a direct view or convert into another suitable format.
Greetings,
--
Camaleón
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: pan.2010.07.01.11.19.29@gmail.com">http://lists.debian.org/pan.2010.07.01.11.19.29@gmail.com
07-01-2010, 11:23 AM
Ron Johnson
OTF conversion without OpenOffice
On 07/01/2010 06:11 AM, brownh wrote:
Thank you, Matheiu, and others. I ultimately succeeded and here report
my experiences with the options.
1. I found several on-line free conversion services. For various
reasons such as security and privacy I did not pursue them.
2. Install OpenOffice and OpenOffice.OpenXML
Translator. Because this contradicted my desire for command line
conversion rather than install big GUI apps, I did not pursue.
3. Abiword can be used to convert the document format from .docx to,
say, .pdf. It was my intent to use a command line utility instead, but
here report that Abiword did in fact work and automatically detected
the input format.
4. Antiword-for-Office is a perl script, but when I tried to compile,
found I was missing the perl Archive::Zip module. Not knowing what to
do about that and too little time to find out, I did not pursue.
$ apt-cache search perl archive zip
This indicates that you must install libarchive-zip-perl.
5. Unoconv script is a debian package and seems what I really
want. However, when I ran it, I found that it depends on JRE, although
"$ aptitude show unoconv" indicates that it depends on python. In any
case, I don't happen to have JRE installed in current box, and so did not
pursue.
And you couldn't install it?
6. Odf-converter. This is a perl script. It requires libtiff.so.3, but
by symlinking found that it can use libtiff.so.4 instead. With it I
was able to generate an .otf file, which of course required Abiword to
convert to PDF since I can't use unoconv.
Haines Brown
--
Seek truth from facts.
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
> On 07/01/2010 06:11 AM, brownh wrote:
>> 4. Antiword-for-Office is a perl script, but when I tried to compile,
>> found I was missing the perl Archive::Zip module. Not knowing what to
>> do about that and too little time to find out, I did not pursue.
> This indicates that you must install libarchive-zip-perl.
Thanks. This seemed to get through that hang in the compile, but now
it hangs because it can't find XML/LibXML.pm. I did a search for
LibXML, and the obvious package, libxml-libxml-common-perl, did not
help (I already had libxml2 installed).
>> 5. Unoconv script is a debian package and seems what I really
>> want. However, when I ran it, I found that it depends on JRE, although
>> "$ aptitude show unoconv" indicates that it depends on python. In any
>> case, I don't happen to have JRE installed in current box, and so did not
>> pursue.
>
> And you couldn't install it?
No, the reason is that I'm working with temporary hardware and wanted
to avoid doing that, but now I did install the Sun JRE. When I try to
use unoconv on a .docx file I get:
unoconv: UnoException during conversion: File could not be loaded by
OpenOffice The provided document cannot be converted to the desired
format.
This sounds like it relies on OpenOffice, which I don't have
installed.
Haines
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 87y6dvuxn1.fsf@teufel.historicalMaterialism.info"> http://lists.debian.org/87y6dvuxn1.fsf@teufel.historicalMaterialism.info
07-01-2010, 01:51 PM
Ron Johnson
OTF conversion without OpenOffice
On 07/01/2010 08:42 AM, brownh wrote:
Ron Johnson<ron.l.johnson@cox.net> writes:
On 07/01/2010 06:11 AM, brownh wrote:
4. Antiword-for-Office is a perl script, but when I tried to compile,
found I was missing the perl Archive::Zip module. Not knowing what to
do about that and too little time to find out, I did not pursue.
This indicates that you must install libarchive-zip-perl.
Thanks. This seemed to get through that hang in the compile, but now
it hangs because it can't find XML/LibXML.pm. I did a search for
LibXML, and the obvious package, libxml-libxml-common-perl, did not
help (I already had libxml2 installed).
$ apt-file search libXML.pm
Interesting.
$ apt-file search libXML.pm
$
5. Unoconv script is a debian package and seems what I really
want. However, when I ran it, I found that it depends on JRE, although
"$ aptitude show unoconv" indicates that it depends on python. In any
case, I don't happen to have JRE installed in current box, and so did not
pursue.
And you couldn't install it?
No, the reason is that I'm working with temporary hardware and wanted
to avoid doing that, but now I did install the Sun JRE. When I try to
use unoconv on a .docx file I get:
unoconv: UnoException during conversion: File could not be loaded by
OpenOffice The provided document cannot be converted to the desired
format.
This sounds like it relies on OpenOffice, which I don't have
installed.
Right. It appears that the unoconv package metadata is in error.
I'd file a bug asking for clarification.
--
Seek truth from facts.
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
> On Wed, 30 Jun 2010 14:48:49 -0400, brownh wrote:
>
>> I received a .docx file appended in an e-mail, and need to extract and
>> convert it to a convenient format such as .html, .pdf, or plain .txt.
>
> (...)
>
> If it's a simple file (just plain text) you can extract (unzip) the .docx
> into *.xml data for a direct view or convert into another suitable format.
Camaleón, I'm afraid you lost me. The file was .docx, which looks
binary. As a result, it's MIME'd in the mail message, which makes it
plain ASCII.
But apparently you mean that I can run unzip on the .docx file to
extract *.xml data. This was news to me, for I had no idea that .docx
was an archive. But I tried it, and a number of things happened. It
created an empty _rels directory; it created a docProps directory in
which are app.xml and core.xml, and it created a word/ directory in
which there are a number of *.xml files. None of these xml files are
understood by abiword.
Haines
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 87tyojuwnj.fsf@teufel.historicalMaterialism.info"> http://lists.debian.org/87tyojuwnj.fsf@teufel.historicalMaterialism.info
07-01-2010, 02:39 PM
Camaleón
OTF conversion without OpenOffice
On Thu, 01 Jul 2010 10:03:28 -0400, brownh wrote:
> Camaleón writes:
>
>> (...)
>>
>> If it's a simple file (just plain text) you can extract (unzip) the
>> .docx into *.xml data for a direct view or convert into another
>> suitable format.
>
> Camaleón, I'm afraid you lost me. The file was .docx, which looks
> binary. As a result, it's MIME'd in the mail message, which makes it
> plain ASCII.
I was referring to the "content" of the .docx file, not the "nature" of
it :-).
If there are images or tables, it will be difficult to render them in the
xml file (images would be linked and tables would need a parser). But if
the .docx file just cointains a bunch of text, it can be easily readable
from the resulting xml file.
> But apparently you mean that I can run unzip on the .docx file to
> extract *.xml data. This was news to me, for I had no idea that .docx
> was an archive. But I tried it, and a number of things happened. It
> created an empty _rels directory; it created a docProps directory in
> which are app.xml and core.xml, and it created a word/ directory in
> which there are a number of *.xml files. None of these xml files are
> understood by abiword.
Yep. MS ".docx" format is far from ".odt" flexibility but it shares some
features. One if that the files are compressed and can be easily
extracted for raw reading.
"document.xml" is the main file, the one that contains the text of the
document. And being a xml file, it can be read with any editor (console
or GUI based) or any browser because is just plain text. Of course, do
not expect to get the same shape you get with the ".docx" file when
opened with a text processor, but at least you can view the content of
the file :-)
Greetings,
--
Camaleón
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: pan.2010.07.01.14.39.22@gmail.com">http://lists.debian.org/pan.2010.07.01.14.39.22@gmail.com