FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.

» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Ubuntu > Ubuntu User

LinkBack Thread Tools
Old 08-10-2011, 04:29 PM
Patton Echols
Default Scripting / one liner help

I am looking for thoughts on how I might extract image names from an
html document.

The document started as a Word document with nothing but images, one per
page, randomly named. It was saved as html using libre office, so I now
have the images separate. I have a script that will process them
through imagemagik to clean them up, reduce to from full color to b/w
and make them into a pdf. But the pages are out of order because the
images are randomly named.

What I'd like to do is have something read the html file in order and
either feed the names of the JPGs to the script in order or just spit
them out to a file that I can feed to the script. The html source has
all the images listed sequentially without line breaks. Each tag is the
same except for the image name and looks like this:
<IMG SRC="source_html_m1463afff.jpg" NAME="graphics3" ALIGN=BOTTOM

Thanks for any thoughts.

ubuntu-users mailing list
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users

Thread Tools

All times are GMT. The time now is 02:12 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org