FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Ubuntu > Ubuntu User

 
 
LinkBack Thread Tools
 
Old 08-10-2011, 04:29 PM
Patton Echols
 
Default Scripting / one liner help

I am looking for thoughts on how I might extract image names from an
html document.


The document started as a Word document with nothing but images, one per
page, randomly named. It was saved as html using libre office, so I now
have the images separate. I have a script that will process them
through imagemagik to clean them up, reduce to from full color to b/w
and make them into a pdf. But the pages are out of order because the
images are randomly named.


What I'd like to do is have something read the html file in order and
either feed the names of the JPGs to the script in order or just spit
them out to a file that I can feed to the script. The html source has
all the images listed sequentially without line breaks. Each tag is the
same except for the image name and looks like this:
<IMG SRC="source_html_m1463afff.jpg" NAME="graphics3" ALIGN=BOTTOM
WIDTH=575 HEIGHT=790 BORDER=0>


Thanks for any thoughts.

--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users
 

Thread Tools




All times are GMT. The time now is 02:12 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org