Il giorno mer, 06/08/2008 alle 10.08 +0200, Markus Schönhaber ha
scritto:
> John Toliver wrote:
>
> > So my question to start is which language should I use to pull the
> > data out of an html file?
>
> The one that you're familiar with is, IMO, the primary choice.
>
> > Is perl better for this application, or is
> > python better or some other language?
>
> I'm not too familiar with Perl but have done quite some Python
> programming over the years. Therefore I don't have an unbiased view in
> this regard, nevertheless I doubt that one has a massive advantage over
> the other when it comes to text processing.
>
Well, I'll tend to disagree, but then I'm perl biased, thus my maybe my
advice is to be taken "cum grano salis"
> > I'm probably going to need to brush up on my regular expressions for
> > this but that's ok too.
> >
> > Any thoughts would be appreciated...
>
There is a wonderful book on RE in the O'Reilly series, explaining how
to use it in different languages "Mastering Regular Expressions", by
Jeffrey Friedl.
If you decide by Perl (not PERL, this is another thing...), you could
find useful the HTML::Tree module
(http://search.cpan.org/~petek/HTML-Tree-3.23/lib/HTML/Tree.pm)
> ...snip....
> To sum thing up: IMO there is not the one best way or the one best
> programming language to get the desired result. What's best for you
> largely depends on what you're familiar with and what matches your
> personal preference best.
>
And this is nothing but the truth
Enjoy
--
Leo Cacciari
--
ubuntu-users mailing list
ubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-users