On Sun, Jan 20, 2008 at 08:15:25AM +0000, Wulfy wrote:
> [I sent this to Donn's private e-mail by mistake.. sorry Donn.]
>
> Donn wrote:
> > There is a gui that does this. It has a name so abysmal that I can't recall
> > it...
> >
> > I used this scripts once a few years ago to fetch a website.
> > It gets two parameters: url level
> > The level is how far down a chain of links it should go.
> > You could just replace the vars and run the command directly.
> > ===
> >
> > #!/bin/bash
> > #Try to make using wget easier than it bloody is.
> > url=$1
> > if [ -z $url ]; then (echo "Bad url"; exit 1); fi
> > LEV=$2
> > if [ -z $LEV ]; then
> > LEV="2"
> > fi
> >
> > echo "running: wget --convert-links -r -l$LEV $url -o log"
> > wget --convert-links -r -l$LEV "$url" -o log
> >
> > ===
> >
> > man wget is the best plan really.
> >
> >
> > d
> >
> >
> <sigh> I don't know what I'm doing wrong, but I can't get wget to get
> more than the top layer of the site. The archive.org site just brings
> in index.html (and robots.txt). I tried it on another site and it
> brought in the two versions of the main page (dialup and high speed) but
> the menu links weren't followed. I tried -l5 and -15 and got the same
> download.
>
> Any idea why the -r isn't recursing?
Have you used the underdocumented option to ignore robots.txt?
put
robots = off
in your .wgetrc, or use
-erobots=off
on the command line.
>
> --
> Blessings
>
> Wulfmann
>
> Wulf Credo:
> Respect the elders. Teach the young. Co-operate with the pack.
> Play when you can. Hunt when you must. Rest in between.
> Share your affections. Voice your opinion. Leave your Mark.
> Copyright July 17, 1988 by Del Goetz
>
>
>
> --
> kubuntu-users mailing list
> kubuntu-users@lists.ubuntu.com
> Modify settings or unsubscribe at:
https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
--
-=[L]=-
South Hampstead.
--
WebMail Services from Metron Computerware (http://www.metron.com)
--
kubuntu-users mailing list
kubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
01-21-2008, 01:02 AM
Wulfy
wget problem
Lou Katz wrote:
> On Sun, Jan 20, 2008 at 08:15:25AM +0000, Wulfy wrote:
>
>> <sigh> I don't know what I'm doing wrong, but I can't get wget to get
>> more than the top layer of the site. The archive.org site just brings
>> in index.html (and robots.txt). I tried it on another site and it
>> brought in the two versions of the main page (dialup and high speed) but
>> the menu links weren't followed. I tried -l5 and -15 and got the same
>> download.
>>
>> Any idea why the -r isn't recursing?
>>
>
> Have you used the underdocumented option to ignore robots.txt?
>
> put
> robots = off
>
> in your .wgetrc, or use
>
> -erobots=off
>
> on the command line.
It turns out the problem was with robots.txt. I'll try your solution.
Thanks, Lou!
--
Blessings
Wulfmann
Wulf Credo:
Respect the elders. Teach the young. Co-operate with the pack.
Play when you can. Hunt when you must. Rest in between.
Share your affections. Voice your opinion. Leave your Mark.
Copyright July 17, 1988 by Del Goetz
--
kubuntu-users mailing list
kubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
01-21-2008, 01:10 AM
Wulfy
wget problem
Wulfy wrote:
> Lou Katz wrote:
>
>> Have you used the underdocumented option to ignore robots.txt?
>> put
>> robots = off
>>
>> in your .wgetrc, or use
>>
>> -erobots=off
>>
>> on the command line.
>>
>
> It turns out the problem was with robots.txt. I'll try your solution.
> Thanks, Lou!
>
>
No go. It just doesn't download the robot.txt file. I get an error
"Couldn't download this page from the archive" in place of the net
file. :@(
It's looking more and more like the "easy" way to do this is manually...
--
Blessings
Wulfmann
Wulf Credo:
Respect the elders. Teach the young. Co-operate with the pack.
Play when you can. Hunt when you must. Rest in between.
Share your affections. Voice your opinion. Leave your Mark.
Copyright July 17, 1988 by Del Goetz
--
kubuntu-users mailing list
kubuntu-users@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/kubuntu-users
08-20-2008, 11:22 PM
Paul
wget problem
Hi,
I'm trying to grab all of the .zip files from www.urbanfonts.com using
wget.
If I use flashgot, it will work for the page I'm on, but the output from
flashgot doesn't tell me which options are used?
All of the files are in the form of xyz.php?fontname.zip.
Any ideas on how to get them easier than doing every page with flashgot?
TTFN
Paul
--
Sie können mich aufreizen und wirklich heiß machen!
--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
08-20-2008, 11:31 PM
Patrick Kaiser
wget problem
On Thu, Aug 21, 2008 at 12:22:37AM +0100, Paul wrote:
> Hi,
>
> I'm trying to grab all of the .zip files from www.urbanfonts.com using
> wget.
>
> If I use flashgot, it will work for the page I'm on, but the output from
> flashgot doesn't tell me which options are used?
>
> All of the files are in the form of xyz.php?fontname.zip.
>
> Any ideas on how to get them easier than doing every page with flashgot?
>
> TTFN
>
> Paul
> --
> ???Sie können mich aufreizen und wirklich heiß machen!
> --
> fedora-list mailing list
> fedora-list@redhat.com
> To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Hi Paul,
do you have a complete list of the fontnames? Then you can try it in a
loop?
--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
08-20-2008, 11:42 PM
Paul
wget problem
Hi,
> do you have a complete list of the fontnames? Then you can try it in a
> loop?
'fraid not - the names are stored somewhere on the site and added as a
$POST (by the look of it) when you click on the link...
TTFN
Paul
--
Sie können mich aufreizen und wirklich heiß machen!
--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
08-21-2008, 01:22 AM
Patrick Kaiser
wget problem
On Thu, Aug 21, 2008 at 12:42:33AM +0100, Paul wrote:
> Hi,
>
> > do you have a complete list of the fontnames? Then you can try it in a
> > loop?
>
> 'fraid not - the names are stored somewhere on the site and added as a
> $POST (by the look of it) when you click on the link...
>
> TTFN
>
> Paul
> --
> ???Sie können mich aufreizen und wirklich heiß machen!
> --
> fedora-list mailing list
> fedora-list@redhat.com
> To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
--
Hey Paul,
Just try this. I do not have another solution:
mkdir /tmp/bla
cd /tmp/bla
wget -r http://www.urbanfonts.com
mkdir /tmp/blubb
cd /tmp/blubb
perl -e '@foo=`grep -r -i zip /tmp/bla/*`; foreach (@foo){ if (~/(index.php.*?)"/) {print "wget http://www.urbanfonts.com/scripts/" . $1 . "
";} };'| sh
Than you will have all fonts in /tmp/blubb and named as
index.php?<fontname.zip>
I hope this solutions helps a bit.
Maybe you can also try the --spider option of wget. Didn't tryied this.
--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
08-21-2008, 02:05 AM
"S B"
wget problem
2008/8/20 Paul <paul@all-the-johnsons.co.uk>:
> Hi,
>
> I'm trying to grab all of the .zip files from www.urbanfonts.com using
> wget.
>
> If I use flashgot, it will work for the page I'm on, but the output from
> flashgot doesn't tell me which options are used?
>
> All of the files are in the form of xyz.php?fontname.zip.
>
> Any ideas on how to get them easier than doing every page with flashgot?
>
> TTFN
>
> Paul