FAQ Search Today's Posts Mark Forums Read
» Video Reviews

» Linux Archive

Linux-archive is a website aiming to archive linux email lists and to make them easily accessible for linux users/developers.


» Sponsor

» Partners

» Sponsor

Go Back   Linux Archive > Debian > Debian Development

 
 
LinkBack Thread Tools
 
Old 03-13-2011, 09:49 PM
Nicholas Bamber
 
Default Bug#618281: ITP: libwww-robotrules-perl -- database of robots.txt-derived permissions

Package: wnpp
Owner: Nicholas Bamber <nicholas@periapt.co.uk>
Severity: wishlist
X-Debbugs-CC: debian-devel@lists.debian.org,debian-perl@lists.debian.org

* Package name : libwww-robotrules-perl
Version : 6.01
Upstream Author : Gisle Aas <gisle@activestate.com>
* URL : http://search.cpan.org/dist/WWW-RobotRules/
* License : Artistic or GPL-1+
Programming Lang: Perl
Description : database of robots.txt-derived permissions

WWW::RobotRules parses /robots.txt files as specified in "A Standard for
Robot Exclusion", at <http://www.robotstxt.org/wc/norobots.html> Webmasters
can use the /robots.txt file to forbid conforming robots from accessing
parts

of their web site.

The parsed files are kept in a WWW::RobotRules object, and this object
provides methods to check if access to a given URL is prohibited. The same
WWW::RobotRules object can be used for one or more parsed /robots.txt files
on any number of hosts.



--
To UNSUBSCRIBE, email to debian-devel-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Archive: 4D7D4A16.50505@periapt.co.uk">http://lists.debian.org/4D7D4A16.50505@periapt.co.uk
 

Thread Tools




All times are GMT. The time now is 10:17 PM.

VBulletin, Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO ©2007, Crawlability, Inc.
Copyright 2007 - 2008, www.linux-archive.org