Navigation
Main
Software
Physics
Contact
Software
fm_ng
filemail
checkraid
gal_creat
PDView
pagefetcher
WebCrawler
IPTC-Toolset

Pagefetcher

Description

A small Perlscript to fetch webpages. It reads an startfile, fetches all URLs in this file and all links in these fetched files. Similar to wget starturl.html -l 1 except that one can disallow which pages to fetch.
Made to fetch some News-Pages, who let their pages expire. So one can build its own Archive (for free).

Features

  • Does not refetch pages that are allready there
  • Fetched several pages at the same time

TODO

Documentation is poor.
Consider fetching some pictures as well, but with regard to disallow rules.

Wishlist

-

Download

pagefetcher_1.0.tar.bz2
Last Modified: Sun Nov 13 18:00:13 2005