Navigation
Main
Software
Physics
Contact
Software
fm_ng
filemail
checkraid
gal_creat
PDView
pagefetcher
WebCrawler
IPTC-Toolset

Webpage Crawler

Description

A Perlscripts to create an index over several webpages. A second creates an index over all words with ranking of the page.
Also a CGI-Script is supplied to search the files and give ranked results.

Features

  • Fetches several pages at once

TODO

Documentation is poor.
Implement ONLYFILE similar to pagefetch.
Using a configfile instead of editing the .pl direct
Consider using a common codebase foo.pm for pagefetch and crawl.
See TODO

Wishlist

Download

crawl_1.0.tar.bz2
Last Modified: Sun Nov 13 18:47:04 2005