org.apache.any23.plugin.crawler
Class DefaultWebCrawler

java.lang.Object
  extended by edu.uci.ics.crawler4j.crawler.WebCrawler
      extended by org.apache.any23.plugin.crawler.DefaultWebCrawler
All Implemented Interfaces:
Runnable

public class DefaultWebCrawler
extends edu.uci.ics.crawler4j.crawler.WebCrawler

Default WebCrawler implementation.

Author:
Michele Mostarda (mostarda@fbk.eu)

Constructor Summary
DefaultWebCrawler()
           
 
Method Summary
 boolean shouldVisit(edu.uci.ics.crawler4j.url.WebURL url)
          Override this method to specify whether the given URL should be visited or not.
 void visit(edu.uci.ics.crawler4j.crawler.Page page)
          Override this method to implement the single page processing logic.
 
Methods inherited from class edu.uci.ics.crawler4j.crawler.WebCrawler
getMyController, getMyId, getMyLocalData, getThread, onBeforeExit, onStart, run, setMaximumCrawlDepth, setMyController, setMyId, setThread
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DefaultWebCrawler

public DefaultWebCrawler()
Method Detail

shouldVisit

public boolean shouldVisit(edu.uci.ics.crawler4j.url.WebURL url)
Override this method to specify whether the given URL should be visited or not.

Overrides:
shouldVisit in class edu.uci.ics.crawler4j.crawler.WebCrawler

visit

public void visit(edu.uci.ics.crawler4j.crawler.Page page)
Override this method to implement the single page processing logic.

Overrides:
visit in class edu.uci.ics.crawler4j.crawler.WebCrawler


Copyright © 2010-2012 The Apache Software Foundation. All Rights Reserved.