ApacheCon Europe 2012

Rhein-Neckar-Arena, Sinsheim, Germany

5–8 November 2012

Large scale crawling with Apache Nutch

Julien Nioche

Audience level:
Beginner
Track:
Lucene, Solr & Friends

Wednesday 10 a.m.–10:45 a.m. in Level 1 Right

Description

This talk will give an overview of Apache Nutch, its main components, how it fits with other Apache projects and its latest developments.

Abstract

This talk will give an overview of Apache Nutch. I will describe its main components and how it fits with other Apache projects such as Hadoop, Lucene, SOLR, Tika or HBase.

The second part of the presentation will be focused on the latest developments in Nutch and the changed introduces by the brand new version 2.0.