ApacheCon Europe 2012

Rhein-Neckar-Arena, Sinsheim, Germany

5–8 November 2012

Handling RDF data with tools from the Hadoop ecosystem

Paolo Castagna

Audience level:
Advanced
Track:
Linked Data

Wednesday 4:45 p.m.–5:30 p.m. in Level 2 Left

Description

As open data and linked data communities grow, so do the number and average size of freely available datasets. Often these datasets are modelled and interlinked using RDF. This talk shares tips and tricks, use cases and practical examples of how to effectively use tools from the Hadoop ecosystem to process large RDF datasets.

Abstract

As open data and linked data communities grow, so do the number and average size of datasets freely available. Often these datasets are modelled and shared using RDF.

RDF offers a graph based data model to represent, share, link and integrate data on the Web.

This talk shares some of the use cases, tips and tricks and practical examples of how to use effectively tools from the Hadoop ecosystem to process large RDF datasets using Apache Jena, Hadoop/MapReduce, Pig and Giraph.