ApacheCon Europe 2012

Rhein-Neckar-Arena, Sinsheim, Germany

5–8 November 2012

What are we working on?

Steve Rowe

Audience level:
Advanced
Track:
Lucene, Solr & Friends

Wednesday 11 a.m.–noon in Level 1 Right

Description

This talk will give an overview of some improvements for future versions of Apache Lucene, including major efforts underway in feature branches and work being done during 2012's Google Summer of Code.

Abstract

This talk will give an overview of some improvements for future versions of Apache Lucene, including major efforts underway in feature branches and work being done during 2012's Google Summer of Code.

Talk will provide a summary of each feature, why it is important or useful, feature's current status, and how you can help contribute/test.

Major themes:

  • Intblock Compression
    • Likely the new default index format for future 4.x release
    • Better compression for structured data (e.g. database content)
    • Separate payloads/offsets from the prox stream.
    • You don't "pay" for payloads except when you need them.
  • Positions Iterators
    • Fold Span*Query functionality into basic queries.
    • Enable efficient proximity scoring
    • Faster, more relevant highlighting (result snippets)
  • Docstore improvements
    • StoreableField API improvements
    • Efficient compressed stored fields
    • New possibilities for term vectors
  • Pipe dreams (future-future)
    • Additional suggester implementations
    • Updatable documents in lucene