Welcome to Lens!

Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.

At a high level the project provides these features -

  • Simple metadata layer which provides an abstract view over tiered data stores
  • Single shared schema server based on the Hive Metastore - This schema is shared by data pipelines (HCatalog) and analytics applications.
    • OLAP Cube QL which is a high level SQL like language to query and describe data sets organized in data cubes.
    • A JDBC driver and Java client libraries to issue queries, and a CLI for ad hoc queries.
    • Lens application server - a REST server which allows users to query data, make schema changes, scheduling queries and enforcing quota limits on queries.
    • Driver based architecture allows plugging in reporting systems like Hive, Columnar data warehouses, Redshift etc.
    • Cost based engine selection - allows optimal use of resources by selecting the best execution engine for a given query based on the query cost.

      The following diagram shows Lens architecture.

      Lens Architecture

Disclaimer

Apache Lens is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by Apache Incubator.Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.