Lens server monitoring

This section documents all the metrics available from lens server, admin rest end points and more on query statistics.


Lens server emits following metrics for query service

  • Number of queued queries;
  • Number of running queries;
  • Number of finished queries in server's memory;
  • Total number of accepted queries;
  • Total number of successful queries;
  • Total number of finished queries;
  • Total number of failed queries;
  • Total number of cancelled queries;
  • Number of result formatting error
  • Total number of opened sessions from the server start/restart
  • Total number of closed sessions
  • Number of active sessions

Lens server also emits following metrics for other services

  • Number of exceptions
  • Number of HTTP client error
  • Number of HTTP error
  • Number of HTTP server error
  • Number of HTTP unknown error
  • Number of HTTP request started
  • Number of HTTP requests finished
  • Number of statistics store errors
  • Number of statistics log partition handler errors
  • Number of statistics log file scanner errors
  • Number of email notification errors

Lens server can be configured to emit metrics for resource methods. By default it's disabled, can be enabled by the property lens.server.enable.resource.method.metering. Metrics for resource methods are created lazily(as and when required) and consist of the following things:

  • Number of hits
  • Timer for successful executions.
  • Timer for failed executions.

A timer can provide running averages, statistical values like mean/median/quartiles etc, histograms.

Lens server also emits jvm, gc, memory and thread level metrics.

Supported reporting methods for the metrics emitted are the following:

  • Console reporting. Can be enabled by: lens.server.enable.console.metrics
  • CSV reporting. Can be configured by: lens.server.enable.csv.metrics, lens.server.metrics.csv.directory.path
  • Ganglia reporting. Can be configured by the parameters : lens.server.enable.ganglia.metrics, lens.server.metrics.ganglia.host, lens.server.metrics.ganglia.port
  • Graphite reporting. Can be configured by the parameters: lens.server.enable.graphite.metrics, lens.server.metrics.graphite.host, lens.server.metrics.graphite.port

Reporting to the chosen reporting methods will happen periodically. That period can be configured by: lens.server.metrics.reporting.period

Critical Metrics

When resource method metering is enabled you would see different metrics upto 1000 being emitted and might be confusing to admins - which one to look at.

Along with jvm, memory, thread count gauges, the following are some critical metrics that admin can monitor

  • lens.gauges.org.apache.lens.server.api.query.QueryExecutionService.running-queries.value
  • lens.gauges.org.apache.lens.server.api.query.QueryExecutionService.queued-queries.value
  • lens.gauges.org.apache.lens.server.api.query.QueryExecutionService.finished-queries.value

    For all timers, admin can look at mean or/and p99 values and exception.timer count. For example :

  • lens.timers.org.apache.lens.server.metastore.MetastoreResource.getLatestDateOfCube.GET.exception.timer.count
  • lens.timers.org.apache.lens.server.metastore.MetastoreResource.getLatestDateOfCube.GET.timer.mean
  • lens.timers.org.apache.lens.server.metastore.MetastoreResource.getLatestDateOfCube.GET.timer.p99

REST end points

Lens server provides admin endpoint at host:port/admin. It provides end points for ping, metrics, threads and healthcheck.

  • ping : admin/ping will respond with pong, if server is up
  • metrics : admin/metrics will respond with all metrics in a text file, written in json
  • healthcheck : admin/healthcheck is not implemented yet.
  • threads : admin/threads will give a thread dump of the server

Query Statistics:

Lens Server can be configured to emit query related statistics to a hive table QueryExecutionStatistics.The statistics service can be configured by providing values to lens.statistics.warehouse.dir set to a HDFS location where your query statistics log file will be persisted, lens.statistics.db the database which will contain all statistics related tables and lens.log.rollover.interval time interval which service will be monitoring for rollover in log file.The statistics can be disabled by setting, lens.server.statistics.store.class to empty string. The statistics service works by monitoring for rollups of query-stats.log file and adds an appropriate partition based on the rolled over file. The statistics can be queried using Hive queries.