Current Limitations and Future Work

This section contains an almost certainly incomplete list of known limitations and plans for future work.

  • We would like to have easy support for reading and writing data from/to the Hive metastore via the HCatalog APIs.
  • The optimizer does not yet merge different groupByKey operations that run over the same input data into a single MapReduce job. Implementing this optimization will provide a major performance benefit for a number of problems.