- datafu.hourglass.avro - package datafu.hourglass.avro
-
Input and output formats for using Avro in incremental Hadoop jobs.
- datafu.hourglass.fs - package datafu.hourglass.fs
-
Classes for working with the file system.
- datafu.hourglass.jobs - package datafu.hourglass.jobs
-
Incremental Hadoop jobs and some supporting classes.
- datafu.hourglass.mapreduce - package datafu.hourglass.mapreduce
-
Implementations of mappers, combiners, and reducers used by incremental jobs.
- datafu.hourglass.model - package datafu.hourglass.model
-
Interfaces which define the incremental processing model.
- datafu.hourglass.schemas - package datafu.hourglass.schemas
-
Classes that help manage the Avro schemas used by the jobs.
- datedPathFormat - Static variable in class datafu.hourglass.fs.PathUtils
-
- DatePath - Class in datafu.hourglass.fs
-
Represents a path and the corresponding date that is associated with it.
- DatePath(Date, Path) - Constructor for class datafu.hourglass.fs.DatePath
-
- DateRange - Class in datafu.hourglass.fs
-
A date range, consisting of a start and end date.
- DateRange(Date, Date) - Constructor for class datafu.hourglass.fs.DateRange
-
- DateRangeConfigurable - Interface in datafu.hourglass.jobs
-
An interface for an object with a configurable output date range.
- DateRangePlanner - Class in datafu.hourglass.jobs
-
Determines the date range of inputs which should be processed.
- DateRangePlanner() - Constructor for class datafu.hourglass.jobs.DateRangePlanner
-
- DelegatingCombiner - Class in datafu.hourglass.mapreduce
-
A Hadoop combiner which delegates to an implementation read from the distributed cache.
- DelegatingCombiner() - Constructor for class datafu.hourglass.mapreduce.DelegatingCombiner
-
- DelegatingMapper - Class in datafu.hourglass.mapreduce
-
A Hadoop mapper which delegates to an implementation read from the distributed cache.
- DelegatingMapper() - Constructor for class datafu.hourglass.mapreduce.DelegatingMapper
-
- DelegatingReducer - Class in datafu.hourglass.mapreduce
-
A Hadoop reducer which delegates to an implementation read from the distributed cache.
- DelegatingReducer() - Constructor for class datafu.hourglass.mapreduce.DelegatingReducer
-
- determineAvailableInputDates() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Determines what input data is available.
- determineDateRange() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Determine the date range for inputs to process based on the configuration and available inputs.
- DistributedCacheHelper - Class in datafu.hourglass.mapreduce
-
Methods for working with the Hadoop distributed cache.
- DistributedCacheHelper() - Constructor for class datafu.hourglass.mapreduce.DistributedCacheHelper
-
- getAccumulator() - Method in class datafu.hourglass.mapreduce.CollapsingCombiner
-
Gets the accumulator used to perform aggregation.
- getAccumulator() - Method in class datafu.hourglass.mapreduce.PartitioningCombiner
-
Gets the accumulator used to perform aggregation.
- getAccumulator() - Method in class datafu.hourglass.mapreduce.PartitioningReducer
-
Gets the accumulator used to perform aggregation.
- getAvailableInputsByDate() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets a map from date to available input data.
- getBeginDate() - Method in class datafu.hourglass.fs.DateRange
-
- getCombineInputs() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets whether inputs should be combined.
- getCombineProcessor() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
- getCombinerAccumulator() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Gets the accumulator used for the combiner.
- getCombinerAccumulator() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
Gets the accumulator used for the combiner.
- getCombinerAccumulator() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getCombinerAccumulator() - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
- getCombinerClass() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets the combiner class.
- getConf() - Method in class datafu.hourglass.jobs.TimePartitioner
-
- getContext() - Method in class datafu.hourglass.mapreduce.ObjectProcessor
-
- getCountersParentPath() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the path where counters will be stored.
- getCountersParentPath() - Method in class datafu.hourglass.jobs.StagedOutputJob
-
Gets path to store the counters.
- getCountersPath() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob.Report
-
Gets the path to the counters file, if one was written.
- getCountersPath() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob.Report
-
Gets the path to the counters file, if one was written.
- getCountersPath() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob.Report
-
Gets the path to the counters file, if one was written.
- getCountersPath() - Method in class datafu.hourglass.jobs.StagedOutputJob
-
Path to written counters.
- getCurrentDateRange() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
- getDailyData(Path) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Get a map from date to path for all paths matching yyyy/MM/dd under the given path.
- getDate() - Method in class datafu.hourglass.fs.DatePath
-
- getDatedData(Path) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Get a map from date to path for all paths matching yyyyMMdd under the given path.
- getDatedIntermediateValueSchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getDateForDatedPath(Path) - Static method in class datafu.hourglass.fs.PathUtils
-
Gets the date for a path in the "yyyyMMdd" format.
- getDateForNestedDatedPath(Path) - Static method in class datafu.hourglass.fs.PathUtils
-
Gets the date for a path in the "yyyy/MM/dd" format.
- getDateRange(Date, Date, Collection<Date>, Integer, Integer, boolean) - Static method in class datafu.hourglass.jobs.DateRangePlanner
-
Determines the date range of inputs which should be processed.
- getDateRange() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the desired input date range to process based on the configuration and available inputs.
- getDatesToProcess() - Method in class datafu.hourglass.jobs.PartitionPreservingExecutionPlanner
-
Gets the input dates which are to be processed.
- getDaysAgo() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the number of days to subtract off the end date.
- getDaysAgo() - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Gets the number of days to subtract off the end of the consumption window.
- getEndDate() - Method in class datafu.hourglass.fs.DateRange
-
- getEndDate() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the end date
- getEndDate() - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Gets the end date.
- getFileSystem() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the file system.
- getFileSystem() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the file system.
- getFinal() - Method in interface datafu.hourglass.model.Accumulator
-
Get the output value corresponding to all input values accumulated so far.
- getInputFiles() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob.Report
-
Gets input files that were processed.
- getInputFiles() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob.Report
-
Gets new input files that were processed.
- getInputFiles() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob.Report
-
Gets input files that were processed.
- getInputKeySchemaForSplit(Configuration, InputSplit) - Static method in class datafu.hourglass.avro.AvroMultipleInputsUtil
-
Gets the schema for a particular input split.
- getInputPaths() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the input paths.
- getInputPaths() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the input paths.
- getInputSchemas() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets the input schemas.
- getInputSchemas() - Method in class datafu.hourglass.jobs.PartitionPreservingExecutionPlanner
-
Gets the input schemas.
- getInputSchemasByPath() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets a map from input path to schema.
- getInputSchemasByPath() - Method in class datafu.hourglass.jobs.PartitionPreservingExecutionPlanner
-
Gets a map from input path to schema.
- getInputsToProcess() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets all inputs that will be processed.
- getInputsToProcess() - Method in class datafu.hourglass.jobs.PartitionPreservingExecutionPlanner
-
Gets the inputs which are to be processed.
- getIntermediateValueSchema() - Method in class datafu.hourglass.jobs.IncrementalJob
-
Gets the Avro schema for the intermediate value.
- getIntermediateValueSchema() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getIntermediateValueSchema() - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
- getIntermediateValueSchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getIntermediateValueSchema() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getIntermediateValueSchema() - Method in class datafu.hourglass.schemas.TaskSchemas
-
- getJobId() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob.Report
-
Gets the job ID.
- getJobId() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob.Report
-
Gets the job ID.
- getJobId() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob.Report
-
Gets the job ID.
- getJobName() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob.Report
-
Gets the job name.
- getJobName() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob.Report
-
Gets the job name.
- getJobName() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob.Report
-
Gets the job name.
- getKeySchema() - Method in class datafu.hourglass.jobs.IncrementalJob
-
Gets the Avro schema for the key.
- getKeySchema() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getKeySchema() - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
- getKeySchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getKeySchema() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getKeySchema() - Method in class datafu.hourglass.schemas.TaskSchemas
-
- getMapInputSchemas() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getMapInputSchemas() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getMapOutputKeySchema() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets the key schema for the map output.
- getMapOutputKeySchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getMapOutputKeySchema() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getMapOutputSchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getMapOutputSchema() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getMapOutputValueSchema() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets the value schema for the map output.
- getMapOutputValueSchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getMapOutputValueSchema() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getMapper() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Gets the mapper.
- getMapper() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
Gets the mapper.
- getMapper() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getMapper() - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
- getMapper() - Method in class datafu.hourglass.mapreduce.CollapsingMapper
-
Gets the mapper.
- getMapper() - Method in class datafu.hourglass.mapreduce.PartitioningMapper
-
Gets the mapper.
- getMapperClass() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets the mapper class.
- getMapProcessor() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
- getMaxIterations() - Method in class datafu.hourglass.jobs.IncrementalJob
-
Gets the maximum number of iterations for the job.
- getMaxToProcess() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the maximum number of days to process at a time.
- getMaxToProcess() - Method in class datafu.hourglass.jobs.IncrementalJob
-
Gets the maximum number of days of input data to process in a single run.
- getName() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the job name
- getNeedsAnotherPass() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets whether another pass will be required.
- getNeedsAnotherPass() - Method in class datafu.hourglass.jobs.PartitionPreservingExecutionPlanner
-
Gets whether another pass will be required.
- getNestedPathRoot(Path) - Static method in class datafu.hourglass.fs.PathUtils
-
Gets the root path for a path in the "yyyy/MM/dd" format.
- getNewAccumulator() - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
- getNewInputsToProcess() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets only the new data that will be processed.
- getNumDays() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the number of days to process.
- getNumDays() - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Gets the number of consecutive days to process.
- getNumReducers() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the number of reducers to use.
- getNumReducers() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Get the number of reducers to use based on the input and previous output data size.
- getNumReducers() - Method in class datafu.hourglass.jobs.PartitionPreservingExecutionPlanner
-
Get the number of reducers to use based on the input data size.
- getNumReducers() - Method in class datafu.hourglass.jobs.ReduceEstimator
-
- getOldAccumulator() - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
- getOldInputFiles() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob.Report
-
Gets old input files that were processed.
- getOldInputsToProcess() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets only the old data that will be processed.
- getOldRecordMerger() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Gets the record merger that is capable of unmerging old partial output from the new output.
- getOldRecordMerger() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getOutputFile() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob.Report
-
Gets the output file that was produced by the job.
- getOutputFileDateRange(FileSystem, Path) - Static method in class datafu.hourglass.avro.AvroDateRangeMetadata
-
Reads the date range from the metadata stored in an Avro file.
- getOutputFiles() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob.Report
-
Gets the output files that were produced by the job.
- getOutputPath() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the output path.
- getOutputPath() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob.Report
-
Gets the path to the output which was produced by the job.
- getOutputPath() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the output path.
- getOutputSchemaName() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Get the name for the reduce output schema.
- getOutputSchemaName() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
Get the name for the reduce output schema.
- getOutputSchemaNamespace() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Get the namespace for the reduce output schema.
- getOutputSchemaNamespace() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
Get the namespace for the reduce output schema.
- getOutputValueSchema() - Method in class datafu.hourglass.jobs.IncrementalJob
-
Gets the Avro schema for the output data.
- getOutputValueSchema() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getOutputValueSchema() - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
- getOutputValueSchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getOutputValueSchema() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getOutputValueSchema() - Method in class datafu.hourglass.schemas.TaskSchemas
-
- getPartition(AvroKey<GenericRecord>, AvroValue<GenericRecord>, int) - Method in class datafu.hourglass.jobs.TimePartitioner
-
- getPath() - Method in class datafu.hourglass.fs.DatePath
-
- getPreviousOutputToProcess() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets the previous output to reuse, or null if no output is being reused.
- getProperties() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the configuration properties.
- getProps() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the configuration properties.
- getRecordMerger() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Gets the record merger that is capable of merging previous output with a new partial output.
- getRecordMerger() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getRecordWriter(TaskAttemptContext) - Method in class datafu.hourglass.avro.AvroKeyValueWithMetadataOutputFormat
- getRecordWriter(TaskAttemptContext) - Method in class datafu.hourglass.avro.AvroKeyWithMetadataOutputFormat
- getReduceOutputSchema() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets the reduce output schema.
- getReduceOutputSchema() - Method in class datafu.hourglass.schemas.PartitionCollapsingSchemas
-
- getReduceOutputSchema() - Method in class datafu.hourglass.schemas.PartitionPreservingSchemas
-
- getReduceProcessor() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
- getReducerAccumulator() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Gets the accumulator used for the reducer.
- getReducerAccumulator() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
Gets the accumulator used for the reducer.
- getReducerAccumulator() - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
- getReducerAccumulator() - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
- getReducerClass() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets the reducer class.
- getReport() - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Gets a report summarizing the run.
- getReports() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Get reports that summarize each of the job iterations.
- getReports() - Method in class datafu.hourglass.jobs.AbstractPartitionPreservingIncrementalJob
-
Get reports that summarize each of the job iterations.
- getRetentionCount() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the number of days of data which will be retained in the output path.
- getReusedOutput() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob.Report
-
Gets the output that was reused, if one was reused.
- getReuseOutput() - Method in class datafu.hourglass.mapreduce.CollapsingCombiner
-
Gets whether previous output is being reused.
- getReuseOutput() - Method in class datafu.hourglass.mapreduce.CollapsingMapper
-
Gets whether previous output is being reused.
- getReuseOutput() - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
Gets whether previous output is being reused.
- getReusePreviousOutput() - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Get whether previous output should be reused.
- getReusePreviousOutput() - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Gets whether previous output should be reused, if it exists.
- getSchemaFromFile(FileSystem, Path) - Static method in class datafu.hourglass.fs.PathUtils
-
Gets the schema from a given Avro data file.
- getSchemaFromPath(FileSystem, Path) - Static method in class datafu.hourglass.fs.PathUtils
-
Gets the schema for the first Avro file under the given path.
- getSchemas() - Method in class datafu.hourglass.jobs.IncrementalJob
-
Gets the schemas.
- getSchemas() - Method in class datafu.hourglass.mapreduce.CollapsingCombiner
-
Gets the schemas.
- getSchemas() - Method in class datafu.hourglass.mapreduce.CollapsingMapper
-
Gets the Avro schemas.
- getSchemas() - Method in class datafu.hourglass.mapreduce.PartitioningMapper
-
Gets the Avro schemas.
- getSchemas() - Method in class datafu.hourglass.mapreduce.PartitioningReducer
-
Gets the Avro schemas
- getStartDate() - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Gets the start date
- getStartDate() - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Gets the start date.
- getTempPath() - Method in class datafu.hourglass.jobs.AbstractJob
-
Gets the temporary path under which intermediate files will be stored.
- getWriteCounters() - Method in class datafu.hourglass.jobs.StagedOutputJob
-
Get whether counters should be written.
- getWriterSchema() - Method in class datafu.hourglass.avro.AvroKeyValueWithMetadataRecordWriter
-
Gets the writer schema for the key/value pair generic record.
- setAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.mapreduce.CollapsingCombiner
-
Sets the accumulator used to perform aggregation.
- setAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
- setAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.mapreduce.PartitioningCombiner
-
Sets the accumulator used to perform aggregation.
- setAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.mapreduce.PartitioningReducer
-
Sets the accumulator used to perform aggregation.
- setCombineInputs(boolean) - Method in class datafu.hourglass.jobs.AbstractNonIncrementalJob
-
Sets whether inputs should be combined.
- setCombinerAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Set the accumulator for the combiner
- setCombinerAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
Set the accumulator for the combiner
- setConf(Configuration) - Method in class datafu.hourglass.jobs.TimePartitioner
-
- setContext(TaskInputOutputContext<Object, Object, Object, Object>) - Method in class datafu.hourglass.mapreduce.CollapsingMapper
-
- setContext(TaskInputOutputContext<Object, Object, Object, Object>) - Method in class datafu.hourglass.mapreduce.ObjectProcessor
-
- setContext(TaskInputOutputContext<Object, Object, Object, Object>) - Method in class datafu.hourglass.mapreduce.PartitioningMapper
-
- setContext(TaskInputOutputContext<Object, Object, Object, Object>) - Method in class datafu.hourglass.mapreduce.PartitioningReducer
-
- setCountersParentPath(Path) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the path where counters will be stored.
- setCountersParentPath(Path) - Method in class datafu.hourglass.jobs.StagedOutputJob
-
Sets path to store the counters.
- setDaysAgo(Integer) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets the number of days to subtract off the end date.
- setDaysAgo(Integer) - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Sets the number of days to subtract off the end of the consumption window.
- setEndDate(Date) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets the end date.
- setEndDate(Date) - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Sets the end date.
- setFailOnMissing(boolean) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets whether the job should fail if data is missing within the desired date range.
- setFailOnMissing(boolean) - Method in class datafu.hourglass.jobs.IncrementalJob
-
Sets whether the job should fail if input data within the desired range is missing.
- setInputKeySchemaForPath(Job, Schema, String) - Static method in class datafu.hourglass.avro.AvroMultipleInputsUtil
-
Sets the job input key schema for a path.
- setInputPaths(List<Path>) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the input paths.
- setInputPaths(List<Path>) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets the input paths.
- setIntermediateValueSchema(Schema) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Sets the Avro schema for the intermediate value.
- setIntermediateValueSchema(Schema) - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
Sets the Avro schema for the intermediate value.
- setIntermediateValueSchema(Schema) - Method in class datafu.hourglass.schemas.TaskSchemas.Builder
-
- setKeySchema(Schema) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Sets the Avro schema for the key.
- setKeySchema(Schema) - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
Sets the Avro schema for the key.
- setKeySchema(Schema) - Method in class datafu.hourglass.schemas.TaskSchemas.Builder
-
- setMapper(Mapper<GenericRecord, GenericRecord, GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Set the mapper.
- setMapper(Mapper<GenericRecord, GenericRecord, GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
Set the mapper.
- setMapper(Mapper<GenericRecord, GenericRecord, GenericRecord>) - Method in class datafu.hourglass.mapreduce.CollapsingMapper
-
Sets the mapper.
- setMapper(Mapper<GenericRecord, GenericRecord, GenericRecord>) - Method in class datafu.hourglass.mapreduce.PartitioningMapper
-
Sets the mapper.
- setMaxIterations(Integer) - Method in class datafu.hourglass.jobs.IncrementalJob
-
Sets the maximum number of iterations for the job.
- setMaxToProcess(Integer) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets the maximum number of days to process at a time.
- setMaxToProcess(Integer) - Method in class datafu.hourglass.jobs.IncrementalJob
-
Sets the maximum number of days of input data to process in a single run.
- setMerger(Merger<GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Sets the record merger that is capable of merging previous output with a new partial output.
- setName(String) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the job name
- setNumDays(Integer) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets the number of days to process.
- setNumDays(Integer) - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Sets the number of consecutive days to process.
- setNumReducers(Integer) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the number of reducers to use.
- setOldMerger(Merger<GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Sets the record merger that is capable of unmerging old partial output from the new output.
- setOldRecordMerger(Merger<GenericRecord>) - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
- setOnSetup(Setup) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Set callback to provide custom configuration before job begins execution.
- setOnSetup(Setup) - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
Set callback to provide custom configuration before job begins execution.
- setOutputDateRange(DateRange) - Method in interface datafu.hourglass.jobs.DateRangeConfigurable
-
Sets the date range for the output.
- setOutputDateRange(DateRange) - Method in class datafu.hourglass.mapreduce.CollapsingCombiner
-
- setOutputDateRange(DateRange) - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
- setOutputPath(Path) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the output path.
- setOutputPath(Path) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets the output path.
- setOutputValueSchema(Schema) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Sets the Avro schema for the output data.
- setOutputValueSchema(Schema) - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
Sets the Avro schema for the output data.
- setOutputValueSchema(Schema) - Method in class datafu.hourglass.schemas.TaskSchemas.Builder
-
- setProperties(Properties) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the configuration properties.
- setProperties(Properties) - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
- setProperties(Properties) - Method in class datafu.hourglass.jobs.IncrementalJob
-
- setProperties(Properties) - Method in class datafu.hourglass.jobs.TimeBasedJob
-
- setRecordMerger(Merger<GenericRecord>) - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
- setReducerAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionCollapsingIncrementalJob
-
Set the accumulator for the reducer.
- setReducerAccumulator(Accumulator<GenericRecord, GenericRecord>) - Method in class datafu.hourglass.jobs.PartitionPreservingIncrementalJob
-
Set the accumulator for the reducer.
- setRetentionCount(Integer) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the number of days of data which will be retained in the output path.
- setReuseOutput(boolean) - Method in class datafu.hourglass.mapreduce.CollapsingCombiner
-
Sets whether previous output is being reused.
- setReuseOutput(boolean) - Method in class datafu.hourglass.mapreduce.CollapsingMapper
-
Sets whether previous output is being reused.
- setReuseOutput(boolean) - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
Sets whether previous output is being reused.
- setReusePreviousOutput(boolean) - Method in class datafu.hourglass.jobs.AbstractPartitionCollapsingIncrementalJob
-
Set whether previous output should be reused.
- setReusePreviousOutput(boolean) - Method in class datafu.hourglass.jobs.PartitionCollapsingExecutionPlanner
-
Sets whether previous output should be reused, if it exists.
- setSchemas(PartitionCollapsingSchemas) - Method in class datafu.hourglass.mapreduce.CollapsingCombiner
-
Sets the schemas.
- setSchemas(PartitionCollapsingSchemas) - Method in class datafu.hourglass.mapreduce.CollapsingMapper
-
Sets the Avro schemas.
- setSchemas(PartitionCollapsingSchemas) - Method in class datafu.hourglass.mapreduce.CollapsingReducer
-
Sets the Avro schemas.
- setSchemas(PartitionPreservingSchemas) - Method in class datafu.hourglass.mapreduce.PartitioningMapper
-
Sets the Avro schemas.
- setSchemas(PartitionPreservingSchemas) - Method in class datafu.hourglass.mapreduce.PartitioningReducer
-
Sets the Avro schemas.
- setStartDate(Date) - Method in class datafu.hourglass.jobs.ExecutionPlanner
-
Sets the start date.
- setStartDate(Date) - Method in class datafu.hourglass.jobs.TimeBasedJob
-
Sets the start date.
- setTempPath(Path) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets the temporary path where intermediate files will be stored.
- Setup - Interface in datafu.hourglass.jobs
-
- setup(Configuration) - Method in interface datafu.hourglass.jobs.Setup
-
Set custom configuration.
- setup(Reducer<Object, Object, Object, Object>.Context) - Method in class datafu.hourglass.mapreduce.DelegatingCombiner
-
- setup(Mapper<Object, Object, Object, Object>.Context) - Method in class datafu.hourglass.mapreduce.DelegatingMapper
-
- setup(Reducer<Object, Object, Object, Object>.Context) - Method in class datafu.hourglass.mapreduce.DelegatingReducer
-
- setUseCombiner(boolean) - Method in class datafu.hourglass.jobs.AbstractJob
-
Sets whether the combiner should be used.
- setWriteCounters(boolean) - Method in class datafu.hourglass.jobs.StagedOutputJob
-
Sets whether counters should be written.
- StagedOutputJob - Class in datafu.hourglass.jobs
-
A derivation of Job
that stages its output in another location and only
moves it to the final destination if the job completes successfully.
- StagedOutputJob(Configuration, String, Logger) - Constructor for class datafu.hourglass.jobs.StagedOutputJob
-
Initializes the job.