org.apache.crunch.lib
Class Sample.SamplerFn<S>
java.lang.Object
org.apache.crunch.DoFn<S,S>
org.apache.crunch.lib.Sample.SamplerFn<S>
- All Implemented Interfaces:
- Serializable
- Enclosing class:
- Sample
public static class Sample.SamplerFn<S>
- extends DoFn<S,S>
- See Also:
- Serialized Form
Method Summary |
void |
initialize()
Called during the setup of the MapReduce job this DoFn is
associated with. |
void |
process(S input,
Emitter<S> emitter)
Processes the records from a PCollection . |
Sample.SamplerFn
public Sample.SamplerFn(long seed,
double acceptanceProbability)
initialize
public void initialize()
- Description copied from class:
DoFn
- Called during the setup of the MapReduce job this
DoFn
is
associated with. Subclasses may override this method to do appropriate
initialization.
- Overrides:
initialize
in class DoFn<S,S>
process
public void process(S input,
Emitter<S> emitter)
- Description copied from class:
DoFn
- Processes the records from a
PCollection
.
Note: Crunch can reuse a single input record object whose content
changes on each DoFn.process(Object, Emitter)
method call. This
functionality is imposed by Hadoop's Reducer implementation: The framework will reuse the key and value
objects that are passed into the reduce, therefore the application should
clone the objects they want to keep a copy of.
- Specified by:
process
in class DoFn<S,S>
- Parameters:
input
- The input record.emitter
- The emitter to send the output to
Copyright © 2012 The Apache Software Foundation. All Rights Reserved.