org.apache.crunch.lib.join
Class JoinFn<K,U,V>
java.lang.Object
org.apache.crunch.DoFn<Pair<Pair<K,Integer>,Iterable<Pair<U,V>>>,Pair<K,Pair<U,V>>>
org.apache.crunch.lib.join.JoinFn<K,U,V>
- Type Parameters:
K
- Type of the keys.U
- Type of the first PTable
's valuesV
- Type of the second PTable
's values
- All Implemented Interfaces:
- Serializable
- Direct Known Subclasses:
- FullOuterJoinFn, InnerJoinFn, LeftOuterJoinFn, RightOuterJoinFn
public abstract class JoinFn<K,U,V>
- extends DoFn<Pair<Pair<K,Integer>,Iterable<Pair<U,V>>>,Pair<K,Pair<U,V>>>
Represents a DoFn
for performing joins.
- See Also:
- Serialized Form
Constructor Summary |
JoinFn(PType<K> keyType,
PType<U> leftValueType)
Instantiate with the PType of the value of the left side of the join (used
for creating deep copies of values). |
Method Summary |
abstract String |
getJoinType()
|
abstract void |
join(K key,
int id,
Iterable<Pair<U,V>> pairs,
Emitter<Pair<K,Pair<U,V>>> emitter)
Performs the actual joining. |
void |
process(Pair<Pair<K,Integer>,Iterable<Pair<U,V>>> input,
Emitter<Pair<K,Pair<U,V>>> emitter)
Split up the input record to make coding a bit more manageable. |
JoinFn
public JoinFn(PType<K> keyType,
PType<U> leftValueType)
- Instantiate with the PType of the value of the left side of the join (used
for creating deep copies of values).
- Parameters:
keyType
- The PType of the value used as the key of the joinleftValueType
- The PType of the value type of the left side of the join
getJoinType
public abstract String getJoinType()
- Returns:
- The name of this join type (e.g. innerJoin, leftOuterJoin).
join
public abstract void join(K key,
int id,
Iterable<Pair<U,V>> pairs,
Emitter<Pair<K,Pair<U,V>>> emitter)
- Performs the actual joining.
- Parameters:
key
- The key for this grouping of values.id
- The side that this group of values is from (0 -> left, 1 ->
right).pairs
- The group of values associated with this key and id pair.emitter
- The emitter to send the output to.
process
public void process(Pair<Pair<K,Integer>,Iterable<Pair<U,V>>> input,
Emitter<Pair<K,Pair<U,V>>> emitter)
- Split up the input record to make coding a bit more manageable.
- Specified by:
process
in class DoFn<Pair<Pair<K,Integer>,Iterable<Pair<U,V>>>,Pair<K,Pair<U,V>>>
- Parameters:
input
- The input record.emitter
- The emitter to send the output to.
Copyright © 2012 The Apache Software Foundation. All Rights Reserved.