org.apache.spark
Class ShuffleDependency<K,V,C>
Object
org.apache.spark.Dependency<scala.Product2<K,V>>
org.apache.spark.ShuffleDependency<K,V,C>
- All Implemented Interfaces:
- java.io.Serializable
public class ShuffleDependency<K,V,C>
- extends Dependency<scala.Product2<K,V>>
:: DeveloperApi ::
Represents a dependency on the output of a shuffle stage. Note that in the case of shuffle,
the RDD is transient since we don't need it on the executor side.
param: _rdd the parent RDD
param: partitioner partitioner used to partition the shuffle output
param: serializer Serializer
to use. If set to None,
the default serializer, as specified by spark.serializer
config option, will
be used.
param: keyOrdering key ordering for RDD's shuffles
param: aggregator map/reduce-side aggregator for RDD's shuffle
param: mapSideCombine whether to perform partial aggregation (also known as map-side combine)
- See Also:
- Serialized Form
Methods inherited from class Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ShuffleDependency
public ShuffleDependency(RDD<? extends scala.Product2<K,V>> _rdd,
Partitioner partitioner,
scala.Option<Serializer> serializer,
scala.Option<scala.math.Ordering<K>> keyOrdering,
scala.Option<Aggregator<K,V,C>> aggregator,
boolean mapSideCombine)
partitioner
public Partitioner partitioner()
serializer
public scala.Option<Serializer> serializer()
keyOrdering
public scala.Option<scala.math.Ordering<K>> keyOrdering()
aggregator
public scala.Option<Aggregator<K,V,C>> aggregator()
mapSideCombine
public boolean mapSideCombine()
rdd
public RDD<scala.Product2<K,V>> rdd()
- Specified by:
rdd
in class Dependency<scala.Product2<K,V>>
shuffleId
public int shuffleId()
shuffleHandle
public org.apache.spark.shuffle.ShuffleHandle shuffleHandle()