pyspark.RDD.union#

RDD.union(other)[source]#

Return the union of this RDD and another one.

New in version 0.7.0.

Parameters
otherRDD

another RDD

Returns
RDD

the union of this RDD and another one

Examples

>>> rdd = sc.parallelize([1, 1, 2, 3])
>>> rdd.union(rdd).collect()
[1, 1, 2, 3, 1, 1, 2, 3]