Home | Trees | Indices | Help |
|
---|
|
object --+ | SparkContext
Main entry point for Spark functionality. A SparkContext represents the connection to a Spark cluster, and can be used to create RDDs and broadcast variables on that cluster.
Instance Methods | |||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
Inherited from |
Properties | |
defaultParallelism Default level of parallelism to use when not given by user (e.g. |
|
Inherited from |
Method Details |
Create a new SparkContext.
|
Distribute a local Python collection to form an RDD. >>> sc.parallelize(range(5), 5).glom().collect() [[0], [1], [2], [3], [4]] |
Broadcast a read-only variable to the cluster, returning a
|
Create an Accumulator with the given initial value, using a given AccumulatorParam helper object to define how to add values of the data type if provided. Default AccumulatorParams are used for integers and floating-point numbers if you do not provide one. For other types, a custom AccumulatorParam can be used. |
Add a file to be downloaded with this Spark job on every node. The
To access the file in Spark jobs, use SparkFiles.get(path) to find its download location. >>> from pyspark import SparkFiles >>> path = os.path.join(tempdir, "test.txt") >>> with open(path, "w") as testFile: ... testFile.write("100") >>> sc.addFile(path) >>> def func(iterator): ... with open(SparkFiles.get("test.txt")) as testFile: ... fileVal = int(testFile.readline()) ... return [x * 100 for x in iterator] >>> sc.parallelize([1, 2, 3, 4]).mapPartitions(func).collect() [100, 200, 300, 400] |
Add a .py or .zip dependency for all tasks to be executed on this
SparkContext in the future. The |
Set the directory under which RDDs are going to be checkpointed. The directory must be a HDFS path if running on a cluster. If the directory does not exist, it will be created. If the directory
exists and |
Property Details |
defaultParallelismDefault level of parallelism to use when not given by user (e.g. for reduce tasks)
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0.1 on Tue Sep 24 17:57:23 2013 | http://epydoc.sourceforge.net |