Settings
The following list are the settings used to configure Spark Streaming applications.
|
Caution
|
FIXME Describe how to set them in streaming applications. |
-
spark.streaming.kafka.maxRetries(default:1) sets up the number of connection attempts to Kafka brokers. -
spark.streaming.receiver.writeAheadLog.enable(default:false) controls what ReceivedBlockHandler to use:WriteAheadLogBasedBlockHandlerorBlockManagerBasedBlockHandler. -
spark.streaming.receiver.blockStoreTimeout(default:30) time in seconds to wait until both writes to a write-ahead log and BlockManager complete successfully. -
spark.streaming.clock(default:org.apache.spark.util.SystemClock) specifies a fully-qualified class name that extendsorg.apache.spark.util.Clockto represent time. It is used in JobGenerator. -
spark.streaming.ui.retainedBatches(default:1000) controls the number ofBatchUIDataelements about completed batches in a first-in-first-out (FIFO) queue that are used to display statistics in Streaming page in web UI. -
spark.streaming.receiverRestartDelay(default:2000) - the time interval between a receiver is stopped and started again. -
spark.streaming.concurrentJobs(default:1) is the number of concurrent jobs, i.e. threads in streaming-job-executor thread pool. -
spark.streaming.stopSparkContextByDefault(default:true) controls whether (true) or not (false) to stop the underlying SparkContext (regardless of whether thisStreamingContexthas been started).
-
spark.streaming.kafka.maxRatePerPartition(default:0) if non-0sets maximum number of messages per partition. -
spark.streaming.manualClock.jump(default:0) offsets (aka jumps) the system time, i.e. adds its value to checkpoint time, when used with the clock being a subclass oforg.apache.spark.util.ManualClock. It is used when JobGenerator is restarted from checkpoint. -
spark.streaming.unpersist(default:true) is a flag to control whether output streams should unpersist old RDDs. -
spark.streaming.gracefulStopTimeout(default: 10 * batch interval) -
spark.streaming.stopGracefullyOnShutdown(default:false) controls whether to stop StreamingContext gracefully or not and is used by stopOnShutdown Shutdown Hook.
Checkpointing
-
spark.streaming.checkpoint.directory- when set and StreamingContext is created, the value of the setting gets passed on to StreamingContext.checkpoint method.
Back Pressure
-
spark.streaming.backpressure.enabled(default:false) - enables (true) or disables (false) back pressure in input streams with receivers or DirectKafkaInputDStream. -
spark.streaming.backpressure.rateEstimator(default:pid) is the RateEstimator to use.