allow_non_restored_state |
Flag indicating whether non restored state is allowed if the savepoint contains state for an operator that is no longer part of the pipeline. |
Default: false |
auto_balance_write_files_sharding_enabled |
Flag indicating whether auto-balance sharding for WriteFiles transform should be enabled. This might prove useful in streaming use-case, where pipeline needs to write quite many events into files, typically divided into N shards. Default behavior on Flink would be, that some workers will receive more shards to take care of than others. This cause workers to go out of balance in terms of processing backlog and memory usage. Enabling this feature will make shards to be spread evenly among available workers in improve throughput and memory usage stability. |
Default: false |
auto_watermark_interval |
The interval in milliseconds for automatic watermark emission. |
|
checkpoint_timeout_millis |
The maximum time in milliseconds that a checkpoint may take before being discarded. |
Default: -1 |
checkpointing_interval |
The interval in milliseconds at which to trigger checkpoints of the running pipeline. Default: No checkpointing. |
Default: -1 |
checkpointing_mode |
The checkpointing mode that defines consistency guarantee. |
Default: EXACTLY_ONCE |
disable_metrics |
Disable Beam metrics in Flink Runner |
Default: false |
execution_mode_for_batch |
Flink mode for data exchange of batch pipelines. Reference {@link org.apache.flink.api.common.ExecutionMode}. Set this to BATCH_FORCED if pipelines get blocked, see https://issues.apache.org/jira/browse/FLINK-10672 |
Default: PIPELINED |
execution_retry_delay |
Sets the delay in milliseconds between executions. A value of {@code -1} indicates that the default value should be used. |
Default: -1 |
externalized_checkpoints_enabled |
Enables or disables externalized checkpoints. Works in conjunction with CheckpointingInterval |
Default: false |
fail_on_checkpointing_errors |
Sets the expected behaviour for tasks in case that they encounter an error in their checkpointing procedure. If this is set to true, the task will fail on checkpointing error. If this is set to false, the task will only decline a the checkpoint and continue running. |
Default: true |
faster_copy |
Remove unneeded deep copy between operators. See https://issues.apache.org/jira/browse/BEAM-11146 |
Default: false |
files_to_stage |
Jar-Files to send to all workers and put on the classpath. The default value is all files from the classpath. |
|
finish_bundle_before_checkpointing |
If set, finishes the current bundle and flushes all output before checkpointing the state of the operators. By default, starts checkpointing immediately and buffers any remaining bundle output as part of the checkpoint. The setting may affect the checkpoint alignment. |
Default: false |
flink_master |
Address of the Flink Master where the Pipeline should be executed. Can either be of the form "host:port" or one of the special values [local], [collection] or [auto]. |
Default: [auto] |
latency_tracking_interval |
Interval in milliseconds for sending latency tracking marks from the sources to the sinks. Interval value <= 0 disables the feature. |
Default: 0 |
max_bundle_size |
The maximum number of elements in a bundle. |
Default: 1000 |
max_bundle_time_mills |
The maximum time to wait before finalising a bundle (in milliseconds). |
Default: 1000 |
max_parallelism |
The pipeline wide maximum degree of parallelism to be used. The maximum parallelism specifies the upper limit for dynamic scaling and the number of key groups used for partitioned state. |
Default: -1 |
min_pause_between_checkpoints |
The minimal pause in milliseconds before the next checkpoint is triggered. |
Default: -1 |
num_concurrent_checkpoints |
The maximum number of concurrent checkpoints. Defaults to 1 (=no concurrent checkpoints). |
Default: 1 |
number_of_execution_retries |
Sets the number of times that failed tasks are re-executed. A value of zero effectively disables fault tolerance. A value of -1 indicates that the system default value (as defined in the configuration) should be used. |
Default: -1 |
object_reuse |
Sets the behavior of reusing objects. |
Default: false |
parallelism |
The degree of parallelism to be used when distributing operations onto workers. If the parallelism is not set, the configured Flink default is used, or 1 if none can be found. |
Default: -1 |
re_iterable_group_by_key_result |
Flag indicating whether result of GBK needs to be re-iterable. Re-iterable result implies that all values for a single key must fit in memory as we currently do not support spilling to disk. |
Default: false |
report_checkpoint_duration |
If not null, reports the checkpoint duration of each ParDo stage in the provided metric namespace. |
|
retain_externalized_checkpoints_on_cancellation |
Sets the behavior of externalized checkpoints on cancellation. |
Default: false |
savepoint_path |
Savepoint restore path. If specified, restores the streaming pipeline from the provided path. |
|
shutdown_sources_after_idle_ms |
Shuts down sources which have been idle for the configured time of milliseconds. Once a source has been shut down, checkpointing is not possible anymore. Shutting down the sources eventually leads to pipeline shutdown (=Flink job finishes) once all input has been processed. Unless explicitly set, this will default to Long.MAX_VALUE when checkpointing is enabled and to 0 when checkpointing is disabled. See https://issues.apache.org/jira/browse/FLINK-2491 for progress on this issue. |
Default: -1 |
state_backend |
State backend to store Beam's state. Use 'rocksdb' or 'filesystem'. |
|
state_backend_factory |
Sets the state backend factory to use in streaming mode. Defaults to the flink cluster's state.backend configuration. |
|
state_backend_storage_path |
State backend path to persist state backend data. Used to initialize state backend. |
|