Lease

10 variables

59 variables

Leases a dataflow WorkItem to run

Authorization

To use this building block you will have to grant access to at least one of the following scopes:

View and manage your data across Google Cloud Platform services
View and manage your Google Compute Engine resources
View your Google Compute Engine resources
View your email address

Input

This building block consumes 10 input parameters

Name	Format	Description
`projectId` Required	`STRING`	Identifies the project this worker belongs to
`jobId` Required	`STRING`	Identifies the workflow job this worker belongs to
`requestedLeaseDuration`	`ANY`	The initial lease period
`currentWorkerTime`	`ANY`	The current timestamp at the worker
`workItemTypes[]`	`STRING`
`location`	`STRING`	The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that contains the WorkItem's job
`unifiedWorkerRequest`	`OBJECT`	Untranslated bag-of-bytes WorkRequest from UnifiedWorker
`unifiedWorkerRequest.customKey.value` Required	`ANY`	Untranslated bag-of-bytes WorkRequest from UnifiedWorker
`workerCapabilities[]`	`STRING`
`workerId`	`STRING`	Identifies the worker leasing work -- typically the ID of the virtual machine running the worker

= Parameter name
= Format

projectId STRING Required

Identifies the project this worker belongs to

jobId STRING Required

Identifies the workflow job this worker belongs to

requestedLeaseDuration ANY

The initial lease period

currentWorkerTime ANY

The current timestamp at the worker

workItemTypes[] STRING

location STRING

The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that contains the WorkItem's job

unifiedWorkerRequest OBJECT

Untranslated bag-of-bytes WorkRequest from UnifiedWorker

unifiedWorkerRequest.customKey.value ANY Required

Untranslated bag-of-bytes WorkRequest from UnifiedWorker

workerCapabilities[] STRING

workerId STRING

Identifies the worker leasing work -- typically the ID of the virtual machine running the worker

Output

This building block provides 59 output parameters

Name	Format	Description
`unifiedWorkerResponse`	`OBJECT`	Untranslated bag-of-bytes WorkResponse for UnifiedWorker
`unifiedWorkerResponse.customKey.value`	`ANY`	Untranslated bag-of-bytes WorkResponse for UnifiedWorker
`workItems[]`	`OBJECT`	WorkItem represents basic information about a WorkItem to be executed in the cloud
`workItems[].shellTask`	`OBJECT`	A task which consists of a shell command for the worker to execute
`workItems[].shellTask.command`	`STRING`	The shell command to run
`workItems[].shellTask.exitCode`	`INTEGER`	Exit code for the task
`workItems[].streamingComputationTask`	`OBJECT`	A task which describes what action should be performed for the specified streaming computation ranges
`workItems[].streamingComputationTask.computationRanges[]`	`OBJECT`	Describes full or partial data disk assignment information of the computation ranges
`workItems[].streamingComputationTask.dataDisks[]`	`OBJECT`	Describes mounted data disk
`workItems[].streamingComputationTask.taskType`	`ENUMERATION`	A type of streaming computation task
`workItems[].jobId`	`STRING`	Identifies the workflow job this WorkItem belongs to
`workItems[].id`	`INTEGER`	Identifies this WorkItem
`workItems[].configuration`	`STRING`	Work item-specific configuration as an opaque blob
`workItems[].mapTask`	`OBJECT`	MapTask consists of an ordered set of instructions, each of which describes one particular low-level operation for the worker to perform in order to accomplish the MapTask's WorkItem. Each instruction must appear in the list before any instructions which depends on its output
`workItems[].mapTask.systemName`	`STRING`	System-defined name of this MapTask. Unique across the workflow
`workItems[].mapTask.stageName`	`STRING`	System-defined name of the stage containing this MapTask. Unique across the workflow
`workItems[].mapTask.instructions[]`	`OBJECT`	Describes a particular operation comprising a MapTask
`workItems[].mapTask.counterPrefix`	`STRING`	Counter prefix that can be used to prefix counters. Not currently used in Dataflow
`workItems[].seqMapTask`	`OBJECT`	Describes a particular function to invoke
`workItems[].seqMapTask.name`	`STRING`	The user-provided name of the SeqDo operation
`workItems[].seqMapTask.outputInfos[]`	`OBJECT`	Information about an output of a SeqMapTask
`workItems[].seqMapTask.inputs[]`	`OBJECT`	Information about a side input of a DoFn or an input of a SeqDoFn
`workItems[].seqMapTask.stageName`	`STRING`	System-defined name of the stage containing the SeqDo operation. Unique across the workflow
`workItems[].seqMapTask.systemName`	`STRING`	System-defined name of the SeqDo operation. Unique across the workflow
`workItems[].seqMapTask.userFn`	`OBJECT`	The user function to invoke
`workItems[].seqMapTask.userFn.customKey.value`	`ANY`	The user function to invoke
`workItems[].packages[]`	`OBJECT`	The packages that must be installed in order for a worker to run the steps of the Cloud Dataflow job that will be assigned to its worker pool. This is the mechanism by which the Cloud Dataflow SDK causes code to be loaded onto the workers. For example, the Cloud Dataflow Java SDK might use this to install jars containing the user's code and all of the various dependencies (libraries, data files, etc.) required in order for that code to run
`workItems[].packages[].location`	`STRING`	The resource to read the package from. The supported resource type is: Google Cloud Storage: storage.googleapis.com/{bucket} bucket.storage.googleapis.com/
`workItems[].packages[].name`	`STRING`	The name of the package
`workItems[].projectId`	`STRING`	Identifies the cloud project this WorkItem belongs to
`workItems[].streamingSetupTask`	`OBJECT`	A task which initializes part of a streaming Dataflow job
`workItems[].streamingSetupTask.streamingComputationTopology`	`OBJECT`	Global topology of the streaming Dataflow job, including all computations and their sharded locations
`workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap`	`OBJECT`	Maps user stage names to stable computation names
`workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap.customKey.value`	`STRING`	Maps user stage names to stable computation names
`workItems[].streamingSetupTask.streamingComputationTopology.persistentStateVersion`	`INTEGER`	Version number for persistent state
`workItems[].streamingSetupTask.streamingComputationTopology.forwardingKeyBits`	`INTEGER`	The size (in bits) of keys that will be assigned to source messages
`workItems[].streamingSetupTask.snapshotConfig`	`OBJECT`	Streaming appliance snapshot configuration
`workItems[].streamingSetupTask.snapshotConfig.importStateEndpoint`	`STRING`	Indicates which endpoint is used to import appliance state
`workItems[].streamingSetupTask.snapshotConfig.snapshotId`	`STRING`	If set, indicates the snapshot id for the snapshot being performed
`workItems[].streamingSetupTask.workerHarnessPort`	`INTEGER`	The TCP port used by the worker to communicate with the Dataflow worker harness
`workItems[].streamingSetupTask.drain`	`BOOLEAN`	The user has requested drain
`workItems[].streamingSetupTask.receiveWorkPort`	`INTEGER`	The TCP port on which the worker should listen for messages from other streaming computation workers
`workItems[].reportStatusInterval`	`ANY`	Recommended reporting interval
`workItems[].sourceOperationTask`	`OBJECT`	A work item that represents the different operations that can be performed on a user-defined Source specification
`workItems[].sourceOperationTask.split`	`OBJECT`	Represents the operation to split a high-level Source specification into bundles (parts for parallel processing). At a high level, splitting of a source into bundles happens as follows: SourceSplitRequest is applied to the source. If it returns SOURCE_SPLIT_OUTCOME_USE_CURRENT, no further splitting happens and the source is used "as is". Otherwise, splitting is applied recursively to each produced DerivedSource. As an optimization, for any Source, if its does_not_need_splitting is true, the framework assumes that splitting this source would return SOURCE_SPLIT_OUTCOME_USE_CURRENT, and doesn't initiate a SourceSplitRequest. This applies both to the initial source being split and to bundles produced from it
`workItems[].sourceOperationTask.name`	`STRING`	User-provided name of the Read instruction for this source
`workItems[].sourceOperationTask.originalName`	`STRING`	System-defined name for the Read instruction for this source in the original workflow graph
`workItems[].sourceOperationTask.systemName`	`STRING`	System-defined name of the Read instruction for this source. Unique across the workflow
`workItems[].sourceOperationTask.stageName`	`STRING`	System-defined name of the stage containing the source operation. Unique across the workflow
`workItems[].sourceOperationTask.getMetadata`	`OBJECT`	A request to compute the SourceMetadata of a Source
`workItems[].leaseExpireTime`	`ANY`	Time when the lease on this Work will expire
`workItems[].streamingConfigTask`	`OBJECT`	A task that carries configuration information for streaming computations
`workItems[].streamingConfigTask.windmillServiceEndpoint`	`STRING`	If present, the worker must use this endpoint to communicate with Windmill Service dispatchers, otherwise the worker must continue to use whatever endpoint it had been using
`workItems[].streamingConfigTask.maxWorkItemCommitBytes`	`INTEGER`	Maximum size for work item commit supported windmill storage layer
`workItems[].streamingConfigTask.userStepToStateFamilyNameMap`	`OBJECT`	Map from user step names to state families
`workItems[].streamingConfigTask.userStepToStateFamilyNameMap.customKey.value`	`STRING`	Map from user step names to state families
`workItems[].streamingConfigTask.windmillServicePort`	`INTEGER`	If present, the worker must use this port to communicate with Windmill Service dispatchers. Only applicable when windmill_service_endpoint is specified
`workItems[].streamingConfigTask.streamingComputationConfigs[]`	`OBJECT`	Configuration information for a single streaming computation
`workItems[].initialReportIndex`	`INTEGER`	The initial index to use when reporting the status of the WorkItem

= Parameter name
= Format

unifiedWorkerResponse OBJECT

Untranslated bag-of-bytes WorkResponse for UnifiedWorker

unifiedWorkerResponse.customKey.value ANY

Untranslated bag-of-bytes WorkResponse for UnifiedWorker

workItems[] OBJECT

WorkItem represents basic information about a WorkItem to be executed in the cloud

workItems[].shellTask OBJECT

A task which consists of a shell command for the worker to execute

workItems[].shellTask.command STRING

The shell command to run

workItems[].shellTask.exitCode INTEGER

Exit code for the task

workItems[].streamingComputationTask OBJECT

A task which describes what action should be performed for the specified streaming computation ranges

workItems[].streamingComputationTask.computationRanges[] OBJECT

Describes full or partial data disk assignment information of the computation ranges

workItems[].streamingComputationTask.dataDisks[] OBJECT

Describes mounted data disk

workItems[].streamingComputationTask.taskType ENUMERATION

A type of streaming computation task

workItems[].jobId STRING

Identifies the workflow job this WorkItem belongs to

workItems[].id INTEGER

Identifies this WorkItem

workItems[].configuration STRING

Work item-specific configuration as an opaque blob

workItems[].mapTask OBJECT

MapTask consists of an ordered set of instructions, each of which describes one particular low-level operation for the worker to perform in order to accomplish the MapTask's WorkItem.

Each instruction must appear in the list before any instructions which depends on its output

workItems[].mapTask.systemName STRING

System-defined name of this MapTask. Unique across the workflow

workItems[].mapTask.stageName STRING

System-defined name of the stage containing this MapTask. Unique across the workflow

workItems[].mapTask.instructions[] OBJECT

Describes a particular operation comprising a MapTask

workItems[].mapTask.counterPrefix STRING

Counter prefix that can be used to prefix counters. Not currently used in Dataflow

workItems[].seqMapTask OBJECT

Describes a particular function to invoke

workItems[].seqMapTask.name STRING

The user-provided name of the SeqDo operation

workItems[].seqMapTask.outputInfos[] OBJECT

Information about an output of a SeqMapTask

workItems[].seqMapTask.inputs[] OBJECT

Information about a side input of a DoFn or an input of a SeqDoFn

workItems[].seqMapTask.stageName STRING

System-defined name of the stage containing the SeqDo operation. Unique across the workflow

workItems[].seqMapTask.systemName STRING

System-defined name of the SeqDo operation. Unique across the workflow

workItems[].seqMapTask.userFn OBJECT

The user function to invoke

workItems[].seqMapTask.userFn.customKey.value ANY

The user function to invoke

workItems[].packages[] OBJECT

The packages that must be installed in order for a worker to run the steps of the Cloud Dataflow job that will be assigned to its worker pool.

This is the mechanism by which the Cloud Dataflow SDK causes code to be loaded onto the workers. For example, the Cloud Dataflow Java SDK might use this to install jars containing the user's code and all of the various dependencies (libraries, data files, etc.) required in order for that code to run

workItems[].packages[].location STRING

The resource to read the package from. The supported resource type is:

Google Cloud Storage:

storage.googleapis.com/{bucket} bucket.storage.googleapis.com/

workItems[].packages[].name STRING

The name of the package

workItems[].projectId STRING

Identifies the cloud project this WorkItem belongs to

workItems[].streamingSetupTask OBJECT

A task which initializes part of a streaming Dataflow job

workItems[].streamingSetupTask.streamingComputationTopology OBJECT

Global topology of the streaming Dataflow job, including all computations and their sharded locations

workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap OBJECT

Maps user stage names to stable computation names

workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap.customKey.value STRING

Maps user stage names to stable computation names

workItems[].streamingSetupTask.streamingComputationTopology.persistentStateVersion INTEGER

Version number for persistent state

workItems[].streamingSetupTask.streamingComputationTopology.forwardingKeyBits INTEGER

The size (in bits) of keys that will be assigned to source messages

workItems[].streamingSetupTask.snapshotConfig OBJECT

Streaming appliance snapshot configuration

workItems[].streamingSetupTask.snapshotConfig.importStateEndpoint STRING

Indicates which endpoint is used to import appliance state

workItems[].streamingSetupTask.snapshotConfig.snapshotId STRING

If set, indicates the snapshot id for the snapshot being performed

workItems[].streamingSetupTask.workerHarnessPort INTEGER

The TCP port used by the worker to communicate with the Dataflow worker harness

workItems[].streamingSetupTask.drain BOOLEAN

The user has requested drain

workItems[].streamingSetupTask.receiveWorkPort INTEGER

The TCP port on which the worker should listen for messages from other streaming computation workers

workItems[].reportStatusInterval ANY

Recommended reporting interval

workItems[].sourceOperationTask OBJECT

A work item that represents the different operations that can be performed on a user-defined Source specification

workItems[].sourceOperationTask.split OBJECT

Represents the operation to split a high-level Source specification into bundles (parts for parallel processing).

At a high level, splitting of a source into bundles happens as follows: SourceSplitRequest is applied to the source. If it returns SOURCE_SPLIT_OUTCOME_USE_CURRENT, no further splitting happens and the source is used "as is". Otherwise, splitting is applied recursively to each produced DerivedSource.

As an optimization, for any Source, if its does_not_need_splitting is true, the framework assumes that splitting this source would return SOURCE_SPLIT_OUTCOME_USE_CURRENT, and doesn't initiate a SourceSplitRequest. This applies both to the initial source being split and to bundles produced from it

workItems[].sourceOperationTask.name STRING

User-provided name of the Read instruction for this source

workItems[].sourceOperationTask.originalName STRING

System-defined name for the Read instruction for this source in the original workflow graph

workItems[].sourceOperationTask.systemName STRING

System-defined name of the Read instruction for this source. Unique across the workflow

workItems[].sourceOperationTask.stageName STRING

System-defined name of the stage containing the source operation. Unique across the workflow

workItems[].sourceOperationTask.getMetadata OBJECT

A request to compute the SourceMetadata of a Source

workItems[].leaseExpireTime ANY

Time when the lease on this Work will expire

workItems[].streamingConfigTask OBJECT

A task that carries configuration information for streaming computations

workItems[].streamingConfigTask.windmillServiceEndpoint STRING

If present, the worker must use this endpoint to communicate with Windmill Service dispatchers, otherwise the worker must continue to use whatever endpoint it had been using

workItems[].streamingConfigTask.maxWorkItemCommitBytes INTEGER

Maximum size for work item commit supported windmill storage layer

workItems[].streamingConfigTask.userStepToStateFamilyNameMap OBJECT

Map from user step names to state families

workItems[].streamingConfigTask.userStepToStateFamilyNameMap.customKey.value STRING

Map from user step names to state families

workItems[].streamingConfigTask.windmillServicePort INTEGER

If present, the worker must use this port to communicate with Windmill Service dispatchers. Only applicable when windmill_service_endpoint is specified

workItems[].streamingConfigTask.streamingComputationConfigs[] OBJECT

Configuration information for a single streaming computation

workItems[].initialReportIndex INTEGER

The initial index to use when reporting the status of the WorkItem

Projects / Jobs / Work Items

Lease

Authorization

Input

Output