Lease
|
|||||
|
|
Leases a dataflow WorkItem to run
Authorization
To use this building block you will have to grant access to at least one of the following scopes:
- View and manage your data across Google Cloud Platform services
- View and manage your Google Compute Engine resources
- View your Google Compute Engine resources
- View your email address
Input
This building block consumes 11 input parameters
Name | Format | Description |
---|---|---|
projectId Required |
STRING |
Identifies the project this worker belongs to |
location Required |
STRING |
The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that contains the WorkItem's job |
jobId Required |
STRING |
Identifies the workflow job this worker belongs to |
requestedLeaseDuration |
ANY |
The initial lease period |
currentWorkerTime |
ANY |
The current timestamp at the worker |
workItemTypes[] |
STRING |
|
location |
STRING |
The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that contains the WorkItem's job |
unifiedWorkerRequest |
OBJECT |
Untranslated bag-of-bytes WorkRequest from UnifiedWorker |
unifiedWorkerRequest.customKey.value Required |
ANY |
Untranslated bag-of-bytes WorkRequest from UnifiedWorker |
workerCapabilities[] |
STRING |
|
workerId |
STRING |
Identifies the worker leasing work -- typically the ID of the virtual machine running the worker |
= Parameter name
= Format
projectId STRING Required Identifies the project this worker belongs to |
location STRING Required The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that contains the WorkItem's job |
jobId STRING Required Identifies the workflow job this worker belongs to |
requestedLeaseDuration ANY The initial lease period |
currentWorkerTime ANY The current timestamp at the worker |
workItemTypes[] STRING |
location STRING The [regional endpoint] (https://cloud.google.com/dataflow/docs/concepts/regional-endpoints) that contains the WorkItem's job |
unifiedWorkerRequest OBJECT Untranslated bag-of-bytes WorkRequest from UnifiedWorker |
unifiedWorkerRequest.customKey.value ANY Required Untranslated bag-of-bytes WorkRequest from UnifiedWorker |
workerCapabilities[] STRING |
workerId STRING Identifies the worker leasing work -- typically the ID of the virtual machine running the worker |
Output
This building block provides 59 output parameters
Name | Format | Description |
---|---|---|
unifiedWorkerResponse |
OBJECT |
Untranslated bag-of-bytes WorkResponse for UnifiedWorker |
unifiedWorkerResponse.customKey.value |
ANY |
Untranslated bag-of-bytes WorkResponse for UnifiedWorker |
workItems[] |
OBJECT |
WorkItem represents basic information about a WorkItem to be executed in the cloud |
workItems[].shellTask |
OBJECT |
A task which consists of a shell command for the worker to execute |
workItems[].shellTask.command |
STRING |
The shell command to run |
workItems[].shellTask.exitCode |
INTEGER |
Exit code for the task |
workItems[].streamingComputationTask |
OBJECT |
A task which describes what action should be performed for the specified streaming computation ranges |
workItems[].streamingComputationTask.computationRanges[] |
OBJECT |
Describes full or partial data disk assignment information of the computation ranges |
workItems[].streamingComputationTask.dataDisks[] |
OBJECT |
Describes mounted data disk |
workItems[].streamingComputationTask.taskType |
ENUMERATION |
A type of streaming computation task |
workItems[].jobId |
STRING |
Identifies the workflow job this WorkItem belongs to |
workItems[].id |
INTEGER |
Identifies this WorkItem |
workItems[].configuration |
STRING |
Work item-specific configuration as an opaque blob |
workItems[].mapTask |
OBJECT |
MapTask consists of an ordered set of instructions, each of which describes one particular low-level operation for the worker to perform in order to accomplish the MapTask's WorkItem. Each instruction must appear in the list before any instructions which depends on its output |
workItems[].mapTask.systemName |
STRING |
System-defined name of this MapTask. Unique across the workflow |
workItems[].mapTask.stageName |
STRING |
System-defined name of the stage containing this MapTask. Unique across the workflow |
workItems[].mapTask.instructions[] |
OBJECT |
Describes a particular operation comprising a MapTask |
workItems[].mapTask.counterPrefix |
STRING |
Counter prefix that can be used to prefix counters. Not currently used in Dataflow |
workItems[].seqMapTask |
OBJECT |
Describes a particular function to invoke |
workItems[].seqMapTask.name |
STRING |
The user-provided name of the SeqDo operation |
workItems[].seqMapTask.outputInfos[] |
OBJECT |
Information about an output of a SeqMapTask |
workItems[].seqMapTask.inputs[] |
OBJECT |
Information about a side input of a DoFn or an input of a SeqDoFn |
workItems[].seqMapTask.stageName |
STRING |
System-defined name of the stage containing the SeqDo operation. Unique across the workflow |
workItems[].seqMapTask.systemName |
STRING |
System-defined name of the SeqDo operation. Unique across the workflow |
workItems[].seqMapTask.userFn |
OBJECT |
The user function to invoke |
workItems[].seqMapTask.userFn.customKey.value |
ANY |
The user function to invoke |
workItems[].packages[] |
OBJECT |
The packages that must be installed in order for a worker to run the steps of the Cloud Dataflow job that will be assigned to its worker pool. This is the mechanism by which the Cloud Dataflow SDK causes code to be loaded onto the workers. For example, the Cloud Dataflow Java SDK might use this to install jars containing the user's code and all of the various dependencies (libraries, data files, etc.) required in order for that code to run |
workItems[].packages[].location |
STRING |
The resource to read the package from. The supported resource type is: Google Cloud Storage: storage.googleapis.com/{bucket} bucket.storage.googleapis.com/ |
workItems[].packages[].name |
STRING |
The name of the package |
workItems[].projectId |
STRING |
Identifies the cloud project this WorkItem belongs to |
workItems[].streamingSetupTask |
OBJECT |
A task which initializes part of a streaming Dataflow job |
workItems[].streamingSetupTask.streamingComputationTopology |
OBJECT |
Global topology of the streaming Dataflow job, including all computations and their sharded locations |
workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap |
OBJECT |
Maps user stage names to stable computation names |
workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap.customKey.value |
STRING |
Maps user stage names to stable computation names |
workItems[].streamingSetupTask.streamingComputationTopology.persistentStateVersion |
INTEGER |
Version number for persistent state |
workItems[].streamingSetupTask.streamingComputationTopology.forwardingKeyBits |
INTEGER |
The size (in bits) of keys that will be assigned to source messages |
workItems[].streamingSetupTask.snapshotConfig |
OBJECT |
Streaming appliance snapshot configuration |
workItems[].streamingSetupTask.snapshotConfig.importStateEndpoint |
STRING |
Indicates which endpoint is used to import appliance state |
workItems[].streamingSetupTask.snapshotConfig.snapshotId |
STRING |
If set, indicates the snapshot id for the snapshot being performed |
workItems[].streamingSetupTask.workerHarnessPort |
INTEGER |
The TCP port used by the worker to communicate with the Dataflow worker harness |
workItems[].streamingSetupTask.drain |
BOOLEAN |
The user has requested drain |
workItems[].streamingSetupTask.receiveWorkPort |
INTEGER |
The TCP port on which the worker should listen for messages from other streaming computation workers |
workItems[].reportStatusInterval |
ANY |
Recommended reporting interval |
workItems[].sourceOperationTask |
OBJECT |
A work item that represents the different operations that can be performed on a user-defined Source specification |
workItems[].sourceOperationTask.split |
OBJECT |
Represents the operation to split a high-level Source specification into bundles (parts for parallel processing). At a high level, splitting of a source into bundles happens as follows: SourceSplitRequest is applied to the source. If it returns SOURCE_SPLIT_OUTCOME_USE_CURRENT, no further splitting happens and the source is used "as is". Otherwise, splitting is applied recursively to each produced DerivedSource. As an optimization, for any Source, if its does_not_need_splitting is true, the framework assumes that splitting this source would return SOURCE_SPLIT_OUTCOME_USE_CURRENT, and doesn't initiate a SourceSplitRequest. This applies both to the initial source being split and to bundles produced from it |
workItems[].sourceOperationTask.name |
STRING |
User-provided name of the Read instruction for this source |
workItems[].sourceOperationTask.originalName |
STRING |
System-defined name for the Read instruction for this source in the original workflow graph |
workItems[].sourceOperationTask.systemName |
STRING |
System-defined name of the Read instruction for this source. Unique across the workflow |
workItems[].sourceOperationTask.stageName |
STRING |
System-defined name of the stage containing the source operation. Unique across the workflow |
workItems[].sourceOperationTask.getMetadata |
OBJECT |
A request to compute the SourceMetadata of a Source |
workItems[].leaseExpireTime |
ANY |
Time when the lease on this Work will expire |
workItems[].streamingConfigTask |
OBJECT |
A task that carries configuration information for streaming computations |
workItems[].streamingConfigTask.windmillServiceEndpoint |
STRING |
If present, the worker must use this endpoint to communicate with Windmill Service dispatchers, otherwise the worker must continue to use whatever endpoint it had been using |
workItems[].streamingConfigTask.maxWorkItemCommitBytes |
INTEGER |
Maximum size for work item commit supported windmill storage layer |
workItems[].streamingConfigTask.userStepToStateFamilyNameMap |
OBJECT |
Map from user step names to state families |
workItems[].streamingConfigTask.userStepToStateFamilyNameMap.customKey.value |
STRING |
Map from user step names to state families |
workItems[].streamingConfigTask.windmillServicePort |
INTEGER |
If present, the worker must use this port to communicate with Windmill Service dispatchers. Only applicable when windmill_service_endpoint is specified |
workItems[].streamingConfigTask.streamingComputationConfigs[] |
OBJECT |
Configuration information for a single streaming computation |
workItems[].initialReportIndex |
INTEGER |
The initial index to use when reporting the status of the WorkItem |
= Parameter name
= Format
unifiedWorkerResponse OBJECT Untranslated bag-of-bytes WorkResponse for UnifiedWorker |
unifiedWorkerResponse.customKey.value ANY Untranslated bag-of-bytes WorkResponse for UnifiedWorker |
workItems[] OBJECT WorkItem represents basic information about a WorkItem to be executed in the cloud |
workItems[].shellTask OBJECT A task which consists of a shell command for the worker to execute |
workItems[].shellTask.command STRING The shell command to run |
workItems[].shellTask.exitCode INTEGER Exit code for the task |
workItems[].streamingComputationTask OBJECT A task which describes what action should be performed for the specified streaming computation ranges |
workItems[].streamingComputationTask.computationRanges[] OBJECT Describes full or partial data disk assignment information of the computation ranges |
workItems[].streamingComputationTask.dataDisks[] OBJECT Describes mounted data disk |
workItems[].streamingComputationTask.taskType ENUMERATION A type of streaming computation task |
workItems[].jobId STRING Identifies the workflow job this WorkItem belongs to |
workItems[].id INTEGER Identifies this WorkItem |
workItems[].configuration STRING Work item-specific configuration as an opaque blob |
workItems[].mapTask OBJECT MapTask consists of an ordered set of instructions, each of which describes one particular low-level operation for the worker to perform in order to accomplish the MapTask's WorkItem. Each instruction must appear in the list before any instructions which depends on its output |
workItems[].mapTask.systemName STRING System-defined name of this MapTask. Unique across the workflow |
workItems[].mapTask.stageName STRING System-defined name of the stage containing this MapTask. Unique across the workflow |
workItems[].mapTask.instructions[] OBJECT Describes a particular operation comprising a MapTask |
workItems[].mapTask.counterPrefix STRING Counter prefix that can be used to prefix counters. Not currently used in Dataflow |
workItems[].seqMapTask OBJECT Describes a particular function to invoke |
workItems[].seqMapTask.name STRING The user-provided name of the SeqDo operation |
workItems[].seqMapTask.outputInfos[] OBJECT Information about an output of a SeqMapTask |
workItems[].seqMapTask.inputs[] OBJECT Information about a side input of a DoFn or an input of a SeqDoFn |
workItems[].seqMapTask.stageName STRING System-defined name of the stage containing the SeqDo operation. Unique across the workflow |
workItems[].seqMapTask.systemName STRING System-defined name of the SeqDo operation. Unique across the workflow |
workItems[].seqMapTask.userFn OBJECT The user function to invoke |
workItems[].seqMapTask.userFn.customKey.value ANY The user function to invoke |
workItems[].packages[] OBJECT The packages that must be installed in order for a worker to run the steps of the Cloud Dataflow job that will be assigned to its worker pool. This is the mechanism by which the Cloud Dataflow SDK causes code to be loaded onto the workers. For example, the Cloud Dataflow Java SDK might use this to install jars containing the user's code and all of the various dependencies (libraries, data files, etc.) required in order for that code to run |
workItems[].packages[].location STRING The resource to read the package from. The supported resource type is: Google Cloud Storage: storage.googleapis.com/{bucket} bucket.storage.googleapis.com/ |
workItems[].packages[].name STRING The name of the package |
workItems[].projectId STRING Identifies the cloud project this WorkItem belongs to |
workItems[].streamingSetupTask OBJECT A task which initializes part of a streaming Dataflow job |
workItems[].streamingSetupTask.streamingComputationTopology OBJECT Global topology of the streaming Dataflow job, including all computations and their sharded locations |
workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap OBJECT Maps user stage names to stable computation names |
workItems[].streamingSetupTask.streamingComputationTopology.userStageToComputationNameMap.customKey.value STRING Maps user stage names to stable computation names |
workItems[].streamingSetupTask.streamingComputationTopology.persistentStateVersion INTEGER Version number for persistent state |
workItems[].streamingSetupTask.streamingComputationTopology.forwardingKeyBits INTEGER The size (in bits) of keys that will be assigned to source messages |
workItems[].streamingSetupTask.snapshotConfig OBJECT Streaming appliance snapshot configuration |
workItems[].streamingSetupTask.snapshotConfig.importStateEndpoint STRING Indicates which endpoint is used to import appliance state |
workItems[].streamingSetupTask.snapshotConfig.snapshotId STRING If set, indicates the snapshot id for the snapshot being performed |
workItems[].streamingSetupTask.workerHarnessPort INTEGER The TCP port used by the worker to communicate with the Dataflow worker harness |
workItems[].streamingSetupTask.drain BOOLEAN The user has requested drain |
workItems[].streamingSetupTask.receiveWorkPort INTEGER The TCP port on which the worker should listen for messages from other streaming computation workers |
workItems[].reportStatusInterval ANY Recommended reporting interval |
workItems[].sourceOperationTask OBJECT A work item that represents the different operations that can be performed on a user-defined Source specification |
workItems[].sourceOperationTask.split OBJECT Represents the operation to split a high-level Source specification into bundles (parts for parallel processing). At a high level, splitting of a source into bundles happens as follows: SourceSplitRequest is applied to the source. If it returns SOURCE_SPLIT_OUTCOME_USE_CURRENT, no further splitting happens and the source is used "as is". Otherwise, splitting is applied recursively to each produced DerivedSource. As an optimization, for any Source, if its does_not_need_splitting is true, the framework assumes that splitting this source would return SOURCE_SPLIT_OUTCOME_USE_CURRENT, and doesn't initiate a SourceSplitRequest. This applies both to the initial source being split and to bundles produced from it |
workItems[].sourceOperationTask.name STRING User-provided name of the Read instruction for this source |
workItems[].sourceOperationTask.originalName STRING System-defined name for the Read instruction for this source in the original workflow graph |
workItems[].sourceOperationTask.systemName STRING System-defined name of the Read instruction for this source. Unique across the workflow |
workItems[].sourceOperationTask.stageName STRING System-defined name of the stage containing the source operation. Unique across the workflow |
workItems[].sourceOperationTask.getMetadata OBJECT A request to compute the SourceMetadata of a Source |
workItems[].leaseExpireTime ANY Time when the lease on this Work will expire |
workItems[].streamingConfigTask OBJECT A task that carries configuration information for streaming computations |
workItems[].streamingConfigTask.windmillServiceEndpoint STRING If present, the worker must use this endpoint to communicate with Windmill Service dispatchers, otherwise the worker must continue to use whatever endpoint it had been using |
workItems[].streamingConfigTask.maxWorkItemCommitBytes INTEGER Maximum size for work item commit supported windmill storage layer |
workItems[].streamingConfigTask.userStepToStateFamilyNameMap OBJECT Map from user step names to state families |
workItems[].streamingConfigTask.userStepToStateFamilyNameMap.customKey.value STRING Map from user step names to state families |
workItems[].streamingConfigTask.windmillServicePort INTEGER If present, the worker must use this port to communicate with Windmill Service dispatchers. Only applicable when windmill_service_endpoint is specified |
workItems[].streamingConfigTask.streamingComputationConfigs[] OBJECT Configuration information for a single streaming computation |
workItems[].initialReportIndex INTEGER The initial index to use when reporting the status of the WorkItem |