Instantiate Inline

Instantiates a template and begins execution

114 variables
11 variables

Instantiates a template and begins execution.This method is equivalent to executing the sequence CreateWorkflowTemplate, InstantiateWorkflowTemplate, DeleteWorkflowTemplate.The returned Operation can be used to track execution of workflow by polling operations.get. The Operation will complete when entire workflow is finished.The running workflow can be aborted via operations.cancel. This will cause any inflight jobs to be cancelled and workflow-owned clusters to be deleted.The Operation.metadata will be WorkflowMetadata.On successful completion, Operation.response will be Empty

Authorization

To use this building block you will have to grant access to at least one of the following scopes:

  • View and manage your data across Google Cloud Platform services

Input

This building block consumes 114 input parameters

  = Parameter name
  = Format

parent STRING Required

Required. The "resource name" of the workflow template region, as described in https://cloud.google.com/apis/design/resource_names of the form projects/{project_id}/regions/{region}

requestId STRING

Optional. A tag that prevents multiple concurrent workflow instances with the same tag from running. This mitigates risk of concurrent instances started due to retries.It is recommended to always set this value to a UUID (https://en.wikipedia.org/wiki/Universally_unique_identifier).The tag must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). The maximum length is 40 characters

placement OBJECT

Specifies workflow execution target.Either managed_cluster or cluster_selector is required

placement.clusterSelector OBJECT

A selector that chooses target cluster for jobs based on metadata

placement.clusterSelector.clusterLabels OBJECT

Required. The cluster labels. Cluster must have all labels to match

placement.clusterSelector.clusterLabels.customKey.value STRING Required

Required. The cluster labels. Cluster must have all labels to match

placement.clusterSelector.zone STRING

Optional. The zone where workflow process executes. This parameter does not affect the selection of the cluster.If unspecified, the zone of the first cluster matching the selector is used

placement.managedCluster OBJECT

Cluster that is managed by the workflow

placement.managedCluster.labels OBJECT

Optional. The labels to associate with this cluster.Label keys must be between 1 and 63 characters long, and must conform to the following PCRE regular expression: \p{Ll}\p{Lo}{0,62}Label values must be between 1 and 63 characters long, and must conform to the following PCRE regular expression: \p{Ll}\p{Lo}\p{N}_-{0,63}No more than 32 labels can be associated with a given cluster

placement.managedCluster.labels.customKey.value STRING Required

Optional. The labels to associate with this cluster.Label keys must be between 1 and 63 characters long, and must conform to the following PCRE regular expression: \p{Ll}\p{Lo}{0,62}Label values must be between 1 and 63 characters long, and must conform to the following PCRE regular expression: \p{Ll}\p{Lo}\p{N}_-{0,63}No more than 32 labels can be associated with a given cluster

placement.managedCluster.config OBJECT

The cluster config

placement.managedCluster.config.workerConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

placement.managedCluster.config.gceClusterConfig OBJECT

Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster

placement.managedCluster.config.softwareConfig OBJECT

Specifies the selection and config of software inside the cluster

placement.managedCluster.config.masterConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

placement.managedCluster.config.secondaryWorkerConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

placement.managedCluster.config.encryptionConfig OBJECT

Encryption settings for the cluster

placement.managedCluster.config.securityConfig OBJECT

Security related configuration, including Kerberos

placement.managedCluster.config.configBucket STRING

Optional. A Google Cloud Storage bucket used to stage job dependencies, config files, and job driver console output. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's staging bucket according to the Google Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket (see Cloud Dataproc staging bucket)

placement.managedCluster.clusterName STRING

Required. The cluster name prefix. A unique cluster name will be formed by appending a random suffix.The name must contain only lower-case letters (a-z), numbers (0-9), and hyphens (-). Must begin with a letter. Cannot begin or end with hyphen. Must consist of between 2 and 35 characters

updateTime ANY

Output only. The time template was last updated

parameters[] OBJECT

A configurable parameter that replaces one or more fields in the template. Parameterizable fields: - Labels - File uris - Job properties - Job arguments - Script variables - Main class (in HadoopJob and SparkJob) - Zone (in ClusterSelector)

parameters[].validation OBJECT

Configuration for parameter validation

parameters[].validation.values OBJECT

Validation based on a list of allowed values

parameters[].validation.values.values[] STRING

parameters[].validation.regex OBJECT

Validation based on regular expressions

parameters[].validation.regex.regexes[] STRING

parameters[].fields[] STRING

parameters[].name STRING

Required. Parameter name. The parameter name is used as the key, and paired with the parameter value, which are passed to the template when the template is instantiated. The name must contain only capital letters (A-Z), numbers (0-9), and underscores (_), and must not start with a number. The maximum length is 40 characters

parameters[].description STRING

Optional. Brief description of the parameter. Must not exceed 1024 characters

name STRING

Output only. The "resource name" of the template, as described in https://cloud.google.com/apis/design/resource_names of the form projects/{project_id}/regions/{region}/workflowTemplates/{template_id}

version INTEGER

Optional. Used to perform a consistent read-modify-write.This field should be left blank for a CreateWorkflowTemplate request. It is required for an UpdateWorkflowTemplate request, and must match the current server version. A typical update template flow would fetch the current template with a GetWorkflowTemplate request, which will return the current template with the version field filled in with the current server version. The user updates other fields in the template, then returns it as part of the UpdateWorkflowTemplate request

id STRING

Required. The template id.The id must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). Cannot begin or end with underscore or hyphen. Must consist of between 3 and 50 characters

jobs[] OBJECT

A job executed by the workflow

jobs[].hadoopJob OBJECT

A Cloud Dataproc job for running Apache Hadoop MapReduce (https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html) jobs on Apache Hadoop YARN (https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/YARN.html)

jobs[].hadoopJob.args[] STRING

jobs[].hadoopJob.fileUris[] STRING

jobs[].hadoopJob.mainClass STRING

The name of the driver's main class. The jar file containing the class must be in the default CLASSPATH or specified in jar_file_uris

jobs[].hadoopJob.archiveUris[] STRING

jobs[].hadoopJob.mainJarFileUri STRING

The HCFS URI of the jar file containing the main class. Examples: 'gs://foo-bucket/analytics-binaries/extract-useful-metrics-mr.jar' 'hdfs:/tmp/test-samples/custom-wordcount.jar' 'file:///home/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar'

jobs[].hadoopJob.jarFileUris[] STRING

jobs[].hadoopJob.loggingConfig OBJECT

The runtime logging config of the job

jobs[].hadoopJob.loggingConfig.driverLogLevels OBJECT

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].hadoopJob.loggingConfig.driverLogLevels.customKey.value ENUMERATION Required

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].hadoopJob.properties OBJECT

Optional. A mapping of property names to values, used to configure Hadoop. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/hadoop/conf/*-site and classes in user code

jobs[].hadoopJob.properties.customKey.value STRING Required

Optional. A mapping of property names to values, used to configure Hadoop. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/hadoop/conf/*-site and classes in user code

jobs[].hiveJob OBJECT

A Cloud Dataproc job for running Apache Hive (https://hive.apache.org/) queries on YARN

jobs[].hiveJob.continueOnFailure BOOLEAN

Optional. Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries

jobs[].hiveJob.queryFileUri STRING

The HCFS URI of the script that contains Hive queries

jobs[].hiveJob.queryList OBJECT

A list of queries to run on a cluster

jobs[].hiveJob.queryList.queries[] STRING

jobs[].hiveJob.jarFileUris[] STRING

jobs[].hiveJob.scriptVariables OBJECT

Optional. Mapping of query variable names to values (equivalent to the Hive command: SET name="value";)

jobs[].hiveJob.scriptVariables.customKey.value STRING Required

Optional. Mapping of query variable names to values (equivalent to the Hive command: SET name="value";)

jobs[].hiveJob.properties OBJECT

Optional. A mapping of property names and values, used to configure Hive. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/hadoop/conf/*-site.xml, /etc/hive/conf/hive-site.xml, and classes in user code

jobs[].hiveJob.properties.customKey.value STRING Required

Optional. A mapping of property names and values, used to configure Hive. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/hadoop/conf/*-site.xml, /etc/hive/conf/hive-site.xml, and classes in user code

jobs[].prerequisiteStepIds[] STRING

jobs[].labels OBJECT

Optional. The labels to associate with this job.Label keys must be between 1 and 63 characters long, and must conform to the following regular expression: \p{Ll}\p{Lo}{0,62}Label values must be between 1 and 63 characters long, and must conform to the following regular expression: \p{Ll}\p{Lo}\p{N}_-{0,63}No more than 32 labels can be associated with a given job

jobs[].labels.customKey.value STRING Required

Optional. The labels to associate with this job.Label keys must be between 1 and 63 characters long, and must conform to the following regular expression: \p{Ll}\p{Lo}{0,62}Label values must be between 1 and 63 characters long, and must conform to the following regular expression: \p{Ll}\p{Lo}\p{N}_-{0,63}No more than 32 labels can be associated with a given job

jobs[].sparkJob OBJECT

A Cloud Dataproc job for running Apache Spark (http://spark.apache.org/) applications on YARN

jobs[].sparkJob.args[] STRING

jobs[].sparkJob.fileUris[] STRING

jobs[].sparkJob.mainClass STRING

The name of the driver's main class. The jar file that contains the class must be in the default CLASSPATH or specified in jar_file_uris

jobs[].sparkJob.archiveUris[] STRING

jobs[].sparkJob.mainJarFileUri STRING

The HCFS URI of the jar file that contains the main class

jobs[].sparkJob.jarFileUris[] STRING

jobs[].sparkJob.loggingConfig OBJECT

The runtime logging config of the job

jobs[].sparkJob.loggingConfig.driverLogLevels OBJECT

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].sparkJob.loggingConfig.driverLogLevels.customKey.value ENUMERATION Required

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].sparkJob.properties OBJECT

Optional. A mapping of property names to values, used to configure Spark. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code

jobs[].sparkJob.properties.customKey.value STRING Required

Optional. A mapping of property names to values, used to configure Spark. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code

jobs[].sparkSqlJob OBJECT

A Cloud Dataproc job for running Apache Spark SQL (http://spark.apache.org/sql/) queries

jobs[].sparkSqlJob.queryList OBJECT

A list of queries to run on a cluster

jobs[].sparkSqlJob.queryList.queries[] STRING

jobs[].sparkSqlJob.queryFileUri STRING

The HCFS URI of the script that contains SQL queries

jobs[].sparkSqlJob.scriptVariables OBJECT

Optional. Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";)

jobs[].sparkSqlJob.scriptVariables.customKey.value STRING Required

Optional. Mapping of query variable names to values (equivalent to the Spark SQL command: SET name="value";)

jobs[].sparkSqlJob.jarFileUris[] STRING

jobs[].sparkSqlJob.loggingConfig OBJECT

The runtime logging config of the job

jobs[].sparkSqlJob.loggingConfig.driverLogLevels OBJECT

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].sparkSqlJob.loggingConfig.driverLogLevels.customKey.value ENUMERATION Required

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].sparkSqlJob.properties OBJECT

Optional. A mapping of property names to values, used to configure Spark SQL's SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten

jobs[].sparkSqlJob.properties.customKey.value STRING Required

Optional. A mapping of property names to values, used to configure Spark SQL's SparkConf. Properties that conflict with values set by the Cloud Dataproc API may be overwritten

jobs[].pysparkJob OBJECT

A Cloud Dataproc job for running Apache PySpark (https://spark.apache.org/docs/0.9.0/python-programming-guide.html) applications on YARN

jobs[].pysparkJob.jarFileUris[] STRING

jobs[].pysparkJob.loggingConfig OBJECT

The runtime logging config of the job

jobs[].pysparkJob.loggingConfig.driverLogLevels OBJECT

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].pysparkJob.loggingConfig.driverLogLevels.customKey.value ENUMERATION Required

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].pysparkJob.properties OBJECT

Optional. A mapping of property names to values, used to configure PySpark. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code

jobs[].pysparkJob.properties.customKey.value STRING Required

Optional. A mapping of property names to values, used to configure PySpark. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/spark/conf/spark-defaults.conf and classes in user code

jobs[].pysparkJob.args[] STRING

jobs[].pysparkJob.fileUris[] STRING

jobs[].pysparkJob.pythonFileUris[] STRING

jobs[].pysparkJob.mainPythonFileUri STRING

Required. The HCFS URI of the main Python file to use as the driver. Must be a .py file

jobs[].pysparkJob.archiveUris[] STRING

jobs[].scheduling OBJECT

Job scheduling options

jobs[].scheduling.maxFailuresPerHour INTEGER

Optional. Maximum number of times per hour a driver may be restarted as a result of driver terminating with non-zero code before job is reported failed.A job may be reported as thrashing if driver exits with non-zero code 4 times within 10 minute window.Maximum value is 10

jobs[].pigJob OBJECT

A Cloud Dataproc job for running Apache Pig (https://pig.apache.org/) queries on YARN

jobs[].pigJob.loggingConfig OBJECT

The runtime logging config of the job

jobs[].pigJob.loggingConfig.driverLogLevels OBJECT

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].pigJob.loggingConfig.driverLogLevels.customKey.value ENUMERATION Required

The per-package log levels for the driver. This may include "root" package name to configure rootLogger. Examples: 'com.google = FATAL', 'root = INFO', 'org.apache = DEBUG'

jobs[].pigJob.properties OBJECT

Optional. A mapping of property names to values, used to configure Pig. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/hadoop/conf/*-site.xml, /etc/pig/conf/pig.properties, and classes in user code

jobs[].pigJob.properties.customKey.value STRING Required

Optional. A mapping of property names to values, used to configure Pig. Properties that conflict with values set by the Cloud Dataproc API may be overwritten. Can include properties set in /etc/hadoop/conf/*-site.xml, /etc/pig/conf/pig.properties, and classes in user code

jobs[].pigJob.continueOnFailure BOOLEAN

Optional. Whether to continue executing queries if a query fails. The default value is false. Setting to true can be useful when executing independent parallel queries

jobs[].pigJob.queryFileUri STRING

The HCFS URI of the script that contains the Pig queries

jobs[].pigJob.queryList OBJECT

A list of queries to run on a cluster

jobs[].pigJob.queryList.queries[] STRING

jobs[].pigJob.jarFileUris[] STRING

jobs[].pigJob.scriptVariables OBJECT

Optional. Mapping of query variable names to values (equivalent to the Pig command: name=[value])

jobs[].pigJob.scriptVariables.customKey.value STRING Required

Optional. Mapping of query variable names to values (equivalent to the Pig command: name=[value])

jobs[].stepId STRING

Required. The step id. The id must be unique among all jobs within the template.The step id is used as prefix for job id, as job goog-dataproc-workflow-step-id label, and in prerequisiteStepIds field from other steps.The id must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). Cannot begin or end with underscore or hyphen. Must consist of between 3 and 50 characters

createTime ANY

Output only. The time template was created

labels OBJECT

Optional. The labels to associate with this template. These labels will be propagated to all jobs and clusters created by the workflow instance.Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt).Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt).No more than 32 labels can be associated with a template

labels.customKey.value STRING Required

Optional. The labels to associate with this template. These labels will be propagated to all jobs and clusters created by the workflow instance.Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt).Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt).No more than 32 labels can be associated with a template

Output

This building block provides 11 output parameters

  = Parameter name
  = Format

done BOOLEAN

If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available

response OBJECT

The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is standard Get/Create/Update, the response should be the resource. For other methods, the response should have the type XxxResponse, where Xxx is the original method name. For example, if the original method name is TakeSnapshot(), the inferred response type is TakeSnapshotResponse

response.customKey.value ANY

The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is standard Get/Create/Update, the response should be the resource. For other methods, the response should have the type XxxResponse, where Xxx is the original method name. For example, if the original method name is TakeSnapshot(), the inferred response type is TakeSnapshotResponse

name STRING

The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the name should be a resource name ending with operations/{unique_id}

error OBJECT

The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC (https://github.com/grpc). Each Status message contains three pieces of data: error code, error message, and error details.You can find out more about this error model and how to work with it in the API Design Guide (https://cloud.google.com/apis/design/errors)

error.message STRING

A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client

error.details[] OBJECT

error.details[].customKey.value ANY

error.code INTEGER

The status code, which should be an enum value of google.rpc.Code

metadata OBJECT

Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any

metadata.customKey.value ANY

Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any