Patch

Updates a cluster in a project

112 variables
11 variables

Updates a cluster in a project

Authorization

To use this building block you will have to grant access to at least one of the following scopes:

  • View and manage your data across Google Cloud Platform services

Input

This building block consumes 112 input parameters

  = Parameter name
  = Format

projectId STRING Required

Required. The ID of the Google Cloud Platform project the cluster belongs to

region STRING Required

Required. The Cloud Dataproc region in which to handle the request

clusterName STRING Required

Required. The cluster name

updateMask ANY

Required. Specifies the path, relative to Cluster, of the field to update. For example, to change the number of workers in a cluster to 5, the update_mask parameter would be specified as config.worker_config.num_instances, and the PATCH request body would specify the new value, as follows: { "config":{ "workerConfig":{ "numInstances":"5" } } } Similarly, to change the number of preemptible workers in a cluster to 5, the update_mask parameter would be config.secondary_worker_config.num_instances, and the PATCH request body would be set as follows: { "config":{ "secondaryWorkerConfig":{ "numInstances":"5" } } } Note: Currently, only the following fields can be updated:

Mask Purpose
labels Update labels
config.worker_config.num_instances Resize primary worker group
config.secondary_worker_config.num_instances Resize secondary worker group

gracefulDecommissionTimeout ANY

Optional. Timeout for graceful YARN decomissioning. Graceful decommissioning allows removing nodes from the cluster without interrupting jobs in progress. Timeout specifies how long to wait for jobs in progress to finish before forcefully removing nodes (and potentially interrupting jobs). Default timeout is 0 (for forceful decommission), and the maximum allowed timeout is 1 day.Only supported on Dataproc image versions 1.2 and higher

requestId STRING

Optional. A unique id used to identify the request. If the server receives two UpdateClusterRequest requests with the same id, then the second request will be ignored and the first google.longrunning.Operation created and stored in the backend is returned.It is recommended to always set this value to a UUID (https://en.wikipedia.org/wiki/Universally_unique_identifier).The id must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), and hyphens (-). The maximum length is 40 characters

labels OBJECT

Optional. The labels to associate with this cluster. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a cluster

labels.customKey.value STRING Required

Optional. The labels to associate with this cluster. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a cluster

metrics OBJECT

Contains cluster daemon metrics, such as HDFS and YARN stats.Beta Feature: This report is available for testing purposes only. It may be changed before final release

metrics.hdfsMetrics OBJECT

The HDFS metrics

metrics.hdfsMetrics.customKey.value INTEGER Required

The HDFS metrics

metrics.yarnMetrics OBJECT

The YARN metrics

metrics.yarnMetrics.customKey.value INTEGER Required

The YARN metrics

status OBJECT

The status of a cluster and its instances

status.detail STRING

Output only. Optional details of cluster's state

status.state ENUMERATION

Output only. The cluster's state

status.stateStartTime ANY

Output only. Time when this state was entered

status.substate ENUMERATION

Output only. Additional state information that includes status reported by the agent

statusHistory[] OBJECT

The status of a cluster and its instances

statusHistory[].detail STRING

Output only. Optional details of cluster's state

statusHistory[].state ENUMERATION

Output only. The cluster's state

statusHistory[].stateStartTime ANY

Output only. Time when this state was entered

statusHistory[].substate ENUMERATION

Output only. Additional state information that includes status reported by the agent

config OBJECT

The cluster config

config.workerConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

config.workerConfig.instanceNames[] STRING

config.workerConfig.accelerators[] OBJECT

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine

config.workerConfig.accelerators[].acceleratorCount INTEGER

The number of the accelerator cards of this type exposed to this instance

config.workerConfig.accelerators[].acceleratorTypeUri STRING

Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes.Examples: https://www.googleapis.com/compute/beta/projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 nvidia-tesla-k80Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-k80

config.workerConfig.numInstances INTEGER

Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1

config.workerConfig.diskConfig OBJECT

Specifies the config of disk options for a group of VM instances

config.workerConfig.diskConfig.bootDiskType STRING

Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)

config.workerConfig.diskConfig.numLocalSsds INTEGER

Optional. Number of attached SSDs, from 0 to 4 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries

config.workerConfig.diskConfig.bootDiskSizeGb INTEGER

Optional. Size in GB of the boot disk (default is 500GB)

config.workerConfig.managedGroupConfig OBJECT

Specifies the resources used to actively manage an instance group

config.workerConfig.managedGroupConfig.instanceGroupManagerName STRING

Output only. The name of the Instance Group Manager for this group

config.workerConfig.managedGroupConfig.instanceTemplateName STRING

Output only. The name of the Instance Template used for the Managed Instance Group

config.workerConfig.isPreemptible BOOLEAN

Optional. Specifies that this instance group contains preemptible instances

config.workerConfig.imageUri STRING

Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version

config.workerConfig.machineTypeUri STRING

Optional. The Compute Engine machine type used for cluster instances.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 n1-standard-2Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the machine type resource, for example, n1-standard-2

config.gceClusterConfig OBJECT

Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster

config.gceClusterConfig.tags[] STRING

config.gceClusterConfig.serviceAccount STRING

Optional. The service account of the instances. Defaults to the default Compute Engine service account. Custom service accounts need permissions equivalent to the following IAM roles: roles/logging.logWriter roles/storage.objectAdmin(see https://cloud.google.com/compute/docs/access/service-accounts#custom_service_accounts for more information). Example: [account_id]@[project_id].iam.gserviceaccount.com

config.gceClusterConfig.subnetworkUri STRING

Optional. The Compute Engine subnetwork to be used for machine communications. Cannot be specified with network_uri.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/regions/us-east1/subnetworks/sub0 projects/[project_id]/regions/us-east1/subnetworks/sub0 sub0

config.gceClusterConfig.networkUri STRING

Optional. The Compute Engine network to be used for machine communications. Cannot be specified with subnetwork_uri. If neither network_uri nor subnetwork_uri is specified, the "default" network of the project is used, if it exists. Cannot be a "Custom Subnet Network" (see Using Subnetworks for more information).A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/regions/global/default projects/[project_id]/regions/global/default default

config.gceClusterConfig.zoneUri STRING

Optional. The zone where the Compute Engine cluster will be located. On a create request, it is required in the "global" region. If omitted in a non-global Cloud Dataproc region, the service will pick a zone in the corresponding Compute Engine region. On a get request, zone will always be present.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/[zone] projects/[project_id]/zones/[zone] us-central1-f

config.gceClusterConfig.internalIpOnly BOOLEAN

Optional. If true, all instances in the cluster will only have internal IP addresses. By default, clusters are not restricted to internal IP addresses, and will have ephemeral external IP addresses assigned to each instance. This internal_ip_only restriction can only be enabled for subnetwork enabled networks, and all off-cluster dependencies must be configured to be accessible without external IP addresses

config.gceClusterConfig.metadata OBJECT

The Compute Engine metadata entries to add to all instances (see Project and instance metadata (https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata))

config.gceClusterConfig.metadata.customKey.value STRING Required

The Compute Engine metadata entries to add to all instances (see Project and instance metadata (https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata))

config.gceClusterConfig.serviceAccountScopes[] STRING

config.softwareConfig OBJECT

Specifies the selection and config of software inside the cluster

config.softwareConfig.imageVersion STRING

Optional. The version of software inside the cluster. It must be one of the supported Cloud Dataproc Versions, such as "1.2" (including a subminor version, such as "1.2.29"), or the "preview" version. If unspecified, it defaults to the latest Debian version

config.softwareConfig.properties OBJECT

Optional. The properties to set on daemon config files.Property keys are specified in prefix:property format, for example core:hadoop.tmp.dir. The following are supported prefixes and their mappings: capacity-scheduler: capacity-scheduler.xml core: core-site.xml distcp: distcp-default.xml hdfs: hdfs-site.xml hive: hive-site.xml mapred: mapred-site.xml pig: pig.properties spark: spark-defaults.conf yarn: yarn-site.xmlFor more information, see Cluster properties

config.softwareConfig.properties.customKey.value STRING Required

Optional. The properties to set on daemon config files.Property keys are specified in prefix:property format, for example core:hadoop.tmp.dir. The following are supported prefixes and their mappings: capacity-scheduler: capacity-scheduler.xml core: core-site.xml distcp: distcp-default.xml hdfs: hdfs-site.xml hive: hive-site.xml mapred: mapred-site.xml pig: pig.properties spark: spark-defaults.conf yarn: yarn-site.xmlFor more information, see Cluster properties

config.softwareConfig.optionalComponents[] ENUMERATION

config.masterConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

config.masterConfig.instanceNames[] STRING

config.masterConfig.accelerators[] OBJECT

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine

config.masterConfig.accelerators[].acceleratorCount INTEGER

The number of the accelerator cards of this type exposed to this instance

config.masterConfig.accelerators[].acceleratorTypeUri STRING

Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes.Examples: https://www.googleapis.com/compute/beta/projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 nvidia-tesla-k80Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-k80

config.masterConfig.numInstances INTEGER

Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1

config.masterConfig.diskConfig OBJECT

Specifies the config of disk options for a group of VM instances

config.masterConfig.diskConfig.bootDiskType STRING

Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)

config.masterConfig.diskConfig.numLocalSsds INTEGER

Optional. Number of attached SSDs, from 0 to 4 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries

config.masterConfig.diskConfig.bootDiskSizeGb INTEGER

Optional. Size in GB of the boot disk (default is 500GB)

config.masterConfig.managedGroupConfig OBJECT

Specifies the resources used to actively manage an instance group

config.masterConfig.managedGroupConfig.instanceGroupManagerName STRING

Output only. The name of the Instance Group Manager for this group

config.masterConfig.managedGroupConfig.instanceTemplateName STRING

Output only. The name of the Instance Template used for the Managed Instance Group

config.masterConfig.isPreemptible BOOLEAN

Optional. Specifies that this instance group contains preemptible instances

config.masterConfig.imageUri STRING

Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version

config.masterConfig.machineTypeUri STRING

Optional. The Compute Engine machine type used for cluster instances.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 n1-standard-2Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the machine type resource, for example, n1-standard-2

config.secondaryWorkerConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

config.secondaryWorkerConfig.instanceNames[] STRING

config.secondaryWorkerConfig.accelerators[] OBJECT

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine

config.secondaryWorkerConfig.accelerators[].acceleratorCount INTEGER

The number of the accelerator cards of this type exposed to this instance

config.secondaryWorkerConfig.accelerators[].acceleratorTypeUri STRING

Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes.Examples: https://www.googleapis.com/compute/beta/projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 nvidia-tesla-k80Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-k80

config.secondaryWorkerConfig.numInstances INTEGER

Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1

config.secondaryWorkerConfig.diskConfig OBJECT

Specifies the config of disk options for a group of VM instances

config.secondaryWorkerConfig.diskConfig.bootDiskType STRING

Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)

config.secondaryWorkerConfig.diskConfig.numLocalSsds INTEGER

Optional. Number of attached SSDs, from 0 to 4 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries

config.secondaryWorkerConfig.diskConfig.bootDiskSizeGb INTEGER

Optional. Size in GB of the boot disk (default is 500GB)

config.secondaryWorkerConfig.managedGroupConfig OBJECT

Specifies the resources used to actively manage an instance group

config.secondaryWorkerConfig.managedGroupConfig.instanceGroupManagerName STRING

Output only. The name of the Instance Group Manager for this group

config.secondaryWorkerConfig.managedGroupConfig.instanceTemplateName STRING

Output only. The name of the Instance Template used for the Managed Instance Group

config.secondaryWorkerConfig.isPreemptible BOOLEAN

Optional. Specifies that this instance group contains preemptible instances

config.secondaryWorkerConfig.imageUri STRING

Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version

config.secondaryWorkerConfig.machineTypeUri STRING

Optional. The Compute Engine machine type used for cluster instances.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 n1-standard-2Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the machine type resource, for example, n1-standard-2

config.encryptionConfig OBJECT

Encryption settings for the cluster

config.encryptionConfig.gcePdKmsKeyName STRING

Optional. The Cloud KMS key name to use for PD disk encryption for all instances in the cluster

config.securityConfig OBJECT

Security related configuration, including Kerberos

config.securityConfig.kerberosConfig OBJECT

Specifies Kerberos related configuration

config.securityConfig.kerberosConfig.keystoreUri STRING

Optional. The Cloud Storage URI of the keystore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate

config.securityConfig.kerberosConfig.keyPasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided key. For the self-signed certificate, this password is generated by Dataproc

config.securityConfig.kerberosConfig.keystorePasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided keystore. For the self-signed certificate, this password is generated by Dataproc

config.securityConfig.kerberosConfig.crossRealmTrustAdminServer STRING

Optional. The admin server (IP or hostname) for the remote trusted realm in a cross realm trust relationship

config.securityConfig.kerberosConfig.kdcDbKeyUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the master key of the KDC database

config.securityConfig.kerberosConfig.truststorePasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided truststore. For the self-signed certificate, this password is generated by Dataproc

config.securityConfig.kerberosConfig.enableKerberos BOOLEAN

Optional. Flag to indicate whether to Kerberize the cluster

config.securityConfig.kerberosConfig.truststoreUri STRING

Optional. The Cloud Storage URI of the truststore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate

config.securityConfig.kerberosConfig.crossRealmTrustRealm STRING

Optional. The remote realm the Dataproc on-cluster KDC will trust, should the user enable cross realm trust

config.securityConfig.kerberosConfig.rootPrincipalPasswordUri STRING

Required. The Cloud Storage URI of a KMS encrypted file containing the root principal password

config.securityConfig.kerberosConfig.kmsKeyUri STRING

Required. The uri of the KMS key used to encrypt various sensitive files

config.securityConfig.kerberosConfig.crossRealmTrustKdc STRING

Optional. The KDC (IP or hostname) for the remote trusted realm in a cross realm trust relationship

config.securityConfig.kerberosConfig.crossRealmTrustSharedPasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the shared password between the on-cluster Kerberos realm and the remote trusted realm, in a cross realm trust relationship

config.securityConfig.kerberosConfig.tgtLifetimeHours INTEGER

Optional. The lifetime of the ticket granting ticket, in hours. If not specified, or user specifies 0, then default value 10 will be used

config.initializationActions[] OBJECT

Specifies an executable to run on a fully configured node and a timeout period for executable completion

config.initializationActions[].executableFile STRING

Required. Cloud Storage URI of executable file

config.initializationActions[].executionTimeout ANY

Optional. Amount of time executable has to complete. Default is 10 minutes. Cluster creation fails with an explanatory error message (the name of the executable that caused the error and the exceeded timeout period) if the executable is not completed at end of the timeout period

config.configBucket STRING

Optional. A Google Cloud Storage bucket used to stage job dependencies, config files, and job driver console output. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's staging bucket according to the Google Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket (see Cloud Dataproc staging bucket)

clusterName STRING

Required. The cluster name. Cluster names within a project must be unique. Names of deleted clusters can be reused

clusterUuid STRING

Output only. A cluster UUID (Unique Universal Identifier). Cloud Dataproc generates this value when it creates the cluster

projectId STRING

Required. The Google Cloud Platform project ID that the cluster belongs to

Output

This building block provides 11 output parameters

  = Parameter name
  = Format

done BOOLEAN

If the value is false, it means the operation is still in progress. If true, the operation is completed, and either error or response is available

response OBJECT

The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is standard Get/Create/Update, the response should be the resource. For other methods, the response should have the type XxxResponse, where Xxx is the original method name. For example, if the original method name is TakeSnapshot(), the inferred response type is TakeSnapshotResponse

response.customKey.value ANY

The normal response of the operation in case of success. If the original method returns no data on success, such as Delete, the response is google.protobuf.Empty. If the original method is standard Get/Create/Update, the response should be the resource. For other methods, the response should have the type XxxResponse, where Xxx is the original method name. For example, if the original method name is TakeSnapshot(), the inferred response type is TakeSnapshotResponse

name STRING

The server-assigned name, which is only unique within the same service that originally returns it. If you use the default HTTP mapping, the name should be a resource name ending with operations/{unique_id}

error OBJECT

The Status type defines a logical error model that is suitable for different programming environments, including REST APIs and RPC APIs. It is used by gRPC (https://github.com/grpc). Each Status message contains three pieces of data: error code, error message, and error details.You can find out more about this error model and how to work with it in the API Design Guide (https://cloud.google.com/apis/design/errors)

error.message STRING

A developer-facing error message, which should be in English. Any user-facing error message should be localized and sent in the google.rpc.Status.details field, or localized by the client

error.details[] OBJECT

error.details[].customKey.value ANY

error.code INTEGER

The status code, which should be an enum value of google.rpc.Code

metadata OBJECT

Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any

metadata.customKey.value ANY

Service-specific metadata associated with the operation. It typically contains progress information and common metadata such as create time. Some services might not provide such metadata. Any method that returns a long-running operation should document the metadata type, if any