Get

3 variables

106 variables

Gets the resource representation for a cluster in a project

Authorization

To use this building block you will have to grant access to at least one of the following scopes:

View and manage your data across Google Cloud Platform services

Input

This building block consumes 3 input parameters

Name	Format	Description
`projectId` Required	`STRING`	Required. The ID of the Google Cloud Platform project that the cluster belongs to
`region` Required	`STRING`	Required. The Cloud Dataproc region in which to handle the request
`clusterName` Required	`STRING`	Required. The cluster name

= Parameter name
= Format

projectId STRING Required

Required. The ID of the Google Cloud Platform project that the cluster belongs to

region STRING Required

Required. The Cloud Dataproc region in which to handle the request

clusterName STRING Required

Required. The cluster name

Output

This building block provides 106 output parameters

Name	Format	Description
`labels`	`OBJECT`	Optional. The labels to associate with this cluster. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a cluster
`labels.customKey.value`	`STRING`	Optional. The labels to associate with this cluster. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a cluster
`metrics`	`OBJECT`	Contains cluster daemon metrics, such as HDFS and YARN stats.Beta Feature: This report is available for testing purposes only. It may be changed before final release
`metrics.hdfsMetrics`	`OBJECT`	The HDFS metrics
`metrics.hdfsMetrics.customKey.value`	`INTEGER`	The HDFS metrics
`metrics.yarnMetrics`	`OBJECT`	The YARN metrics
`metrics.yarnMetrics.customKey.value`	`INTEGER`	The YARN metrics
`status`	`OBJECT`	The status of a cluster and its instances
`status.detail`	`STRING`	Output only. Optional details of cluster's state
`status.state`	`ENUMERATION`	Output only. The cluster's state
`status.stateStartTime`	`ANY`	Output only. Time when this state was entered
`status.substate`	`ENUMERATION`	Output only. Additional state information that includes status reported by the agent
`statusHistory[]`	`OBJECT`	The status of a cluster and its instances
`statusHistory[].detail`	`STRING`	Output only. Optional details of cluster's state
`statusHistory[].state`	`ENUMERATION`	Output only. The cluster's state
`statusHistory[].stateStartTime`	`ANY`	Output only. Time when this state was entered
`statusHistory[].substate`	`ENUMERATION`	Output only. Additional state information that includes status reported by the agent
`config`	`OBJECT`	The cluster config
`config.workerConfig`	`OBJECT`	Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group
`config.workerConfig.instanceNames[]`	`STRING`
`config.workerConfig.accelerators[]`	`OBJECT`	Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine
`config.workerConfig.accelerators[].acceleratorCount`	`INTEGER`	The number of the accelerator cards of this type exposed to this instance
`config.workerConfig.accelerators[].acceleratorTypeUri`	`STRING`	Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes.Examples: https://www.googleapis.com/compute/beta/projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 nvidia-tesla-k80Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-k80
`config.workerConfig.numInstances`	`INTEGER`	Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1
`config.workerConfig.diskConfig`	`OBJECT`	Specifies the config of disk options for a group of VM instances
`config.workerConfig.diskConfig.bootDiskType`	`STRING`	Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)
`config.workerConfig.diskConfig.numLocalSsds`	`INTEGER`	Optional. Number of attached SSDs, from 0 to 4 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries
`config.workerConfig.diskConfig.bootDiskSizeGb`	`INTEGER`	Optional. Size in GB of the boot disk (default is 500GB)
`config.workerConfig.managedGroupConfig`	`OBJECT`	Specifies the resources used to actively manage an instance group
`config.workerConfig.managedGroupConfig.instanceGroupManagerName`	`STRING`	Output only. The name of the Instance Group Manager for this group
`config.workerConfig.managedGroupConfig.instanceTemplateName`	`STRING`	Output only. The name of the Instance Template used for the Managed Instance Group
`config.workerConfig.isPreemptible`	`BOOLEAN`	Optional. Specifies that this instance group contains preemptible instances
`config.workerConfig.imageUri`	`STRING`	Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version
`config.workerConfig.machineTypeUri`	`STRING`	Optional. The Compute Engine machine type used for cluster instances.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 n1-standard-2Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the machine type resource, for example, n1-standard-2
`config.gceClusterConfig`	`OBJECT`	Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster
`config.gceClusterConfig.tags[]`	`STRING`
`config.gceClusterConfig.serviceAccount`	`STRING`	Optional. The service account of the instances. Defaults to the default Compute Engine service account. Custom service accounts need permissions equivalent to the following IAM roles: roles/logging.logWriter roles/storage.objectAdmin(see https://cloud.google.com/compute/docs/access/service-accounts#custom_service_accounts for more information). Example: [account_id]@[project_id].iam.gserviceaccount.com
`config.gceClusterConfig.subnetworkUri`	`STRING`	Optional. The Compute Engine subnetwork to be used for machine communications. Cannot be specified with network_uri.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/regions/us-east1/subnetworks/sub0 projects/[project_id]/regions/us-east1/subnetworks/sub0 sub0
`config.gceClusterConfig.networkUri`	`STRING`	Optional. The Compute Engine network to be used for machine communications. Cannot be specified with subnetwork_uri. If neither network_uri nor subnetwork_uri is specified, the "default" network of the project is used, if it exists. Cannot be a "Custom Subnet Network" (see Using Subnetworks for more information).A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/regions/global/default projects/[project_id]/regions/global/default default
`config.gceClusterConfig.zoneUri`	`STRING`	Optional. The zone where the Compute Engine cluster will be located. On a create request, it is required in the "global" region. If omitted in a non-global Cloud Dataproc region, the service will pick a zone in the corresponding Compute Engine region. On a get request, zone will always be present.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/[zone] projects/[project_id]/zones/[zone] us-central1-f
`config.gceClusterConfig.internalIpOnly`	`BOOLEAN`	Optional. If true, all instances in the cluster will only have internal IP addresses. By default, clusters are not restricted to internal IP addresses, and will have ephemeral external IP addresses assigned to each instance. This internal_ip_only restriction can only be enabled for subnetwork enabled networks, and all off-cluster dependencies must be configured to be accessible without external IP addresses
`config.gceClusterConfig.metadata`	`OBJECT`	The Compute Engine metadata entries to add to all instances (see Project and instance metadata (https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata))
`config.gceClusterConfig.metadata.customKey.value`	`STRING`	The Compute Engine metadata entries to add to all instances (see Project and instance metadata (https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata))
`config.gceClusterConfig.serviceAccountScopes[]`	`STRING`
`config.softwareConfig`	`OBJECT`	Specifies the selection and config of software inside the cluster
`config.softwareConfig.imageVersion`	`STRING`	Optional. The version of software inside the cluster. It must be one of the supported Cloud Dataproc Versions, such as "1.2" (including a subminor version, such as "1.2.29"), or the "preview" version. If unspecified, it defaults to the latest Debian version
`config.softwareConfig.properties`	`OBJECT`	Optional. The properties to set on daemon config files.Property keys are specified in prefix:property format, for example core:hadoop.tmp.dir. The following are supported prefixes and their mappings: capacity-scheduler: capacity-scheduler.xml core: core-site.xml distcp: distcp-default.xml hdfs: hdfs-site.xml hive: hive-site.xml mapred: mapred-site.xml pig: pig.properties spark: spark-defaults.conf yarn: yarn-site.xmlFor more information, see Cluster properties
`config.softwareConfig.properties.customKey.value`	`STRING`	Optional. The properties to set on daemon config files.Property keys are specified in prefix:property format, for example core:hadoop.tmp.dir. The following are supported prefixes and their mappings: capacity-scheduler: capacity-scheduler.xml core: core-site.xml distcp: distcp-default.xml hdfs: hdfs-site.xml hive: hive-site.xml mapred: mapred-site.xml pig: pig.properties spark: spark-defaults.conf yarn: yarn-site.xmlFor more information, see Cluster properties
`config.softwareConfig.optionalComponents[]`	`ENUMERATION`
`config.masterConfig`	`OBJECT`	Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group
`config.masterConfig.instanceNames[]`	`STRING`
`config.masterConfig.accelerators[]`	`OBJECT`	Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine
`config.masterConfig.accelerators[].acceleratorCount`	`INTEGER`	The number of the accelerator cards of this type exposed to this instance
`config.masterConfig.accelerators[].acceleratorTypeUri`	`STRING`	Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes.Examples: https://www.googleapis.com/compute/beta/projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 nvidia-tesla-k80Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-k80
`config.masterConfig.numInstances`	`INTEGER`	Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1
`config.masterConfig.diskConfig`	`OBJECT`	Specifies the config of disk options for a group of VM instances
`config.masterConfig.diskConfig.bootDiskType`	`STRING`	Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)
`config.masterConfig.diskConfig.numLocalSsds`	`INTEGER`	Optional. Number of attached SSDs, from 0 to 4 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries
`config.masterConfig.diskConfig.bootDiskSizeGb`	`INTEGER`	Optional. Size in GB of the boot disk (default is 500GB)
`config.masterConfig.managedGroupConfig`	`OBJECT`	Specifies the resources used to actively manage an instance group
`config.masterConfig.managedGroupConfig.instanceGroupManagerName`	`STRING`	Output only. The name of the Instance Group Manager for this group
`config.masterConfig.managedGroupConfig.instanceTemplateName`	`STRING`	Output only. The name of the Instance Template used for the Managed Instance Group
`config.masterConfig.isPreemptible`	`BOOLEAN`	Optional. Specifies that this instance group contains preemptible instances
`config.masterConfig.imageUri`	`STRING`	Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version
`config.masterConfig.machineTypeUri`	`STRING`	Optional. The Compute Engine machine type used for cluster instances.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 n1-standard-2Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the machine type resource, for example, n1-standard-2
`config.secondaryWorkerConfig`	`OBJECT`	Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group
`config.secondaryWorkerConfig.instanceNames[]`	`STRING`
`config.secondaryWorkerConfig.accelerators[]`	`OBJECT`	Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine
`config.secondaryWorkerConfig.accelerators[].acceleratorCount`	`INTEGER`	The number of the accelerator cards of this type exposed to this instance
`config.secondaryWorkerConfig.accelerators[].acceleratorTypeUri`	`STRING`	Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes.Examples: https://www.googleapis.com/compute/beta/projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 nvidia-tesla-k80Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-k80
`config.secondaryWorkerConfig.numInstances`	`INTEGER`	Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1
`config.secondaryWorkerConfig.diskConfig`	`OBJECT`	Specifies the config of disk options for a group of VM instances
`config.secondaryWorkerConfig.diskConfig.bootDiskType`	`STRING`	Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)
`config.secondaryWorkerConfig.diskConfig.numLocalSsds`	`INTEGER`	Optional. Number of attached SSDs, from 0 to 4 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries
`config.secondaryWorkerConfig.diskConfig.bootDiskSizeGb`	`INTEGER`	Optional. Size in GB of the boot disk (default is 500GB)
`config.secondaryWorkerConfig.managedGroupConfig`	`OBJECT`	Specifies the resources used to actively manage an instance group
`config.secondaryWorkerConfig.managedGroupConfig.instanceGroupManagerName`	`STRING`	Output only. The name of the Instance Group Manager for this group
`config.secondaryWorkerConfig.managedGroupConfig.instanceTemplateName`	`STRING`	Output only. The name of the Instance Template used for the Managed Instance Group
`config.secondaryWorkerConfig.isPreemptible`	`BOOLEAN`	Optional. Specifies that this instance group contains preemptible instances
`config.secondaryWorkerConfig.imageUri`	`STRING`	Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version
`config.secondaryWorkerConfig.machineTypeUri`	`STRING`	Optional. The Compute Engine machine type used for cluster instances.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 n1-standard-2Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the machine type resource, for example, n1-standard-2
`config.encryptionConfig`	`OBJECT`	Encryption settings for the cluster
`config.encryptionConfig.gcePdKmsKeyName`	`STRING`	Optional. The Cloud KMS key name to use for PD disk encryption for all instances in the cluster
`config.securityConfig`	`OBJECT`	Security related configuration, including Kerberos
`config.securityConfig.kerberosConfig`	`OBJECT`	Specifies Kerberos related configuration
`config.securityConfig.kerberosConfig.keystoreUri`	`STRING`	Optional. The Cloud Storage URI of the keystore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate
`config.securityConfig.kerberosConfig.keyPasswordUri`	`STRING`	Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided key. For the self-signed certificate, this password is generated by Dataproc
`config.securityConfig.kerberosConfig.keystorePasswordUri`	`STRING`	Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided keystore. For the self-signed certificate, this password is generated by Dataproc
`config.securityConfig.kerberosConfig.crossRealmTrustAdminServer`	`STRING`	Optional. The admin server (IP or hostname) for the remote trusted realm in a cross realm trust relationship
`config.securityConfig.kerberosConfig.kdcDbKeyUri`	`STRING`	Optional. The Cloud Storage URI of a KMS encrypted file containing the master key of the KDC database
`config.securityConfig.kerberosConfig.truststorePasswordUri`	`STRING`	Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided truststore. For the self-signed certificate, this password is generated by Dataproc
`config.securityConfig.kerberosConfig.enableKerberos`	`BOOLEAN`	Optional. Flag to indicate whether to Kerberize the cluster
`config.securityConfig.kerberosConfig.truststoreUri`	`STRING`	Optional. The Cloud Storage URI of the truststore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate
`config.securityConfig.kerberosConfig.crossRealmTrustRealm`	`STRING`	Optional. The remote realm the Dataproc on-cluster KDC will trust, should the user enable cross realm trust
`config.securityConfig.kerberosConfig.rootPrincipalPasswordUri`	`STRING`	Required. The Cloud Storage URI of a KMS encrypted file containing the root principal password
`config.securityConfig.kerberosConfig.kmsKeyUri`	`STRING`	Required. The uri of the KMS key used to encrypt various sensitive files
`config.securityConfig.kerberosConfig.crossRealmTrustKdc`	`STRING`	Optional. The KDC (IP or hostname) for the remote trusted realm in a cross realm trust relationship
`config.securityConfig.kerberosConfig.crossRealmTrustSharedPasswordUri`	`STRING`	Optional. The Cloud Storage URI of a KMS encrypted file containing the shared password between the on-cluster Kerberos realm and the remote trusted realm, in a cross realm trust relationship
`config.securityConfig.kerberosConfig.tgtLifetimeHours`	`INTEGER`	Optional. The lifetime of the ticket granting ticket, in hours. If not specified, or user specifies 0, then default value 10 will be used
`config.initializationActions[]`	`OBJECT`	Specifies an executable to run on a fully configured node and a timeout period for executable completion
`config.initializationActions[].executableFile`	`STRING`	Required. Cloud Storage URI of executable file
`config.initializationActions[].executionTimeout`	`ANY`	Optional. Amount of time executable has to complete. Default is 10 minutes. Cluster creation fails with an explanatory error message (the name of the executable that caused the error and the exceeded timeout period) if the executable is not completed at end of the timeout period
`config.configBucket`	`STRING`	Optional. A Google Cloud Storage bucket used to stage job dependencies, config files, and job driver console output. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's staging bucket according to the Google Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket (see Cloud Dataproc staging bucket)
`clusterName`	`STRING`	Required. The cluster name. Cluster names within a project must be unique. Names of deleted clusters can be reused
`clusterUuid`	`STRING`	Output only. A cluster UUID (Unique Universal Identifier). Cloud Dataproc generates this value when it creates the cluster
`projectId`	`STRING`	Required. The Google Cloud Platform project ID that the cluster belongs to

= Parameter name
= Format

labels OBJECT

Optional. The labels to associate with this cluster. Label keys must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). Label values may be empty, but, if present, must contain 1 to 63 characters, and must conform to RFC 1035 (https://www.ietf.org/rfc/rfc1035.txt). No more than 32 labels can be associated with a cluster

labels.customKey.value STRING

metrics OBJECT

Contains cluster daemon metrics, such as HDFS and YARN stats.Beta Feature: This report is available for testing purposes only. It may be changed before final release

metrics.hdfsMetrics OBJECT

The HDFS metrics

metrics.hdfsMetrics.customKey.value INTEGER

The HDFS metrics

metrics.yarnMetrics OBJECT

The YARN metrics

metrics.yarnMetrics.customKey.value INTEGER

The YARN metrics

status OBJECT

The status of a cluster and its instances

status.detail STRING

Output only. Optional details of cluster's state

status.state ENUMERATION

Output only. The cluster's state

status.stateStartTime ANY

Output only. Time when this state was entered

status.substate ENUMERATION

Output only. Additional state information that includes status reported by the agent

statusHistory[] OBJECT

The status of a cluster and its instances

statusHistory[].detail STRING

Output only. Optional details of cluster's state

statusHistory[].state ENUMERATION

Output only. The cluster's state

statusHistory[].stateStartTime ANY

Output only. Time when this state was entered

statusHistory[].substate ENUMERATION

Output only. Additional state information that includes status reported by the agent

config OBJECT

The cluster config

config.workerConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

config.workerConfig.instanceNames[] STRING

config.workerConfig.accelerators[] OBJECT

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine

config.workerConfig.accelerators[].acceleratorCount INTEGER

The number of the accelerator cards of this type exposed to this instance

config.workerConfig.accelerators[].acceleratorTypeUri STRING

Full URL, partial URI, or short name of the accelerator type resource to expose to this instance. See Compute Engine AcceleratorTypes.Examples: https://www.googleapis.com/compute/beta/projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 projects/[project_id]/zones/us-east1-a/acceleratorTypes/nvidia-tesla-k80 nvidia-tesla-k80Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the accelerator type resource, for example, nvidia-tesla-k80

config.workerConfig.numInstances INTEGER

Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1

config.workerConfig.diskConfig OBJECT

Specifies the config of disk options for a group of VM instances

config.workerConfig.diskConfig.bootDiskType STRING

Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)

config.workerConfig.diskConfig.numLocalSsds INTEGER

Optional. Number of attached SSDs, from 0 to 4 (default is 0). If SSDs are not attached, the boot disk is used to store runtime logs and HDFS (https://hadoop.apache.org/docs/r1.2.1/hdfs_user_guide.html) data. If one or more SSDs are attached, this runtime bulk data is spread across them, and the boot disk contains only basic config and installed binaries

config.workerConfig.diskConfig.bootDiskSizeGb INTEGER

Optional. Size in GB of the boot disk (default is 500GB)

config.workerConfig.managedGroupConfig OBJECT

Specifies the resources used to actively manage an instance group

config.workerConfig.managedGroupConfig.instanceGroupManagerName STRING

Output only. The name of the Instance Group Manager for this group

config.workerConfig.managedGroupConfig.instanceTemplateName STRING

Output only. The name of the Instance Template used for the Managed Instance Group

config.workerConfig.isPreemptible BOOLEAN

Optional. Specifies that this instance group contains preemptible instances

config.workerConfig.imageUri STRING

Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version

config.workerConfig.machineTypeUri STRING

Optional. The Compute Engine machine type used for cluster instances.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 projects/[project_id]/zones/us-east1-a/machineTypes/n1-standard-2 n1-standard-2Auto Zone Exception: If you are using the Cloud Dataproc Auto Zone Placement feature, you must use the short name of the machine type resource, for example, n1-standard-2

config.gceClusterConfig OBJECT

Common config settings for resources of Compute Engine cluster instances, applicable to all instances in the cluster

config.gceClusterConfig.tags[] STRING

config.gceClusterConfig.serviceAccount STRING

Optional. The service account of the instances. Defaults to the default Compute Engine service account. Custom service accounts need permissions equivalent to the following IAM roles: roles/logging.logWriter roles/storage.objectAdmin(see https://cloud.google.com/compute/docs/access/service-accounts#custom_service_accounts for more information). Example: [account_id]@[project_id].iam.gserviceaccount.com

config.gceClusterConfig.subnetworkUri STRING

Optional. The Compute Engine subnetwork to be used for machine communications. Cannot be specified with network_uri.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/regions/us-east1/subnetworks/sub0 projects/[project_id]/regions/us-east1/subnetworks/sub0 sub0

config.gceClusterConfig.networkUri STRING

Optional. The Compute Engine network to be used for machine communications. Cannot be specified with subnetwork_uri. If neither network_uri nor subnetwork_uri is specified, the "default" network of the project is used, if it exists. Cannot be a "Custom Subnet Network" (see Using Subnetworks for more information).A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/regions/global/default projects/[project_id]/regions/global/default default

config.gceClusterConfig.zoneUri STRING

Optional. The zone where the Compute Engine cluster will be located. On a create request, it is required in the "global" region. If omitted in a non-global Cloud Dataproc region, the service will pick a zone in the corresponding Compute Engine region. On a get request, zone will always be present.A full URL, partial URI, or short name are valid. Examples: https://www.googleapis.com/compute/v1/projects/[project_id]/zones/[zone] projects/[project_id]/zones/[zone] us-central1-f

config.gceClusterConfig.internalIpOnly BOOLEAN

Optional. If true, all instances in the cluster will only have internal IP addresses. By default, clusters are not restricted to internal IP addresses, and will have ephemeral external IP addresses assigned to each instance. This internal_ip_only restriction can only be enabled for subnetwork enabled networks, and all off-cluster dependencies must be configured to be accessible without external IP addresses

config.gceClusterConfig.metadata OBJECT

The Compute Engine metadata entries to add to all instances (see Project and instance metadata (https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata))

config.gceClusterConfig.metadata.customKey.value STRING

The Compute Engine metadata entries to add to all instances (see Project and instance metadata (https://cloud.google.com/compute/docs/storing-retrieving-metadata#project_and_instance_metadata))

config.gceClusterConfig.serviceAccountScopes[] STRING

config.softwareConfig OBJECT

Specifies the selection and config of software inside the cluster

config.softwareConfig.imageVersion STRING

Optional. The version of software inside the cluster. It must be one of the supported Cloud Dataproc Versions, such as "1.2" (including a subminor version, such as "1.2.29"), or the "preview" version. If unspecified, it defaults to the latest Debian version

config.softwareConfig.properties OBJECT

Optional. The properties to set on daemon config files.Property keys are specified in prefix:property format, for example core:hadoop.tmp.dir. The following are supported prefixes and their mappings: capacity-scheduler: capacity-scheduler.xml core: core-site.xml distcp: distcp-default.xml hdfs: hdfs-site.xml hive: hive-site.xml mapred: mapred-site.xml pig: pig.properties spark: spark-defaults.conf yarn: yarn-site.xmlFor more information, see Cluster properties

config.softwareConfig.properties.customKey.value STRING

config.softwareConfig.optionalComponents[] ENUMERATION

config.masterConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

config.masterConfig.instanceNames[] STRING

config.masterConfig.accelerators[] OBJECT

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine

config.masterConfig.accelerators[].acceleratorCount INTEGER

The number of the accelerator cards of this type exposed to this instance

config.masterConfig.accelerators[].acceleratorTypeUri STRING

config.masterConfig.numInstances INTEGER

Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1

config.masterConfig.diskConfig OBJECT

Specifies the config of disk options for a group of VM instances

config.masterConfig.diskConfig.bootDiskType STRING

Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)

config.masterConfig.diskConfig.numLocalSsds INTEGER

config.masterConfig.diskConfig.bootDiskSizeGb INTEGER

Optional. Size in GB of the boot disk (default is 500GB)

config.masterConfig.managedGroupConfig OBJECT

Specifies the resources used to actively manage an instance group

config.masterConfig.managedGroupConfig.instanceGroupManagerName STRING

Output only. The name of the Instance Group Manager for this group

config.masterConfig.managedGroupConfig.instanceTemplateName STRING

Output only. The name of the Instance Template used for the Managed Instance Group

config.masterConfig.isPreemptible BOOLEAN

Optional. Specifies that this instance group contains preemptible instances

config.masterConfig.imageUri STRING

Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version

config.masterConfig.machineTypeUri STRING

config.secondaryWorkerConfig OBJECT

Optional. The config settings for Compute Engine resources in an instance group, such as a master or worker group

config.secondaryWorkerConfig.instanceNames[] STRING

config.secondaryWorkerConfig.accelerators[] OBJECT

Specifies the type and number of accelerator cards attached to the instances of an instance. See GPUs on Compute Engine

config.secondaryWorkerConfig.accelerators[].acceleratorCount INTEGER

The number of the accelerator cards of this type exposed to this instance

config.secondaryWorkerConfig.accelerators[].acceleratorTypeUri STRING

config.secondaryWorkerConfig.numInstances INTEGER

Optional. The number of VM instances in the instance group. For master instance groups, must be set to 1

config.secondaryWorkerConfig.diskConfig OBJECT

Specifies the config of disk options for a group of VM instances

config.secondaryWorkerConfig.diskConfig.bootDiskType STRING

Optional. Type of the boot disk (default is "pd-standard"). Valid values: "pd-ssd" (Persistent Disk Solid State Drive) or "pd-standard" (Persistent Disk Hard Disk Drive)

config.secondaryWorkerConfig.diskConfig.numLocalSsds INTEGER

config.secondaryWorkerConfig.diskConfig.bootDiskSizeGb INTEGER

Optional. Size in GB of the boot disk (default is 500GB)

config.secondaryWorkerConfig.managedGroupConfig OBJECT

Specifies the resources used to actively manage an instance group

config.secondaryWorkerConfig.managedGroupConfig.instanceGroupManagerName STRING

Output only. The name of the Instance Group Manager for this group

config.secondaryWorkerConfig.managedGroupConfig.instanceTemplateName STRING

Output only. The name of the Instance Template used for the Managed Instance Group

config.secondaryWorkerConfig.isPreemptible BOOLEAN

Optional. Specifies that this instance group contains preemptible instances

config.secondaryWorkerConfig.imageUri STRING

Optional. The Compute Engine image resource used for cluster instances. It can be specified or may be inferred from SoftwareConfig.image_version

config.secondaryWorkerConfig.machineTypeUri STRING

config.encryptionConfig OBJECT

Encryption settings for the cluster

config.encryptionConfig.gcePdKmsKeyName STRING

Optional. The Cloud KMS key name to use for PD disk encryption for all instances in the cluster

config.securityConfig OBJECT

Security related configuration, including Kerberos

config.securityConfig.kerberosConfig OBJECT

Specifies Kerberos related configuration

config.securityConfig.kerberosConfig.keystoreUri STRING

Optional. The Cloud Storage URI of the keystore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate

config.securityConfig.kerberosConfig.keyPasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided key. For the self-signed certificate, this password is generated by Dataproc

config.securityConfig.kerberosConfig.keystorePasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided keystore. For the self-signed certificate, this password is generated by Dataproc

config.securityConfig.kerberosConfig.crossRealmTrustAdminServer STRING

Optional. The admin server (IP or hostname) for the remote trusted realm in a cross realm trust relationship

config.securityConfig.kerberosConfig.kdcDbKeyUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the master key of the KDC database

config.securityConfig.kerberosConfig.truststorePasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the password to the user provided truststore. For the self-signed certificate, this password is generated by Dataproc

config.securityConfig.kerberosConfig.enableKerberos BOOLEAN

Optional. Flag to indicate whether to Kerberize the cluster

config.securityConfig.kerberosConfig.truststoreUri STRING

Optional. The Cloud Storage URI of the truststore file used for SSL encryption. If not provided, Dataproc will provide a self-signed certificate

config.securityConfig.kerberosConfig.crossRealmTrustRealm STRING

Optional. The remote realm the Dataproc on-cluster KDC will trust, should the user enable cross realm trust

config.securityConfig.kerberosConfig.rootPrincipalPasswordUri STRING

Required. The Cloud Storage URI of a KMS encrypted file containing the root principal password

config.securityConfig.kerberosConfig.kmsKeyUri STRING

Required. The uri of the KMS key used to encrypt various sensitive files

config.securityConfig.kerberosConfig.crossRealmTrustKdc STRING

Optional. The KDC (IP or hostname) for the remote trusted realm in a cross realm trust relationship

config.securityConfig.kerberosConfig.crossRealmTrustSharedPasswordUri STRING

Optional. The Cloud Storage URI of a KMS encrypted file containing the shared password between the on-cluster Kerberos realm and the remote trusted realm, in a cross realm trust relationship

config.securityConfig.kerberosConfig.tgtLifetimeHours INTEGER

Optional. The lifetime of the ticket granting ticket, in hours. If not specified, or user specifies 0, then default value 10 will be used

config.initializationActions[] OBJECT

Specifies an executable to run on a fully configured node and a timeout period for executable completion

config.initializationActions[].executableFile STRING

Required. Cloud Storage URI of executable file

config.initializationActions[].executionTimeout ANY

Optional. Amount of time executable has to complete. Default is 10 minutes. Cluster creation fails with an explanatory error message (the name of the executable that caused the error and the exceeded timeout period) if the executable is not completed at end of the timeout period

config.configBucket STRING

Optional. A Google Cloud Storage bucket used to stage job dependencies, config files, and job driver console output. If you do not specify a staging bucket, Cloud Dataproc will determine a Cloud Storage location (US, ASIA, or EU) for your cluster's staging bucket according to the Google Compute Engine zone where your cluster is deployed, and then create and manage this project-level, per-location bucket (see Cloud Dataproc staging bucket)

clusterName STRING

Required. The cluster name. Cluster names within a project must be unique. Names of deleted clusters can be reused

clusterUuid STRING

Output only. A cluster UUID (Unique Universal Identifier). Cloud Dataproc generates this value when it creates the cluster

projectId STRING

Required. The Google Cloud Platform project ID that the cluster belongs to

Projects / Regions / Clusters

Get

Authorization

Input

Output