Create
|
|||||
|
|
Creates a training or a batch prediction job
Authorization
To use this building block you will have to grant access to at least one of the following scopes:
- View and manage your data across Google Cloud Platform services
Input
This building block consumes 103 input parameters
Name | Format | Description |
---|---|---|
parent Required |
STRING |
Required. The project name |
trainingOutput |
OBJECT |
Represents results of a training job. Output only |
trainingOutput.isBuiltInAlgorithmJob |
BOOLEAN |
Whether this job is a built-in Algorithm job |
trainingOutput.builtInAlgorithmOutput |
OBJECT |
Represents output related to a built-in algorithm Job |
trainingOutput.builtInAlgorithmOutput.pythonVersion |
STRING |
Python version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.runtimeVersion |
STRING |
AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.framework |
STRING |
Framework on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.modelPath |
STRING |
The Cloud Storage path to the |
trainingOutput.trials[] |
OBJECT |
Represents the result of a single hyperparameter tuning trial from a training job. The TrainingOutput object that is returned on successful completion of a training job with hyperparameter tuning includes a list of HyperparameterOutput objects, one for each successful trial |
trainingOutput.trials[].hyperparameters |
OBJECT |
The hyperparameters given to this trial |
trainingOutput.trials[].hyperparameters.customKey.value Required |
STRING |
The hyperparameters given to this trial |
trainingOutput.trials[].trialId |
STRING |
The trial id for these results |
trainingOutput.trials[].endTime |
ANY |
Output only. End time for the trial |
trainingOutput.trials[].isTrialStoppedEarly |
BOOLEAN |
True if the trial is stopped early |
trainingOutput.trials[].startTime |
ANY |
Output only. Start time for the trial |
trainingOutput.trials[].finalMetric |
OBJECT |
An observed value of a metric |
trainingOutput.trials[].finalMetric.trainingStep |
INTEGER |
The global training step for this metric |
trainingOutput.trials[].finalMetric.objectiveValue |
NUMBER |
The objective value at this training step |
trainingOutput.trials[].builtInAlgorithmOutput |
OBJECT |
Represents output related to a built-in algorithm Job |
trainingOutput.trials[].builtInAlgorithmOutput.pythonVersion |
STRING |
Python version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.runtimeVersion |
STRING |
AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.framework |
STRING |
Framework on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.modelPath |
STRING |
The Cloud Storage path to the |
trainingOutput.trials[].state |
ENUMERATION |
Output only. The detailed state of the trial |
trainingOutput.trials[].allMetrics[] |
OBJECT |
An observed value of a metric |
trainingOutput.hyperparameterMetricTag |
STRING |
The TensorFlow summary tag name used for optimizing hyperparameter tuning
trials. See
|
trainingOutput.completedTrialCount |
INTEGER |
The number of hyperparameter tuning trials that completed successfully. Only set for hyperparameter tuning jobs |
trainingOutput.isHyperparameterTuningJob |
BOOLEAN |
Whether this job is a hyperparameter tuning job |
trainingOutput.consumedMLUnits |
NUMBER |
The amount of ML units consumed by the job |
createTime |
ANY |
Output only. When the job was created |
labels |
OBJECT |
Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
labels.customKey.value Required |
STRING |
Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
predictionInput |
OBJECT |
Represents input parameters for a prediction job |
predictionInput.outputPath |
STRING |
Required. The output Google Cloud Storage location |
predictionInput.outputDataFormat |
ENUMERATION |
Optional. Format of the output data files, defaults to JSON |
predictionInput.dataFormat |
ENUMERATION |
Required. The format of the input data files |
predictionInput.batchSize |
INTEGER |
Optional. Number of records per batch, defaults to 64. The service will buffer batch_size number of records in memory before invoking one Tensorflow prediction call internally. So take the record size and memory available into consideration when setting this parameter |
predictionInput.runtimeVersion |
STRING |
Optional. The AI Platform runtime version to use for this batch prediction. If not set, AI Platform will pick the runtime version used during the CreateVersion request for this model version, or choose the latest stable version when model version information is not available such as when the model is specified by uri |
predictionInput.inputPaths[] |
STRING |
|
predictionInput.region |
STRING |
Required. The Google Compute Engine region to run the prediction job in. See the available regions for AI Platform services |
predictionInput.versionName |
STRING |
Use this field if you want to specify a version of the model to use. The
string is formatted the same way as
|
predictionInput.modelName |
STRING |
Use this field if you want to use the default version for the specified model. The string must use the following format:
|
predictionInput.uri |
STRING |
Use this field if you want to specify a Google Cloud Storage path for the model to use |
predictionInput.maxWorkerCount |
INTEGER |
Optional. The maximum number of workers to be used for parallel processing. Defaults to 10 if not specified |
predictionInput.signatureName |
STRING |
Optional. The name of the signature defined in the SavedModel to use for this job. Please refer to SavedModel for information about how to use signatures. Defaults to DEFAULT_SERVING_SIGNATURE_DEF_KEY , which is "serving_default" |
errorMessage |
STRING |
Output only. The details of a failure or a cancellation |
etag |
BINARY |
|
trainingInput |
OBJECT |
Represents input parameters for a training job. When using the gcloud command to submit your training job, you can specify the input parameters as command-line arguments and/or in a YAML configuration file referenced from the --config command-line argument. For details, see the guide to submitting a training job |
trainingInput.workerCount |
INTEGER |
Optional. The number of worker replicas to use for the training job. Each
replica in the cluster will be of the type specified in This value can only be used when The default value is zero |
trainingInput.masterType |
STRING |
Optional. Specifies the type of virtual machine to use for your training job's master worker. The following types are supported: <dl> <dt>standard</dt> <dd> A basic machine configuration suitable for training simple models with small to moderate datasets. </dd> <dt>large_model</dt> <dd> A machine with a lot of memory, specially suited for parameter servers when your model is large (having many hidden layers or layers with very large numbers of nodes). </dd> <dt>complex_model_s</dt> <dd> A machine suitable for the master and workers of the cluster when your model requires more computation than the standard machine can handle satisfactorily. </dd> <dt>complex_model_m</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_s</i>. </dd> <dt>complex_model_l</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_m</i>. </dd> <dt>standard_gpu</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla K80 GPU. See more about <a href="/ml-engine/docs/tensorflow/using-gpus">using GPUs to train your model</a>. </dd> <dt>complex_model_m_gpu</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla K80 GPUs. </dd> <dt>complex_model_l_gpu</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla K80 GPUs. </dd> <dt>standard_p100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla P100 GPU. </dd> <dt>complex_model_m_p100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla P100 GPUs. </dd> <dt>standard_v100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>large_model_v100</dt> <dd> A machine equivalent to <i>large_model</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>complex_model_m_v100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla V100 GPUs. </dd> <dt>complex_model_l_v100</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla V100 GPUs. </dd> <dt>cloud_tpu</dt> <dd> A TPU VM including one Cloud TPU. See more about <a href="/ml-engine/docs/tensorflow/using-tpus">using TPUs to train your model</a>. </dd> </dl>You may also use certain Compute Engine machine types directly in this field. The following types are supported:
See more about using Compute Engine machine types. You must set this value when |
trainingInput.maxRunningTime |
ANY |
Optional. The maximum job running time. The default is 7 days |
trainingInput.runtimeVersion |
STRING |
Optional. The AI Platform runtime version to use for training. If not set, AI Platform uses the default stable version, 1.0. For more information, see the runtime version list and how to manage runtime versions |
trainingInput.pythonModule |
STRING |
Required. The Python module name to run after installing the packages |
trainingInput.args[] |
STRING |
|
trainingInput.region |
STRING |
Required. The Google Compute Engine region to run the training job in. See the available regions for AI Platform services |
trainingInput.workerType |
STRING |
Optional. Specifies the type of virtual machine to use for your training job's worker nodes. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
If you use This value must be present when |
trainingInput.parameterServerType |
STRING |
Optional. Specifies the type of virtual machine to use for your training job's parameter server. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
This value must be present when |
trainingInput.parameterServerConfig |
OBJECT |
Represents the configuration for a replica in a cluster |
trainingInput.parameterServerConfig.acceleratorConfig |
OBJECT |
Represents a hardware accelerator request config |
trainingInput.parameterServerConfig.acceleratorConfig.type |
ENUMERATION |
The type of accelerator to use |
trainingInput.parameterServerConfig.acceleratorConfig.count |
INTEGER |
The number of accelerators to attach to each machine running the job |
trainingInput.parameterServerConfig.imageUri |
STRING |
The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.masterConfig |
OBJECT |
Represents the configuration for a replica in a cluster |
trainingInput.masterConfig.acceleratorConfig |
OBJECT |
Represents a hardware accelerator request config |
trainingInput.masterConfig.acceleratorConfig.type |
ENUMERATION |
The type of accelerator to use |
trainingInput.masterConfig.acceleratorConfig.count |
INTEGER |
The number of accelerators to attach to each machine running the job |
trainingInput.masterConfig.imageUri |
STRING |
The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.scaleTier |
ENUMERATION |
Required. Specifies the machine types, the number of replicas for workers and parameter servers |
trainingInput.jobDir |
STRING |
Optional. A Google Cloud Storage path in which to store training outputs and other data needed for training. This path is passed to your TensorFlow program as the '--job-dir' command-line argument. The benefit of specifying this field is that Cloud ML validates the path for use in training |
trainingInput.hyperparameters |
OBJECT |
Represents a set of hyperparameters to optimize |
trainingInput.hyperparameters.algorithm |
ENUMERATION |
Optional. The search algorithm specified for the hyperparameter tuning job. Uses the default AI Platform hyperparameter tuning algorithm if unspecified |
trainingInput.hyperparameters.hyperparameterMetricTag |
STRING |
Optional. The TensorFlow summary tag name to use for optimizing trials. For current versions of TensorFlow, this tag name should exactly match what is shown in TensorBoard, including all scopes. For versions of TensorFlow prior to 0.12, this should be only the tag passed to tf.Summary. By default, "training/hptuning/metric" will be used |
trainingInput.hyperparameters.params[] |
OBJECT |
Represents a single hyperparameter to optimize |
trainingInput.hyperparameters.params[].minValue |
NUMBER |
Required if type is |
trainingInput.hyperparameters.params[].discreteValues[] |
NUMBER |
|
trainingInput.hyperparameters.params[].scaleType |
ENUMERATION |
Optional. How the parameter should be scaled to the hypercube.
Leave unset for categorical parameters.
Some kind of scaling is strongly recommended for real or integral
parameters (e.g., |
trainingInput.hyperparameters.params[].maxValue |
NUMBER |
Required if type is |
trainingInput.hyperparameters.params[].type |
ENUMERATION |
Required. The type of the parameter |
trainingInput.hyperparameters.params[].categoricalValues[] |
STRING |
|
trainingInput.hyperparameters.params[].parameterName |
STRING |
Required. The parameter name must be unique amongst all ParameterConfigs in a HyperparameterSpec message. E.g., "learning_rate" |
trainingInput.hyperparameters.enableTrialEarlyStopping |
BOOLEAN |
Optional. Indicates if the hyperparameter tuning job enables auto trial early stopping |
trainingInput.hyperparameters.resumePreviousJobId |
STRING |
Optional. The prior hyperparameter tuning job id that users hope to continue with. The job id will be used to find the corresponding vizier study guid and resume the study |
trainingInput.hyperparameters.maxParallelTrials |
INTEGER |
Optional. The number of training trials to run concurrently. You can reduce the time it takes to perform hyperparameter tuning by adding trials in parallel. However, each trail only benefits from the information gained in completed trials. That means that a trial does not get access to the results of trials running at the same time, which could reduce the quality of the overall optimization. Each trial will use the same scale tier and machine types. Defaults to one |
trainingInput.hyperparameters.maxFailedTrials |
INTEGER |
Optional. The number of failed trials that need to be seen before failing the hyperparameter tuning job. You can specify this field to override the default failing criteria for AI Platform hyperparameter tuning jobs. Defaults to zero, which means the service decides when a hyperparameter job should fail |
trainingInput.hyperparameters.goal |
ENUMERATION |
Required. The type of goal to use for tuning. Available types are
Defaults to |
trainingInput.hyperparameters.maxTrials |
INTEGER |
Optional. How many training trials should be attempted to optimize the specified hyperparameters. Defaults to one |
trainingInput.pythonVersion |
STRING |
Optional. The version of Python used in training. If not set, the default
version is '2.7'. Python '3.5' is available when |
trainingInput.workerConfig |
OBJECT |
Represents the configuration for a replica in a cluster |
trainingInput.workerConfig.acceleratorConfig |
OBJECT |
Represents a hardware accelerator request config |
trainingInput.workerConfig.acceleratorConfig.type |
ENUMERATION |
The type of accelerator to use |
trainingInput.workerConfig.acceleratorConfig.count |
INTEGER |
The number of accelerators to attach to each machine running the job |
trainingInput.workerConfig.imageUri |
STRING |
The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.parameterServerCount |
INTEGER |
Optional. The number of parameter server replicas to use for the training
job. Each replica in the cluster will be of the type specified in
This value can only be used when The default value is zero |
trainingInput.packageUris[] |
STRING |
|
state |
ENUMERATION |
Output only. The detailed state of a job |
jobId |
STRING |
Required. The user-specified id of the job |
endTime |
ANY |
Output only. When the job processing was completed |
startTime |
ANY |
Output only. When the job processing was started |
predictionOutput |
OBJECT |
Represents results of a prediction job |
predictionOutput.errorCount |
INTEGER |
The number of data instances which resulted in errors |
predictionOutput.outputPath |
STRING |
The output Google Cloud Storage location provided at the job creation time |
predictionOutput.nodeHours |
NUMBER |
Node hours used by the batch prediction job |
predictionOutput.predictionCount |
INTEGER |
The number of generated predictions |
= Parameter name
= Format
parent STRING Required Required. The project name |
trainingOutput OBJECT Represents results of a training job. Output only |
trainingOutput.isBuiltInAlgorithmJob BOOLEAN Whether this job is a built-in Algorithm job |
trainingOutput.builtInAlgorithmOutput OBJECT Represents output related to a built-in algorithm Job |
trainingOutput.builtInAlgorithmOutput.pythonVersion STRING Python version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.runtimeVersion STRING AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.framework STRING Framework on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.modelPath STRING The Cloud Storage path to the |
trainingOutput.trials[] OBJECT Represents the result of a single hyperparameter tuning trial from a training job. The TrainingOutput object that is returned on successful completion of a training job with hyperparameter tuning includes a list of HyperparameterOutput objects, one for each successful trial |
trainingOutput.trials[].hyperparameters OBJECT The hyperparameters given to this trial |
trainingOutput.trials[].hyperparameters.customKey.value STRING Required The hyperparameters given to this trial |
trainingOutput.trials[].trialId STRING The trial id for these results |
trainingOutput.trials[].endTime ANY Output only. End time for the trial |
trainingOutput.trials[].isTrialStoppedEarly BOOLEAN True if the trial is stopped early |
trainingOutput.trials[].startTime ANY Output only. Start time for the trial |
trainingOutput.trials[].finalMetric OBJECT An observed value of a metric |
trainingOutput.trials[].finalMetric.trainingStep INTEGER The global training step for this metric |
trainingOutput.trials[].finalMetric.objectiveValue NUMBER The objective value at this training step |
trainingOutput.trials[].builtInAlgorithmOutput OBJECT Represents output related to a built-in algorithm Job |
trainingOutput.trials[].builtInAlgorithmOutput.pythonVersion STRING Python version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.runtimeVersion STRING AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.framework STRING Framework on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.modelPath STRING The Cloud Storage path to the |
trainingOutput.trials[].state ENUMERATION Output only. The detailed state of the trial |
trainingOutput.trials[].allMetrics[] OBJECT An observed value of a metric |
trainingOutput.hyperparameterMetricTag STRING The TensorFlow summary tag name used for optimizing hyperparameter tuning
trials. See
|
trainingOutput.completedTrialCount INTEGER The number of hyperparameter tuning trials that completed successfully. Only set for hyperparameter tuning jobs |
trainingOutput.isHyperparameterTuningJob BOOLEAN Whether this job is a hyperparameter tuning job |
trainingOutput.consumedMLUnits NUMBER The amount of ML units consumed by the job |
createTime ANY Output only. When the job was created |
labels OBJECT Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
labels.customKey.value STRING Required Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
predictionInput OBJECT Represents input parameters for a prediction job |
predictionInput.outputPath STRING Required. The output Google Cloud Storage location |
predictionInput.outputDataFormat ENUMERATION Optional. Format of the output data files, defaults to JSON |
predictionInput.dataFormat ENUMERATION Required. The format of the input data files |
predictionInput.batchSize INTEGER Optional. Number of records per batch, defaults to 64. The service will buffer batch_size number of records in memory before invoking one Tensorflow prediction call internally. So take the record size and memory available into consideration when setting this parameter |
predictionInput.runtimeVersion STRING Optional. The AI Platform runtime version to use for this batch prediction. If not set, AI Platform will pick the runtime version used during the CreateVersion request for this model version, or choose the latest stable version when model version information is not available such as when the model is specified by uri |
predictionInput.inputPaths[] STRING |
predictionInput.region STRING Required. The Google Compute Engine region to run the prediction job in. See the available regions for AI Platform services |
predictionInput.versionName STRING Use this field if you want to specify a version of the model to use. The
string is formatted the same way as
|
predictionInput.modelName STRING Use this field if you want to use the default version for the specified model. The string must use the following format:
|
predictionInput.uri STRING Use this field if you want to specify a Google Cloud Storage path for the model to use |
predictionInput.maxWorkerCount INTEGER Optional. The maximum number of workers to be used for parallel processing. Defaults to 10 if not specified |
predictionInput.signatureName STRING Optional. The name of the signature defined in the SavedModel to use for this job. Please refer to SavedModel for information about how to use signatures. Defaults to DEFAULT_SERVING_SIGNATURE_DEF_KEY , which is "serving_default" |
errorMessage STRING Output only. The details of a failure or a cancellation |
etag BINARY
|
trainingInput OBJECT Represents input parameters for a training job. When using the gcloud command to submit your training job, you can specify the input parameters as command-line arguments and/or in a YAML configuration file referenced from the --config command-line argument. For details, see the guide to submitting a training job |
trainingInput.workerCount INTEGER Optional. The number of worker replicas to use for the training job. Each
replica in the cluster will be of the type specified in This value can only be used when The default value is zero |
trainingInput.masterType STRING Optional. Specifies the type of virtual machine to use for your training job's master worker. The following types are supported: <dl> <dt>standard</dt> <dd> A basic machine configuration suitable for training simple models with small to moderate datasets. </dd> <dt>large_model</dt> <dd> A machine with a lot of memory, specially suited for parameter servers when your model is large (having many hidden layers or layers with very large numbers of nodes). </dd> <dt>complex_model_s</dt> <dd> A machine suitable for the master and workers of the cluster when your model requires more computation than the standard machine can handle satisfactorily. </dd> <dt>complex_model_m</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_s</i>. </dd> <dt>complex_model_l</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_m</i>. </dd> <dt>standard_gpu</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla K80 GPU. See more about <a href="/ml-engine/docs/tensorflow/using-gpus">using GPUs to train your model</a>. </dd> <dt>complex_model_m_gpu</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla K80 GPUs. </dd> <dt>complex_model_l_gpu</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla K80 GPUs. </dd> <dt>standard_p100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla P100 GPU. </dd> <dt>complex_model_m_p100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla P100 GPUs. </dd> <dt>standard_v100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>large_model_v100</dt> <dd> A machine equivalent to <i>large_model</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>complex_model_m_v100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla V100 GPUs. </dd> <dt>complex_model_l_v100</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla V100 GPUs. </dd> <dt>cloud_tpu</dt> <dd> A TPU VM including one Cloud TPU. See more about <a href="/ml-engine/docs/tensorflow/using-tpus">using TPUs to train your model</a>. </dd> </dl>You may also use certain Compute Engine machine types directly in this field. The following types are supported:
See more about using Compute Engine machine types. You must set this value when |
trainingInput.maxRunningTime ANY Optional. The maximum job running time. The default is 7 days |
trainingInput.runtimeVersion STRING Optional. The AI Platform runtime version to use for training. If not set, AI Platform uses the default stable version, 1.0. For more information, see the runtime version list and how to manage runtime versions |
trainingInput.pythonModule STRING Required. The Python module name to run after installing the packages |
trainingInput.args[] STRING |
trainingInput.region STRING Required. The Google Compute Engine region to run the training job in. See the available regions for AI Platform services |
trainingInput.workerType STRING Optional. Specifies the type of virtual machine to use for your training job's worker nodes. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
If you use This value must be present when |
trainingInput.parameterServerType STRING Optional. Specifies the type of virtual machine to use for your training job's parameter server. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
This value must be present when |
trainingInput.parameterServerConfig OBJECT Represents the configuration for a replica in a cluster |
trainingInput.parameterServerConfig.acceleratorConfig OBJECT Represents a hardware accelerator request config |
trainingInput.parameterServerConfig.acceleratorConfig.type ENUMERATION The type of accelerator to use |
trainingInput.parameterServerConfig.acceleratorConfig.count INTEGER The number of accelerators to attach to each machine running the job |
trainingInput.parameterServerConfig.imageUri STRING The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.masterConfig OBJECT Represents the configuration for a replica in a cluster |
trainingInput.masterConfig.acceleratorConfig OBJECT Represents a hardware accelerator request config |
trainingInput.masterConfig.acceleratorConfig.type ENUMERATION The type of accelerator to use |
trainingInput.masterConfig.acceleratorConfig.count INTEGER The number of accelerators to attach to each machine running the job |
trainingInput.masterConfig.imageUri STRING The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.scaleTier ENUMERATION Required. Specifies the machine types, the number of replicas for workers and parameter servers |
trainingInput.jobDir STRING Optional. A Google Cloud Storage path in which to store training outputs and other data needed for training. This path is passed to your TensorFlow program as the '--job-dir' command-line argument. The benefit of specifying this field is that Cloud ML validates the path for use in training |
trainingInput.hyperparameters OBJECT Represents a set of hyperparameters to optimize |
trainingInput.hyperparameters.algorithm ENUMERATION Optional. The search algorithm specified for the hyperparameter tuning job. Uses the default AI Platform hyperparameter tuning algorithm if unspecified |
trainingInput.hyperparameters.hyperparameterMetricTag STRING Optional. The TensorFlow summary tag name to use for optimizing trials. For current versions of TensorFlow, this tag name should exactly match what is shown in TensorBoard, including all scopes. For versions of TensorFlow prior to 0.12, this should be only the tag passed to tf.Summary. By default, "training/hptuning/metric" will be used |
trainingInput.hyperparameters.params[] OBJECT Represents a single hyperparameter to optimize |
trainingInput.hyperparameters.params[].minValue NUMBER Required if type is |
trainingInput.hyperparameters.params[].discreteValues[] NUMBER |
trainingInput.hyperparameters.params[].scaleType ENUMERATION Optional. How the parameter should be scaled to the hypercube.
Leave unset for categorical parameters.
Some kind of scaling is strongly recommended for real or integral
parameters (e.g., |
trainingInput.hyperparameters.params[].maxValue NUMBER Required if type is |
trainingInput.hyperparameters.params[].type ENUMERATION Required. The type of the parameter |
trainingInput.hyperparameters.params[].categoricalValues[] STRING |
trainingInput.hyperparameters.params[].parameterName STRING Required. The parameter name must be unique amongst all ParameterConfigs in a HyperparameterSpec message. E.g., "learning_rate" |
trainingInput.hyperparameters.enableTrialEarlyStopping BOOLEAN Optional. Indicates if the hyperparameter tuning job enables auto trial early stopping |
trainingInput.hyperparameters.resumePreviousJobId STRING Optional. The prior hyperparameter tuning job id that users hope to continue with. The job id will be used to find the corresponding vizier study guid and resume the study |
trainingInput.hyperparameters.maxParallelTrials INTEGER Optional. The number of training trials to run concurrently. You can reduce the time it takes to perform hyperparameter tuning by adding trials in parallel. However, each trail only benefits from the information gained in completed trials. That means that a trial does not get access to the results of trials running at the same time, which could reduce the quality of the overall optimization. Each trial will use the same scale tier and machine types. Defaults to one |
trainingInput.hyperparameters.maxFailedTrials INTEGER Optional. The number of failed trials that need to be seen before failing the hyperparameter tuning job. You can specify this field to override the default failing criteria for AI Platform hyperparameter tuning jobs. Defaults to zero, which means the service decides when a hyperparameter job should fail |
trainingInput.hyperparameters.goal ENUMERATION Required. The type of goal to use for tuning. Available types are
Defaults to |
trainingInput.hyperparameters.maxTrials INTEGER Optional. How many training trials should be attempted to optimize the specified hyperparameters. Defaults to one |
trainingInput.pythonVersion STRING Optional. The version of Python used in training. If not set, the default
version is '2.7'. Python '3.5' is available when |
trainingInput.workerConfig OBJECT Represents the configuration for a replica in a cluster |
trainingInput.workerConfig.acceleratorConfig OBJECT Represents a hardware accelerator request config |
trainingInput.workerConfig.acceleratorConfig.type ENUMERATION The type of accelerator to use |
trainingInput.workerConfig.acceleratorConfig.count INTEGER The number of accelerators to attach to each machine running the job |
trainingInput.workerConfig.imageUri STRING The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.parameterServerCount INTEGER Optional. The number of parameter server replicas to use for the training
job. Each replica in the cluster will be of the type specified in
This value can only be used when The default value is zero |
trainingInput.packageUris[] STRING |
state ENUMERATION Output only. The detailed state of a job |
jobId STRING Required. The user-specified id of the job |
endTime ANY Output only. When the job processing was completed |
startTime ANY Output only. When the job processing was started |
predictionOutput OBJECT Represents results of a prediction job |
predictionOutput.errorCount INTEGER The number of data instances which resulted in errors |
predictionOutput.outputPath STRING The output Google Cloud Storage location provided at the job creation time |
predictionOutput.nodeHours NUMBER Node hours used by the batch prediction job |
predictionOutput.predictionCount INTEGER The number of generated predictions |
Output
This building block provides 102 output parameters
Name | Format | Description |
---|---|---|
trainingOutput |
OBJECT |
Represents results of a training job. Output only |
trainingOutput.isBuiltInAlgorithmJob |
BOOLEAN |
Whether this job is a built-in Algorithm job |
trainingOutput.builtInAlgorithmOutput |
OBJECT |
Represents output related to a built-in algorithm Job |
trainingOutput.builtInAlgorithmOutput.pythonVersion |
STRING |
Python version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.runtimeVersion |
STRING |
AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.framework |
STRING |
Framework on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.modelPath |
STRING |
The Cloud Storage path to the |
trainingOutput.trials[] |
OBJECT |
Represents the result of a single hyperparameter tuning trial from a training job. The TrainingOutput object that is returned on successful completion of a training job with hyperparameter tuning includes a list of HyperparameterOutput objects, one for each successful trial |
trainingOutput.trials[].hyperparameters |
OBJECT |
The hyperparameters given to this trial |
trainingOutput.trials[].hyperparameters.customKey.value |
STRING |
The hyperparameters given to this trial |
trainingOutput.trials[].trialId |
STRING |
The trial id for these results |
trainingOutput.trials[].endTime |
ANY |
Output only. End time for the trial |
trainingOutput.trials[].isTrialStoppedEarly |
BOOLEAN |
True if the trial is stopped early |
trainingOutput.trials[].startTime |
ANY |
Output only. Start time for the trial |
trainingOutput.trials[].finalMetric |
OBJECT |
An observed value of a metric |
trainingOutput.trials[].finalMetric.trainingStep |
INTEGER |
The global training step for this metric |
trainingOutput.trials[].finalMetric.objectiveValue |
NUMBER |
The objective value at this training step |
trainingOutput.trials[].builtInAlgorithmOutput |
OBJECT |
Represents output related to a built-in algorithm Job |
trainingOutput.trials[].builtInAlgorithmOutput.pythonVersion |
STRING |
Python version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.runtimeVersion |
STRING |
AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.framework |
STRING |
Framework on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.modelPath |
STRING |
The Cloud Storage path to the |
trainingOutput.trials[].state |
ENUMERATION |
Output only. The detailed state of the trial |
trainingOutput.trials[].allMetrics[] |
OBJECT |
An observed value of a metric |
trainingOutput.hyperparameterMetricTag |
STRING |
The TensorFlow summary tag name used for optimizing hyperparameter tuning
trials. See
|
trainingOutput.completedTrialCount |
INTEGER |
The number of hyperparameter tuning trials that completed successfully. Only set for hyperparameter tuning jobs |
trainingOutput.isHyperparameterTuningJob |
BOOLEAN |
Whether this job is a hyperparameter tuning job |
trainingOutput.consumedMLUnits |
NUMBER |
The amount of ML units consumed by the job |
createTime |
ANY |
Output only. When the job was created |
labels |
OBJECT |
Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
labels.customKey.value |
STRING |
Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
predictionInput |
OBJECT |
Represents input parameters for a prediction job |
predictionInput.outputPath |
STRING |
Required. The output Google Cloud Storage location |
predictionInput.outputDataFormat |
ENUMERATION |
Optional. Format of the output data files, defaults to JSON |
predictionInput.dataFormat |
ENUMERATION |
Required. The format of the input data files |
predictionInput.batchSize |
INTEGER |
Optional. Number of records per batch, defaults to 64. The service will buffer batch_size number of records in memory before invoking one Tensorflow prediction call internally. So take the record size and memory available into consideration when setting this parameter |
predictionInput.runtimeVersion |
STRING |
Optional. The AI Platform runtime version to use for this batch prediction. If not set, AI Platform will pick the runtime version used during the CreateVersion request for this model version, or choose the latest stable version when model version information is not available such as when the model is specified by uri |
predictionInput.inputPaths[] |
STRING |
|
predictionInput.region |
STRING |
Required. The Google Compute Engine region to run the prediction job in. See the available regions for AI Platform services |
predictionInput.versionName |
STRING |
Use this field if you want to specify a version of the model to use. The
string is formatted the same way as
|
predictionInput.modelName |
STRING |
Use this field if you want to use the default version for the specified model. The string must use the following format:
|
predictionInput.uri |
STRING |
Use this field if you want to specify a Google Cloud Storage path for the model to use |
predictionInput.maxWorkerCount |
INTEGER |
Optional. The maximum number of workers to be used for parallel processing. Defaults to 10 if not specified |
predictionInput.signatureName |
STRING |
Optional. The name of the signature defined in the SavedModel to use for this job. Please refer to SavedModel for information about how to use signatures. Defaults to DEFAULT_SERVING_SIGNATURE_DEF_KEY , which is "serving_default" |
errorMessage |
STRING |
Output only. The details of a failure or a cancellation |
etag |
BINARY |
|
trainingInput |
OBJECT |
Represents input parameters for a training job. When using the gcloud command to submit your training job, you can specify the input parameters as command-line arguments and/or in a YAML configuration file referenced from the --config command-line argument. For details, see the guide to submitting a training job |
trainingInput.workerCount |
INTEGER |
Optional. The number of worker replicas to use for the training job. Each
replica in the cluster will be of the type specified in This value can only be used when The default value is zero |
trainingInput.masterType |
STRING |
Optional. Specifies the type of virtual machine to use for your training job's master worker. The following types are supported: <dl> <dt>standard</dt> <dd> A basic machine configuration suitable for training simple models with small to moderate datasets. </dd> <dt>large_model</dt> <dd> A machine with a lot of memory, specially suited for parameter servers when your model is large (having many hidden layers or layers with very large numbers of nodes). </dd> <dt>complex_model_s</dt> <dd> A machine suitable for the master and workers of the cluster when your model requires more computation than the standard machine can handle satisfactorily. </dd> <dt>complex_model_m</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_s</i>. </dd> <dt>complex_model_l</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_m</i>. </dd> <dt>standard_gpu</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla K80 GPU. See more about <a href="/ml-engine/docs/tensorflow/using-gpus">using GPUs to train your model</a>. </dd> <dt>complex_model_m_gpu</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla K80 GPUs. </dd> <dt>complex_model_l_gpu</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla K80 GPUs. </dd> <dt>standard_p100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla P100 GPU. </dd> <dt>complex_model_m_p100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla P100 GPUs. </dd> <dt>standard_v100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>large_model_v100</dt> <dd> A machine equivalent to <i>large_model</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>complex_model_m_v100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla V100 GPUs. </dd> <dt>complex_model_l_v100</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla V100 GPUs. </dd> <dt>cloud_tpu</dt> <dd> A TPU VM including one Cloud TPU. See more about <a href="/ml-engine/docs/tensorflow/using-tpus">using TPUs to train your model</a>. </dd> </dl>You may also use certain Compute Engine machine types directly in this field. The following types are supported:
See more about using Compute Engine machine types. You must set this value when |
trainingInput.maxRunningTime |
ANY |
Optional. The maximum job running time. The default is 7 days |
trainingInput.runtimeVersion |
STRING |
Optional. The AI Platform runtime version to use for training. If not set, AI Platform uses the default stable version, 1.0. For more information, see the runtime version list and how to manage runtime versions |
trainingInput.pythonModule |
STRING |
Required. The Python module name to run after installing the packages |
trainingInput.args[] |
STRING |
|
trainingInput.region |
STRING |
Required. The Google Compute Engine region to run the training job in. See the available regions for AI Platform services |
trainingInput.workerType |
STRING |
Optional. Specifies the type of virtual machine to use for your training job's worker nodes. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
If you use This value must be present when |
trainingInput.parameterServerType |
STRING |
Optional. Specifies the type of virtual machine to use for your training job's parameter server. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
This value must be present when |
trainingInput.parameterServerConfig |
OBJECT |
Represents the configuration for a replica in a cluster |
trainingInput.parameterServerConfig.acceleratorConfig |
OBJECT |
Represents a hardware accelerator request config |
trainingInput.parameterServerConfig.acceleratorConfig.type |
ENUMERATION |
The type of accelerator to use |
trainingInput.parameterServerConfig.acceleratorConfig.count |
INTEGER |
The number of accelerators to attach to each machine running the job |
trainingInput.parameterServerConfig.imageUri |
STRING |
The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.masterConfig |
OBJECT |
Represents the configuration for a replica in a cluster |
trainingInput.masterConfig.acceleratorConfig |
OBJECT |
Represents a hardware accelerator request config |
trainingInput.masterConfig.acceleratorConfig.type |
ENUMERATION |
The type of accelerator to use |
trainingInput.masterConfig.acceleratorConfig.count |
INTEGER |
The number of accelerators to attach to each machine running the job |
trainingInput.masterConfig.imageUri |
STRING |
The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.scaleTier |
ENUMERATION |
Required. Specifies the machine types, the number of replicas for workers and parameter servers |
trainingInput.jobDir |
STRING |
Optional. A Google Cloud Storage path in which to store training outputs and other data needed for training. This path is passed to your TensorFlow program as the '--job-dir' command-line argument. The benefit of specifying this field is that Cloud ML validates the path for use in training |
trainingInput.hyperparameters |
OBJECT |
Represents a set of hyperparameters to optimize |
trainingInput.hyperparameters.algorithm |
ENUMERATION |
Optional. The search algorithm specified for the hyperparameter tuning job. Uses the default AI Platform hyperparameter tuning algorithm if unspecified |
trainingInput.hyperparameters.hyperparameterMetricTag |
STRING |
Optional. The TensorFlow summary tag name to use for optimizing trials. For current versions of TensorFlow, this tag name should exactly match what is shown in TensorBoard, including all scopes. For versions of TensorFlow prior to 0.12, this should be only the tag passed to tf.Summary. By default, "training/hptuning/metric" will be used |
trainingInput.hyperparameters.params[] |
OBJECT |
Represents a single hyperparameter to optimize |
trainingInput.hyperparameters.params[].minValue |
NUMBER |
Required if type is |
trainingInput.hyperparameters.params[].discreteValues[] |
NUMBER |
|
trainingInput.hyperparameters.params[].scaleType |
ENUMERATION |
Optional. How the parameter should be scaled to the hypercube.
Leave unset for categorical parameters.
Some kind of scaling is strongly recommended for real or integral
parameters (e.g., |
trainingInput.hyperparameters.params[].maxValue |
NUMBER |
Required if type is |
trainingInput.hyperparameters.params[].type |
ENUMERATION |
Required. The type of the parameter |
trainingInput.hyperparameters.params[].categoricalValues[] |
STRING |
|
trainingInput.hyperparameters.params[].parameterName |
STRING |
Required. The parameter name must be unique amongst all ParameterConfigs in a HyperparameterSpec message. E.g., "learning_rate" |
trainingInput.hyperparameters.enableTrialEarlyStopping |
BOOLEAN |
Optional. Indicates if the hyperparameter tuning job enables auto trial early stopping |
trainingInput.hyperparameters.resumePreviousJobId |
STRING |
Optional. The prior hyperparameter tuning job id that users hope to continue with. The job id will be used to find the corresponding vizier study guid and resume the study |
trainingInput.hyperparameters.maxParallelTrials |
INTEGER |
Optional. The number of training trials to run concurrently. You can reduce the time it takes to perform hyperparameter tuning by adding trials in parallel. However, each trail only benefits from the information gained in completed trials. That means that a trial does not get access to the results of trials running at the same time, which could reduce the quality of the overall optimization. Each trial will use the same scale tier and machine types. Defaults to one |
trainingInput.hyperparameters.maxFailedTrials |
INTEGER |
Optional. The number of failed trials that need to be seen before failing the hyperparameter tuning job. You can specify this field to override the default failing criteria for AI Platform hyperparameter tuning jobs. Defaults to zero, which means the service decides when a hyperparameter job should fail |
trainingInput.hyperparameters.goal |
ENUMERATION |
Required. The type of goal to use for tuning. Available types are
Defaults to |
trainingInput.hyperparameters.maxTrials |
INTEGER |
Optional. How many training trials should be attempted to optimize the specified hyperparameters. Defaults to one |
trainingInput.pythonVersion |
STRING |
Optional. The version of Python used in training. If not set, the default
version is '2.7'. Python '3.5' is available when |
trainingInput.workerConfig |
OBJECT |
Represents the configuration for a replica in a cluster |
trainingInput.workerConfig.acceleratorConfig |
OBJECT |
Represents a hardware accelerator request config |
trainingInput.workerConfig.acceleratorConfig.type |
ENUMERATION |
The type of accelerator to use |
trainingInput.workerConfig.acceleratorConfig.count |
INTEGER |
The number of accelerators to attach to each machine running the job |
trainingInput.workerConfig.imageUri |
STRING |
The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.parameterServerCount |
INTEGER |
Optional. The number of parameter server replicas to use for the training
job. Each replica in the cluster will be of the type specified in
This value can only be used when The default value is zero |
trainingInput.packageUris[] |
STRING |
|
state |
ENUMERATION |
Output only. The detailed state of a job |
jobId |
STRING |
Required. The user-specified id of the job |
endTime |
ANY |
Output only. When the job processing was completed |
startTime |
ANY |
Output only. When the job processing was started |
predictionOutput |
OBJECT |
Represents results of a prediction job |
predictionOutput.errorCount |
INTEGER |
The number of data instances which resulted in errors |
predictionOutput.outputPath |
STRING |
The output Google Cloud Storage location provided at the job creation time |
predictionOutput.nodeHours |
NUMBER |
Node hours used by the batch prediction job |
predictionOutput.predictionCount |
INTEGER |
The number of generated predictions |
= Parameter name
= Format
trainingOutput OBJECT Represents results of a training job. Output only |
trainingOutput.isBuiltInAlgorithmJob BOOLEAN Whether this job is a built-in Algorithm job |
trainingOutput.builtInAlgorithmOutput OBJECT Represents output related to a built-in algorithm Job |
trainingOutput.builtInAlgorithmOutput.pythonVersion STRING Python version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.runtimeVersion STRING AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.framework STRING Framework on which the built-in algorithm was trained |
trainingOutput.builtInAlgorithmOutput.modelPath STRING The Cloud Storage path to the |
trainingOutput.trials[] OBJECT Represents the result of a single hyperparameter tuning trial from a training job. The TrainingOutput object that is returned on successful completion of a training job with hyperparameter tuning includes a list of HyperparameterOutput objects, one for each successful trial |
trainingOutput.trials[].hyperparameters OBJECT The hyperparameters given to this trial |
trainingOutput.trials[].hyperparameters.customKey.value STRING The hyperparameters given to this trial |
trainingOutput.trials[].trialId STRING The trial id for these results |
trainingOutput.trials[].endTime ANY Output only. End time for the trial |
trainingOutput.trials[].isTrialStoppedEarly BOOLEAN True if the trial is stopped early |
trainingOutput.trials[].startTime ANY Output only. Start time for the trial |
trainingOutput.trials[].finalMetric OBJECT An observed value of a metric |
trainingOutput.trials[].finalMetric.trainingStep INTEGER The global training step for this metric |
trainingOutput.trials[].finalMetric.objectiveValue NUMBER The objective value at this training step |
trainingOutput.trials[].builtInAlgorithmOutput OBJECT Represents output related to a built-in algorithm Job |
trainingOutput.trials[].builtInAlgorithmOutput.pythonVersion STRING Python version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.runtimeVersion STRING AI Platform runtime version on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.framework STRING Framework on which the built-in algorithm was trained |
trainingOutput.trials[].builtInAlgorithmOutput.modelPath STRING The Cloud Storage path to the |
trainingOutput.trials[].state ENUMERATION Output only. The detailed state of the trial |
trainingOutput.trials[].allMetrics[] OBJECT An observed value of a metric |
trainingOutput.hyperparameterMetricTag STRING The TensorFlow summary tag name used for optimizing hyperparameter tuning
trials. See
|
trainingOutput.completedTrialCount INTEGER The number of hyperparameter tuning trials that completed successfully. Only set for hyperparameter tuning jobs |
trainingOutput.isHyperparameterTuningJob BOOLEAN Whether this job is a hyperparameter tuning job |
trainingOutput.consumedMLUnits NUMBER The amount of ML units consumed by the job |
createTime ANY Output only. When the job was created |
labels OBJECT Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
labels.customKey.value STRING Optional. One or more labels that you can add, to organize your jobs. Each label is a key-value pair, where both the key and the value are arbitrary strings that you supply. For more information, see the documentation on using labels |
predictionInput OBJECT Represents input parameters for a prediction job |
predictionInput.outputPath STRING Required. The output Google Cloud Storage location |
predictionInput.outputDataFormat ENUMERATION Optional. Format of the output data files, defaults to JSON |
predictionInput.dataFormat ENUMERATION Required. The format of the input data files |
predictionInput.batchSize INTEGER Optional. Number of records per batch, defaults to 64. The service will buffer batch_size number of records in memory before invoking one Tensorflow prediction call internally. So take the record size and memory available into consideration when setting this parameter |
predictionInput.runtimeVersion STRING Optional. The AI Platform runtime version to use for this batch prediction. If not set, AI Platform will pick the runtime version used during the CreateVersion request for this model version, or choose the latest stable version when model version information is not available such as when the model is specified by uri |
predictionInput.inputPaths[] STRING |
predictionInput.region STRING Required. The Google Compute Engine region to run the prediction job in. See the available regions for AI Platform services |
predictionInput.versionName STRING Use this field if you want to specify a version of the model to use. The
string is formatted the same way as
|
predictionInput.modelName STRING Use this field if you want to use the default version for the specified model. The string must use the following format:
|
predictionInput.uri STRING Use this field if you want to specify a Google Cloud Storage path for the model to use |
predictionInput.maxWorkerCount INTEGER Optional. The maximum number of workers to be used for parallel processing. Defaults to 10 if not specified |
predictionInput.signatureName STRING Optional. The name of the signature defined in the SavedModel to use for this job. Please refer to SavedModel for information about how to use signatures. Defaults to DEFAULT_SERVING_SIGNATURE_DEF_KEY , which is "serving_default" |
errorMessage STRING Output only. The details of a failure or a cancellation |
etag BINARY
|
trainingInput OBJECT Represents input parameters for a training job. When using the gcloud command to submit your training job, you can specify the input parameters as command-line arguments and/or in a YAML configuration file referenced from the --config command-line argument. For details, see the guide to submitting a training job |
trainingInput.workerCount INTEGER Optional. The number of worker replicas to use for the training job. Each
replica in the cluster will be of the type specified in This value can only be used when The default value is zero |
trainingInput.masterType STRING Optional. Specifies the type of virtual machine to use for your training job's master worker. The following types are supported: <dl> <dt>standard</dt> <dd> A basic machine configuration suitable for training simple models with small to moderate datasets. </dd> <dt>large_model</dt> <dd> A machine with a lot of memory, specially suited for parameter servers when your model is large (having many hidden layers or layers with very large numbers of nodes). </dd> <dt>complex_model_s</dt> <dd> A machine suitable for the master and workers of the cluster when your model requires more computation than the standard machine can handle satisfactorily. </dd> <dt>complex_model_m</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_s</i>. </dd> <dt>complex_model_l</dt> <dd> A machine with roughly twice the number of cores and roughly double the memory of <i>complex_model_m</i>. </dd> <dt>standard_gpu</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla K80 GPU. See more about <a href="/ml-engine/docs/tensorflow/using-gpus">using GPUs to train your model</a>. </dd> <dt>complex_model_m_gpu</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla K80 GPUs. </dd> <dt>complex_model_l_gpu</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla K80 GPUs. </dd> <dt>standard_p100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla P100 GPU. </dd> <dt>complex_model_m_p100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla P100 GPUs. </dd> <dt>standard_v100</dt> <dd> A machine equivalent to <i>standard</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>large_model_v100</dt> <dd> A machine equivalent to <i>large_model</i> that also includes a single NVIDIA Tesla V100 GPU. </dd> <dt>complex_model_m_v100</dt> <dd> A machine equivalent to <i>complex_model_m</i> that also includes four NVIDIA Tesla V100 GPUs. </dd> <dt>complex_model_l_v100</dt> <dd> A machine equivalent to <i>complex_model_l</i> that also includes eight NVIDIA Tesla V100 GPUs. </dd> <dt>cloud_tpu</dt> <dd> A TPU VM including one Cloud TPU. See more about <a href="/ml-engine/docs/tensorflow/using-tpus">using TPUs to train your model</a>. </dd> </dl>You may also use certain Compute Engine machine types directly in this field. The following types are supported:
See more about using Compute Engine machine types. You must set this value when |
trainingInput.maxRunningTime ANY Optional. The maximum job running time. The default is 7 days |
trainingInput.runtimeVersion STRING Optional. The AI Platform runtime version to use for training. If not set, AI Platform uses the default stable version, 1.0. For more information, see the runtime version list and how to manage runtime versions |
trainingInput.pythonModule STRING Required. The Python module name to run after installing the packages |
trainingInput.args[] STRING |
trainingInput.region STRING Required. The Google Compute Engine region to run the training job in. See the available regions for AI Platform services |
trainingInput.workerType STRING Optional. Specifies the type of virtual machine to use for your training job's worker nodes. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
If you use This value must be present when |
trainingInput.parameterServerType STRING Optional. Specifies the type of virtual machine to use for your training job's parameter server. The supported values are the same as those described in the entry for
This value must be consistent with the category of machine type that
This value must be present when |
trainingInput.parameterServerConfig OBJECT Represents the configuration for a replica in a cluster |
trainingInput.parameterServerConfig.acceleratorConfig OBJECT Represents a hardware accelerator request config |
trainingInput.parameterServerConfig.acceleratorConfig.type ENUMERATION The type of accelerator to use |
trainingInput.parameterServerConfig.acceleratorConfig.count INTEGER The number of accelerators to attach to each machine running the job |
trainingInput.parameterServerConfig.imageUri STRING The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.masterConfig OBJECT Represents the configuration for a replica in a cluster |
trainingInput.masterConfig.acceleratorConfig OBJECT Represents a hardware accelerator request config |
trainingInput.masterConfig.acceleratorConfig.type ENUMERATION The type of accelerator to use |
trainingInput.masterConfig.acceleratorConfig.count INTEGER The number of accelerators to attach to each machine running the job |
trainingInput.masterConfig.imageUri STRING The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.scaleTier ENUMERATION Required. Specifies the machine types, the number of replicas for workers and parameter servers |
trainingInput.jobDir STRING Optional. A Google Cloud Storage path in which to store training outputs and other data needed for training. This path is passed to your TensorFlow program as the '--job-dir' command-line argument. The benefit of specifying this field is that Cloud ML validates the path for use in training |
trainingInput.hyperparameters OBJECT Represents a set of hyperparameters to optimize |
trainingInput.hyperparameters.algorithm ENUMERATION Optional. The search algorithm specified for the hyperparameter tuning job. Uses the default AI Platform hyperparameter tuning algorithm if unspecified |
trainingInput.hyperparameters.hyperparameterMetricTag STRING Optional. The TensorFlow summary tag name to use for optimizing trials. For current versions of TensorFlow, this tag name should exactly match what is shown in TensorBoard, including all scopes. For versions of TensorFlow prior to 0.12, this should be only the tag passed to tf.Summary. By default, "training/hptuning/metric" will be used |
trainingInput.hyperparameters.params[] OBJECT Represents a single hyperparameter to optimize |
trainingInput.hyperparameters.params[].minValue NUMBER Required if type is |
trainingInput.hyperparameters.params[].discreteValues[] NUMBER |
trainingInput.hyperparameters.params[].scaleType ENUMERATION Optional. How the parameter should be scaled to the hypercube.
Leave unset for categorical parameters.
Some kind of scaling is strongly recommended for real or integral
parameters (e.g., |
trainingInput.hyperparameters.params[].maxValue NUMBER Required if type is |
trainingInput.hyperparameters.params[].type ENUMERATION Required. The type of the parameter |
trainingInput.hyperparameters.params[].categoricalValues[] STRING |
trainingInput.hyperparameters.params[].parameterName STRING Required. The parameter name must be unique amongst all ParameterConfigs in a HyperparameterSpec message. E.g., "learning_rate" |
trainingInput.hyperparameters.enableTrialEarlyStopping BOOLEAN Optional. Indicates if the hyperparameter tuning job enables auto trial early stopping |
trainingInput.hyperparameters.resumePreviousJobId STRING Optional. The prior hyperparameter tuning job id that users hope to continue with. The job id will be used to find the corresponding vizier study guid and resume the study |
trainingInput.hyperparameters.maxParallelTrials INTEGER Optional. The number of training trials to run concurrently. You can reduce the time it takes to perform hyperparameter tuning by adding trials in parallel. However, each trail only benefits from the information gained in completed trials. That means that a trial does not get access to the results of trials running at the same time, which could reduce the quality of the overall optimization. Each trial will use the same scale tier and machine types. Defaults to one |
trainingInput.hyperparameters.maxFailedTrials INTEGER Optional. The number of failed trials that need to be seen before failing the hyperparameter tuning job. You can specify this field to override the default failing criteria for AI Platform hyperparameter tuning jobs. Defaults to zero, which means the service decides when a hyperparameter job should fail |
trainingInput.hyperparameters.goal ENUMERATION Required. The type of goal to use for tuning. Available types are
Defaults to |
trainingInput.hyperparameters.maxTrials INTEGER Optional. How many training trials should be attempted to optimize the specified hyperparameters. Defaults to one |
trainingInput.pythonVersion STRING Optional. The version of Python used in training. If not set, the default
version is '2.7'. Python '3.5' is available when |
trainingInput.workerConfig OBJECT Represents the configuration for a replica in a cluster |
trainingInput.workerConfig.acceleratorConfig OBJECT Represents a hardware accelerator request config |
trainingInput.workerConfig.acceleratorConfig.type ENUMERATION The type of accelerator to use |
trainingInput.workerConfig.acceleratorConfig.count INTEGER The number of accelerators to attach to each machine running the job |
trainingInput.workerConfig.imageUri STRING The Docker image to run on the replica. This image must be in Container Registry. Learn more about configuring custom containers |
trainingInput.parameterServerCount INTEGER Optional. The number of parameter server replicas to use for the training
job. Each replica in the cluster will be of the type specified in
This value can only be used when The default value is zero |
trainingInput.packageUris[] STRING |
state ENUMERATION Output only. The detailed state of a job |
jobId STRING Required. The user-specified id of the job |
endTime ANY Output only. When the job processing was completed |
startTime ANY Output only. When the job processing was started |
predictionOutput OBJECT Represents results of a prediction job |
predictionOutput.errorCount INTEGER The number of data instances which resulted in errors |
predictionOutput.outputPath STRING The output Google Cloud Storage location provided at the job creation time |
predictionOutput.nodeHours NUMBER Node hours used by the batch prediction job |
predictionOutput.predictionCount INTEGER The number of generated predictions |