Deidentify
|
|||||
|
|
De-identifies potentially sensitive info from a ContentItem. This method has limits on input size and output size. See https://cloud.google.com/dlp/docs/deidentify-sensitive-data to learn more.
When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated
Authorization
To use this building block you will have to grant access to at least one of the following scopes:
- View and manage your data across Google Cloud Platform services
Input
This building block consumes 47 input parameters
Name | Format | Description |
---|---|---|
parent Required |
STRING |
The parent resource name, for example projects/my-project-id |
deidentifyTemplateName |
STRING |
Optional template to use. Any configuration directly specified in deidentify_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged |
deidentifyConfig |
OBJECT |
The configuration that controls how the data will change |
deidentifyConfig.recordTransformations |
OBJECT |
A type of transformation that is applied over structured data such as a table |
deidentifyConfig.recordTransformations.recordSuppressions[] |
OBJECT |
Configuration to suppress records whose suppression conditions evaluate to true |
deidentifyConfig.recordTransformations.fieldTransformations[] |
OBJECT |
The transformation to apply to the field |
deidentifyConfig.infoTypeTransformations |
OBJECT |
A type of transformation that will scan unstructured text and
apply various |
deidentifyConfig.infoTypeTransformations.transformations[] |
OBJECT |
A transformation to apply to text that is identified as a specific info_type |
inspectConfig |
OBJECT |
Configuration description of the scanning process. When used with redactContent only info_types and min_likelihood are currently used |
inspectConfig.contentOptions[] |
ENUMERATION |
|
inspectConfig.limits |
OBJECT |
|
inspectConfig.limits.maxFindingsPerInfoType[] |
OBJECT |
Max findings configuration per infoType, per content item or long running DlpJob |
inspectConfig.limits.maxFindingsPerInfoType[].maxFindings |
INTEGER |
Max findings limit for the given infoType |
inspectConfig.limits.maxFindingsPerItem |
INTEGER |
Max number of findings that will be returned for each item scanned.
When set within |
inspectConfig.limits.maxFindingsPerRequest |
INTEGER |
Max number of findings that will be returned per request/job.
When set within |
inspectConfig.excludeInfoTypes |
BOOLEAN |
When true, excludes type information of the findings |
inspectConfig.minLikelihood |
ENUMERATION |
Only returns findings equal or above this threshold. The default is POSSIBLE. See https://cloud.google.com/dlp/docs/likelihood to learn more |
inspectConfig.ruleSet[] |
OBJECT |
Rule set for modifying a set of infoTypes to alter behavior under certain circumstances, depending on the specific details of the rules within the set |
inspectConfig.ruleSet[].rules[] |
OBJECT |
A single inspection rule to be applied to infoTypes, specified in
|
inspectConfig.ruleSet[].infoTypes[] |
OBJECT |
Type of information detected by the API |
inspectConfig.infoTypes[] |
OBJECT |
Type of information detected by the API |
inspectConfig.infoTypes[].name |
STRING |
Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64} |
inspectConfig.includeQuote |
BOOLEAN |
When true, a contextual quote from the data that triggered a finding is included in the response; see Finding.quote |
inspectConfig.customInfoTypes[] |
OBJECT |
Custom information type provided by the user. Used to find domain-specific sensitive information configurable to the data in question |
inspectConfig.customInfoTypes[].storedType |
OBJECT |
A reference to a StoredInfoType to use with scanning |
inspectConfig.customInfoTypes[].storedType.name |
STRING |
Resource name of the requested |
inspectConfig.customInfoTypes[].storedType.createTime |
ANY |
Timestamp indicating when the version of the |
inspectConfig.customInfoTypes[].infoType |
OBJECT |
Type of information detected by the API |
inspectConfig.customInfoTypes[].infoType.name |
STRING |
Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64} |
inspectConfig.customInfoTypes[].dictionary |
OBJECT |
Custom information type based on a dictionary of words or phrases. This can be used to match sensitive information specific to the data, such as a list of employee IDs or job titles. Dictionary words are case-insensitive and all characters other than letters and digits in the unicode Basic Multilingual Plane will be replaced with whitespace when scanning for matches, so the dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters surrounding any match must be of a different type than the adjacent characters within the word, so letters must be next to non-letters and digits next to non-digits. For example, the dictionary word "jen" will match the first three letters of the text "jen123" but will return no matches for "jennifer". Dictionary words containing a large number of characters that are not
letters or digits may result in unexpected findings because such characters
are treated as whitespace. The
limits page contains details about
the size limits of dictionaries. For dictionaries that do not fit within
these constraints, consider using |
inspectConfig.customInfoTypes[].regex |
OBJECT |
Message defining a custom regular expression |
inspectConfig.customInfoTypes[].regex.groupIndexes[] |
INTEGER |
|
inspectConfig.customInfoTypes[].regex.pattern |
STRING |
Pattern defining the regular expression. Its syntax (https://github.com/google/re2/wiki/Syntax) can be found under the google/re2 repository on GitHub |
inspectConfig.customInfoTypes[].surrogateType |
OBJECT |
Message for detecting output from deidentification transformations
such as
|
inspectConfig.customInfoTypes[].likelihood |
ENUMERATION |
Likelihood to return for this CustomInfoType. This base value can be
altered by a detection rule if the finding meets the criteria specified by
the rule. Defaults to |
inspectConfig.customInfoTypes[].exclusionType |
ENUMERATION |
If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding to be returned. It still can be used for rules matching |
inspectConfig.customInfoTypes[].detectionRules[] |
OBJECT |
Deprecated; use |
inspectTemplateName |
STRING |
Optional template to use. Any configuration directly specified in inspect_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged |
item |
OBJECT |
Container structure for the content to inspect |
item.value |
STRING |
String data to inspect or redact |
item.byteItem |
OBJECT |
Container for bytes to inspect or redact |
item.byteItem.data |
BINARY |
Content data to inspect or redact |
item.byteItem.type |
ENUMERATION |
The type of data stored in the bytes string. Default will be TEXT_UTF8 |
item.table |
OBJECT |
Structured content to inspect. Up to 50,000 |
item.table.rows[] |
OBJECT |
|
item.table.headers[] |
OBJECT |
General identifier of a data field in a storage service |
item.table.headers[].name |
STRING |
Name describing the field |
= Parameter name
= Format
parent STRING Required The parent resource name, for example projects/my-project-id |
deidentifyTemplateName STRING Optional template to use. Any configuration directly specified in deidentify_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged |
deidentifyConfig OBJECT The configuration that controls how the data will change |
deidentifyConfig.recordTransformations OBJECT A type of transformation that is applied over structured data such as a table |
deidentifyConfig.recordTransformations.recordSuppressions[] OBJECT Configuration to suppress records whose suppression conditions evaluate to true |
deidentifyConfig.recordTransformations.fieldTransformations[] OBJECT The transformation to apply to the field |
deidentifyConfig.infoTypeTransformations OBJECT A type of transformation that will scan unstructured text and
apply various |
deidentifyConfig.infoTypeTransformations.transformations[] OBJECT A transformation to apply to text that is identified as a specific info_type |
inspectConfig OBJECT Configuration description of the scanning process. When used with redactContent only info_types and min_likelihood are currently used |
inspectConfig.contentOptions[] ENUMERATION |
inspectConfig.limits OBJECT |
inspectConfig.limits.maxFindingsPerInfoType[] OBJECT Max findings configuration per infoType, per content item or long running DlpJob |
inspectConfig.limits.maxFindingsPerInfoType[].maxFindings INTEGER Max findings limit for the given infoType |
inspectConfig.limits.maxFindingsPerItem INTEGER Max number of findings that will be returned for each item scanned.
When set within |
inspectConfig.limits.maxFindingsPerRequest INTEGER Max number of findings that will be returned per request/job.
When set within |
inspectConfig.excludeInfoTypes BOOLEAN When true, excludes type information of the findings |
inspectConfig.minLikelihood ENUMERATION Only returns findings equal or above this threshold. The default is POSSIBLE. See https://cloud.google.com/dlp/docs/likelihood to learn more |
inspectConfig.ruleSet[] OBJECT Rule set for modifying a set of infoTypes to alter behavior under certain circumstances, depending on the specific details of the rules within the set |
inspectConfig.ruleSet[].rules[] OBJECT A single inspection rule to be applied to infoTypes, specified in
|
inspectConfig.ruleSet[].infoTypes[] OBJECT Type of information detected by the API |
inspectConfig.infoTypes[] OBJECT Type of information detected by the API |
inspectConfig.infoTypes[].name STRING Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64} |
inspectConfig.includeQuote BOOLEAN When true, a contextual quote from the data that triggered a finding is included in the response; see Finding.quote |
inspectConfig.customInfoTypes[] OBJECT Custom information type provided by the user. Used to find domain-specific sensitive information configurable to the data in question |
inspectConfig.customInfoTypes[].storedType OBJECT A reference to a StoredInfoType to use with scanning |
inspectConfig.customInfoTypes[].storedType.name STRING Resource name of the requested |
inspectConfig.customInfoTypes[].storedType.createTime ANY Timestamp indicating when the version of the |
inspectConfig.customInfoTypes[].infoType OBJECT Type of information detected by the API |
inspectConfig.customInfoTypes[].infoType.name STRING Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64} |
inspectConfig.customInfoTypes[].dictionary OBJECT Custom information type based on a dictionary of words or phrases. This can be used to match sensitive information specific to the data, such as a list of employee IDs or job titles. Dictionary words are case-insensitive and all characters other than letters and digits in the unicode Basic Multilingual Plane will be replaced with whitespace when scanning for matches, so the dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters surrounding any match must be of a different type than the adjacent characters within the word, so letters must be next to non-letters and digits next to non-digits. For example, the dictionary word "jen" will match the first three letters of the text "jen123" but will return no matches for "jennifer". Dictionary words containing a large number of characters that are not
letters or digits may result in unexpected findings because such characters
are treated as whitespace. The
limits page contains details about
the size limits of dictionaries. For dictionaries that do not fit within
these constraints, consider using |
inspectConfig.customInfoTypes[].regex OBJECT Message defining a custom regular expression |
inspectConfig.customInfoTypes[].regex.groupIndexes[] INTEGER |
inspectConfig.customInfoTypes[].regex.pattern STRING Pattern defining the regular expression. Its syntax (https://github.com/google/re2/wiki/Syntax) can be found under the google/re2 repository on GitHub |
inspectConfig.customInfoTypes[].surrogateType OBJECT Message for detecting output from deidentification transformations
such as
|
inspectConfig.customInfoTypes[].likelihood ENUMERATION Likelihood to return for this CustomInfoType. This base value can be
altered by a detection rule if the finding meets the criteria specified by
the rule. Defaults to |
inspectConfig.customInfoTypes[].exclusionType ENUMERATION If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding to be returned. It still can be used for rules matching |
inspectConfig.customInfoTypes[].detectionRules[] OBJECT Deprecated; use |
inspectTemplateName STRING Optional template to use. Any configuration directly specified in inspect_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged |
item OBJECT Container structure for the content to inspect |
item.value STRING String data to inspect or redact |
item.byteItem OBJECT Container for bytes to inspect or redact |
item.byteItem.data BINARY Content data to inspect or redact |
item.byteItem.type ENUMERATION The type of data stored in the bytes string. Default will be TEXT_UTF8 |
item.table OBJECT Structured content to inspect. Up to 50,000 |
item.table.rows[] OBJECT |
item.table.headers[] OBJECT General identifier of a data field in a storage service |
item.table.headers[].name STRING Name describing the field |
Output
This building block provides 21 output parameters
Name | Format | Description |
---|---|---|
item |
OBJECT |
Container structure for the content to inspect |
item.value |
STRING |
String data to inspect or redact |
item.byteItem |
OBJECT |
Container for bytes to inspect or redact |
item.byteItem.data |
BINARY |
Content data to inspect or redact |
item.byteItem.type |
ENUMERATION |
The type of data stored in the bytes string. Default will be TEXT_UTF8 |
item.table |
OBJECT |
Structured content to inspect. Up to 50,000 |
item.table.rows[] |
OBJECT |
|
item.table.headers[] |
OBJECT |
General identifier of a data field in a storage service |
item.table.headers[].name |
STRING |
Name describing the field |
overview |
OBJECT |
Overview of the modifications that occurred |
overview.transformedBytes |
INTEGER |
Total size in bytes that were transformed in some way |
overview.transformationSummaries[] |
OBJECT |
Summary of a single transformation. Only one of 'transformation', 'field_transformation', or 'record_suppress' will be set |
overview.transformationSummaries[].results[] |
OBJECT |
A collection that informs the user the number of times a particular
|
overview.transformationSummaries[].field |
OBJECT |
General identifier of a data field in a storage service |
overview.transformationSummaries[].field.name |
STRING |
Name describing the field |
overview.transformationSummaries[].fieldTransformations[] |
OBJECT |
The transformation to apply to the field |
overview.transformationSummaries[].transformedBytes |
INTEGER |
Total size in bytes that were transformed in some way |
overview.transformationSummaries[].recordSuppress |
OBJECT |
Configuration to suppress records whose suppression conditions evaluate to true |
overview.transformationSummaries[].infoType |
OBJECT |
Type of information detected by the API |
overview.transformationSummaries[].infoType.name |
STRING |
Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64} |
overview.transformationSummaries[].transformation |
OBJECT |
A rule for transforming a value |
= Parameter name
= Format
item OBJECT Container structure for the content to inspect |
item.value STRING String data to inspect or redact |
item.byteItem OBJECT Container for bytes to inspect or redact |
item.byteItem.data BINARY Content data to inspect or redact |
item.byteItem.type ENUMERATION The type of data stored in the bytes string. Default will be TEXT_UTF8 |
item.table OBJECT Structured content to inspect. Up to 50,000 |
item.table.rows[] OBJECT |
item.table.headers[] OBJECT General identifier of a data field in a storage service |
item.table.headers[].name STRING Name describing the field |
overview OBJECT Overview of the modifications that occurred |
overview.transformedBytes INTEGER Total size in bytes that were transformed in some way |
overview.transformationSummaries[] OBJECT Summary of a single transformation. Only one of 'transformation', 'field_transformation', or 'record_suppress' will be set |
overview.transformationSummaries[].results[] OBJECT A collection that informs the user the number of times a particular
|
overview.transformationSummaries[].field OBJECT General identifier of a data field in a storage service |
overview.transformationSummaries[].field.name STRING Name describing the field |
overview.transformationSummaries[].fieldTransformations[] OBJECT The transformation to apply to the field |
overview.transformationSummaries[].transformedBytes INTEGER Total size in bytes that were transformed in some way |
overview.transformationSummaries[].recordSuppress OBJECT Configuration to suppress records whose suppression conditions evaluate to true |
overview.transformationSummaries[].infoType OBJECT Type of information detected by the API |
overview.transformationSummaries[].infoType.name STRING Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64} |
overview.transformationSummaries[].transformation OBJECT A rule for transforming a value |