Inspect

Finds potentially sensitive info in content

40 variables
10 variables

Finds potentially sensitive info in content. This method has limits on input size, processing time, and output size.

When no InfoTypes or CustomInfoTypes are specified in this request, the system will automatically choose what detectors to run. By default this may be all types, but may change over time as detectors are updated.

For how to guides, see https://cloud.google.com/dlp/docs/inspecting-images and https://cloud.google.com/dlp/docs/inspecting-text,

Authorization

To use this building block you will have to grant access to at least one of the following scopes:

  • View and manage your data across Google Cloud Platform services

Input

This building block consumes 40 input parameters

  = Parameter name
  = Format

parent STRING Required

The parent resource name, for example projects/my-project-id

item OBJECT

Container structure for the content to inspect

item.value STRING

String data to inspect or redact

item.byteItem OBJECT

Container for bytes to inspect or redact

item.byteItem.data BINARY

Content data to inspect or redact

item.byteItem.type ENUMERATION

The type of data stored in the bytes string. Default will be TEXT_UTF8

item.table OBJECT

Structured content to inspect. Up to 50,000 Values per request allowed. See https://cloud.google.com/dlp/docs/inspecting-text#inspecting_a_table to learn more

item.table.rows[] OBJECT

item.table.headers[] OBJECT

General identifier of a data field in a storage service

item.table.headers[].name STRING

Name describing the field

inspectConfig OBJECT

Configuration description of the scanning process. When used with redactContent only info_types and min_likelihood are currently used

inspectConfig.contentOptions[] ENUMERATION

inspectConfig.limits OBJECT

inspectConfig.limits.maxFindingsPerInfoType[] OBJECT

Max findings configuration per infoType, per content item or long running DlpJob

inspectConfig.limits.maxFindingsPerInfoType[].maxFindings INTEGER

Max findings limit for the given infoType

inspectConfig.limits.maxFindingsPerItem INTEGER

Max number of findings that will be returned for each item scanned. When set within InspectDataSourceRequest, the maximum returned is 2000 regardless if this is set higher. When set within InspectContentRequest, this field is ignored

inspectConfig.limits.maxFindingsPerRequest INTEGER

Max number of findings that will be returned per request/job. When set within InspectContentRequest, the maximum returned is 2000 regardless if this is set higher

inspectConfig.excludeInfoTypes BOOLEAN

When true, excludes type information of the findings

inspectConfig.minLikelihood ENUMERATION

Only returns findings equal or above this threshold. The default is POSSIBLE. See https://cloud.google.com/dlp/docs/likelihood to learn more

inspectConfig.ruleSet[] OBJECT

Rule set for modifying a set of infoTypes to alter behavior under certain circumstances, depending on the specific details of the rules within the set

inspectConfig.ruleSet[].rules[] OBJECT

A single inspection rule to be applied to infoTypes, specified in InspectionRuleSet

inspectConfig.ruleSet[].infoTypes[] OBJECT

Type of information detected by the API

inspectConfig.infoTypes[] OBJECT

Type of information detected by the API

inspectConfig.infoTypes[].name STRING

Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64}

inspectConfig.includeQuote BOOLEAN

When true, a contextual quote from the data that triggered a finding is included in the response; see Finding.quote

inspectConfig.customInfoTypes[] OBJECT

Custom information type provided by the user. Used to find domain-specific sensitive information configurable to the data in question

inspectConfig.customInfoTypes[].storedType OBJECT

A reference to a StoredInfoType to use with scanning

inspectConfig.customInfoTypes[].storedType.name STRING

Resource name of the requested StoredInfoType, for example organizations/433245324/storedInfoTypes/432452342 or projects/project-id/storedInfoTypes/432452342

inspectConfig.customInfoTypes[].storedType.createTime ANY

Timestamp indicating when the version of the StoredInfoType used for inspection was created. Output-only field, populated by the system

inspectConfig.customInfoTypes[].infoType OBJECT

Type of information detected by the API

inspectConfig.customInfoTypes[].infoType.name STRING

Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64}

inspectConfig.customInfoTypes[].dictionary OBJECT

Custom information type based on a dictionary of words or phrases. This can be used to match sensitive information specific to the data, such as a list of employee IDs or job titles.

Dictionary words are case-insensitive and all characters other than letters and digits in the unicode Basic Multilingual Plane will be replaced with whitespace when scanning for matches, so the dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters surrounding any match must be of a different type than the adjacent characters within the word, so letters must be next to non-letters and digits next to non-digits. For example, the dictionary word "jen" will match the first three letters of the text "jen123" but will return no matches for "jennifer".

Dictionary words containing a large number of characters that are not letters or digits may result in unexpected findings because such characters are treated as whitespace. The limits page contains details about the size limits of dictionaries. For dictionaries that do not fit within these constraints, consider using LargeCustomDictionaryConfig in the StoredInfoType API

inspectConfig.customInfoTypes[].regex OBJECT

Message defining a custom regular expression

inspectConfig.customInfoTypes[].regex.groupIndexes[] INTEGER

inspectConfig.customInfoTypes[].regex.pattern STRING

Pattern defining the regular expression. Its syntax (https://github.com/google/re2/wiki/Syntax) can be found under the google/re2 repository on GitHub

inspectConfig.customInfoTypes[].surrogateType OBJECT

Message for detecting output from deidentification transformations such as CryptoReplaceFfxFpeConfig. These types of transformations are those that perform pseudonymization, thereby producing a "surrogate" as output. This should be used in conjunction with a field on the transformation such as surrogate_info_type. This CustomInfoType does not support the use of detection_rules

inspectConfig.customInfoTypes[].likelihood ENUMERATION

Likelihood to return for this CustomInfoType. This base value can be altered by a detection rule if the finding meets the criteria specified by the rule. Defaults to VERY_LIKELY if not specified

inspectConfig.customInfoTypes[].exclusionType ENUMERATION

If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding to be returned. It still can be used for rules matching

inspectConfig.customInfoTypes[].detectionRules[] OBJECT

Deprecated; use InspectionRuleSet instead. Rule for modifying a CustomInfoType to alter behavior under certain circumstances, depending on the specific details of the rule. Not supported for the surrogate_type custom infoType

inspectTemplateName STRING

Optional template to use. Any configuration directly specified in inspect_config will override those set in the template. Singular fields that are set in this request will replace their corresponding fields in the template. Repeated fields are appended. Singular sub-messages and groups are recursively merged

Output

This building block provides 10 output parameters

  = Parameter name
  = Format

result OBJECT

All the findings for a single scanned item

result.findingsTruncated BOOLEAN

If true, then this item might have more findings than were returned, and the findings returned are an arbitrary subset of all findings. The findings list might be truncated because the input items were too large, or because the server reached the maximum amount of resources allowed for a single API call. For best results, divide the input into smaller batches

result.findings[] OBJECT

Represents a piece of potentially sensitive content

result.findings[].likelihood ENUMERATION

Confidence of how likely it is that the info_type is correct

result.findings[].quoteInfo OBJECT

Message for infoType-dependent details parsed from quote

result.findings[].quote STRING

The content that was found. Even if the content is not textual, it may be converted to a textual representation here. Provided if include_quote is true and the finding is less than or equal to 4096 bytes long. If the finding exceeds 4096 bytes in length, the quote may be omitted

result.findings[].location OBJECT

Specifies the location of the finding

result.findings[].infoType OBJECT

Type of information detected by the API

result.findings[].infoType.name STRING

Name of the information type. Either a name of your choosing when creating a CustomInfoType, or one of the names listed at https://cloud.google.com/dlp/docs/infotypes-reference when specifying a built-in type. InfoType names should conform to the pattern [a-zA-Z0-9_]{1,64}

result.findings[].createTime ANY

Timestamp when finding was detected