Granica Screen Configuration

Once you've installed the Granica platform with Granica Screen enabled, it's time to start monitoring and protecting your data. This can be managed through the Granica CLI interface.

1. Identify data to be protected by Granica Screen

Granica Screen supports scanning existing data stored in data lakes such as Amazon S3 and Google Cloud Storage (GCS). The first step is to identify buckets or data of interest are identified, which might be all buckets within your organization! The data of interest can then be configured for scanning in the Granica policy.

Currently, the following file types are supported - unsupported files will be skipped and will not affect the scanning process.

File Type	Extensions	Scan Method	Available Now
Big Data	.parquet, .snappy.parquet	Structured Parsing	Yes
Comma/tab separated	.csv, .tsv	Structured Parsing	Yes
Text	.json, .txt, .html, etc.	Intelligent Parsing	Yes
Email	.eml	Intelligent Parsing	Yes
Archived/Compressed	.gz, .zip	Decompress and Parse	Yes
Image	.jpeg, .png, .tiff	OCR	In progress, contact us
Document	.pdf, .doc, .xlsx, .pptx	Intelligent Parsing	In progress, contact us

2. Specify types of sensitive data to identify

Within the Granica policy, the set of sensitive data to identify can be configured.

Currently, the following types of sensitive data are supported by standard classifiers. Custom classifiers can also be specified in addition to these, and Granica is continuously adding support for additional types of sensitive data. Note: If data can be interpreted as multiple PII types, we report the most likely type.

3. Specify report format and location

After the data is scanned, Granica Screen generates reports for each instance of sensitive data identified. The format and location of this report can be customized as follows within the Granica policy.

Configuration	Options
Output format	json, csv, Parquet
Output compression	none, gzip, snappy (Parquet only)
Output location	An AWS S3 or GCS location. If unspecified, a bucket will automatically be created.

The generated report includes the following information for each instance of sensitive data:

Column	Type	Description
n	bigint	Index of result within result file
obj_key	string	The cloud object containing this instance of sensitive data
classification_type	string	The type of sensitive data identified
offset	bigint	The offset location within an unstructured file
classified_size	bigint	The length of the result within an unstructured file
row	bigint	The row number of a result within a tabular file
col	bigint	The column number of a result within a tabular file
column_name	bigint	The column name of a result within a tabular file, when available
data	string	The sensitive data identified (optional via policy)

4. Specify the redacted output format

In addition to generating a detection report, Granica Screen can directly redact sensitive data from a file and create a sanitized copy of the data at a separately configured cloud location. Appropriately redacted data can then be used in broader contexts to enable additional use cases while managing privacy risk.

A variety of redaction formats are supported, along with additional customization options.

Transformation Type	Description
Redaction	Removal of sensitive data without replacement, e.g. "My name is John Smith" to "My name is"
Replacement	Replacement of sensitive data with a fixed value, e.g. [REDACTED]
Size-preserving replacement	Replacement of sensitive data with a value of equal length, e.g. XXXXX
Named replacement	Replacement of sensitive data with a label identifying the type of sensitive data, e.g. [EMAIL]
Numbered replacement	Replacement of sensitive data with a label identifying each unique instance of sensitive data, e.g. [EMAIL_1] and [EMAIL_2]
Encrypted	Replacement of sensitive data with an encrypted value, e.g. [EMAIL_encryptedemailaddress]
Format preserving encrypted	Replacement of sensitive data with an encrypted value, preserving the original format, e.g. john@granica.ai to siek@jtiwoei.qb
Synthetic data replacement	Replacement of sensitive data with a similar synthetic value of the same type, e.g. replacing John with Evan

If you need further assistance with redaction formats, contact us for details.

Configuration

1. Identify data to be protected by Granica Screen

2. Specify types of sensitive data to identify

3. Specify report format and location

4. Specify the redacted output format

See also

Was this page helpful?

Configuration

1. Identify data to be protected by Granica Screen#

2. Specify types of sensitive data to identify#

3. Specify report format and location#

4. Specify the redacted output format#

See also#

Was this page helpful?

1. Identify data to be protected by Granica Screen

2. Specify types of sensitive data to identify

3. Specify report format and location

4. Specify the redacted output format

See also