Use Case: Sensitive Data Discovery

Learn how Screen enables sensitive data discovery

The problem

A massive volume and variety of data is stored in cloud data lakes, but organizations have poor visibility into where sensitive data, PII, or otherwise regulated data may be located. This makes it impossible to determine whether data is being stored securely and in compliance with data privacy regulations. As a result, organizations either tolerate significant security and regulatory risk, or delete/lock down their data, sacrificing a key lever for innovation and differentiation.

How we help

Granica Screen automatically discovers and generates reports for any sensitive data in your cloud storage, with superior accuracy, cost-effectiveness, and scalability compared to off-the shelf approaches. These reports can then be integrated into your workflows for managing sensitive data, or Granica Screen can be configured to automatically take action on detected sensitive data. This gives you maximum visibility into sensitive data being stored, enabling you to make informed decisions about how to manage and use your data.

Screen operates through the Granica Platform, which enables you to screen incoming data in the background immediately after objects land in your buckets. For more details on the Granica architecture and approach for background processing, see the reference for how Screening works. Screen-specific configuration can be configured to target relevant objects and types of sensitive data.

Why we're the best solution

Best-in-class accuracy

We believe that accuracy is a fundamental requirement for effective sensitive data discovery. Accuracy is both difficult to achieve and to measure, since every dataset is unique and benefits from a different approach. Granica Screen is powered by an adaptive classification system that learns from a dataset and maximizes the accuracy of results.

Granica Screen provides demonstrably superior classification accuracy across datasets and use cases. We benchmark our performance on a variety of synthetic data, such as data generated by the Presidio Research library, as well as real datasets across a range of filetypes and industries.

Cost-efficient processing

Solutions like Google Cloud DLP and Amazon Macie start their pricing at $1.00/GB for sensitive data detection, with additional costs to act on the transformed data with redaction or encryption. Scanning 1PB of data would cost up to $1M in spend on detection alone, which is untenable for many use cases. In addition, this is based on the uncompressed size of data! For example, 1PB of data compressed by 75% (typical for big data files e.g. parquet) would actually cost up to $4M in spend on detection at this price.

Some vendors attempt to mitigate this problem by scanning less data through sampling approaches. However, this just means you pay less and get less, and doesn't actually unlock the bulk of your data for your business.

Granica Screen adapts to patterns in data to efficiently process data and avoid wasted computation, enabling sensitive data detection at 1/5th to 1/10th of the cost of other vendors or off-the-shelf solutions.

Scalable to petabytes of data and billions of objects

Granica Screen is built on top of Granica's data processing platform, which serves customers storing petabytes of data and billions of objects. We optimize the process of classifying, transforming, and storing this data in order to deliver secure, reliable, and performant results.

See also