This page is optimized for wider screens.

How policies work

Learn how to manage Granica using policies.

Granica policies give you additional controls around crunching and managing data in your cloud object stores. They enable you to centrally manage Granica behind the scenes without impacting developer workflows or applications. Granica policies support a range of use cases, enabling you to:

  1. Automatically crunch new buckets
  2. Control crunching while you Granica-enable your custom applications
  3. Crunch cold data without making any changes to your custom applications
  4. Use Crunch with 1st and 3rd party applications that are not Granica-enabled
  5. Restrict modifications or deletions of objects for retention and compliance
  6. Maintain your existing object deletion policies for GDPR and compliance
  7. Maintain your existing lifecycle policies to tier crunched data to archival storage classes (e.g. S3 Glacier, GCS Coldline)
  8. Protect against accidental deletion of objects
  9. Control which objects within a bucket are crunch-eligible
  10. Automate object removal

Unlike S3/GCS policies which are managed at the individual bucket level, Granica policies are applied and managed globally making them simple to administer.


1Policy Management

How to manage policies

Use the granica policy edit command to view, change, apply and delete Granica policies. Granica policies will not be in effect for a given bucket until that bucket has been discovered and crunched. This occurs when you manually run the granica crunch <bucket>, granica execute-policy, and/or granica crunch universe commands, or when you enable auto-crunch for the universe. Assuming the bucket matches the policy filters, then these actions place the bucket under ongoing management by Crunch. In other words, unmanaged buckets are not affected by policy edits, but once a bucket is managed any subsequent policy updates will take effect immediately on save with no need to re-crunch.

Executing granica policy edit will open your existing policy in the editor defined by the VISUAL or EDITOR environment variable on your Granica Admin Server. If neither of the variables exist, the command will open the policy using vi (vim).

The first time you execute granica policy edit command your editor will open the default Granica policy. If you modify and commit (write) the policy and exit the editor the modified policy is automatically applied and takes effect immediately. If you commit (write) an empty file and exit the editor the existing policy is deleted and replaced with the default policy. If you exit the editor without commiting (writing) any changes then the existing policy remains in place.

2Bucket Discovery (the "universe")

Introduction

The universe section operates at the account level, not the bucket level, and specifies how and when Crunch discovers buckets and projects in your account. Discovered buckets are then filtered, crunched and managed using the relevant policies.

`auto-crunch`

If enabled, Crunch automatically and periodically runs granica crunch universe to discover and crunch newly created buckets in your account. Also, every time you save the policy with auto-crunch enabled Crunch will run granica crunch universe.

If not defined or enabled, bucket discovery (and crunching) only occurs whenever you manually run granica crunch universe.


Use cases


  • Automatically crunch new buckets The auto-crunch policy makes it easy to capture savings from newly created buckets in your account.

3Filters

Introduction

Crunch policy consists of three main filters:

  • standard
  • include
  • exclude

`standard`

Specifies the parameters which apply to all buckets which are crunch-eligible (more on eligibility below). Standard settings can be overwritten by the settings listed in the include section for a particular bucket or buckets matching the glob patterns. See Customizing Crunch Policy.


Use cases


  • Create default policies. The standard filter makes it easy to define default crunch behavior.

`crunch-enable`

Specifies whether to enable Crunch. When set to “true”, Crunch is enabled according to the policy settings. When set to “false”, Crunch is disabled.


  • Prevent crunching while you Granica-enable your custom applications. If you have a small number of buckets to exclude (vs. include), you can easily exclude them with the exclude policy. If you have buckets that are accessed by multiple applications, then all those applications must be Granica-enabled before you remove the exclusion and queue crunching via the CLI.

`exclude`

Specifies which buckets are not crunch-eligible. The exclude pattern overrides the include pattern. If a bucket is listed here or meets the globe pattern then it will NOT be crunched regardless whether it is listed in the include section or an explicit crunch command (granica crunch <bucket>) is issued.


Use cases


  • Prevent crunching while you Granica-enable your custom applications. If you have a small number of buckets to exclude (vs. include), you can easily exclude them with the exclude policy. If you have buckets that are accessed by multiple applications, then all those applications must be Granica-enabled before you remove the exclusion and queue crunching via the CLI.

`include`

Specifies a list of global bucket glob patterns that define which buckets are crunch-eligible. Bucket customization parameters are also defined here. If there are no buckets listed in this section, no buckets will be crunched when crunching universe (granica crunch universe). If the include section is populated, only buckets listed or buckets matching the glob pattern will be eligible for crunching. Buckets not explicitly listed in this section can still be crunched with granica crunch <bucket> command. For bucket customization paramaters see Customization.


Use cases


  • Control the scope of buckets which Crunch will crunch. The include filter makes it easy to expand or reduce the scope of the granica crunch universe command.
  • Create exceptions for specific buckets and applications. The include filter makes it easy to add exceptions to the standard policies.

4Customization

Introduction

Crunch policy allows you to easily customize crunching for individual buckets or a group of buckets. To customize a specific bucket simply list it in the include section along with the desired settings. If you have a glob pattern defined there, then adding the desired policy setting will apply to all matching buckets. The custom policy setting for a bucket or buckets will overwrite the standard settings defined in the standards section. If a parameter is left unspecified in the custom policy settings, the value from the standard section will apply.

`freeze-for`

Specifies how long objects in a bucket must be retained before they can be updated, moved or deleted. The duration can be specified in intervals of seconds s, minutes m, hours h, or days d. This is the same as bucket retention policies you may be familiar with. Crunch will crunch both your incoming and existing data as per the policies set, but will not allow the crunched objects to be updated, moved or deleted until the freeze-for period expires.

Configure the freeze-for period to align with your retention requirements.


Use cases


  • Restrict modifications or deletions of objects for retention and compliance. U.S.-based financial service institutions such as banks, broker-dealers and record keepers are required to comply with a number of regulations specifying requirements for electronic records retention, including the Securities and Exchange (SEC) Rule 17a-4(f), Commodity Futures Trading Commission (CFTC) Rule 1.31(c)-(d), and Financial Industry Regulatory Authority (FINRA) Rule 4511(c).

`crunch-after`

Specifies how long Crunch waits to crunch a object after it has been created. After you queue crunching via the CLI, Crunch continuously monitors your source buckets for new objects. However, instead of immediately crunching existing objects and/or new objects when they land, Crunch waits to crunch the objects until the crunch-after duration has passed.

Set the crunch-after duration to be greater than or equal to your data access window for your specified buckets.


Use cases


  • Crunch data without making any changes to your own applications. This use case requires that your applications not read data after a known period of time (say 30 days), i.e. that the data is cold after this timeframe. The benefit is that you do not need to make any changes whatsoever to your applications and so you can start seeing storage savings immediately; however, there are trade-offs. First, your savings are reduced as you will pay full storage costs for all data inside the crunch-after window. Second, it requires close coordination between between appdev teams to ensure existing (or more likely new) applications either (a) do not attempt to read the crunched data outside the crunch-after window or (b) are Granica-enabled before attempting to access the data. You can also use the Granica CLI plugin to access crunched data.

  • Use Crunch with 1st and 3rd party applications that are not Granica-enabled. This use case requires that you wait to crunch your data until those 3rd party applications have stopped accessing it. For example, you could have a SaaS application like Snowflake or Redshift ingest your data (and thus make their own copy), and once the ingestion is complete you can crunch your data and continue to use it with your own Granica-enabled applications or the Granica CLI plugin.

`tier`

If defined, specifies how long Crunch waits after objects have been modified (or previously tiered) before automatically moving them from their current storage class to a lower-cost storage class. tier takes in a list of objects with class and after fields. Your crunched objects will move from their current storage class into your specified class after the time period after expires. The after duration can be specified in intervals of seconds s, minutes m, hours h, or days d. Valid class options are:

  • Amazon S3: glacier, deep-archive
  • Google GCS: coldline, archive

Set your desired storage tier using the class field, and your desired tiering timeframe using the after field.


Use cases


  • Maintain your existing lifecycle policies. Easily tier crunched data to archival storage classes (e.g. S3 Glacier, GCS Coldline).
  • Further reduce storage costs for cold objects. Use tier for truly cold data such as old backups.

`expire-after`

If defined, specifies how long Crunch keeps objects after they have been modified before automatically deleting them. When expire-after triggers Crunch to delete an object, all copies are deleted immediately. This is the case regardless whether the Recycle Bin is enabled, i.e. the Recycle Bin does not protect objects that are automatically deleted via expire-after. Crunch deletes the expired object whether it is currently in your S3/GCS store or in a tiered storage class. Set your desired expiration timeframe using the expire-after field.


Use cases


  • Maintain your existing object deletion policies for GDPR and compliance. Crunch supports your existing deletion/expiration policies to ensure that when a crunched object expires it, and all copies, are deleted.
  • Automatically clean up any temp data used for test/dev. Crunch helps you delete temporary data to reduce your storage costs. You can also use expire-after in combination with instant, free copies (Coming Soon) to further lower your costs as well as increase the speed of your development workflows.

`uncrunch-expire-after`

uncrunch-expire-after specifies how long Crunch keeps objects after they have been uncrunched before automatically deleting them from the source bucket. When uncrunch-expire-after triggers Crunch to delete an uncrunched object, the object is deleted immediately from the source bucket. Set your desired expiration timeframe using the uncrunch-expire-after field. If not defined, the uncrunch-expire-after field defaults to a value of 1 day(1d).

`recycle-bin`

If defined, enables recoverability of objects deleted either via the Granica API or CLI. The Recycle Bin does not apply to objects that are automatically deleted via expire-after. The delete-after field specifies the duration for which objects are retained for recoverability before they are permanently deleted. The duration can be specified in intervals of seconds s, minutes m, hours h, or days d. Click here for more information about the Recycle Bin feature.

  1. Set the delete-after field to your desired duration (e.g. "30d" for 30 days).
  2. Restore objects in the event you accidentally delete them.

Use cases


  • Protect against accidental deletion of objects. The Crunch Recycle Bin persists data even after it has been explicitly deleted by your application, ensuring that you can quickly and easily recover objects that were accidentally deleted.

`object-include`

If defined, specifies an object regex pattern that defines which objects are crunch-eligible. Crunch applies the regex pattern on the entire key and only objects that match this pattern will be crunched via granica crunch <bucket> and/or granica crunch universe.


Use cases


  • Control which objects within a bucket are crunch-eligible. Allows you to have a wide range of objects in a single bucket yet have full control over which objects can be crunched.

`object-exclude`

If defined, specifies an object regex pattern that defines which objects are not crunch-eligible. Crunch applies the regex pattern on the entire key. The object-exclude pattern overrides the object-include pattern, i.e. objects that match this pattern will NOT be crunched via granica crunch <bucket> and/or granica crunch universe regardless whether they match the object-include pattern.


Use cases


  • Control which objects within a bucket are crunch-eligible. Allows you to have a wide range of objects in a single bucket yet have full control over which objects can be crunched.

`cleaner`

Crunched objects are cleaned on scheduled basis from the source bucket when certain conditions are met. The process responsible for cleaning the crunched objects called cleaner can be disabled. When cleaner is disabled, the crunched objects will not be removed from the source bucket.


Use cases


  • Efficiently integrate Crunch into your environment. The cleaner policy can be used to prevent the deletion of crunched objects from source buckets, thus allowing the re-use of the same test objects without potentially lengthy and costly object movement such as archiving and restoration.
  • Initiate storage savings. With cleaner enabled, Crunch removes any original full-size objects and their associated costs from the environment. Screen Policy documentation

5Screen Filters

`screen`

Nested mapping which specifies screen-specific behaviors. It includes the following fields:


`enable`

Specifies whether to enable Screen. When set to “true”, Screen is enabled according to the policy settings. When set to “false” (default), Screen is disabled.


`classification-types`


An array of classification type specifications, specifying which classification types to enable, as well as the desired likelihood level. Each specification is a mapping with the following fields:

`type`


The name of the classification type. See [link](granica-screen-configuration) for a list of supported classification types.

`likelihood`


The minimum likelihood of a match. Lower likelihood threshold gives better recall, while higher likelihood threshold gives better precision. Accepted values: LOW, MEDIUM (default), HIGH

`transformation-params`


- A mapping of configurations specifying what transforms, if any, to apply to the data. This mapping includes the following fields:

`transformation-type`


The type of transformation to apply to sensitive data identified. See [link](granica-screen-configuration) for documentation of behaviors

  • Accepted values: "NONE”, “NAMED", "REDACTED", "SIZE_PRESERVING"
  • Default: “NONE”

`redaction-char`


Character to use for redaction, if transformation type is SIZE_PRESERVING. Ignored otherwise.


Default: '#'


`redaction-string`

Character to use for redaction, if transformation type is REDACTED. Ignored otherwise.

  • Default: REDACTED

`transformation-output-path`


A cloud bucket/prefix to write transformed objects to. Cross-cloud writes are not supported, so the specification should be of the form “” or “/

Must be set if transformation-type is not NONE.


`max-objs-per-day`


Specify maximum number of objects per day in a bucket to scan, for sampling purposes.


  • Default: no limit.

`obj-sampling-percent`


Specify target percent of objects to be sampled in a bucket for scanning. This percentage of objects will be scanned, up to max-objs-per-day (if set).


  • Default: 100%

policy.yaml
############################################################
# For an explanation of policy fields and options, please #
# refer to our documentation website at: #
# #
# https://www.docs.granica.ai/how-policies-work #
# #
# Any unsaved changes and comments will be discarded #
############################################################
universe:
auto-crunch: disable
screen:
enable: true
classification-types:
standard:
crunch-enable: true
freeze-for: disable
crunch-after: disable
tier:
- class: disable
after: disable
expire-after: disable
recycle-bin:
delete-after: 30d
object-include: disable
object-exclude: disable
cleaner: enable
uncrunch-expire-after: disable
screen:
enable: "false"
classification-types:
- type: ABA_ROUTING_MICR
likelihood: MEDIUM
- type: US_DRIVERS_LICENSE
likelihood: MEDIUM
- type: US_ADDRESS
likelihood: MEDIUM
- type: SSN
likelihood: MEDIUM
- type: PASSWORD
likelihood: MEDIUM
- type: PERSON_NAME
likelihood: MEDIUM
- type: PHONE_NUMBER
likelihood: MEDIUM
- type: CREDIT_CARD_NUMBER
likelihood: MEDIUM
- type: EMAIL
likelihood: MEDIUM
- type: VEHICLE_IDENTIFICATION_NUMBER
likelihood: MEDIUM
- type: PASSPORT
likelihood: MEDIUM
- type: IP
likelihood: MEDIUM
transformation-params:
transformation-type: SIZE_PRESERVING
redaction-char: "#"
redaction-string: "[REDACTED]"
transformation-output-path: TRANSFORMATION-STORAGE-PATH
max-objs-per-day: ""
obj-sampling-percent: ""
exclude:
- bucket: bucket-z-*
include:
- bucket: bucket-a-*
freeze-for: 30d
crunch-after: 30d
tier:
- class: glacier
after: 60d
recycle-bin:
delete-after: 45d
object-include: \.*.json$
object-exclude: \.*.pdf$
- bucket: bucket-b-*
crunch-after: 1d
expire-after: 365d
recycle-bin:
delete-after: disable
cleaner: disable
uncrunch-expire-after: 2d