Crunch FAQ
Frequently asked questions about Granica Crunch.
What is Granica Crunch?
Granica Crunch is a lakehouse-native compression optimizer that reduces the physical size of columnar files (primarily Apache Parquet) by 15-60%, without any data loss. It runs entirely within your VPC and is not in the read path.
Is it lossless?
Yes. Crunch uses lossless compression optimization. Every byte of your original data is preserved and recoverable. Pre- and post-crunch checksums are validated automatically.
Does it work with my existing tools?
Yes. Crunch produces standard, format-compliant files. Any tool that reads Parquet today (Spark, Trino, Presto, Athena, BigQuery, Databricks, etc.) can read Crunched files without modification.
How long does it take to see savings?
You can start seeing storage cost reductions within hours of running the granica crunch command on your first bucket. The exact timeline depends on the volume of data being crunched.
Does Crunch affect query performance?
Smaller files generally improve query performance because less data needs to be read from storage. Benchmarks show up to 56% improvement on TPC-DS queries.
What happens if Crunch encounters a corrupted file?
Crunch validates every file before and after processing. If any integrity check fails, Crunch stops processing that bucket and alerts the operations team. No corrupted data is ever written.
Which clouds are supported?
Currently AWS (Amazon S3) and GCP (Google Cloud Storage). Azure support is on the roadmap.
How is Crunch priced?
Crunch pricing is outcome-based — you pay a percentage of the savings Crunch generates. There are no upfront costs. Contact sales@granica.ai for details.