Caching as a Semantic Problem: Redis, Pre-Aggregation, and Mixed Granularity Data

- January 22, 2026

Caching is usually treated as a performance hack.

Add Redis.
Cache query results.
Invalidate aggressively.
Hope nothing breaks.

That approach works, until you introduce semantics.

Once users can ask the same question in multiple ways, simple query caching starts to fall apart.

Why Query Caching Breaks Down

Consider these two questions:

“Sales by category by month”

“Monthly sales for product categories”

Syntactically different.
Semantically identical.

A traditional cache sees:

Two queries
Two cache keys
Two entries

A semantic system should see:

One intent
One reusable result

This is where caching stops being technical and starts being conceptual.

The Shift: Cache Meaning, Not Queries

Instead of caching:

SQL strings
Serialized result sets

The idea is to cache:

Semantic tuples
At defined grains
With known aggregation rules

For example:

(Time:Month, Product:Category, Measure:Sales)

Once that tuple exists, many questions can reuse it.

Mixed Granularity Is the Hard Part

Real data is messy.

Some dimensions arrive at:

SKU level Others at:
Category Or:
Brand

If you insist on leaf-level purity, you either:

Explode storage
Or recompute constantly

The alternative is semantic awareness of granularity.

Redis as a Granularity-Aware Cache

Redis works well here because:

Keys are cheap
Structures are flexible
Access is fast enough to experiment

Instead of: cache:query:{hash}

You start thinking in:


cache:tuple:Time:2025-03|Product:Category:Bike|Measure:Sales

The cache knows:

What level the data represents
What it can roll up to
What it can safely combine with

Pre-Aggregation Without Over-Commitment

The goal isn’t to pre-aggregate everything.

It’s to pre-aggregate:

Common levels
Stable dimensions
High-fan-out queries

Then allow:

On-the-fly composition
Partial reuse
Fallback to source data

Redis becomes a semantic accelerator, not a source of truth.

Why This Feels Different

This approach:

Reduces duplicate computation
Improves cache hit rates organically
Aligns performance with meaning

Most importantly:

It allows the system to explain why a result was fast or slow.

That’s rare and valuable.

A Quiet Benefit: Explainability

Because cached data is semantically labelled, you can say:

“This result used cached monthly category aggregates”
“This part was computed at runtime due to missing grain”

Performance stops being magical.

It becomes understandable.

Closing Thought

Caching isn’t about speed.

It’s about reusing understanding.

Once you treat it that way, tools like Redis start to look very different.

Search This Blog

Dot Net Consultant