A pre-aggregation lets the semantic layer trade…

Freshness for query latency — the matched query reads a pre-built rollup instead of scanning the fact table

Pre-aggregations — When Sub-Second Beats Real-Time — Semantic Web Academy

Overview

A second is a long time

An interactive dashboard at 5-second latency feels broken; at 500 ms it feels alive. Hitting 500 ms over a 10 TB fact table is impossible without pre-aggregation: the engine writes a small, dimensionally-keyed rollup table on a schedule, then transparently substitutes it when a query matches.

The contract a pre-aggregation makes

Match condition — (measures, dimensions, time grain, filters) of the live query are a subset of the rollup's.
Refresh policy — every 30 min, every hour, hourly+real-time delta, etc.
Storage target — Snowflake table, Cube Store, ClickHouse, S3+Iceberg.

Why this is a semantic-layer concern, not a warehouse concern

Materialised views in the warehouse are a flat list. The semantic layer knows the metric tree and can pick which rollups to maintain so the matching subset covers 90% of real queries with 5% of the storage. It is an instrumentation-driven optimisation, not a guess.

The daily-specials board

Pre-aggregations are the printed daily-specials board outside a café: the menu doesn't get re-cooked for every customer; the popular answers are pre-prepared and served instantly while bespoke orders still hit the kitchen. The trade-off is that the board is right as of this morning — not as of right now. For 99 customers who order the special, that's the right deal.

Reflect

Open your slowest dashboard. What % of its panels actually need second-by-second freshness? For the rest, a 30-min rollup is invisible to users and 100× cheaper. The anti-pattern is asking the warehouse to recompute the fact table at every page load.

›Which dashboard in your stack would benefit most from a single pre-aggregation — and what is the freshness SLA users actually need?
›Have you ever seen a team scale up warehouse credits to 'fix' dashboard latency that was really a missing pre-aggregation?

Reading in progress · 0 of 2 activities done