Reverse ETL & Text-to-Metric (LLM-Friendly Semantic Layers)

Activating governed metrics in SaaS tools and grounding LLM agents on the metric catalogue, not raw SQL.

0/2 done

Overview

The semantic layer's two newest customers

BI dashboards used to be the only consumer that mattered. Today there are two more, both of which only work if the semantic layer is a real API:

1. Reverse ETL — operational activation

Tools like Hightouch, Census and Polytomic read from the warehouse / semantic layer and push the rows into operational SaaS tools — Salesforce, HubSpot, Braze, Zendesk, Intercom. The promise: the same customer_health_score that drives the executive dashboard appears as a custom field on every salesperson's Salesforce account, and as a Braze audience for the lifecycle-marketing email.

Without a semantic layer, reverse ETL produces a fork: the warehouse number and the Salesforce number diverge within a month. With a semantic layer, the same definition drives both — and changes propagate uniformly.

2. Text-to-Metric — LLM agents that don't hallucinate

Vanilla 'text-to-SQL' is a hallucination machine: the model confidently invents column names. Text-to-metric constrains the LLM to a small typed catalogue (metrics, dimensions, time_grains, filters) and emits requests against the semantic layer instead of raw SQL.

// Agent input: 'How did revenue per region trend last quarter?'
// Agent output (validated against the catalogue):
{
  "metrics":   ["revenue"],
  "groupBy":   ["region", "week"],
  "timeGrain": "week",
  "filter":    "last_quarter"
}

Hallucination becomes structurally impossible: the agent can only ask for metrics that exist, on dimensions that exist, at grains that exist. The semantic layer is the LLM's schema.

Map vs improvised directions

Text-to-metric is a taxi dispatcher reading addresses off a printed local map. Drivers (LLMs) follow listed streets exactly, instead of inventing roads from a half-remembered description of the city. Text-to-SQL is the same driver with no map, asked to get to '47 Maple Avenue near the old church' — they'll confidently arrive somewhere, just not necessarily at the address you wanted.

Reflect

The semantic layer becomes existential the moment any LLM-powered surface is in production. Without it, you cannot give an agent a finite, governed action space — and an agent without a finite action space is indistinguishable from a random number generator with good prose.

  • Where in your roadmap does an LLM-driven analytics surface appear — and is the team that owns it talking to the team that owns the metric layer?
  • Which 5 metrics would you expose first to an internal Slack-bot agent? That's also the priority list for the contract / ownership / SLA work.

Reading in progress · 0 of 2 activities done