Key-Value, Wide-Column, Time-Series

When 'just look up by key' beats SQL — and the cost of giving up joins.

0/2 done

Overview

Key-Value, Wide-Column, Time-Series

When 'just look up by key' beats SQL — and the cost of giving up joins.

Why it matters

KV stores (Redis, DynamoDB) win on raw read latency and write throughput. Wide-column (Cassandra) wins on linear scalability for known access patterns. Time-series (Influx, Timescale) wins on retention + downsampling out of the box.

Going deeper

Workload → store cheat-sheet:

WorkloadBest fitWhy
Session cache, rate-limit counterKey-Value (Redis)Sub-ms by-key reads, TTL built in
'Show me Alice's last 50 messages'Wide-columnPartition by user_id, cluster by created_at DESC, single seek
Server CPU metrics, IoT sensor streamTime-seriesNative downsampling, retention, gap-fill
Ad-hoc analytical join across 6 tablesWarehouse / OLAPNone of the three above will help — wrong tool

Cassandra-shaped antipattern: 'we need flexibility, let's add secondary indexes later.' Cassandra's secondary indexes scatter-gather across every node and degrade rapidly. Model per query means: list the queries first, then design one table per query, and accept the write-amplification.

Analogy

These three stores are purpose-built workshop tools, not Swiss Army knives.

  • Key-Value (Redis, DynamoDB) is the coat-check ticket: hand over a number, get your coat back in microseconds. Beautifully fast at one job, hopeless if you ask 'which coats are red?'.
  • Wide-column (Cassandra, ScyllaDB) is the seat number on a stadium ticket: sector + row + seat tells the usher exactly which partition + cluster column to walk to. Brilliant for known access patterns, painful when the question is 'find me all the people wearing hats'.
  • Time-series (InfluxDB, TimescaleDB, Prometheus) is the security-camera archive: append-only by timestamp, automatic downsampling of old footage, retention policies built in. Anything queried by 'what happened between T1 and T2?' belongs here.

The pricing is the same in all three: huge throughput / scale wins, paid for in giving up ad-hoc join freedom.

Make it stick

Use the prompts below to anchor key-value, wide-column, time-series to something you actually own.

  • Where in your stack is a relational DB being used as a key-value cache? That hot path probably belongs in Redis.
  • Pick a Cassandra (or DynamoDB) table you know. Which access pattern was bolted on *after* the schema, and what did it cost in writes?
  • Which time-bucketed query in your warehouse would be 10× cheaper in a real time-series DB?

Reading in progress · 0 of 2 activities done