Iceberg, Delta, Hudi — Table Formats for the Lakehouse

ACID on object storage, time travel, schema evolution — the big idea.

0/2 done

Theory

A 'table format' is a metadata layer above Parquet

Plain Parquet on S3 has no concept of a transaction. Two writers stomp on each other; a partial failure leaves orphaned files; you cannot UPDATE or DELETE a row.

Iceberg / Delta Lake / Hudi solve this by maintaining a manifest (a JSON / Avro log of which Parquet files compose the current table snapshot). They give you:

  • ACID transactions on object storage (S3, GCS, ADLS).
  • Time travel — query the table as of a previous snapshot for debugging, audit, or rollback.
  • Schema evolution — add, drop, rename columns without rewriting data.
  • Hidden partitioning (Iceberg) — the query doesn't need to know the partition column.

This is the foundation of the lakehouse: warehouse-class guarantees on cheap object storage, queryable by Spark, Trino, Snowflake, Flink and DuckDB.

Analogy

Plain Parquet files on S3 are loose pages scattered on a desk — the words are all there, but there's no table of contents, no edition number, and if two people scribble at once you get chaos. A table format (Iceberg/Delta/Hudi) is the version-controlled book built on top of those pages: a manifest acts as the table of contents listing exactly which pages make up 'edition 8273', so you can flip back to any past edition (time travel), safely co-author (ACID), and insert a chapter without reprinting the whole book (schema evolution). Same paper underneath — a spine and an index turn it into a library.

Reflect

Picking Iceberg vs Delta vs Hudi is increasingly a vendor-affinity decision (Snowflake Iceberg, Databricks Delta, AWS EMR Hudi). The good news: the concept of table format is portable. Learn one, you can read the others in an afternoon.

  • Which table format does your warehouse most naturally read AND write to?
  • What's the cost of migration the day you change warehouses — and how would Iceberg's open catalogues change that calculus?

Reading in progress · 0 of 2 activities done