GraphRAG adds failure modes naive RAG doesn't have — traversal loops, structural hallucination, alias-merge regressions — so it needs graph-specific release gates beyond top-k retrieval and latency.
graphrag_eval:
retrieval: { path_recall_at_5: 0.81, entity_grounding_accuracy: 0.89 }
generation: { claim_support_rate: 0.93, unsupported_claim_rate: 0.04 }
safety: { pii_leak_rate: 0.0 }
release_gate:
fail_if: ['path_recall_at_5 < 0.78', 'unsupported_claim_rate > 0.05']
Gate on path-level grounding, not just chunk recall. A GraphRAG answer can retrieve the right chunks yet assemble them along a wrong path (the celebrity-hub problem from hybrid retrieval). path_recall and entity_grounding_accuracy catch failures that chunk-level recall is blind to.
Block on grounding even when latency is green. A release that meets its p95 SLO but regresses path-support is a quality failure — shipping it trades correctness for speed. Latency being green is never a reason to wave through a grounding regression.
Citation completeness is a gate too. Track the share of claims that carry valid per-claim evidence (from the provenance lesson). A drop there means answers are becoming less auditable even if they still 'sound' right — exactly the slow erosion gates exist to stop.