GraphRAG query planning

Acting as an air traffic controller to route intent and manage traversal depth.

0/4 done

Overview

Acting as an air traffic controller to route intent and manage traversal depth.

Why it matters

Blindly crawling an entire knowledge graph for every user prompt results in latency explosion and prompt bloat. Query planning is the orchestration layer: it evaluates incoming intent, targets specific seed entry nodes, and locks down an explicit hop budget to extract clean, minimal evidence paths without getting lost in graph noise.

How it actually works

Query planning is the air-traffic-control layer that runs before retrieval. Crawling the whole graph for every prompt is latency suicide; planning decides the intent, the seed entities, the hop budget and the retrieval blend up front.

query_plan:
  question: 'Which exceptions were approved by leaders connected to Acme?'
  classify: { intent: path+policy, requires_hops: 2 }
  retrievers:
    - { kind: vector, top_k: 8 }
    - { kind: graph_traversal, start_entities: [Acme, 'policy exception'], max_hops: 2 }
  re_rank: { features: [path_support, text_relevance, recency] }

Lock the hop budget before execution. requires_hops and max_hops are the variables that most control latency and noise. Setting them per-query (from the classified intent) instead of using one global default is what keeps p95 latency stable across easy and hard questions.

Plan the failure path too. Seed-entity extraction can fail — the question names an entity that isn't in the graph. A robust plan has a fallback (drop to pure vector search, or ask a clarifying question) instead of returning an empty traversal. Planning is not just the happy path; it's deciding what to do when the graph can't help.

Done well, planning turns GraphRAG from 'traverse everything and hope' into 'retrieve the minimal evidence set this specific question needs'.

Analogy

Query planning is a flight plan filed before take-off. You don't improvise altitude and fuel mid-air — you set the route, the limits and the diversion airport on the ground. The hop budget is your fuel limit; the fallback is your diversion airport.

Pitfalls & how to avoid them

  • One global hop budget. Symptom: easy queries slow, hard queries truncated. Fix: per-intent hop budget.
  • No fallback when seeds fail. Symptom: empty answers. Fix: drop to vector search or clarify.
  • Planning after retrieval. Symptom: wasted traversal. Fix: classify intent first.
  • Ignoring re-rank features. Fix: declare path_support + relevance + recency in the plan.

Apply it to your system

Take one hard question your system gets.

  • What intent class is it, and how many hops does it really need?
  • Which seed entities would you start the traversal from?
  • What happens today when those seed entities aren't found in the graph?

Reading in progress · 0 of 4 activities done