Which of these is the safest way to include a user-supplied name in a SPARQL query?

Binding the value as an xsd:string literal via a parameter API

Wrapping the value in to make it an IRI

f-string interpolation into the query text

Spotting SPARQL Injection

Why concatenating user input into a SPARQL query is dangerous.

0/3 done

Theory

Imagine an app that searches authors by name:

query = f'SELECT ?b WHERE {{ ?b :author "{user_input}" }}'

If user_input is Alice" } UNION { ?b ?p ?o the query escapes its string and dumps the whole dataset. SPARQL UPDATE makes this even worse: an attacker can INSERT DATA or DROP GRAPH.

Rule: never splice untrusted text into a SPARQL string. Bind it as a typed term.

Theory

Going deeper — why escaping is not enough

Beginners reach for 'just escape the quotes'. That fails for the same reason it fails in SQL: you are trying to out-parse the parser. There is always another encoding (Unicode escapes, alternate delimiters, language tags, datatype IRIs) that your hand-rolled filter missed. The only robust fix is to never build query syntax from data at all — keep the query template constant and inject values through the engine's own term-binding API, which treats them as opaque RDF terms that cannot become syntax.

Defense in depth, in priority order:

Parameterize — initBindings (rdflib) / ParameterizedSparqlString (Jena) / pre-bound variables. This is the actual fix.
Least privilege — give the app's endpoint credentials read-only access to a single graph, so even a successful injection can't DROP GRAPH or INSERT DATA.
Separate read and write endpoints — never expose SPARQL UPDATE on the same endpoint that serves user-facing search.
Validate at the boundary — reject absurd inputs early, but treat this as a usability guard, never as your security boundary.

Analogy

It's the SemWeb cousin of SQL injection. The store doesn't care that you thought you were writing a query — it just executes whatever syntactically valid SPARQL it receives.

And SPARQL's blast radius is often worse than SQL's: graph stores expose UPDATE, INSERT DATA, DELETE DATA and DROP GRAPH over the same endpoint, so a single un-escaped quote can let an attacker rewrite or wipe whole tenant graphs, not just exfiltrate a few rows.

Worked example — BIND to defuse injection

Worked example — the safe pattern using BIND.

Instead of splicing user text into the query string, declare a variable and BIND the typed value to it before the WHERE clause uses it:

PREFIX : <http://example.org/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT ?book WHERE {
  BIND("Alice"^^xsd:string AS ?name)
  ?book :author ?name .
}

BIND(value AS ?var) assigns a typed RDF term to ?var. Even if value contained quote marks or SPARQL keywords, it stays a literal — it can never escape and become query syntax.

In real apps you don't write the BIND yourself; your SPARQL library does it for you when you call initBindings={'name': Literal(user_input)} (rdflib) or pss.setLiteral('name', user_input) (Jena). Same outcome: the literal is bound, not concatenated.

Reflect

Where in your stack does untrusted text ever reach a SPARQL string? Web forms, REST query parameters, chat messages routed to an LLM agent…

Map every entry point on a napkin: each one is a candidate for the prepareQuery / ParameterizedSparqlString treatment. The LLM-agent path is the easiest to miss — generated SPARQL is also untrusted text from the store's point of view.

›Name one place in a typical web app where user text hits a SPARQL query.
›What's the worst thing an attacker could do via SPARQL UPDATE if injection is possible?

Reading in progress · 0 of 3 activities done