Journal

Permission-aware retrieval, in one page

A terse explanation of how Kaldros keeps every answer scoped to what the asker is actually allowed to see — without doing it at prompt time.

Published
Reading time
2 min read
Author
Kaldros

The single hardest thing about enterprise search is not ranking. It is getting the permissions right. If the retrieval layer surfaces a document the asker cannot legitimately open, no amount of grounding in the language model will save the answer. A leaked sentence is a leak regardless of citation.

We take a strict approach. Permissions are mirrored from the source system into a per-tenant access graph at ingest time, along with the document itself. Every edge — a user belonging to a Slack channel, a file inherited from a Google Drive folder, a Linear project restricted to a team — is written as a signed event into the same change log that carries content updates. The graph is kept tight rather than clever: resolution is a bounded traversal, not an open-ended reasoning task.

At query time the retrieval layer runs two filters in sequence. The first is a shallow visibility check: is the asker in any principal set that the document is attached to? That narrows the candidate pool without touching the vector index. The second filter runs inside the index itself — we store principal masks alongside each embedding so that the nearest-neighbour search cannot return chunks the asker is not entitled to, even before re-ranking.

The prompt never sees a redaction step, because there is nothing to redact. Chunks that did not pass the filter were never retrieved. The language model is asked to ground an answer in the chunks it was given, and those chunks are by construction within the asker's reach.

Two practical consequences follow. The first is that our latency budget for permission resolution is fixed and measured; a slow permission system degrades every query, so it sits on our hot path dashboards. The second is that permission changes propagate in seconds, not minutes. A revocation in Google Workspace removes the relevant edges from the graph before the next query, and the now-unreachable chunks fall out of the result set on the next call.

There is more to say about group-of-groups resolution, about how we handle tenants that enforce field-level redaction inside single documents, and about what happens when a source system lies about its own ACL. We will get to those in later posts. For now, the headline: retrieval happens inside the permission boundary, not next to it.