Reference architecture for KoraSafe deployments.

What runs where, what crosses which boundary, and how a regulatory delta becomes an enforced control. This page is the technical map of the platform; the sibling pages go deeper on individual subsystems.

Three planes

Control, data, and catalog

KoraSafe is organized as three planes with distinct trust boundaries, data ownership, and deployment shape. The control plane is the SaaS surface. The data plane is where customer traffic is evaluated. The catalog plane is the authoritative regulatory output that both other planes consume.

Where policy is authored and evidence is read

KoraSafe-hosted SaaS surface. Policy authoring, dashboards, evidence retrieval, regulatory feed, admin. Holds tenant configuration and audit records. No customer inference traffic transits this plane.

  • Web app and API (serverless)
  • Multi-tenant Postgres (RLS)
  • Workers for async evaluation
  • Object store for evidence packages

Where customer traffic is evaluated

Runs inside the customer environment or as a hybrid edge agent. Inline policy enforcement, finding emission, telemetry capture. Returns governance signals to the control plane; raw content does not leave the customer environment unless the customer's policy says it does.

  • Gateway, sidecar, or embedded SDK
  • Hybrid edge agent (optional, for residency)
  • Browser and IDE extensions
  • Connectors to existing detection stacks

The authoritative regulatory output

Shared, read-only across tenants. The output of the regulatory intelligence pipeline: approved obligations, mapped controls, sector packs, vector embeddings. Both other planes resolve against it.

  • Obligation catalog (queryable via API)
  • Sector packs (HIPAA, NAIC, EU AI Act, etc.)
  • Vector embeddings for cross-framework mapping
  • Audit log of every editorial decision
Deployment shapes

Managed, hybrid, and air-gap

Most customers run in the managed-cloud shape: KoraSafe hosts everything, you wire connectors and SDKs. Teams with data-residency, regulator, or sovereignty requirements deploy the hybrid edge agent in their own environment so raw content never leaves it. Air-gap is supported through the same edge agent with outbound telemetry disabled.

SaaS end to end

Default deployment. KoraSafe hosts the control plane and a hosted data-plane gateway. Customer traffic transits the gateway over TLS. Findings, evidence, and audit-chain entries persist in the tenant-scoped Postgres.

  • Fastest path to production
  • No customer-side infrastructure
  • EU and US regions available

Control in SaaS, data in your environment

The edge agent runs as a container or VM inside the customer environment. Policy enforcement, PII detection, and redaction happen locally. Only governance signals (decisions, redacted findings, audit events) leave the environment. mTLS for telemetry. Customer controls what crosses the boundary.

  • Container image and Helm chart published
  • Sidecar, gateway, or SDK embedding
  • Designed for regulated verticals

Edge agent with telemetry disabled

Same edge-agent artifact as hybrid, configured with outbound telemetry off. Catalog updates ship as signed bundles the customer pulls into the environment on a schedule. Evidence stays local; the customer's auditor verifies signatures against the bundle.

  • Offline catalog refresh
  • Locally signed audit chain
  • Available for government and defense use cases
Component map

What runs where

The platform's moving parts and where each one lives. Components in the customer environment are marked accordingly; everything else runs in the KoraSafe-hosted control plane.

Component Responsibility Location
Web app Dashboards, policy authoring, evidence retrieval, admin. React frontend served from the edge. Control plane
API Serverless handlers behind a common error, auth, rate-limit, and org-scope middleware. Synchronous endpoints for reads and policy decisions. Control plane
Workers Asynchronous evaluation, document extraction, classifier runs, evidence pack assembly, scheduled catalog refresh. Long-running paths that should not block the request handler. Control plane
Postgres System of record. Tenant configuration, findings, policy decisions, audit-chain entries, obligation snapshots. Row-level security on every table in the public schema. Control plane
Vector store Embeddings for obligations, controls, and citations. Powers cross-framework similarity search and obligation-to-control auto-mapping. Control plane
Object store Source-document binaries, generated evidence packages, retained probe transcripts. Tenant-scoped, signed-URL retrieval only. Control plane
Edge agent Optional. Inline policy enforcement, redaction, and telemetry collection inside the customer's environment for teams that cannot send raw content to a hosted plane. Customer environment
Connectors Bidirectional adapters to detection providers, observability tools, and identity sources. Findings federate into KoraSafe; policy decisions can fan out to existing tooling. Customer + Control plane
Browser and IDE extensions Chrome, VS Code, and JetBrains surfaces that emit governance signals (shadow AI access, workspace AI activity) and display context in-product. Customer environment
Regulatory intelligence pipeline

From regulator publication to enforced control

The catalog plane is built by a named eight-stage pipeline. Each stage is independently observable, individually retryable, and produces evidence the next stage can audit. The pipeline is the spine of every Guardian finding, every audit package, and every governance index movement.

1. Watcher

Polls regulator and provider sources on a per-source cadence. Detects new publications, amendments, and revocations.

2. Adapter

Normalizes raw source documents (HTML, PDF, JSON, RSS) into a canonical document model with stable identifiers, preserved citations, and source provenance.

3. Extractor

RAG-driven extraction of obligations, controls, definitions, and effective dates. Each extracted fact carries a span back to the source document for audit.

4. Approver

Editorial review queue. Extracted obligations require human sign-off before they enter the catalog. Every approval, edit, and rejection is logged to the audit trail.

5. Embed

Generates vector embeddings for every approved obligation, control, and citation. Embeddings power cross-framework similarity search and obligation-to-control auto-mapping.

6. Curator

Builds sector packs and overlay manifests. Curators bundle obligations into shippable packs with autonomy defaults and policy templates.

7. Control

Wires obligations to enforceable controls in the customer org. Subscribed packs install controls into the policy plane and the Guardian detection rule set.

8. Catalog

The published catalog, queryable through the API and surfaced in the regulatory intelligence feed. The single source of truth for every downstream consumer.

Reference flow A

Regulatory delta to enforced policy

How a new regulator publication ends up gating a runtime action in the customer environment.

Source publicationThe Watcher detects a new amendment from a tracked regulator or provider. The raw document lands in the staging store with the source URL and the fetch timestamp.
Normalize and extractAdapter converts the document to the canonical model. Extractor identifies obligations, controls, definitions, and effective dates. Each extracted fact carries a span back to the source.
Editorial approvalApprover routes the extracted obligations to the editorial queue. A human reviewer accepts, edits, or rejects each one. Every decision lands in the audit log.
Catalog publishApproved obligations enter the catalog. Embeddings are generated and indexed. The obligation becomes addressable via GET /api/v1/obligations/:id.
Pack curationCurator bundles the obligation into the affected sector pack (for example nyc-ll144 or eu-ai-act). Subscribed customers receive a notification that a pack has a pending upgrade with a diff and preserved overrides.
Control installationOn accept, the pack's controls install into the customer's policy plane. Guardian detection rules update. Affected systems are tagged for re-validation.
Runtime enforcementThe next request that touches an affected system is evaluated against the new control. A finding is emitted if the request fails; a policy decision is logged either way. Both feed the audit chain.
Reference flow B

Runtime decision to evidence pack

How a single inference request becomes auditor-grade evidence.

Request enters the data planeAn AI agent (the customer's, not KoraSafe's) makes a model call. The call hits the gateway, the sidecar, the embedded SDK, or the edge agent depending on the deployment shape.
Policy evaluationThe enforcer resolves the request against the installed control set. PII detection, bias guards, autonomy limits, and per-obligation rules run inline. Latency budget is bounded per call.
Finding emissionIf a control fails, a finding is emitted with the matched obligation citation, the deciding policy version, and the evidence span. The action is allowed, redacted, blocked, or routed for human review per the policy decision.
Audit-chain entryEvery policy decision (pass or fail) appends to the tenant's audit chain. Entries are content-hashed and chained to the previous entry, so any tamper attempt breaks the chain at the modified record.
Evidence packageAn evidence package query (manual or scheduled) walks the audit chain for the requested time window, attaches the matched obligations and policy versions, and produces a signed bundle the external auditor can verify independently.
Integrations reference

The federation surface, by function

KoraSafe is a federation point, not a black box. Customer-side detection, observability, identity, and evidence sources connect through documented contracts. Each integration belongs to one of five categories. The live roster, OAuth flows, and webhook contracts live on the integrations page.

Category Role in the architecture Examples
Detection providers External classifiers and guardrails federate into the same finding stream as native KoraSafe classifiers. Each provider's recognizer is preserved as the finding's provenance. Presidio, Lakera Guard, Bedrock Guardrails, Portkey, LangSmith, Fiddler, watsonx.governance
Observability and telemetry AI-tracing tools forward call records, latency, and tool-use telemetry. The data plane joins these signals to the governance evaluation that ran inline. Datadog AI Observability, OpenTelemetry receivers
Identity and access SSO and directory sync. JWT claims carry tenant org_id for every authenticated path; service accounts get scoped API keys. SAML 2.0 SSO, Okta, Azure AD, Google Workspace
Shadow AI signals Finance, identity, and code-workspace signals feed the discovery layer so unsanctioned AI use surfaces from multiple independent vantage points. AWS CUR, Okta sign-in logs, Azure AD usage, GitHub workspace activity, VS Code, JetBrains, Chrome
Agent protocols Programmatic surfaces for agent-to-agent and agent-to-platform integration. Agents query findings, retrieve control status, and submit audit evidence over standardized contracts. MCP gateway, A2A protocol, REST API, HMAC-signed webhooks
Security architecture

Defense in depth across every layer

The architecture treats security as layered enforcement, not a single gate. Each step below is a defense the request must pass through. A bug at any one layer is bounded by the layer below it. Posture detail, certifications path, and threat model live on the security page.

Network and transportTLS 1.2+ on all in-transit traffic. mTLS available between the edge agent and the control plane. Webhook payloads carry HMAC-SHA256 signatures. No plaintext path exists in the production environment.
Identity and sessionSAML 2.0 SSO for enterprise tenants. JWT session tokens carry org_id, role, and scope. MFA configurable per org. Session expiry and password policy per tenant.
API authorizationEvery serverless handler wraps a common middleware: authentication check, rate limiting per token, org-scope validation. Service accounts get scoped API keys with explicit permission grants.
Database isolationRow-level security on every table in the public schema. Service-role bypasses do not appear in customer-facing API paths. Even if the API layer fails to filter by org_id, the database refuses to return rows from the wrong tenant.
Data-at-rest encryptionAES-256 disk encryption for the database. Backup snapshots inherit the same baseline. Object-store payloads carry per-tenant encryption keys; BYOK is planned.
Tamper-evident audit chainEvery policy decision appends to a content-hashed, JWS-signed audit chain per tenant. Any modification to a past entry breaks the chain at that record. External auditors verify signatures independently.
Customer-environment redactionIn the hybrid and air-gap shapes, the edge agent redacts PII and sensitive payloads before telemetry leaves the customer environment. The customer's policy decides what crosses the boundary.
Tenant isolation and residency

Tenant data never crosses org boundaries

Row-level security on every table in the public schema means there is no application-layer query to add or forget. Even if a bug bypasses org_id filtering in the API, the database enforces the boundary. No shared tables with cross-tenant queries, no caching layers that leak across tenants.

RLS

enforced at the database layer

JWT org_id

in every authenticated request

TLS 1.2+

all traffic in transit

For teams that need data residency or air-gap operation, the edge agent runs inside the customer environment and only sends governance signals back to the control plane. The customer's policy decides what telemetry leaves the environment and what stays.

Continue reading

Deeper detail per subsystem

This page is the map. Each subsystem has its own page with the details, contracts, and operational notes.