
Trust & privacy
Review agent updates without exposing what's private.
Verifiable Labs is designed for security review — redacted evidence, approval-gated exports, and private evaluation boundaries by default.
- 0
- Customer data in results
- 100%
- Runs redacted by default
- Approval
- Required before any export
- Every run
- Produces an evidence record
The principle
Private by default. Evidence when needed.
Security review shouldn't mean handing over your data. Verifiable Labs reviews agent updates without exposing customer data, private evals, hidden cases, gold answers, raw traces, or secrets.
Redacted evidence
A reviewable record that reveals scores, not secrets
Every run produces a Generalization Card built for review — the decision, the machine reasons, and the per-suite deltas. The sensitive material that produced them never appears.
- Decision, reasons, and per-suite score deltas
- Engine verdicts and the policy that was applied
- Redacted by default on every run
Generalization Card
- decision
- BLOCK
- reasons
- ood_regressed · hidden_regressed
- public
- 0.740 → 0.910
- hidden
- 0.732 → 0.611
- ood
- 0.701 → 0.488
- record
- redacted · reviewable
✗ candidate not promoted
Private boundaries
Hidden cases stay sealed
The baseline, hidden cases, gold answers, and raw traces stay inside the evaluation boundary. Candidates run against them, but nothing crosses back out into a result.
- No customer data in results or demos
- Hidden cases and gold answers never exposed
- Managed or bring-your-own-key / self-hosted routing

Approval-gated exports
Nothing leaves the workspace without sign-off
Exports require approval before an evidence record can leave the workspace, with a full trail of who requested, approved, and downloaded it.
- Request → approve → export, with an audit trail
- Role-based approval controls
- Audit-ready records for security and compliance

What a Generalization Card discloses
- The gate decision — ship, block, or limit
- Machine reasons behind the decision
- Per-suite score deltas (scores, not answers)
- Contamination & anti-hack engine verdicts
- The gate policy that was applied
What it never contains
- Customer data
- Hidden evaluation cases
- Gold answers
- Raw model traces
- Secrets or credentials

Improve what fails. Ship what holds.
Bring a baseline and candidate agent workflow. Verifiable Labs will show which updates should ship, which should be blocked, and which need limited rollout.
