Architecture
Veto is composed of four main components that work together to provide centralized permission control for AI agents.
System overview
┌─────────────────────────────────────────────────────┐
│ Developer Machine │
│ │
│ Claude Code ──→ Veto Plugin (hooks) │
│ │ │ │
│ │ POST /api/v1/hooks/evaluate │
│ ▼ ▼ │
│ LiteLLM Proxy ←──────── Veto Server ──→ PostgreSQL │
│ │ POST /api/v1/proxy/evaluate │ │
│ │ │ │ │
│ ▼ ▼ │ │
│ Anthropic API AI Scorer (Claude) Redis │
│ │ │
│ ▼ │
│ Dashboard │
└─────────────────────────────────────────────────────┘
Components
Veto Server
FastAPI + SQLAlchemy (async) + PostgreSQL
The core backend. Handles:
- Rule storage and evaluation
- AI risk scoring (calls the configured AI model to evaluate tool calls)
- Audit event logging
- User authentication and team management
- API key management
- Billing and subscription management
Key endpoints:
POST /api/v1/hooks/evaluate— called by the Claude Code pluginPOST /api/v1/proxy/evaluate— called by the LiteLLM guardrailGET /api/v1/rules— CRUD for rulesGET /api/v1/audit— query the audit log
Dashboard
Next.js (App Router) + Tailwind + shadcn/ui
The management UI. Provides:
- Rule editor (create, edit, delete, reorder, import/export)
- Real-time audit log with filtering and search
- Analytics (decisions over time, top tools, response times)
- Team management (invite members, assign roles)
- API key management
- Onboarding wizard
- Billing and plan management
LLM Proxy
Built-in LLM proxy managed from the dashboard
The proxy sits between Claude Code and the Anthropic API. It is enabled and configured from the dashboard (Settings → LLM Proxy). Internally, a guardrail (external_hook_evaluate.py) is a LiteLLM post_call guardrail that:
- Intercepts the LLM response stream
- Detects tool calls in the response
- Sends each tool call to the Veto server for evaluation
- Blocks denied tool calls by replacing them with a denial message
- Passes allowed tool calls through to the client
The guardrail is streaming-aware with three phases:
- PEEK — buffers the preamble until the first content block reveals the response type
- STREAM — text-first responses are passed through immediately (zero latency)
- BUFFER — tool call responses are buffered, evaluated, then forwarded or blocked
Claude Code Plugin
Python hook scripts
A lightweight integration that hooks into Claude Code's permission system:
evaluate.py— interceptsPermissionRequestevents, sends them to the Veto server, returns allow/deny/asksession.py— tracks session start/end for audit correlation
Evaluation flow
When a tool call arrives (via plugin or proxy):
1. Load org rules (sorted by priority DESC)
2. For each enabled rule:
a. Match tool_pattern against tool name (fullmatch)
b. If content_pattern set, match against tool input (search)
c. First match wins → return decision
3. If no rule matches and AI scoring is enabled:
a. Call the configured AI model with tool context
b. Get risk score (0-100) and reasoning
c. Score ≥ threshold → DENY, score < threshold × 0.5 → ALLOW, else → ASK
4. If AI scoring fails or is disabled:
a. Apply fail policy (allow or deny)
5. Log the decision as an audit event
Data model
Core entities
| Entity | Description |
|---|---|
| Organization | Top-level tenant. All rules, users, and audit events belong to an org |
| User | Member of an org. Roles: admin, member, viewer |
| Rule | A whitelist or blacklist pattern with priority |
| Audit Event | A logged tool call with decision, reasoning, latency, and metadata |
| Session | Groups audit events from a single Claude Code session |
| API Key | Authentication token for the plugin or external API access |
| Scoring Config | Per-org AI scoring settings (model, threshold, enabled) |
Plans
| Plan | Members | Rules | Evals/mo | AI Models | Audit Retention |
|---|---|---|---|---|---|
| Free | 1 | 20 | 3,000 | Haiku | 7 days |
| Team | Unlimited | Unlimited | 5,000/user | Haiku + Sonnet | 30 days |
| Business | Unlimited | Unlimited | 20,000/user | All | 90 days |
Notes:
- Eval limits are enforced as a hard block — once reached, all tool calls are denied until the next billing cycle
- Team and Business eval limits scale with seat count (limit × number of users)
- AI scoring model selection is restricted by plan tier
- Custom scoring prompts are Business-only
Deployment
Docker Compose (development)
docker-compose up -d
Starts PostgreSQL, Redis, Server, Dashboard, and LiteLLM Proxy locally.
Kubernetes (production)
The deploy/k8s/ directory contains Kustomize manifests for:
- PostgreSQL StatefulSet (10Gi PVC)
- Redis Deployment
- Veto Server Deployment + Ingress (
api.vetoapp.io) - Dashboard Deployment + Ingress (
dashboard.vetoapp.io) - Website Deployment + Ingress (
www.vetoapp.io) - LiteLLM Deployment + Ingress (
proxy.vetoapp.io)
All services use TLS via cert-manager with Let's Encrypt. Secrets are managed with Bitnami SealedSecrets.