Architecture

Veto is composed of four main components that work together to provide centralized permission control for AI agents.

System overview

┌─────────────────────────────────────────────────────┐
│                    Developer Machine                 │
│                                                      │
│  Claude Code ──→ Veto Plugin (hooks)                │
│       │              │                               │
│       │         POST /api/v1/hooks/evaluate          │
│       ▼              ▼                               │
│  LiteLLM Proxy ←──────── Veto Server ──→ PostgreSQL │
│       │          POST /api/v1/proxy/evaluate    │    │
│       │                      │                  │    │
│       ▼                      ▼                  │    │
│  Anthropic API          AI Scorer (Claude)   Redis  │
│                              │                       │
│                              ▼                       │
│                         Dashboard                    │
└─────────────────────────────────────────────────────┘

Components

Veto Server

FastAPI + SQLAlchemy (async) + PostgreSQL

The core backend. Handles:

  • Rule storage and evaluation
  • AI risk scoring (calls the configured AI model to evaluate tool calls)
  • Audit event logging
  • User authentication and team management
  • API key management
  • Billing and subscription management

Key endpoints:

  • POST /api/v1/hooks/evaluate — called by the Claude Code plugin
  • POST /api/v1/proxy/evaluate — called by the LiteLLM guardrail
  • GET /api/v1/rules — CRUD for rules
  • GET /api/v1/audit — query the audit log

Dashboard

Next.js (App Router) + Tailwind + shadcn/ui

The management UI. Provides:

  • Rule editor (create, edit, delete, reorder, import/export)
  • Real-time audit log with filtering and search
  • Analytics (decisions over time, top tools, response times)
  • Team management (invite members, assign roles)
  • API key management
  • Onboarding wizard
  • Billing and plan management

LLM Proxy

Built-in LLM proxy managed from the dashboard

The proxy sits between Claude Code and the Anthropic API. It is enabled and configured from the dashboard (Settings → LLM Proxy). Internally, a guardrail (external_hook_evaluate.py) is a LiteLLM post_call guardrail that:

  1. Intercepts the LLM response stream
  2. Detects tool calls in the response
  3. Sends each tool call to the Veto server for evaluation
  4. Blocks denied tool calls by replacing them with a denial message
  5. Passes allowed tool calls through to the client

The guardrail is streaming-aware with three phases:

  • PEEK — buffers the preamble until the first content block reveals the response type
  • STREAM — text-first responses are passed through immediately (zero latency)
  • BUFFER — tool call responses are buffered, evaluated, then forwarded or blocked

Claude Code Plugin

Python hook scripts

A lightweight integration that hooks into Claude Code's permission system:

  • evaluate.py — intercepts PermissionRequest events, sends them to the Veto server, returns allow/deny/ask
  • session.py — tracks session start/end for audit correlation

Evaluation flow

When a tool call arrives (via plugin or proxy):

1. Load org rules (sorted by priority DESC)
2. For each enabled rule:
   a. Match tool_pattern against tool name (fullmatch)
   b. If content_pattern set, match against tool input (search)
   c. First match wins → return decision
3. If no rule matches and AI scoring is enabled:
   a. Call the configured AI model with tool context
   b. Get risk score (0-100) and reasoning
   c. Score ≥ threshold → DENY, score < threshold × 0.5 → ALLOW, else → ASK
4. If AI scoring fails or is disabled:
   a. Apply fail policy (allow or deny)
5. Log the decision as an audit event

Data model

Core entities

EntityDescription
OrganizationTop-level tenant. All rules, users, and audit events belong to an org
UserMember of an org. Roles: admin, member, viewer
RuleA whitelist or blacklist pattern with priority
Audit EventA logged tool call with decision, reasoning, latency, and metadata
SessionGroups audit events from a single Claude Code session
API KeyAuthentication token for the plugin or external API access
Scoring ConfigPer-org AI scoring settings (model, threshold, enabled)

Plans

PlanMembersRulesEvals/moAI ModelsAudit Retention
Free1203,000Haiku7 days
TeamUnlimitedUnlimited5,000/userHaiku + Sonnet30 days
BusinessUnlimitedUnlimited20,000/userAll90 days

Notes:

  • Eval limits are enforced as a hard block — once reached, all tool calls are denied until the next billing cycle
  • Team and Business eval limits scale with seat count (limit × number of users)
  • AI scoring model selection is restricted by plan tier
  • Custom scoring prompts are Business-only

Deployment

Docker Compose (development)

docker-compose up -d

Starts PostgreSQL, Redis, Server, Dashboard, and LiteLLM Proxy locally.

Kubernetes (production)

The deploy/k8s/ directory contains Kustomize manifests for:

  • PostgreSQL StatefulSet (10Gi PVC)
  • Redis Deployment
  • Veto Server Deployment + Ingress (api.vetoapp.io)
  • Dashboard Deployment + Ingress (dashboard.vetoapp.io)
  • Website Deployment + Ingress (www.vetoapp.io)
  • LiteLLM Deployment + Ingress (proxy.vetoapp.io)

All services use TLS via cert-manager with Let's Encrypt. Secrets are managed with Bitnami SealedSecrets.