Verifier Authoring Guide
Directory Structure
Every verifier lives under registry/verifiers/:
my_org.domain.verifier_name/
├── VERIFIER.json # Metadata, scorecard, and tier
├── verify.py # Verification logic
├── positive.json # Fixtures that should pass (≥3)
├── negative.json # Fixtures that should fail (≥3)
└── adversarial.json # Edge cases and attack vectors (≥3)
VERIFIER.json
{
"id": "vr/my_org.domain.verifier_name",
"version": "0.1.0",
"tier": "HARD",
"domain": "code_quality",
"task_type": "code_verification",
"description": "Verifies that...",
"scorecard": {
"determinism": "deterministic",
"evidence_quality": "high",
"intended_use": "CI gating",
"gating_required": false,
"recommended_gates": [],
"attack_surface": {
"prompt_injection": "low",
"evidence_tampering": "low"
}
},
"permissions_required": ["filesystem"],
"contributor": "your-github-handle"
}
Tier Requirements
| Tier | Determinism | Fixture Requirement | Attack Surface | |------|-------------|---------------------|----------------| | HARD | Deterministic, binary | 3+ positive, 3+ negative, 3+ adversarial | Must document and mitigate | | SOFT | Probabilistic, 0-1 score | 3+ each, plus inter-rater calibration data | Must declare prompt injection risk | | AGENTIC | Agent-dependent | 3+ each, plus timeout/retry tests | Must declare interaction surface |
Adversarial Fixtures
Every verifier must include adversarial test cases that attempt to fool the verification. Examples:
- HARD: DB record exists but with wrong status; API returns 200 with error in body
- SOFT: Prompt-injected text that asks the LLM judge to always return high scores
- AGENTIC: DOM elements that look correct visually but have wrong underlying data
Testing Locally
# Validate structure and scorecard
python scripts/validate_registry.py
# Run all fixture types
vr test --verifier my_org.domain.verifier_name
# Run only adversarial fixtures
vr test --verifier my_org.domain.verifier_name --type adversarial
Publishing
- Fork the
vr-devrepository - Add your verifier directory under
registry/verifiers/ - Each fixture file must have ≥ 3 test cases
- Run
python scripts/validate_registry.py - Open a pull request. CI will run all fixtures automatically