Integration Guide

TRL (Transformer Reinforcement Learning)

Export verified rewards directly to TRL format:

from vrdev import get_verifier, compose, VerifierInput, export_to_trl
from vrdev.core.types import PolicyMode

pipeline = compose(
    [get_verifier("vr/tau2.retail.order_cancelled"),
     get_verifier("vr/rubric.email.tone_professional")],
    require_hard=True,
    policy_mode=PolicyMode.FAIL_CLOSED,
)

# Run verification across your eval set
results = [pipeline.verify(episode) for episode in eval_set]

# Export TRL-compatible reward dataset
export_to_trl(results, output="rewards.jsonl")

The output JSONL contains one record per verification with the reward signal, evidence hash, and trajectory reference.

VERL

For VERL-based training:

from vrdev import export_to_verl

export_to_verl(results, output="verl_rewards/")

This generates the directory structure VERL expects, including reward shaping metadata.

OpenClaw Adapter

The openclaw.cancel_and_email skill in the registry shows how to integrate vr.dev verifiers as the reward function for an OpenClaw skill:

from openclaw import Skill
from vrdev import compose

reward_fn = compose([
    "tau2.retail.order_cancelled",
    "aiv.email.sent_folder_confirmed",
    "rubric.email.tone_professional",
], policy_mode="fail_closed")

skill = Skill(
    name="cancel_and_email",
    reward=reward_fn,
    max_steps=15,
)

Custom Training Loops

For arbitrary training frameworks, use the raw result objects:

from vrdev import get_verifier, VerifierInput

v = get_verifier("vr/code.python.tests_pass")
result = v.verify(VerifierInput(
    completions=["agent output here"],
    ground_truth={"repo": ".", "test_cmd": "pytest"},
))

reward = result[0].score          # float 0.0-1.0
passed = result[0].passed         # bool
evidence = result[0].evidence     # dict
hash = result[0].evidence_hash    # str, sha256:...

Map these to your framework's reward format as needed.

CI/CD Integration

Add verification to your CI pipeline:

# .github/workflows/verify.yml
- name: Verify agent outputs
  run: |
    pip install vrdev
    vr compose \
      --verifiers code.python.lint_ruff,code.python.tests_pass \
      --ground-truth '{"repo": "."}' \
      --policy fail_closed \
      --output json
Integration Guide | vr.dev Docs