Tier Distribution
Evaluation Standards
Multi-Judge Consensus
2-3 parallel LLM judges
6-Axis Scoring
Accuracy, Safety, Reliability, Latency, Process, Schema
Adversarial Probes
Injection, extraction, PII, hallucination
AQVC Attestation
Ed25519-signed W3C VC credential
Anti-Gaming
Question paraphrasing, production correlation
Recent Evaluations
View allLaureum v1.0 — AI Agent Quality Verification
Built by Assisterr