Hive Trust · Live Benchmarks

Every claim signed.
Every benchmark reproducible.

Hive primitives are benchmarked head-to-head against published SOTA, scored on real datasets, and every result is cryptographically signed. No marketing screenshots — receipts.

Hive ColonyIP protection shield Protected or Pending by Hive ColonyIP
Primitives benchmarked
Records signed
Signing algorithm Ed25519
Last updated
Production Corpus Results — Enterprise CASB Workload

SMSH v5 + smshPQMax — real workload compression

Corpus-tested across four enterprise CASB and agent workload types. Numbers are measured, not modeled. Invariant recall clears the 99.5% bar required for court-admissible receipts. Amber = preliminary; signed receipt on every run.

Workload v1 Baseline v5 Registry+Neural Multiplier
Enterprise CASB / policy prompts 1.04x 9.15x 8.8x lift
Agent context windows 8.29x
Verbose / filler-heavy prompts 1.14x 5.09x 4.5x lift
RAG repetitive context 1.58x 3.41x 2.2x lift
Overall mean ~1.2x 5.49x 4.6x lift
Invariant Recall
99.78%
clears 99.5% court-admissible bar
p50 Latency
<25ms
280x faster than GPU-based baselines
Signing
100%
ML-DSA-65 receipt on every corpus run

Corpus results are from production workload testing. Not a head-to-head SOTA benchmark — see signed benchmark cards below for peer-reviewed adversary comparisons. smshPQMax product page →

Live benchmarks

Every record below is a signed Ed25519 receipt from hivemorph. Click any card for the full methodology.

Inference primitives — production workloads with paying customers

Other primitives — trust infrastructure

Voice primitives — STT/TTS compression & tamper detection

Earned badges

Two earnable stamps. Hive Verified is awarded to any primitive that emits an Ed25519 signed receipt. Hive Platinum is awarded only when the trust record is publishable (n ≥ 500, |d| ≥ 0.3, p < 0.01).

Hive Verified · earned by

Hive Platinum · earned by

How we benchmark

Four non-negotiable rules applied uniformly across every primitive, every adversary, every dataset.

Step 01

Pick the published SOTA

We do not compare against straw men. Adversaries are the highest-citation published baseline for each task: LLMLingua-2 for compression, NIST FIPS-204 for signatures, Llama-Guard for safety, self-consistency CoT for reasoning, DSPy for prompt compilation, Constitutional AI for factuality.

Step 02

Ensemble construction

Hive v2 primitives are ensembles that include the SOTA adversary itself as one candidate, plus 3–4 Hive-specific strategies. A quality oracle picks the per-input winner. By construction, the ensemble cannot lose to the adversary alone.

Step 03

Pre-registered evaluation

We commit to the dataset, sample size, metric, and decision criteria before running the benchmark. Pre-registration is published at github.com/srotzin/xcalibur-evaluation.

Step 04

Cryptographic receipts

Every result line, every paired t-statistic, every Cohen's d is committed in a signed Ed25519 receipt. Receipts are queryable at /v1/trust/benchmarks on hivemorph.onrender.com. Tamper any field — signature breaks. No editable marketing slides.

Result status

Every benchmark record carries one of three statuses. Status is computed from the data, not editorially assigned.

Publishable
n ≥ 500, Cohen's d ≥ 0.3, p < 0.01. Ready for public claim.
Preliminary
n meets minimum but effect size or p-value below publishable bar. Honest in-progress.
Match
Hive primitive matches the adversary on correctness within latency budget. Match, not beat — still a real result.

Verify any receipt

Every record is public. You do not need an account or an API key.

curl -sS https://hivemorph.onrender.com/v1/trust/benchmarks/{record_id}

Every field is signed. Re-derive the signature against the record's pubkey_hex. If you can verify it, the record is authentic.

Open the verifier →
Hive ColonyIP

Hive primitives are Patent Pending. Provisional patents filed. The methodology, benchmarks, and receipts are open. The cryptographic primitives are protected.

Hive ColonyIP →