The Standard for
Software Factories

We define how AI agent factories operate at scale. How they consume context, follow rules, and ship production code.

Read the Paper Axiom on GitHub arrow_outward

~/standra ❯ axiombench --report ════════════════════════════════════ AXIOM COMPLIANCE BENCHMARK v2 (using Karpathy's autoresearch) ──────────────────────────────────── MODEL COMPLIANCE TOKENS ──────────────────────────────────── Opus 4.6 █████████▏ 91.6% 239 Sonnet 4.6 █████████ 90.7% 239 qwen3-coder ████████▊ 87.5% 239 ──────────────────────────────────── Markdown ref. █████████ 90.5% 1195 ──────────────────────────────────── SAVINGS: 87% RUNS: 1247 RULES: 10 ════════════════════════════════════ ~/standra ❯ █

METRICS

1,247 Validated Research Runs

7 Published Standards

5 Models Benchmarked

87% Token Cost Reduction

THE PROBLEM

-- Status Quo --

AI coding tools are everywhere. Standards are nowhere.

Every AI coding agent — Claude, Cursor, Codex — consumes project rules differently. There is no shared format for how agents read context, follow instructions, or report results. Teams waste tokens, lose compliance, and can’t switch tools without rewriting everything.

PRINCIPLES

architecture

Open Standards

We publish the protocols AI agents speak. Axiom for rules. CACP for communication. Adopted, not imposed.

Protocol-First

biotech

Empirical Research

Every standard is validated with real data. 1,200+ runs across proprietary and open-source models. Published, reproducible, honest.

Data-Driven

verified

Production-Proven

Our standards come from building real products, not committees. If it doesn’t work in production, we don’t publish it.

Ready for Scale

ECOSYSTEM

Standards & Benchmarks

Axiom

Problem: Rules are verbose prose. Agents waste 87% of tokens reading them.

Compiles rules into compact tabular format. Same compliance, fraction of the cost.

GitHub →

CACP

Problem: Agents return free-form prose. Parsing is fragile and expensive.

Structured I/O protocol. Typed fields replace 2000-token prose with 200 tokens.

GitHub →

PawBench

Problem: No standard way to benchmark LLM inference with tool-calling.

4-dimensional benchmark: multi-turn, multi-agent, parallel, with tools.

GitHub →

AxiomBench

Problem: Nobody measures if agents actually follow project rules.

Compliance benchmark. 10 rules × 8 tasks × 5 models. 1,247 validated runs.

GitHub →

ServingCard

Problem: Model serving configs are tribal knowledge, not portable metadata.

Open spec for quantization, serving params, and deployment config.

servingcard.dev →

RESEARCH

“Compressed rule formats achieve the same compliance as verbose instructions at 87% lower cost — validated across Claude Opus, Claude Sonnet, and open‑source models.”

91.6%

Opus

90.7%

Sonnet

87.5%

qwen3-coder

Read the paper arrow_forward

ABOUT

The Consultancy

Zen Process

AI engineering consultancy. We help enterprises adopt AI-assisted development with the right standards, architecture, and implementation.

zen-process.com →

The Factory

Zen Labs

Product factory. We build real products using AI-powered engineering. Our standards come from production, not theory.

zp.digital →

Let’s talk.

hello@standra.ai terminal GitHub photo_camera Instagram

AXIOM v0.6.0 ░░ CACP v1.0 ░░ PAWBENCH v1.0 ░░ SERVINGCARD v1.0 ░░ AXIOMBENCH 1247 RUNS ░░ 87% TOKEN SAVINGS ░░ 5 MODELS BENCHMARKED ░░ RESEARCH-FIRST ░░ HONEST CLAIMS ░░ OPEN STANDARDS ░░ standra.ai ░░ AXIOM v0.6.0 ░░ CACP v1.0 ░░ PAWBENCH v1.0 ░░ SERVINGCARD v1.0 ░░ AXIOMBENCH 1247 RUNS ░░ 87% TOKEN SAVINGS ░░ 5 MODELS BENCHMARKED ░░ RESEARCH-FIRST ░░ HONEST CLAIMS ░░ OPEN STANDARDS ░░ standra.ai ░░

The Standard forSoftware Factories

AI coding tools are everywhere. Standards are nowhere.

Open Standards

Empirical Research

Production-Proven

Standards & Benchmarks

Axiom

CACP

PawBench

AxiomBench

ServingCard

The Consultancy

Zen Process

The Factory

Zen Labs

Let’s talk.

The Standard for
Software Factories