A correct agent that fails honestly
is better than one that “usually works.”
agent.rs is a host-agnostic, correctness-first agent architecture written in Rust, designed to run unchanged across native, browser, and edge environments using WebAssembly.
It separates agent intelligence from host capabilities, enforces semantic guardrails, and refuses to return plausible-looking but incorrect results.
Why agent.rs?
Most agent frameworks optimize for convenience: auto-invoking tools, silently retrying, hallucinating "final answers" when things go wrong.
agent.rs optimizes for correctness.
It makes failure explicit, observable, and actionable, while keeping the agent logic:
- Portable
- Auditable
- Deterministic
Core Design Principles
1. The Host Provides Capabilities, Not Intelligence
- Hosts provide LLMs, tools, and I/O
- The agent provides decision logic
- No host-specific branching inside the agent
2. What's Identical Across All Three Hosts
In all three environments:
- ✅ The agent logic is identical
- ✅ The tool invocation protocol is identical
- ✅ The guardrails and failure semantics are identical
3. Explicit Failure, Not Silent Fallback
- No silent success
- No hallucinated correctness
- Guardrails enforce correctness by design
4. Pure State Transition Engine
- agent-core has zero platform dependencies
- Native runs Rust directly; Browser and Edge use WebAssembly
- Deterministic state → decision transitions
- Side effects handled exclusively by the host
One Agent, Multiple Hosts
The same agent-core logic runs unchanged across native, browser,
and edge environments. Hosts provide capabilities — the agent provides decisions.
- Native / CLI: Local inference, shell tools, persistent state
- Browser: WebLLM, DOM & fetch tools, session state
- Edge: HTTP LLMs, fetch-only tools, stateless execution
Skills: Contract-Based Operations
Skills are different from tools. They're contract-based operations with built-in guardrails, following Anthropic's Agent Skills specification.
| Aspect | Tools | Skills |
|---|---|---|
| Definition | Host-provided capabilities | Contract-based operations |
| Validation | PlausibilityGuard | Schema + Semantic guardrails |
| Execution | Host executes directly | Host executes, core validates |
| Examples | shell, fetch_url, read_dom |
extract |
Extraction Skill
The first built-in skill extracts structured data from unstructured text.
Supported Targets
email— Email addressesurl— URLs and linksdate— Dates (ISO format)entity— Named entities (people, organizations, locations)name— Person names
Example
# CLI
agent extract --text "Contact Dr. Sarah Johnson at sarah@example.com" --target name
# Output: {"name": ["Dr. Sarah Johnson"]}
# Browser / Edge
{"skill": "extract", "text": "Contact Dr. Sarah Johnson", "target": "name"}
Built-in Guardrails
- Schema Validation — Output must match expected JSON structure
- Anti-Hallucination — Extracted values must exist in source text
- Type Correctness — Values must match target type format
Skills make failure explicit. If the LLM hallucinates a value not in the source text, the guardrail rejects it.
Three Hosts, One Agent
Native
LLM: llama.cpp (local)
Tools: shell (with approval)
State: Persistent
Use Case: CLI automation
Browser
LLM: WebLLM (local)
Tools: DOM, fetch
State: Session
Use Case: Interactive UI
Edge
LLM: HTTP LLM API
Tools: fetch_url
State: Stateless
Use Case: Serverless APIs
Repository Structure
agent.rs/
├─ crates/
│ ├─ agent-core/ # Pure agent logic (WASM-compatible)
│ ├─ agent-native/ # CLI runtime (llama.cpp, shell tools)
│ └─ agent-wasm/ # WASM bindings for agent-core
│
├─ skills/
│ └─ extraction/ # First built-in skill
│ ├─ SKILL.md # Skill contract (Agent Skills spec)
│ ├─ schema.json # JSON schema for input/output
│ └─ README.md # Quick start guide
│
├─ examples/
│ ├─ browser/ # WebLLM + WASM browser demo
│ ├─ edge/ # Deno-based edge runtime
│ └─ with-extraction-skill/ # Skill demo walkthrough
│
├─ Makefile
└─ README.md
Getting Started
Native (CLI)
# Setup
make setup
# Download a model (example: Granite 4.0 Micro)
wget https://huggingface.co/ibm-granite/granite-4.0-micro-GGUF/resolve/main/granite-4.0-micro-Q8_0.gguf
# Run demo
make demo-shell MODEL_PATH=./granite-4.0-micro-Q8_0.gguf
# Or configure .env and run
cp .env.example .env
# Edit .env: MODEL_PATH=./granite-4.0-micro-Q8_0.gguf
make demo-shell
Browser (WebLLM + WASM)
make demo-browser
# Opens http://localhost:8080
# Features:
# - Runs fully locally
# - No API keys needed
# - WebGPU required
# - First run downloads model (~1.8GB)
Edge (Deno)
# Configure environment
cp .env.example .env
# Edit .env:
# LLM_ENDPOINT=https://api.openai.com/v1/chat/completions
# LLM_API_KEY=sk-...
# LLM_MODEL=gpt-4o-mini
# Run
make demo-edge
# Test in another terminal:
curl -X POST http://localhost:8000 \
-H "Content-Type: application/json" \
-d '{"query":"Fetch data from https://httpbin.org/json"}'
Guardrails: Correctness as a First-Class Concern
agent.rs uses semantic guardrails inspired by Mozilla.ai's any-guardrail pattern.
What Guardrails Do
- Reject empty outputs
- Reject metadata-only outputs (e.g. "total 123")
- Reject outputs lacking substance
- Prevent silent success
What Happens on Failure
- Clear explanation
- Explicit rejection reason
- Actionable suggestions
- Non-zero exit / visible UI failure
Failures are signals, not bugs.
Known Failure Modes (By Design)
Some tasks cannot be completed correctly with:
- Small models (< 7B parameters)
- Insufficient tool reasoning capability
- Ambiguous instructions
agent.rs will:
- Fail loudly
- Explain why
- Refuse to hallucinate
This is intentional.
Roadmap
Near-Term
- Tool postconditions (semantic contracts)
- Executable validation (tests as guardrails)
- Model capability negotiation
Medium-Term
- Multi-agent orchestration
- Streaming decisions
- Policy-based guardrail chains
Long-Term
- Formal agent contracts
- Deterministic replay
- Standardized WASM agent ABI
Why Rust?
Rust enables:
- Memory safety
- Deterministic execution
- Zero-cost abstractions
- First-class WASM support
For agent systems that must be trusted, Rust is not optional.
Inspiration
- Mozilla.ai — agent.cpp, llamafile, any-guardrail
- WASM-first system design
- Correctness-over-convenience philosophy