A correct agent that fails honestly
is better than one that “usually works.”

agent.rs is a host-agnostic, correctness-first agent architecture written in Rust, designed to run unchanged across native, browser, and edge environments using WebAssembly.

It separates agent intelligence from host capabilities, enforces semantic guardrails, and refuses to return plausible-looking but incorrect results.

Why agent.rs?

Most agent frameworks optimize for convenience: auto-invoking tools, silently retrying, hallucinating "final answers" when things go wrong.

agent.rs optimizes for correctness.

It makes failure explicit, observable, and actionable, while keeping the agent logic:

  • Portable
  • Auditable
  • Deterministic

Core Design Principles

1. The Host Provides Capabilities, Not Intelligence

  • Hosts provide LLMs, tools, and I/O
  • The agent provides decision logic
  • No host-specific branching inside the agent

2. What's Identical Across All Three Hosts

In all three environments:

  • ✅ The agent logic is identical
  • ✅ The tool invocation protocol is identical
  • ✅ The guardrails and failure semantics are identical

3. Explicit Failure, Not Silent Fallback

  • No silent success
  • No hallucinated correctness
  • Guardrails enforce correctness by design

4. Pure State Transition Engine

  • agent-core has zero platform dependencies
  • Native runs Rust directly; Browser and Edge use WebAssembly
  • Deterministic state → decision transitions
  • Side effects handled exclusively by the host

One Agent, Multiple Hosts

The same agent-core logic runs unchanged across native, browser, and edge environments. Hosts provide capabilities — the agent provides decisions.

agent.rs architecture diagram showing CLI, Browser, and Edge hosts sharing the same agent-core
  • Native / CLI: Local inference, shell tools, persistent state
  • Browser: WebLLM, DOM & fetch tools, session state
  • Edge: HTTP LLMs, fetch-only tools, stateless execution

Skills: Contract-Based Operations

Skills are different from tools. They're contract-based operations with built-in guardrails, following Anthropic's Agent Skills specification.

Aspect Tools Skills
Definition Host-provided capabilities Contract-based operations
Validation PlausibilityGuard Schema + Semantic guardrails
Execution Host executes directly Host executes, core validates
Examples shell, fetch_url, read_dom extract

Extraction Skill

The first built-in skill extracts structured data from unstructured text.

Supported Targets

  • email — Email addresses
  • url — URLs and links
  • date — Dates (ISO format)
  • entity — Named entities (people, organizations, locations)
  • name — Person names

Example

# CLI
agent extract --text "Contact Dr. Sarah Johnson at sarah@example.com" --target name
# Output: {"name": ["Dr. Sarah Johnson"]}

# Browser / Edge
{"skill": "extract", "text": "Contact Dr. Sarah Johnson", "target": "name"}
            

Built-in Guardrails

  • Schema Validation — Output must match expected JSON structure
  • Anti-Hallucination — Extracted values must exist in source text
  • Type Correctness — Values must match target type format

Skills make failure explicit. If the LLM hallucinates a value not in the source text, the guardrail rejects it.

Three Hosts, One Agent

Native

LLM: llama.cpp (local)

Tools: shell (with approval)

State: Persistent

Use Case: CLI automation

Browser

LLM: WebLLM (local)

Tools: DOM, fetch

State: Session

Use Case: Interactive UI

Edge

LLM: HTTP LLM API

Tools: fetch_url

State: Stateless

Use Case: Serverless APIs

The host provides capabilities. The agent provides decisions.

  • Same agent-core logic across all environments
  • Identical guardrails and failure semantics
  • Identical tool invocation protocol

Repository Structure

agent.rs/
├─ crates/
│  ├─ agent-core/     # Pure agent logic (WASM-compatible)
│  ├─ agent-native/   # CLI runtime (llama.cpp, shell tools)
│  └─ agent-wasm/     # WASM bindings for agent-core
│
├─ skills/
│  └─ extraction/     # First built-in skill
│     ├─ SKILL.md     # Skill contract (Agent Skills spec)
│     ├─ schema.json  # JSON schema for input/output
│     └─ README.md    # Quick start guide
│
├─ examples/
│  ├─ browser/        # WebLLM + WASM browser demo
│  ├─ edge/           # Deno-based edge runtime
│  └─ with-extraction-skill/  # Skill demo walkthrough
│
├─ Makefile
└─ README.md
        

Getting Started

Native (CLI)

# Setup
make setup

# Download a model (example: Granite 4.0 Micro)
wget https://huggingface.co/ibm-granite/granite-4.0-micro-GGUF/resolve/main/granite-4.0-micro-Q8_0.gguf

# Run demo
make demo-shell MODEL_PATH=./granite-4.0-micro-Q8_0.gguf

# Or configure .env and run
cp .env.example .env
# Edit .env: MODEL_PATH=./granite-4.0-micro-Q8_0.gguf
make demo-shell
          

Browser (WebLLM + WASM)

make demo-browser
# Opens http://localhost:8080

# Features:
# - Runs fully locally
# - No API keys needed
# - WebGPU required
# - First run downloads model (~1.8GB)
          

Edge (Deno)

# Configure environment
cp .env.example .env
# Edit .env:
# LLM_ENDPOINT=https://api.openai.com/v1/chat/completions
# LLM_API_KEY=sk-...
# LLM_MODEL=gpt-4o-mini

# Run
make demo-edge

# Test in another terminal:
curl -X POST http://localhost:8000 \
  -H "Content-Type: application/json" \
  -d '{"query":"Fetch data from https://httpbin.org/json"}'
          

Guardrails: Correctness as a First-Class Concern

agent.rs uses semantic guardrails inspired by Mozilla.ai's any-guardrail pattern.

What Guardrails Do

  • Reject empty outputs
  • Reject metadata-only outputs (e.g. "total 123")
  • Reject outputs lacking substance
  • Prevent silent success

What Happens on Failure

  • Clear explanation
  • Explicit rejection reason
  • Actionable suggestions
  • Non-zero exit / visible UI failure

Failures are signals, not bugs.

Known Failure Modes (By Design)

Some tasks cannot be completed correctly with:

  • Small models (< 7B parameters)
  • Insufficient tool reasoning capability
  • Ambiguous instructions

agent.rs will:

  • Fail loudly
  • Explain why
  • Refuse to hallucinate

This is intentional.

Roadmap

Near-Term

  • Tool postconditions (semantic contracts)
  • Executable validation (tests as guardrails)
  • Model capability negotiation

Medium-Term

  • Multi-agent orchestration
  • Streaming decisions
  • Policy-based guardrail chains

Long-Term

  • Formal agent contracts
  • Deterministic replay
  • Standardized WASM agent ABI

Why Rust?

Rust enables:

  • Memory safety
  • Deterministic execution
  • Zero-cost abstractions
  • First-class WASM support

For agent systems that must be trusted, Rust is not optional.

Inspiration

  • Mozilla.ai — agent.cpp, llamafile, any-guardrail
  • WASM-first system design
  • Correctness-over-convenience philosophy