GitHub - OpenBMB/EdgeClaw: EdgeClaw: Edge-Cloud Collaborative Personal AI Assistant based on OpenClaw

Secure · Cost-Effective · Efficient

Edge-Cloud Collaborative AI Agent
EdgeClaw: Keep sensitive data off the cloud, let cheap models handle 80% of requests

【中文 | English】

What's New 🔥

[2026.03.13] 🎉 EdgeClaw adds Cost-Aware Collaboration: automatically determines task complexity and matches the most economical cloud model
[2026.02.12] 🎉 EdgeClaw is officially open-sourced — an Edge-Cloud Collaborative AI Agent

💡 About EdgeClaw

EdgeClaw is an Edge-Cloud Collaborative AI Agent jointly developed by THUNLP (Tsinghua University), Renmin University of China, AI9Stars, ModelBest, and OpenBMB, built on top of OpenClaw.

In current AI Agent architectures, the edge side has long been overlooked — all data and tasks are funneled to the cloud, leading to privacy leaks and wasted compute. EdgeClaw reactivates the value of edge computing by constructing a customizable three-tier security system (S1 Passthrough / S2 Desensitization / S3 Local). Through a dual-engine on the edge (rule-based detection ~0ms + local LLM semantic detection ~1–2s), it classifies the sensitivity and complexity of every request in real time, then routes each request through a unified composable pipeline to the most privacy-safe and cost-effective processing path. With intelligent edge-cloud forwarding, developers can achieve seamless privacy protection — "public data to cloud, sensitive data desensitized, private data stays local" — without modifying any business logic.

✨ Key Highlights

🤝 Edge-Cloud Division of Labor

The edge perceives data attributes (sensitivity, complexity); the cloud handles reasoning and generation. The edge covers the cloud's blind spots (sensitive data never leaves the device), while the cloud compensates for the edge's limitations (complex tasks are offloaded to the cloud).

🔒 Three-Tier Security Collaboration

Safe data (S1) — sent directly to the cloud; Sensitive data (S2) — desensitized on-device before forwarding to the cloud; Private data (S3) — processed entirely on-device, with the cloud only maintaining context continuity.

💰 Cost-Aware Collaboration

A local LLM semantically judges task complexity, routing simple tasks to cheap models and reserving expensive models for complex tasks only. In typical workflows, 60–80% of requests are forwarded to low-cost models, drastically cutting cloud token expenses.

🚀 Plug-and-Play, Zero Code Changes

EdgeClaw automatically intercepts and routes via its Hook mechanism — no modifications to any business logic required. It serves as a seamless drop-in replacement for OpenClaw.

🔒 Three-Tier Security Collaboration

Three-Level Sensitivity Classification

Every user message, tool call, and tool result is inspected in real time and automatically classified into one of three levels:

Level	Meaning	Routing Strategy	Example
S1	Safe	Send directly to cloud model	"Write a poem about spring"
S2	Sensitive	Desensitize then forward to cloud	Addresses, phone numbers, emails
S3	Private	Process locally only	Pay slips, passwords, SSH keys

Dual Detection Engines

Engine	Mechanism	Latency	Coverage
Rule Detector	Keywords + Regex matching	~0ms	Known patterns: API keys, DB connection strings, PEM key headers
Local LLM Detector	Semantic understanding (runs on a local small model)	~1–2s	Contextual reasoning: "Analyze this pay slip for me", addresses in various languages

The two engines can be stacked and combined, flexibly enabled per scenario via the checkpoints configuration.

S2 Data Flow: Desensitized Forwarding

User Message (containing PII)
    │
    ▼
Local LLM Detection → S2
    │
    ▼
Local LLM Extracts PII → JSON Array
    │
    ▼
Programmatic PII Replacement → [REDACTED:PHONE], [REDACTED:ADDRESS]
    │
    ▼
Privacy Proxy (localhost:8403)
    ├── Strips PII markers
    ├── Forwards to cloud model
    └── Passes through response (supports SSE streaming)

S3 Data Flow: Fully Local Processing

User Message (containing private data)
    │
    ▼
Detection → S3
    │
    ▼
Forward to Local Guard Agent
    ├── Uses local LLM (Ollama / vLLM)
    ├── Full data visible, entirely local inference
    └── Cloud-side history only receives 🔒 placeholder

Dual-Track Memory & Dual-Track Sessions

~/.openclaw/workspace/
├── MEMORY.md           ← What the cloud model sees (auto-desensitized)
├── MEMORY-FULL.md      ← What the local model sees (complete data)
│
agents/{id}/sessions/
├── full/               ← Complete history (including Guard Agent interactions)
└── clean/              ← Clean history (for cloud model consumption)

The cloud model never sees MEMORY-FULL.md or sessions/full/ — the Hook system intercepts at the file access layer.

Security Guarantees

Theorem 1 (Cloud-Side Invisibility): For any S3-level data x, its original content is completely invisible to the cloud:

∀ x, Detect(x) = S₃ ⟹ x ∉ Cloud(x)

Theorem 2 (Desensitization Completeness): For any S2-level data x, the cloud-visible form contains none of the original privacy entity values:

∀ x, Detect(x) = S₂ ⟹ ∀ (t_i, v_i) ∈ Extract(x), v_i ∉ Cloud(x)

💰 Feature Two: Cost-Aware Collaboration

Why Cost-Aware Collaboration?

In a typical AI coding assistant workflow, most requests involve browsing files, reading code, and simple Q&A — using the most expensive model for these tasks is pure waste. Cost-Aware Collaboration uses a local small model as an LLM-as-Judge, classifying requests by complexity and routing them to cloud models at different price tiers.

Complexity	Task Examples	Default Target Model
SIMPLE	Queries, translation, formatting, greetings	`gpt-4o-mini`
MEDIUM	Code generation, single-file editing, email drafting	`gpt-4o`
COMPLEX	System design, multi-file refactoring, cross-document analysis	`claude-sonnet-4.6`
REASONING	Mathematical proofs, formal logic, experiment design	`o4-mini`

Why LLM-as-Judge Instead of Keyword Rules?

Approach	Pros	Cons
Keyword Rules	Fast	No semantic understanding, high false-positive rate
LLM-as-Judge	Semantic understanding, multilingual	One additional local model call (~1–2s)

The Judge runs on a local small model (e.g., MiniCPM-4.1 / Qwen3.5), with latency of approximately 1–2 seconds.

Smart Caching

Prompt hash caching (SHA-256, TTL 5 minutes) — identical requests are not re-judged, further reducing latency overhead.

How Much Do You Save?

In a typical coding assistant workflow, Cost-Aware Collaboration can route 60–80% of requests to cheaper models.

🚀 Composable Router Pipeline

Security collaboration and cost-aware collaboration run in the same pipeline, working together via weights and a two-phase short-circuit strategy:

User Message
    │
    ▼
RouterPipeline.run()
    │
    ├── Phase 1: Fast routers (weight ≥ 50) run in parallel
    │       └── security router → three-tier sensitivity detection
    │
    ├── Short-circuit: If Phase 1 detects sensitive data → skip Phase 2
    │
    └── Phase 2: Slow routers (weight < 50) run on demand
            └── cost-aware router → LLM Judge task complexity classification

Design Philosophy: Security first — the security router runs first with high weight. If sensitive data is found, it short-circuits immediately without wasting time on complexity judgment. Cost-aware collaboration kicks in only after the security check passes (S1).

End-to-End Pipeline Formalization

                                                    ⎧ θ_cloud(m)        if a = passthrough
m ─[c_msg]→ Detect(m) → l ─[c_route]→ R(l) → a → ⎨ θ_cloud(De(m))    if a = desensitize
                                                    ⎩ θ_local(m)        if a = redirect

  ─[c_persist]→ W(m, l) ─[c_end]→ Sync

10 Hooks Covering the Complete Lifecycle

Hook	Trigger Point	Core Responsibility
`before_model_resolve`	Before model selection	Run pipeline → routing decision
`before_prompt_build`	Before prompt construction	Inject Guard Prompt / S2 markers
`before_tool_call`	Before tool invocation	File access guard + sub-agent guard
`after_tool_call`	After tool invocation	Tool result detection
`tool_result_persist`	Result persistence	Dual-track session write
`before_message_write`	Before message write	S3 → placeholder, S2 → desensitized version
`session_end`	Session ends	Memory synchronization
`message_sending`	Outbound message	Detect and desensitize/cancel
`before_agent_start`	Before sub-agent starts	Task content guard
`message_received`	Message received	Observability logging

📦 Installation

We provide two installation methods: from source (recommended) and local LLM environment setup.

Method 1: Install from Source

git clone https://github.com/openbmb/edgeclaw.git
cd edgeclaw

pnpm install
pnpm build
pnpm ui:build
pnpm openclaw onboard --install-daemon

Method 2: Local LLM Environment

EdgeClaw requires a local inference backend for privacy detection and the Guard Agent. We recommend Ollama:

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the recommended model
ollama pull openbmb/minicpm4.1

# Start the service
ollama serve

All OpenAI-compatible APIs are also supported, including vLLM, LMStudio, SGLang, TGI, etc. See config.example.json for backend configuration examples.

Verify Installation

pnpm openclaw gateway run

If you see the GuardClaw plugin loading logs, the installation was successful.

🚀 Quick Start

1. Enable the GuardClaw Plugin

Add the following to openclaw.json:

{
  "plugins": {
    "entries": {
      "GuardClaw": {
        "enabled": true,
        "config": {
          "privacy": {
            "enabled": true,
            "localModel": {
              "enabled": true,
              "provider": "ollama",
              "model": "openbmb/minicpm4.1",
              "endpoint": "http://localhost:11434"
            },
            "guardAgent": {
              "id": "guard",
              "workspace": "~/.openclaw/workspace-guard",
              "model": "ollama/openbmb/minicpm4.1"
            }
          }
        }
      }
    }
  }
}

2. Configure the Guard Agent

Register the Guard Agent in the agents section of openclaw.json:

{
  "agents": {
    "list": [
      {
        "id": "main",
        "workspace": "~/.openclaw/workspace-main",
        "subagents": { "allowAgents": ["guard"] }
      },
      {
        "id": "guard",
        "workspace": "~/.openclaw/workspace-guard",
        "model": "ollama/openbmb/minicpm4.1"
      }
    ]
  }
}

3. Launch

pnpm openclaw gateway run

EdgeClaw automatically intercepts and routes — no modifications to any business logic required.

4. (Optional) Enable Cost-Aware Collaboration

Enable it in privacy.routers:

{
  "privacy": {
    "routers": {
      "token-saver": {
        "enabled": true,
        "weight": 30,
        "options": {
          "tiers": {
            "SIMPLE": { "provider": "openai", "model": "gpt-4o-mini" },
            "MEDIUM": { "provider": "openai", "model": "gpt-4o" },
            "COMPLEX": { "provider": "anthropic", "model": "claude-sonnet-4.6" },
            "REASONING": { "provider": "openai", "model": "o4-mini" }
          }
        }
      }
    }
  }
}

🎬 Demo

🔧 Custom Configuration

Custom Detection Rules

{
  "privacy": {
    "rules": {
      "keywords": {
        "S2": ["password", "api_key", "token"],
        "S3": ["ssh", "id_rsa", "private_key", ".pem"]
      },
      "patterns": {
        "S2": ["(?:mysql|postgres|mongodb)://[^\\s]+"],
        "S3": ["-----BEGIN (?:RSA |EC )?PRIVATE KEY-----"]
      },
      "tools": {
        "S2": { "tools": ["exec", "shell"], "paths": ["~/secrets"] },
        "S3": { "tools": ["sudo"], "paths": ["~/.ssh", "~/.aws"] }
      }
    }
  }
}

Detector Composition

{
  "privacy": {
    "checkpoints": {
      "onUserMessage": ["ruleDetector", "localModelDetector"],
      "onToolCallProposed": ["ruleDetector"],
      "onToolCallExecuted": ["ruleDetector"]
    }
  }
}

Custom Routers

The EdgeClaw collaboration pipeline is fully extensible — implement the GuardClawRouter interface to inject custom collaboration logic:

const myRouter: GuardClawRouter = {
  id: "content-filter",
  async detect(context, pluginConfig): Promise<RouterDecision> {
    if (context.message && context.message.length > 10000) {
      return {
        level: "S1",
        action: "redirect",
        target: { provider: "anthropic", model: "claude-sonnet-4.6" },
        reason: "Message too long, using larger context model",
      };
    }
    return { level: "S1", action: "passthrough" };
  },
};

{
  "privacy": {
    "routers": {
      "content-filter": {
        "enabled": true,
        "type": "custom",
        "module": "./my-routers/content-filter.js",
        "weight": 40
      }
    },
    "pipeline": {
      "onUserMessage": ["privacy", "token-saver", "content-filter"]
    }
  }
}

Prompt Customization

Edit the Markdown files under extensions/guardclaw/prompts/ to adjust behavior — no code changes needed:

File	Purpose
`detection-system.md`	S1/S2/S3 classification rules
`guard-agent-system.md`	Guard Agent behavior
`token-saver-judge.md`	Task complexity classification

Provider Preset Quick Switch

Built-in presets allow one-click switching between local model + cloud model combinations:

Preset	Local Model	Cloud Model	Use Case
`vllm-qwen35`	vLLM / Qwen 3.5-35B	Same (fully local)	Full local deployment, maximum privacy
`minimax-cloud`	vLLM / Qwen 3.5-35B	MiniMax M2.5	Local privacy detection + cloud primary model

Custom presets for Ollama, LMStudio, SGLang, and other backends are also supported.

🏗️ Code Structure

extensions/guardclaw/
├── index.ts                    # Plugin entry point
├── openclaw.plugin.json        # Plugin metadata
├── config.example.json         # Configuration example
│
├── src/
│   ├── detector.ts             # Detection engine (coordinates dual detectors)
│   ├── rules.ts                # Rule detector (keywords + regex)
│   ├── local-model.ts          # Local LLM detector + desensitization engine
│   ├── router-pipeline.ts      # Router pipeline (two-phase + weighted merge)
│   ├── hooks.ts                # 10 Hooks
│   ├── privacy-proxy.ts        # HTTP privacy proxy
│   ├── guard-agent.ts          # Guard Agent management
│   ├── session-state.ts        # Session privacy state
│   ├── session-manager.ts      # Dual-track session history
│   ├── memory-isolation.ts     # Dual-track memory management
│   └── routers/
│       ├── privacy.ts          # Privacy router (security)
│       └── token-saver.ts      # Cost-Aware router (cost savings)
│
├── prompts/                    # Customizable prompt templates
│   ├── detection-system.md
│   ├── guard-agent-system.md
│   └── token-saver-judge.md
│
└── test/                       # Test suite
    ├── rules.test.ts
    ├── detector.test.ts
    ├── router-pipeline.test.ts
    ├── token-saver.test.ts
    ├── privacy-proxy.test.ts
    └── integration.test.ts

🤝 Contributing

Thanks to all contributors for their code submissions and testing. We welcome new members to join us in building the edge-cloud collaborative Agent ecosystem!

Contributing workflow: Fork this repo → Submit Issues → Create Pull Requests (PRs)

⭐ Support Us

If this project is helpful to your research or work, please give us a ⭐!

💬 Contact Us

For technical questions and feature requests, please use GitHub Issues

📖 References

Dependencies

OpenClaw — Base AI assistant framework
MiniCPM — Recommended local detection model
Ollama — Recommended local inference backend

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19,036 Commits
.agent/workflows		.agent/workflows
.agents		.agents
.github		.github
.pi		.pi
.vscode		.vscode
Swabble		Swabble
apps		apps
assets		assets
changelog/fragments		changelog/fragments
docs		docs
extensions		extensions
git-hooks		git-hooks
packages		packages
patches		patches
scripts		scripts
skills		skills
src		src
test-fixtures		test-fixtures
test		test
ui		ui
vendor/a2ui		vendor/a2ui
.detect-secrets.cfg		.detect-secrets.cfg
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.jscpd.json		.jscpd.json
.mailmap		.mailmap
.markdownlint-cli2.jsonc		.markdownlint-cli2.jsonc
.npmignore		.npmignore
.npmrc		.npmrc
.oxfmtrc.jsonc		.oxfmtrc.jsonc
.oxlintrc.json		.oxlintrc.json
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
.shellcheckrc		.shellcheckrc
.swiftformat		.swiftformat
.swiftlint.yml		.swiftlint.yml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.sandbox		Dockerfile.sandbox
Dockerfile.sandbox-browser		Dockerfile.sandbox-browser
Dockerfile.sandbox-common		Dockerfile.sandbox-common
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
VISION.md		VISION.md
appcast.xml		appcast.xml
docker-compose.yml		docker-compose.yml
docker-setup.sh		docker-setup.sh
docs.acp.md		docs.acp.md
fly.private.toml		fly.private.toml
fly.toml		fly.toml
knip.config.ts		knip.config.ts
openclaw.mjs		openclaw.mjs
openclaw.podman.env		openclaw.podman.env
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
pyproject.toml		pyproject.toml
readme_zh.md		readme_zh.md
render.yaml		render.yaml
setup-podman.sh		setup-podman.sh
tsconfig.json		tsconfig.json
tsconfig.plugin-sdk.dts.json		tsconfig.plugin-sdk.dts.json
tsdown.config.ts		tsdown.config.ts
vitest.channels.config.ts		vitest.channels.config.ts
vitest.config.ts		vitest.config.ts
vitest.e2e.config.ts		vitest.e2e.config.ts
vitest.extensions.config.ts		vitest.extensions.config.ts
vitest.gateway.config.ts		vitest.gateway.config.ts
vitest.live.config.ts		vitest.live.config.ts
vitest.scoped-config.ts		vitest.scoped-config.ts
vitest.unit.config.ts		vitest.unit.config.ts
zizmor.yml		zizmor.yml

Folders and files

Latest commit

History

Repository files navigation

Secure · Cost-Effective · Efficient

💡 About EdgeClaw

✨ Key Highlights

🔒 Three-Tier Security Collaboration

Three-Level Sensitivity Classification

Dual Detection Engines

S2 Data Flow: Desensitized Forwarding

S3 Data Flow: Fully Local Processing

Dual-Track Memory & Dual-Track Sessions

Security Guarantees

💰 Feature Two: Cost-Aware Collaboration

Why Cost-Aware Collaboration?

Why LLM-as-Judge Instead of Keyword Rules?

Smart Caching

How Much Do You Save?

🚀 Composable Router Pipeline

End-to-End Pipeline Formalization

10 Hooks Covering the Complete Lifecycle

📦 Installation

Method 1: Install from Source

Method 2: Local LLM Environment

Verify Installation

🚀 Quick Start

1. Enable the GuardClaw Plugin

2. Configure the Guard Agent

3. Launch

4. (Optional) Enable Cost-Aware Collaboration

🎬 Demo

🔧 Custom Configuration

Custom Detection Rules

Detector Composition

Custom Routers

Prompt Customization

Provider Preset Quick Switch

🏗️ Code Structure

🤝 Contributing

⭐ Support Us

💬 Contact Us

📖 References

Dependencies

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages