IncryptRouter

Smart AI routing and context compression for OpenClaw. One service: route by complexity, compact context, cache responses, and auto-failover. Free-first — use your own API keys.

Read Overview GitHub Contact

What is IncryptRouter?

IncryptRouter (Incrypt Smart Router) is a single, production-ready proxy that sits between your OpenClaw agents — or any OpenAI-compatible client — and multiple LLM providers. Instead of sending every request to one expensive model, it classifies each request by complexity, compresses conversation history to cut tokens, and routes to the cheapest capable model. You get one OpenAI-compatible HTTP endpoint, your own API keys (free-tier first), optional response caching, and automatic failover. No wallet or x402 required.

Why routing matters

If you use OpenClaw or any single-model API with a premium model as default, most traffic is overkill: autocomplete, short Q&A, and syntax fixes get sent to a $15–25/M-token model. IncryptRouter fixes that by sending each request to the cheapest model that can handle it: simple tasks go to free or low-cost providers (Groq, Cerebras, HuggingFace), standard tasks to mid-tier models, and only hard reasoning hits premium models. You add API keys as you need them and scale by editing .env.

Core capabilities

Smart routing — Classifies each request as simple, standard, or complex and sends it to the cheapest capable provider. Free tiers are used first when keys are present.
Context compression — Compacts conversation history before calling the model (rule-based cleanup, dictionary coding, RLE-style shortening), reducing token usage and cost. Inspired by claw-compactor.
Caching — Caches responses by request fingerprint; repeated or near-identical requests are served from cache with configurable TTL and size.
Fallback — If the chosen provider times out or errors, the router tries the next provider in the same tier automatically.
Observability — Per-request metadata (tier, model, provider, latency, cache hit) and a /stats endpoint for dashboards or debugging.

How the pipeline works

Every request to /v1/chat/completions goes through a deterministic pipeline before hitting any provider.

Cache gate — The request (messages + options) is hashed. On cache hit, the stored response is returned with meta.cacheHit: true; no compaction or provider call.
Compaction — If compression is enabled, the message list is run through rule-based cleanup (dedupe, collapse newlines/spaces), optional dictionary coding (frequent phrases → short codes), and RLE-style shortening to reduce token count.
Classifier — A heuristic (length, keywords like "analyze", "report", thresholds) maps the request to simple, standard, or complex. No external API is used.
Routing policy — For the chosen tier, the router picks from configured providers that have an API key in .env. Free-first: simple tier uses free providers; standard/complex use paid providers when keys are present.
Provider mesh + failover — The router calls the first provider in the tier; on timeout or error it tries the next in order. The response is normalized to an OpenAI-shaped structure.
Response and telemetry — The response is cached and metadata (tier, model, provider, latency, cache hit, token counts) is appended and stored for /stats.

Request flow

OpenClaw / Client

POST /v1/chat/completions

▼

Incrypt Smart Router

1. Cache lookup

2. Compaction

3. Classifier

4. Route by tier

5. Provider call

6. Fallback if err

7. Cache + telemetry

▼

Groq

Cerebras

OpenAI

Anthropic

…

Routing tiers

Classification is heuristic (message length and keywords). You can override with forceTier in the request body.

Tier	Typical tasks	Providers (when key in .env)
Simple	Short Q&A, formatting, lookups, "what is X"	Groq, Cerebras, HuggingFace, Together (free-tier)
Standard	Email drafting, summaries, medium-length reasoning	Together, OpenAI gpt-4o-mini, Anthropic Haiku, Gemini Flash
Complex	Analysis, reports, coding, deep reasoning	OpenAI gpt-4o, Anthropic Sonnet

vs ClawRouter, claw-compactor, TokenWatch

IncryptRouter unifies routing, compression, and caching in one service you host yourself, with your API keys. ClawRouter uses x402/USDC and BlockRun's model list; claw-compactor is offline scripts; TokenWatch is a proxy with rate limits. Use IncryptRouter when you want one process, your own keys, free-first routing, built-in compaction and caching, and a single OpenAI-compatible endpoint.

Summary: Routing ✅ (3-tier, free-first) · Compression ✅ (rule + dictionary + RLE) · Caching ✅ (exact-match, TTL) · Auth: your API keys · Deployment: single Node service on one port · Observability: /stats + telemetry in response.

API reference

Base URL: http://localhost:3140 (or your host). All responses JSON.

POST /v1/chat/completions — OpenAI-compatible chat completion. Body: messages (required), optional model, max_tokens, temperature, forceTier (simple|standard|complex). Streaming not supported. Response includes meta: tier, model, provider, cacheHit, compressedFromTokens?, latencyMs?, fallbackUsed?.
GET /health — Liveness. Returns { status: "ok", service: "incrypt-smart-router" }.
GET /stats — Cache size and last 50 requests (tier, model, provider, latency, tokens, cache hit) for debugging or dashboards.

Configuration

Set in .env. At least one free-tier key (e.g. GROQ_API_KEY) is recommended.

PORT (default 3140), COMPRESSION_ENABLED (default true), CACHE_TTL_SECONDS (default 3600), CACHE_MAX_SIZE (default 1000), LOG_LEVEL (default info)
Provider keys: GROQ_API_KEY, CEREBRAS_API_KEY, HF_TOKEN, TOGETHER_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_AI_API_KEY (adapter not yet implemented)

Install and quick start

One-command install clones the repo (default ~/incrypt_smart_router), runs npm install, npm run build, and creates .env from .env.example. Add at least one API key (e.g. Groq free tier at console.groq.com), then npm start. Server listens on http://localhost:3140.

# One-command install

curl -fsSL https://raw.githubusercontent.com/GHX5T-SOL/incrypt_smart_router/main/scripts/install.sh | bash

# Or clone and build

git clone https://github.com/GHX5T-SOL/incrypt_smart_router.git

cd incrypt_smart_router

npm install && npm run build

cp .env.example .env

# Add GROQ_API_KEY or other key to .env, then:

npm start

OpenClaw setup

After the router is running, add a custom model provider with baseUrl: "http://localhost:3140/v1". In ~/.openclaw/openclaw.json under models.providers, add "incrypt-router" with that baseUrl and an OpenAI-compatible model entry (e.g. id: "incrypt/auto"). Set the agent's primary model to incrypt-router/incrypt/auto, then restart the gateway: openclaw gateway restart. You can also tell your OpenClaw agent in natural language: "Install and use https://github.com/GHX5T-SOL/incrypt_smart_router" or "Set up Incrypt Smart Router and use it as my LLM backend".

Development

Source lives in src/ (router, compaction, classifier, cache, failover, providers, server). Tests in src/**/*.test.ts. Commands: npm run build, npm run test, npm run dev, npm run typecheck. See the repo docs for OPENCLAW_INTEGRATION and ARCHITECTURE.

GitHub Quick start OpenClaw setup All products