Lightweight agent framework for local LLMs. OpenClaw patterns, none of the weight.
We run Ziggy, an autonomous AI agent on an NVIDIA DGX Spark. Qwen 2.5 32B, local inference, no cloud. When we looked at OpenClaw for tool-calling and automation, the architecture was sound but the infrastructure was built for cloud-scale deployments.
We needed the patterns without the weight.
Lane-based serial execution. Context window guards. Skill files. Provider fallback. These are the things that make an agent reliable with a small quantised model. The gateway server, channel adapters, web dashboard, and browser automation are not.
ClawLite is what we extracted. ~500 lines of actual logic. Everything OpenClaw got right about agent reliability, nothing it got heavy about.
If you are running a local model and want an agent that can actually use tools without fumbling, this is for you.
Serial by default, parallel only when you say so. No race conditions. No interleaved garbage.
Drop markdown in skills/ to shape behaviour. Trigger keywords for dynamic loading. Configure with text, not code.
Approve a shell command once, it remembers. No re-prompting every session for the same git push.
Automatic compaction at 80%. Summarises older turns before you hit the wall. Critical at 16K context.
Small quants fumble parallel calls. Serial tool execution is more reliable with Qwen, Llama, Mistral.
Ollama goes down, Groq API catches it. Automatically. No broken pipelines at 3am.