Manage the full nSelf AI stack: install and benchmark local Ollama models, configure a Gemini API pool for cloud inference, and route requests between local and cloud backends.
# Install Ollama and pull a local model
nself ai local install
nself ai local pull llama3.2
# Chat with the active model
nself ai chat "Summarize the nSelf architecture in 3 bullets"
# Check the full AI stack status
nself ai local statusnself ai <SUBCOMMAND> [FLAGS] [ARGS]nself ai orchestrates the AI plugin that backs ɳClaw and other AI-powered nSelf features. It splits into two planes: local (Ollama-managed LLMs running on your machine or VPS) and pool (a rotating set of Gemini API keys for cloud inference). The routing subcommand controls which plane handles each request type.
The AI plugin is part of the ɳClaw bundle ($0.99/mo). A valid license key must be set with nself license set before installing the plugin via nself plugin install ai.
Local inference keeps data on-premises. The pool fallback lets you use Gemini for tasks that exceed local hardware. nself ai swap lets you change the active model without restarting the stack.
Manage the local Ollama inference backend.
| Sub-action | Description |
|---|---|
local install | Install Ollama on the current machine |
local status | Show Ollama status, loaded models, and VRAM usage |
local models | List available (pulled) models |
local pull <model> | Pull a model from the Ollama registry |
local remove <model> | Remove a pulled model from disk |
local benchmark | Run a latency + throughput benchmark across loaded models |
local assign <role> <model> | Assign a model to a role (chat, embed, vision) |
local restart | Restart the Ollama daemon |
nself ai local pull llama3.2:8b
nself ai local assign chat llama3.2:8b
nself ai local benchmark
nself ai local statusManage the Gemini API key pool for cloud inference fallback.
| Sub-action | Description |
|---|---|
pool init | Initialize the pool with the first API key |
pool status | Show key count, current quota usage per key, and rotation schedule |
pool provision | Auto-provision additional keys (requires a Google Cloud project) |
pool add <key> | Add a Gemini API key manually |
pool remove <key-id> | Remove a key from the pool |
pool rotate | Force-rotate to the next key in the pool |
pool test | Run a test completion through every key in the pool |
pool daily-reset | Reset quota counters (normally runs on cron) |
nself ai pool init --key AIzaSyABC...
nself ai pool add AIzaSyDEF...
nself ai pool status
nself ai pool testControl how inference requests are routed between local and pool backends.
nself ai routing list # show current routing rules
nself ai routing set chat local # use local Ollama for chat
nself ai routing set embed pool # use Gemini pool for embeddingsSend a one-shot completion request through the active AI stack.
nself ai chat "What is the nSelf architecture?"
nself ai chat --model llama3.2:8b "Explain RLS in Postgres"Hot-swap the active model without restarting the AI plugin service.
nself ai swap mistral:7b| Flag | Type | Default | Description |
|---|---|---|---|
--model | string | (active model) | Model to use for chat or benchmark |
--json | bool | false | Output as JSON (status, routing list) |
--timeout | int | 60 | Seconds to wait for inference response (chat) |
nself license set nself_pro_xxxxx...
nself plugin install ai
nself ai local install
nself ai local pull llama3.2:8b
nself ai local assign chat llama3.2:8b
nself ai local assign embed nomic-embed-text
nself ai routing set chat local
nself ai routing set embed local
nself ai chat "Hello, world"nself ai pool init --key AIzaSyABC...
nself ai routing set chat local
nself ai routing set embed poolnself ai local benchmark --json | jq '.models | sort_by(.p50_ms)'NSELF_AI_BACKEND — default routing target: local or poolNSELF_OLLAMA_HOST — override Ollama host (default: http://127.0.0.1:11434)NSELF_GEMINI_API_KEY — single Gemini key (pool not needed for solo usage)0 — success1 — plugin not installed or Ollama not running2 — invalid arguments3 — license not set or invalid