Run local LLMs via Ollama inside your nSelf stack. Pull any model Ollama supports — Llama, Mistral, Gemma, Phi — and expose it to your apps via the nSelf AI plugin interface. No API keys, no usage costs.
Free Forever — no license key required. For premium hosted models (Claude, GPT-4, Gemini), see the paid AI plugin in the ɳClaw bundle.
nself plugin run ollama/v1/chat/completionsnself plugin install ollama
# Pull a model
nself plugin run ollama pull llama3
nself plugin status ollama# Ollama API port (default: 11434)
OLLAMA_PORT=11434
# Default model for /api/generate calls
OLLAMA_DEFAULT_MODEL=llama3
# GPU device (default: auto-detect)
# Set to "cpu" to force CPU-only
OLLAMA_DEVICE=auto# Chat completion (OpenAI-compatible)
curl http://your-nself-host:11434/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "llama3",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# List available models
nself plugin run ollama list