ɳClaw Voice

ɳClaw supports voice input in the web client and voice message processing via Telegram. Speak naturally, and your words are transcribed before being sent to the AI.

Platform support

Platform	Input	Output	Notes
Web (Chrome)	Web Speech API	Text only	Hold-to-speak button in the input bar
Web (Safari)	Web Speech API	Text only	Same hold-to-speak UI
Telegram bot	Voice messages	Text reply	Whisper transcription on the server
iOS app	Native speech	TTS (nself-voice)	Requires Pro tier + nself-voice plugin
macOS app	Native speech	TTS (nself-voice)	Requires Pro tier + nself-voice plugin

Web client, hold-to-speak

The web client uses the Web Speech API for voice input. No audio is sent to a cloud transcription service, transcription runs in the browser on your device.

How it works:

Hold the microphone button in the input bar.
Speak your message. Text appears in the input field in real time.
Release the button. The transcript is editable before you send.
Press Enter or click Send to submit.

If the browser does not support the Web Speech API (Firefox, some mobile browsers), the mic button is hidden and the text input works normally. There is no error state, it falls back silently.

Telegram voice messages

When the ɳClaw Telegram bot is connected, voice messages are transcribed using OpenAI Whisper running on your server. The transcript is sent to the AI as a normal text message. No audio is stored after transcription.

Setup

Whisper runs as part of the nself-ai plugin. Make sure it is installed:

nself plugin install ai
nself build && nself restart

Whisper model size is configurable. Larger models are more accurate but slower and need more RAM.

# In your backend .env
NSELF_AI_WHISPER_MODEL=base     # default, fast, ~1 GB RAM
# NSELF_AI_WHISPER_MODEL=small  # better accuracy, ~2 GB RAM
# NSELF_AI_WHISPER_MODEL=medium # near-human accuracy, ~5 GB RAM

nself-voice plugin (Pro tier)

The nself-voice plugin adds text-to-speech output to the iOS and macOS native apps. When enabled, ɳClaw reads its replies aloud using your preferred voice.

Install

nself plugin install voice
nself build && nself restart

Configuration

# In your backend .env
NSELF_VOICE_PROVIDER=system     # default, uses OS TTS (no extra cost)
# NSELF_VOICE_PROVIDER=openai   # OpenAI TTS (requires OPENAI_API_KEY)
# NSELF_VOICE_PROVIDER=elevenlabs # ElevenLabs (requires ELEVENLABS_API_KEY)

NSELF_VOICE_DEFAULT_ON=false    # set to true to enable TTS by default for new users

Users can toggle TTS on and off per-session inside the app. The server setting controls the default.

CLI commands

# Check voice configuration
nself claw voice status

# Set Whisper model size
nself claw voice whisper-model --set base

# Test Whisper transcription with a local audio file
nself claw voice test --file recording.mp3

# Enable or disable TTS output
nself claw voice tts --enable
nself claw voice tts --disable

# Set TTS provider
nself claw voice tts --provider openai

Requirements

nself v1.0.122 or later
ɳClaw bundle ($0.99/mo) or ɳSelf+ ($3.99/mo), required for nself-voice TTS and Telegram voice
nself plugin install ai: Required for Whisper transcription
nself plugin install voice: Required for TTS output (iOS/macOS)
Web Speech API: Chrome or Safari (Firefox not supported)