Monitoring & Metrics

v0.4.8Updated for nself v0.4.8

nself provides comprehensive monitoring with Prometheus, Grafana, Loki, and more. The nself metrics and nself monitor commands give you full control over your observability stack.

Metrics Command

Configure the monitoring stack with nself metrics:

CommandDescription
nself metricsShow help
nself metrics enable [profile]Enable monitoring
nself metrics disableDisable monitoring
nself metrics statusShow monitoring status
nself metrics profile [name]View/change profile
nself metrics config [key] [value]Configure settings
nself metrics dashboardOpen Grafana

Monitoring Profiles

Choose the right monitoring profile for your environment:

ProfileServicesMemoryUse Case
minimal3 services~1GBDevelopment, resource-constrained
standard5 services~2GBStaging environments, debugging
full10 services~3-4GBProduction, full observability
autoVariesVariesAuto-detect based on ENV

Auto Profile Selection

ENVProfile
devminimal
stagingstandard
prodfull

Profile Services

Minimal (3)

  • Prometheus (metrics database)
  • Grafana (visualization)
  • cAdvisor (container metrics)

Standard (+2)

  • All from minimal, plus:
  • Loki (log aggregation)
  • Promtail (log collection)

Full (+5)

  • All from standard, plus:
  • Tempo (distributed tracing)
  • Alertmanager (alerting)
  • Node Exporter (host metrics)
  • PostgreSQL Exporter
  • Redis Exporter (if enabled)

Monitor Command

Access dashboards and real-time views with nself monitor:

CommandDescription
nself monitorOpen Grafana (default)
nself monitor grafanaOpen Grafana dashboard
nself monitor prometheusOpen Prometheus UI
nself monitor lokiOpen Loki in Grafana Explore
nself monitor alertsOpen Alertmanager UI
nself monitor servicesCLI service status view
nself monitor resourcesCLI resource usage view
nself monitor logs [service]Tail service logs

Quick Start

# Enable monitoring with standard profile
nself metrics enable standard

# Rebuild to add monitoring containers
nself build && nself start

# Open Grafana dashboard
nself monitor

# View real-time service status in CLI
nself monitor services

# View container resource usage
nself monitor resources

# Tail logs for a specific service
nself monitor logs hasura

Configuration

Environment variables for monitoring:

# Enable monitoring
MONITORING_ENABLED=true

# Monitoring profile (minimal, standard, full, auto)
MONITORING_PROFILE=auto

# Grafana admin credentials
GRAFANA_ADMIN_USER=admin
GRAFANA_ADMIN_PASSWORD=your-secure-password

# Alertmanager configuration
ALERTMANAGER_SLACK_WEBHOOK=https://hooks.slack.com/...
ALERTMANAGER_PAGERDUTY_KEY=your-pagerduty-key

Built-in Dashboards

Grafana comes pre-configured with dashboards for:

  • Container Overview - CPU, memory, network for all containers
  • PostgreSQL - Query performance, connections, replication
  • Hasura - GraphQL query metrics, subscriptions
  • Node Metrics - Host system resources
  • Log Explorer - Loki log aggregation and search

Alerting

Alertmanager supports notifications via:

  • Slack
  • PagerDuty
  • Email
  • Webhooks
  • OpsGenie

Configure alerts in monitoring/alertmanager/alertmanager.yml:

route:
  receiver: 'slack-notifications'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h

receivers:
  - name: 'slack-notifications'
    slack_configs:
      - api_url: '${ALERTMANAGER_SLACK_WEBHOOK}'
        channel: '#alerts'
        send_resolved: true

Troubleshooting

Grafana Not Loading

Ensure monitoring is enabled with nself metrics status. Check if Grafana container is running with nself status.

High Memory Usage

Switch to a lower profile: nself metrics enable minimal. The full profile requires ~2GB of RAM.

No Data in Dashboards

Wait 1-2 minutes for initial metrics collection. Check Prometheus targets at /targets endpoint.