Skip to content

NAAS v2.1 Release Notes

NAAS v2.1 adds OpenTelemetry tracing, an MCP server for AI-assisted operations, SSE streaming for real-time job updates, structured error codes, and rate limiting.

Highlights

  • OpenTelemetry distributed tracing — End-to-end traces from API request through RQ queue to SSH device operation
  • MCP server — AI assistants can manage network devices through the Model Context Protocol (mcp-server-naas on PyPI)
  • SSE streaming — Real-time job status updates via Server-Sent Events
  • Structured error codes — Machine-parseable error classification (CONNECTION_TIMEOUT, AUTH_FAILURE, CONFIG_REJECTED, etc.)
  • Rate limiting — Per-caller and per-caller-per-device sliding window limits on submission endpoints

New Features

OpenTelemetry Distributed Tracing

Traces follow the full request lifecycle: API → RQ queue → worker → SSH device. Trace context propagates through the job queue via W3C traceparent, linking API and worker spans into a single trace.

# Enable on both API and worker
OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317

Install the optional dependency: pip install naas[otel]. When disabled (default), tracing adds zero overhead.

See Observability and ADR 0008.

MCP Server

The mcp-server-naas package exposes NAAS operations to AI assistants (Claude, Copilot, etc.) via the Model Context Protocol. Tools include send_command, send_config, send_command_structured, create_api_key, revoke_api_key, and job management. Includes prompt templates for common workflows.

pip install mcp-server-naas

See MCP Server and ADR 0010.

SSE Streaming

Subscribe to real-time job status updates instead of polling:

curl -N https://naas.example.com/v2/send-command/<job_id>/stream \
  -H "Authorization: Bearer eyJ..."

Events are delivered as the job transitions through queued → started → finished/failed.

Structured Error Codes

Job results now include error_code and error_retryable fields when a job fails:

{
  "status": "failed",
  "error": "Connection to 10.0.0.1 timed out after 30 seconds",
  "error_code": "CONNECTION_TIMEOUT",
  "error_retryable": true
}

Error codes map each netmiko/paramiko exception to a structured identifier. The Python client library gains exception subclasses routed by error code.

See Error Codes.

Rate Limiting

Per-caller and per-caller-per-device sliding window rate limits protect against abuse and accidental flooding:

Variable Default Description
RATE_LIMIT_ENABLED true Enable/disable rate limiting
RATE_LIMIT_PER_CALLER 1000 Max requests per caller per window
RATE_LIMIT_PER_CALLER_DEVICE 20 Max requests per caller per device per window
RATE_LIMIT_WINDOW 60 Sliding window size in seconds

Returns 429 Too Many Requests with Retry-After header when exceeded. Admin role is exempt by default.

See Security — Rate Limiting.

Documentation

  • Kubernetes deployment page restructured as Helm-first
  • K8s/Helm equivalents added across troubleshooting, security, observability, and architecture docs
  • Deployment overview page with method decision guide
  • All v1 API references updated to v2
  • ADR 0009: Command authorization deferred to AAA
  • ADR 0010: MCP server as thin client over REST API

Testing

  • Locust-based load testing with smoke (PR) and full (RC) CI profiles
  • OTel integration tests with OTLP collector
  • SSE streaming integration tests
  • RBAC, API key, and v2 field rejection integration tests

Compatibility

  • Python 3.11+
  • Netmiko 4.x
  • Redis 7+
  • Kubernetes 1.27+ (Helm chart)