NAAS v2.1 Release Notes
NAAS v2.1 adds OpenTelemetry tracing, an MCP server for AI-assisted operations, SSE streaming for real-time job updates, structured error codes, and rate limiting.
Highlights
- OpenTelemetry distributed tracing — End-to-end traces from API request through RQ queue to SSH device operation
- MCP server — AI assistants can manage network devices through the Model Context Protocol (
mcp-server-naason PyPI) - SSE streaming — Real-time job status updates via Server-Sent Events
- Structured error codes — Machine-parseable error classification (
CONNECTION_TIMEOUT,AUTH_FAILURE,CONFIG_REJECTED, etc.) - Rate limiting — Per-caller and per-caller-per-device sliding window limits on submission endpoints
New Features
OpenTelemetry Distributed Tracing
Traces follow the full request lifecycle: API → RQ queue → worker → SSH device. Trace context propagates through the job queue via W3C traceparent, linking API and worker spans into a single trace.
# Enable on both API and worker
OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
Install the optional dependency: pip install naas[otel]. When disabled (default), tracing adds zero overhead.
See Observability and ADR 0008.
MCP Server
The mcp-server-naas package exposes NAAS operations to AI assistants (Claude, Copilot, etc.) via the Model Context Protocol. Tools include send_command, send_config, send_command_structured, create_api_key, revoke_api_key, and job management. Includes prompt templates for common workflows.
See MCP Server and ADR 0010.
SSE Streaming
Subscribe to real-time job status updates instead of polling:
curl -N https://naas.example.com/v2/send-command/<job_id>/stream \
-H "Authorization: Bearer eyJ..."
Events are delivered as the job transitions through queued → started → finished/failed.
Structured Error Codes
Job results now include error_code and error_retryable fields when a job fails:
{
"status": "failed",
"error": "Connection to 10.0.0.1 timed out after 30 seconds",
"error_code": "CONNECTION_TIMEOUT",
"error_retryable": true
}
Error codes map each netmiko/paramiko exception to a structured identifier. The Python client library gains exception subclasses routed by error code.
See Error Codes.
Rate Limiting
Per-caller and per-caller-per-device sliding window rate limits protect against abuse and accidental flooding:
| Variable | Default | Description |
|---|---|---|
RATE_LIMIT_ENABLED |
true |
Enable/disable rate limiting |
RATE_LIMIT_PER_CALLER |
1000 |
Max requests per caller per window |
RATE_LIMIT_PER_CALLER_DEVICE |
20 |
Max requests per caller per device per window |
RATE_LIMIT_WINDOW |
60 |
Sliding window size in seconds |
Returns 429 Too Many Requests with Retry-After header when exceeded. Admin role is exempt by default.
Documentation
- Kubernetes deployment page restructured as Helm-first
- K8s/Helm equivalents added across troubleshooting, security, observability, and architecture docs
- Deployment overview page with method decision guide
- All v1 API references updated to v2
- ADR 0009: Command authorization deferred to AAA
- ADR 0010: MCP server as thin client over REST API
Testing
- Locust-based load testing with smoke (PR) and full (RC) CI profiles
- OTel integration tests with OTLP collector
- SSE streaming integration tests
- RBAC, API key, and v2 field rejection integration tests
Compatibility
- Python 3.11+
- Netmiko 4.x
- Redis 7+
- Kubernetes 1.27+ (Helm chart)