Skip to content

ADR 0010: MCP Server as Thin Client over REST API

  • Status: Accepted
  • Date: 2026-04-16

Context and Problem Statement

AI assistants (Claude, ChatGPT, Cursor, Kiro) need a standardized way to discover and invoke NAAS operations. The Model Context Protocol (MCP) provides this standard, but the server could be implemented at different layers — directly against Redis/RQ internals, as a Flask blueprint, or as a standalone client of the existing REST API.

Decision Drivers

  • AI assistants should get the same auth, RBAC, audit logging, and rate limiting as any other consumer
  • The MCP server should not become a second API surface that diverges from the REST API
  • FastMCP 3.0 is the mature Python framework for MCP servers, with built-in testing support
  • The existing naas-client async SDK (AsyncNaasClient) already wraps the REST API

Considered Options

  • Option 1: MCP server embedded in the Flask app as a blueprint
  • Option 2: Standalone MCP server calling Redis/RQ directly
  • Option 3: Standalone MCP server as a thin client over the REST API via naas-client

Decision Outcome

Chosen option: Option 3 — thin client over REST API, because it reuses the existing API boundary and guarantees all security controls apply without duplication.

Architecture

AI Assistant (Claude / Cursor / ChatGPT)
    │ MCP Protocol (JSON-RPC 2.0 over stdio)
naas-mcp (FastMCP 3.0)
    │ AsyncNaasClient (httpx)
NAAS REST API (/v2/ routes)
    │ RQ + Redis
Workers → Netmiko → Devices

Key design choices

  • Separate package (packages/naas-mcp, published as mcp-server-naas) — independent release cycle, no coupling to server internals
  • Imports from naas-client, not naas — same API boundary as any external consumer
  • FastMCP 3.0 lifespanAsyncNaasClient created on startup, closed on shutdown, injected via Context
  • Job polling in toolssend_command and send_config submit a job then poll until terminal state, so the AI gets a final result rather than a job ID
  • stdio transport — default for local AI assistants; streamable-http deferred to future work

Consequences

  • Good: All auth, RBAC, audit logging, rate limiting, and circuit breakers apply automatically
  • Good: MCP server can't bypass API controls — it's just another client
  • Good: Independent versioning and release from the main server
  • Good: FastMCP test client enables in-process testing without network transport
  • Bad: Extra network hop (MCP → API → worker) adds latency — acceptable for interactive AI use
  • Bad: MCP server cannot expose operations not yet in the REST API — accepted, keeps surfaces aligned

Pros and Cons of the Options

Option 1: Embedded Flask blueprint

  • Good: No extra network hop, single deployment
  • Bad: Couples MCP to Flask lifecycle and release cycle
  • Bad: MCP protocol (JSON-RPC over stdio) doesn't fit naturally in a WSGI app
  • Bad: Harder to test MCP tools in isolation

Option 2: Standalone calling Redis/RQ directly

  • Good: Lowest latency, no API dependency
  • Bad: Bypasses all API security controls (auth, RBAC, audit, rate limiting)
  • Bad: Duplicates job submission logic, creates second API surface
  • Bad: Tight coupling to internal queue implementation

Option 3: Thin client over REST API

  • Good: Reuses all existing security and observability infrastructure
  • Good: Clean separation — MCP server is just another API consumer
  • Good: naas-client already handles auth, retries, and async
  • Bad: Extra hop adds ~10-50ms latency per call — negligible for AI interactions