Skip to content

Protocols

The canonical LLM protocol for Core-X is OpenResponses (POST /v1/responses) — not /v1/chat/completions.

Why OpenResponses?

Every LLM-capable service speaks POST /v1/responses with SSE streaming. The legacy /v1/chat/completions endpoint has been removed from the codebase.

Key Properties

  • Streaming: Server-Sent Events (SSE) for real-time token delivery
  • Unified: Same protocol for gateway, mlx-llm, and mlx-rag
  • A2A: Agent-to-Agent communication uses the event bus (:8085) over SSE

Request Flow

Client → POST /v1/responses → Gateway (:8090) → mlx-llm (:8091)
→ mlx-rag (:8092)
→ [other services]
← SSE stream ←

Event Bus (A2A)

The event bus at :8085 provides SSE pub/sub for agent-to-agent communication. Services publish events and subscribe to channels for real-time coordination.

Terminal window
# Health check
curl http://localhost:8085/health
# Start standalone
python core-x/scripts/run_event_bus.py --host 127.0.0.1 --port 8085