Protocols

The canonical LLM protocol for Core-X is OpenResponses (POST /v1/responses) — not /v1/chat/completions.

Why OpenResponses?

Every LLM-capable service speaks POST /v1/responses with SSE streaming. The legacy /v1/chat/completions endpoint has been removed from the codebase.

Key Properties

Streaming: Server-Sent Events (SSE) for real-time token delivery
Unified: Same protocol for gateway, mlx-llm, and mlx-rag
A2A: Agent-to-Agent communication uses the event bus (:8085) over SSE

Request Flow

Client → POST /v1/responses → Gateway (:8090) → mlx-llm (:8091)
                                              → mlx-rag  (:8092)
                                              → [other services]
         ← SSE stream ←

Event Bus (A2A)

The event bus at :8085 provides SSE pub/sub for agent-to-agent communication. Services publish events and subscribe to channels for real-time coordination.

# Health check
curl http://localhost:8085/health

# Start standalone
python core-x/scripts/run_event_bus.py --host 127.0.0.1 --port 8085