Documentation Index
Fetch the complete documentation index at: https://docs.blink.so/docs/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Blink Server acts as the control plane for AI agent deployments. Key architectural principles:
- Agents are HTTP servers deployed as Docker containers
- The control loop runs inside the server, not in agents
- Communication is HTTP; chat streaming uses SSE and WebSocket to clients
- State is centralized in PostgreSQL (chats, runs, deployments, files, logs, traces, KV)
The server also hosts the web UI and the HTTP API used by the CLI and SDK.
Server Components
- API server handles chats, agents, webhooks, files, logs, traces, and devhook routing
- WebSocket server is used for chat streaming and auth token handshakes
- Startup runs database migrations before accepting traffic
Agent Execution Model
Agents are deployed as Docker containers using a configurable image (default: ghcr.io/coder/blink-agent:latest).
Container Structure
Each agent container includes:
| Component | Purpose |
|---|
| Agent bundle | Built files staged into /app from the deployment output |
| Runtime wrapper | Starts the agent and internal API server, proxies requests, injects auth |
| Internal API server | Serves /kv, /chat, and /otlp/v1/traces for agent code and forwards to the Blink Server |
| OpenTelemetry Collector | Collects agent logs and forwards them to the server |
Runtime Wiring (self-hosted)
On deployment the server:
- Downloads deployment output files, writes them to a temp dir, and adds a runtime wrapper (
__wrapper.js).
- Launches a container and sets environment variables like
ENTRYPOINT, PORT, INTERNAL_BLINK_API_SERVER_URL, INTERNAL_BLINK_API_SERVER_LISTEN_PORT, BLINK_REQUEST_URL, BLINK_REQUEST_ID, and BLINK_DEPLOYMENT_TOKEN.
- The wrapper starts an internal API server inside the container and patches
fetch so internal API calls include x-blink-internal-auth.
- The wrapper runs the agent entrypoint on
PORT+1 and proxies incoming requests on PORT to the agent.
- The OpenTelemetry collector starts and reads the agent log pipe.
Control Loop
The control loop is the core orchestration mechanism. It runs inside the server, not in agents.
Request Flow
- External event arrives (API call, Slack message, GitHub webhook)
- Server routes the event to the appropriate agent deployment
- Server invokes the agent’s
/_agent/chat endpoint with an invocation token
- Agent processes the request and streams a response back (SSE)
- Server persists messages and run/step state to PostgreSQL and fans out to clients
Chat Run Lifecycle
- Each chat run has one or more steps stored in the DB.
- The server selects the latest step, invokes the active deployment, and streams chunks as they arrive.
- If the response includes tool calls, the server creates a new step and continues the loop.
- Interrupts cancel an in-flight step and restart with the latest state.
Streaming and Buffering
- The server broadcasts
message.chunk.added events to WebSocket and SSE clients.
- The current streaming buffer is kept in memory to allow reconnects.
- This in-memory session state is the main blocker for horizontal scaling today.
Chat Run Sequence
Why the Control Loop is Server-Side
Running the control loop in the server rather than agents provides:
- Centralized state in PostgreSQL
- Agent simplicity (no orchestration logic)
- Observability and auditability
- Consistent tool-call looping behavior
For more details about the control loop, see the agent structure guide.
Request Routing
For details on webhook routing and devhooks, see the webhooks and devhooks guide.
Communication
Server -> Agent
The server communicates with agents via HTTP:
| Endpoint | Method | Purpose |
|---|
/_agent/health | GET | Health check |
/_agent/chat | POST | Chat request, SSE response |
/_agent/capabilities | GET | Check supported handlers |
/_agent/ui | GET | UI schema for dynamic inputs |
/_agent/flush-otel | POST | Flush telemetry buffers |
/_agent/* | ANY | Custom request handler |
Older deployments may still be called via /sendMessages or /_agent/send-messages.
All server -> agent calls include x-blink-invocation-token. Chat runs also include run, step, and chat ID headers.
Agent -> Server
Agents do not call the public API directly in containers. Instead, the wrapper exposes an internal API server:
/kv for agent key-value storage
/chat for chat CRUD and message operations
/otlp/v1/traces for trace export (logs are forwarded by the collector)
The wrapper forwards these to the Blink Server using the invocation token and the deployment token.
Data and Storage
PostgreSQL stores:
- chat messages, runs, and steps
- agents, deployments, and deployment targets
- files and attachments
- logs and traces (self-hosted)
Migrations run automatically at server startup.
Limitations
Current architectural constraints to be aware of:
| Limitation | Details |
|---|
| Single node only | In-memory chat streaming buffers prevent horizontal scaling |
| Docker required | Agents must run as Docker containers (no Kubernetes, ECS, etc.) |
| Local Docker daemon | Server must have direct access to Docker socket |
These limitations exist because Blink is in early access. We plan to support horizontal scaling and other deployment options in the future.