Observability
Request IDs, generation IDs, usage logs, latency telemetry. Trace every request from your code to the answer.
Every request through Synapse Garden produces three correlatable IDs and a structured log row. Use them to debug a slow request, audit a user-reported answer, or attach token usage to your application traces — the AI SDK's streamText accepts a custom fetch so you can stamp your own X-Request-Id header and stitch our spans into your trace tree.
Three IDs per request
| ID | Where | Use for |
|---|---|---|
X-Request-Id | Response header on every request | Correlating with our internal logs |
requestId (in your code) | Generated by the SDK or by you, sent as X-Request-Id | Tying upstream calls to your application traces |
generationId | providerMetadata.gateway.generationId after the call | Looking up provider-side traces (admin / debugging) |
import { streamText } from "ai"
const result = streamText({
model: "openai/gpt-5.4",
prompt: "...",
baseURL: "https://synapse.garden/api/v1",
apiKey: process.env.MG_KEY,
// The AI SDK doesn't expose request headers directly; use a fetch wrapper:
fetch: async (url, init) => {
const id = crypto.randomUUID()
init = init ?? {}
init.headers = { ...(init.headers ?? {}), "X-Request-Id": id }
const res = await fetch(url, init)
console.log("[mg] sent X-Request-Id =", id, "status:", res.status)
return res
},
})
const meta = await result.providerMetadata
console.log("[mg] generationId =", meta?.gateway?.generationId)Per-request log row
Every request produces a row in request_logs (per docs/ERD.md §2.7) with:
{
requestId: string, // X-Request-Id (yours or generated)
orgId, projectId, apiKeyId, // governance scope
model: string, // requested model
endpoint: "chat.completions" | "messages" | "embeddings",
status: "success" | "error" | "timeout" | "rate_limited" | "budget_exceeded",
inputTokens, outputTokens,
costPassthroughCents,
costChargedCents,
latencyMs,
ttftMs, // time-to-first-token (streaming)
errorCode, errorMessage, // null on success
createdAt
}Visible in the dashboard at Usage → Recent requests, with filters for project, key, model, status, and date range. The full payload is not logged by default — see Privacy below.
Viewing requests in the dashboard
Dashboard → Usage → Recent requests shows the last 1,000 rows for your workspace. Click any row to see:
- Full timing breakdown (TTFT, total latency, queue lag)
- Token usage (input, output, cached)
- Cost (passthrough + charged)
- Provider that served it
- Full request/response only if you've opted into payload logging at the workspace level
- Curl reproduction snippet (we re-render your call as a curl line for easy debug)
Real-time SSE updates
The dashboard subscribes to a server-sent-events stream (/api/sse) for live usage updates. New rows appear in Recent requests within ~1 second of completion — useful when you're tail-debugging a problem.
OpenTelemetry integration
Synapse Garden emits OTel spans on every proxy call. If your app already uses OpenTelemetry, the spans plug into your existing trace tree:
import { trace } from "@opentelemetry/api"
import { generateText } from "ai"
const tracer = trace.getTracer("my-app")
await tracer.startActiveSpan("generate-summary", async (span) => {
const requestId = span.spanContext().traceId
const { text } = await generateText({
model: "openai/gpt-5.4",
prompt,
fetch: async (url, init) => fetch(url, {
...init,
headers: { ...init?.headers, "X-Request-Id": requestId },
}),
})
span.end()
return text
})When you set X-Request-Id to your trace ID, our internal span propagates the same trace ID — your APM tool stitches the timeline together automatically.
Latency budget
We surface our own overhead separately from the upstream model time. Browse /internal/ops (admin only) for live numbers; targets per docs/ARCHITECTURE.md §3:
| Stage | Target P50 |
|---|---|
| Header parse + key format check | 1 ms |
| Key validation (cache hit) | 3 ms |
| Body validation (Zod) | 2 ms |
| Model allowlist check | <1 ms |
| Rate limit check | 2 ms |
| Token estimate + budget check | 3 ms |
| Synapse Garden total | ~12 ms |
| Upstream model | varies |
| Logging | 0 ms (fire-and-forget) |
Hot path latency is monitored with k6 in CI; if P95 overhead crosses 50 ms, the build fails.
Token usage tracking
In your app:
const { text, usage } = await generateText({ model: "...", prompt })
console.log(usage)
// { promptTokens: 234, completionTokens: 156, totalTokens: 390 }For streaming:
const result = streamText({ model: "...", prompt })
for await (const part of result.textStream) process.stdout.write(part)
const usage = await result.usage // resolves at end of streamAggregate token usage in your own metrics:
metrics.recordHistogram("llm.tokens.input", usage.promptTokens, { model })
metrics.recordHistogram("llm.tokens.output", usage.completionTokens, { model })Webhooks (v2)
We're shipping outbound webhooks in v2 — subscribe a URL to per-request events (request.completed, budget.exceeded, key.revoked). Useful for piping Synapse Garden activity into your own data warehouse without polling.
Privacy
By default, no prompt or completion content is logged. We only log:
- Request metadata (timing, status, tokens, cost)
- Model and endpoint
- API key prefix (not full key)
- IP address (for abuse detection; truncated to /24 after 7 days)
Opt-in payload logging is available at the workspace level for teams that need full request/response audit trails for compliance. PII redaction runs automatically on opt-in payloads.
Status & uptime
- Status page: status.synapse.garden — real-time component health, incident history, scheduled maintenance.
- API health:
GET /api/healthreturns{ db, redis, queue, status }for synthetic monitoring. - Per-provider uptime: see model detail pages on
/models/[creator]/[slug]for live provider availability.
What we monitor
| Metric | Alert threshold |
|---|---|
| Proxy P95 overhead | >50 ms for 5 min |
| Key cache hit rate | <99% for 10 min |
/v1/* 5xx rate | >1% for 5 min |
| Upstream error rate | >3% for 5 min |
| Webhook lag | >30 s |
| Queue backlog | >1,000 |
| Daily token spend | 3× rolling 7-day avg (anomaly) |
Sentry captures unhandled exceptions; alerts route to on-call. Full incident response process documented at /legal/security.