Idempotency

Safe retries with `Idempotency-Key`. 24-hour dedup window. Same key, same response.

FIG.

FIG. 00 · IDEMPOTENCYDEDUPE BY KEY

Network failures happen. Timeouts happen. Mid-stream disconnects happen. Idempotency keys let you safely retry a request without paying for it twice or kicking off duplicate side effects — pass the header on any client, including the AI SDK's streamText, and our edge takes care of the rest.

FIG. 01DEDUP WINDOW

SCHEMATIC

The first request with a given `Idempotency-Key` runs normally and the response is cached for 24 hours. A repeat with the same key returns the cached response without hitting the upstream model. A repeat with a *different* body for the same key is a 409.

How it works

Pass an Idempotency-Key header on any request. We hash the request body and key together, and:

First request with that key: runs normally, response cached for 24 hours.
Repeat request with the same key (within 24h): returns the cached response without hitting the upstream model.
Different body, same key: returns 409 Conflict — keys are bound to the request body they were first seen with.

Result: safe automatic retries, no double-billing, no duplicate webhooks.

Usage

fetch / curl

curl https://synapse.garden/api/v1/chat/completions \
  -H "Authorization: Bearer $MG_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: req_2026-05-10_abc123def456" \
  -d '{
    "model": "openai/gpt-5.4",
    "messages": [{"role": "user", "content": "Say hello"}]
  }'

OpenAI SDK

const res = await client.chat.completions.create(
  { model: "openai/gpt-5.4", messages: [{ role: "user", content: "..." }] },
  { headers: { "Idempotency-Key": `req_${Date.now()}_${crypto.randomUUID()}` } },
)

Anthropic SDK

const message = await client.messages.create(
  { model: "anthropic/claude-opus-4.6", max_tokens: 1024, messages: [...] },
  { headers: { "Idempotency-Key": `req_${Date.now()}_${crypto.randomUUID()}` } },
)

AI SDK

import { generateText } from "ai"

await generateText({
  model: "openai/gpt-5.4",
  prompt: "...",
  baseURL: "https://synapse.garden/api/v1",
  apiKey: process.env.MG_KEY,
  headers: {
    "Idempotency-Key": `req_${Date.now()}_${crypto.randomUUID()}`,
  },
})

Generating keys

A good key is unique per request, deterministic on retry. Patterns:

// Per request
const key = crypto.randomUUID()

// Per business event (when retrying a workflow step)
const key = `workflow_${jobId}_step_${stepIndex}`

// Per user action
const key = `user_${userId}_action_${actionId}`

Don't reuse the same key across different requests — if the bodies differ, you'll get 409 errors.

Do reuse the same key for retries of the same logical request. That's the whole point.

TTL and storage

24 hours. After that, the key expires and a repeat request will run again as a fresh call.
Per-key cache. Idempotency cache is keyed by (api_key_id, idempotency_key) so two different keys can use the same idempotency string without collision.
Per-region. Idempotency cache is regional. Cross-region retries don't dedupe (rare in practice — same request from the same client typically hits the same region).

When to use

Retries on network errors. If your client times out and you don't know if the request hit, send the same idempotency key on retry — guaranteed safe.
Webhook handlers. When you call us from a webhook (which can fire multiple times), use the webhook's event id as the idempotency key.
Workflow orchestration. Each step of a durable workflow is a logical unit — use the step id.
User-initiated actions. "Submit form" buttons that might double-click. Use a per-form-submission UUID.

When NOT to use

Streaming requests. Idempotency cache only stores final responses, not intermediate stream events. If a streamed response is cached and you retry, you get the full final body in one shot, not re-streamed deltas. (Most clients handle this fine, but it's a behavior change.)
Always-fresh content. If you genuinely want a different response each time (e.g. random sampling), don't pass an idempotency key.

Errors

HTTP	When
409 `IDEMPOTENCY_KEY_CONFLICT`	Same key used with a different body
429 `RATE_LIMITED`	Idempotency cache requests count toward your rate limit (the cache hit returns fast, but it counts)

Best practices

Always pass an idempotency key from servers. It's free safety. Browsers can skip it for one-shot UX requests.
Make keys long and unique. 32+ random chars. Don't use predictable values like timestamps alone.
Log the key with your request. Useful for debugging "did my retry actually dedupe?".
For long workflows, namespace your keys. wf:{workflow_id}:step:{step_id} lets you trace which step a key belongs to.
Don't try to hand-roll dedup. Our cache is tighter than anything you could build at the application layer — same DB row, same provider call, same cost.

Cost

Cached idempotency hits are free — no upstream call, no token billing. They count toward your rate limit (so you can't use them as a free-DDoS vector), but they don't deduct from your spend.