Idempotency
Safe retries with `Idempotency-Key`. 24-hour dedup window. Same key, same response.
Network failures happen. Timeouts happen. Mid-stream disconnects happen. Idempotency keys let you safely retry a request without paying for it twice or kicking off duplicate side effects — pass the header on any client, including the AI SDK's streamText, and our edge takes care of the rest.
How it works
Pass an Idempotency-Key header on any request. We hash the request body and key together, and:
- First request with that key: runs normally, response cached for 24 hours.
- Repeat request with the same key (within 24h): returns the cached response without hitting the upstream model.
- Different body, same key: returns
409 Conflict— keys are bound to the request body they were first seen with.
Result: safe automatic retries, no double-billing, no duplicate webhooks.
Usage
fetch / curl
curl https://synapse.garden/api/v1/chat/completions \
-H "Authorization: Bearer $MG_KEY" \
-H "Content-Type: application/json" \
-H "Idempotency-Key: req_2026-05-10_abc123def456" \
-d '{
"model": "openai/gpt-5.4",
"messages": [{"role": "user", "content": "Say hello"}]
}'OpenAI SDK
const res = await client.chat.completions.create(
{ model: "openai/gpt-5.4", messages: [{ role: "user", content: "..." }] },
{ headers: { "Idempotency-Key": `req_${Date.now()}_${crypto.randomUUID()}` } },
)Anthropic SDK
const message = await client.messages.create(
{ model: "anthropic/claude-opus-4.6", max_tokens: 1024, messages: [...] },
{ headers: { "Idempotency-Key": `req_${Date.now()}_${crypto.randomUUID()}` } },
)AI SDK
import { generateText } from "ai"
await generateText({
model: "openai/gpt-5.4",
prompt: "...",
baseURL: "https://synapse.garden/api/v1",
apiKey: process.env.MG_KEY,
headers: {
"Idempotency-Key": `req_${Date.now()}_${crypto.randomUUID()}`,
},
})Generating keys
A good key is unique per request, deterministic on retry. Patterns:
// Per request
const key = crypto.randomUUID()
// Per business event (when retrying a workflow step)
const key = `workflow_${jobId}_step_${stepIndex}`
// Per user action
const key = `user_${userId}_action_${actionId}`Don't reuse the same key across different requests — if the bodies differ, you'll get 409 errors.
Do reuse the same key for retries of the same logical request. That's the whole point.
TTL and storage
- 24 hours. After that, the key expires and a repeat request will run again as a fresh call.
- Per-key cache. Idempotency cache is keyed by
(api_key_id, idempotency_key)so two different keys can use the same idempotency string without collision. - Per-region. Idempotency cache is regional. Cross-region retries don't dedupe (rare in practice — same request from the same client typically hits the same region).
When to use
- Retries on network errors. If your client times out and you don't know if the request hit, send the same idempotency key on retry — guaranteed safe.
- Webhook handlers. When you call us from a webhook (which can fire multiple times), use the webhook's event id as the idempotency key.
- Workflow orchestration. Each step of a durable workflow is a logical unit — use the step id.
- User-initiated actions. "Submit form" buttons that might double-click. Use a per-form-submission UUID.
When NOT to use
- Streaming requests. Idempotency cache only stores final responses, not intermediate stream events. If a streamed response is cached and you retry, you get the full final body in one shot, not re-streamed deltas. (Most clients handle this fine, but it's a behavior change.)
- Always-fresh content. If you genuinely want a different response each time (e.g. random sampling), don't pass an idempotency key.
Errors
| HTTP | When |
|---|---|
409 IDEMPOTENCY_KEY_CONFLICT | Same key used with a different body |
429 RATE_LIMITED | Idempotency cache requests count toward your rate limit (the cache hit returns fast, but it counts) |
Best practices
- Always pass an idempotency key from servers. It's free safety. Browsers can skip it for one-shot UX requests.
- Make keys long and unique. 32+ random chars. Don't use predictable values like timestamps alone.
- Log the key with your request. Useful for debugging "did my retry actually dedupe?".
- For long workflows, namespace your keys.
wf:{workflow_id}:step:{step_id}lets you trace which step a key belongs to. - Don't try to hand-roll dedup. Our cache is tighter than anything you could build at the application layer — same DB row, same provider call, same cost.
Cost
Cached idempotency hits are free — no upstream call, no token billing. They count toward your rate limit (so you can't use them as a free-DDoS vector), but they don't deduct from your spend.
See also
- Errors & retries — full retry strategy
- Rate limits — RPM / TPM caps