API Reference
POST /v1/rerank
Cross-encoder reranking. Pass a query and candidates, get back relevance-ordered results with calibrated scores.
FIG.
FIG. 00 · POST /V1/RERANKQUERY × DOCS → SCORES
/v1/rerank is the second stage of high-quality retrieval. You give it a query and up to 1000 documents; it returns each document with a calibrated relevance score (0–1), optionally truncated to the top top_n. The AI SDK's embed gives you the first stage; Synapse Garden gives you the second stage on the same key.
FIG. 01RANK BY SCORE
SCHEMATICRequest
curl https://synapse.garden/api/v1/rerank \
-H "Authorization: Bearer $MG_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "cohere/rerank-english-v3.0",
"query": "How do I rotate an API key?",
"documents": [
"Spend caps return HTTP 402 when exceeded.",
"API keys can be rotated under Keys → Rotate.",
"Synapse Garden proxies 100+ language models."
],
"top_n": 2
}'Body schema
| Field | Type | Required | Notes |
|---|---|---|---|
model | string | yes | provider/model-id, e.g. cohere/rerank-english-v3.0 |
query | string | yes | The search intent. Max 8 000 chars. |
documents | string[] | yes | 1–1 000 candidate strings. Order is preserved on output unless you sort by score. |
top_n | integer | no | Truncate to top n results. Defaults to all. Range 1–1 000. |
user | string | no | Caller-defined identifier (passes through). Max 256 chars. |
providerOptions | object | no | Provider-namespaced overrides (e.g. { cohere: { return_documents: true } }). |
Headers
| Header | Required | Notes |
|---|---|---|
Authorization: Bearer mg_live_* | yes | Production key. Sandbox keys use mg_test_*. |
x-mg-idempotency-key | no | ULID/UUID. Replays for 24 h. |
x-mg-trace-id | no | Surfaces in OTEL spans + logs for correlation. |
Response
{
"id": "rrk_01J9Z...",
"model": "cohere/rerank-english-v3.0",
"results": [
{ "index": 1, "relevance_score": 0.927, "document": "API keys can be rotated under Keys → Rotate." },
{ "index": 0, "relevance_score": 0.131, "document": "Spend caps return HTTP 402 when exceeded." }
],
"usage": { "search_units": 1 }
}| Field | Type | Notes |
|---|---|---|
id | string | Server-assigned, useful for support tickets. |
model | string | Echoes the requested model. |
results[].index | integer | Position of this document in the original documents array. |
results[].relevance_score | number | Calibrated probability (0–1). |
results[].document | string | Echoes the document text (omitted if you set providerOptions.cohere.return_documents: false). |
usage.search_units | integer | Always 1 per request — reranking is billed per query, not per token or per document. |
Errors
Same envelope as the rest of /v1/*:
{ "error": { "code": "BAD_REQUEST", "message": "documents must contain at least 1 item", "type": "invalid_request_error" } }| Status | error.code | When |
|---|---|---|
| 400 | BAD_REQUEST | Body fails Zod validation. |
| 401 | UNAUTHORIZED | Missing/invalid Authorization header. |
| 402 | BUDGET_EXCEEDED | Project spend cap reached. |
| 403 | MODEL_NOT_ALLOWED | Model is not on this project's allowlist. |
| 429 | RATE_LIMITED | Per-key RPM/TPM exceeded. |
| 5xx | UPSTREAM_ERROR | Provider failed; retry with backoff. |
Limits
- 1 000 documents per request — chunk above that.
- 8 000-char query.
- Documents longer than the model's per-doc input window are truncated upstream; chunk first if precision matters.