POST /v1/rerank

Cross-encoder reranking. Pass a query and candidates, get back relevance-ordered results with calibrated scores.

FIG.

FIG. 00 · POST /V1/RERANKQUERY × DOCS → SCORES

/v1/rerank is the second stage of high-quality retrieval. You give it a query and up to 1000 documents; it returns each document with a calibrated relevance score (0–1), optionally truncated to the top top_n. The AI SDK's embed gives you the first stage; Synapse Garden gives you the second stage on the same key.

FIG. 01RANK BY SCORE

SCHEMATIC

The cross-encoder reads `(query, doc_i)` pairs and emits a calibrated score per pair. Documents are reordered by score and truncated to `top_n` (or the full list if omitted). One request = one billed query, regardless of document count.

Request

curl https://synapse.garden/api/v1/rerank \
  -H "Authorization: Bearer $MG_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "cohere/rerank-english-v3.0",
    "query": "How do I rotate an API key?",
    "documents": [
      "Spend caps return HTTP 402 when exceeded.",
      "API keys can be rotated under Keys → Rotate.",
      "Synapse Garden proxies 100+ language models."
    ],
    "top_n": 2
  }'

Body schema

Field	Type	Required	Notes
`model`	string	yes	`provider/model-id`, e.g. `cohere/rerank-english-v3.0`
`query`	string	yes	The search intent. Max 8 000 chars.
`documents`	string[]	yes	1–1 000 candidate strings. Order is preserved on output unless you sort by score.
`top_n`	integer	no	Truncate to top `n` results. Defaults to all. Range 1–1 000.
`user`	string	no	Caller-defined identifier (passes through). Max 256 chars.
`providerOptions`	object	no	Provider-namespaced overrides (e.g. `{ cohere: { return_documents: true } }`).

Headers

Header	Required	Notes
`Authorization: Bearer mg_live_*`	yes	Production key. Sandbox keys use `mg_test_*`.
`x-mg-idempotency-key`	no	ULID/UUID. Replays for 24 h.
`x-mg-trace-id`	no	Surfaces in OTEL spans + logs for correlation.

Response

{
  "id": "rrk_01J9Z...",
  "model": "cohere/rerank-english-v3.0",
  "results": [
    { "index": 1, "relevance_score": 0.927, "document": "API keys can be rotated under Keys → Rotate." },
    { "index": 0, "relevance_score": 0.131, "document": "Spend caps return HTTP 402 when exceeded." }
  ],
  "usage": { "search_units": 1 }
}

Field	Type	Notes
`id`	string	Server-assigned, useful for support tickets.
`model`	string	Echoes the requested model.
`results[].index`	integer	Position of this document in the original `documents` array.
`results[].relevance_score`	number	Calibrated probability (`0`–`1`).
`results[].document`	string	Echoes the document text (omitted if you set `providerOptions.cohere.return_documents: false`).
`usage.search_units`	integer	Always `1` per request — reranking is billed per query, not per token or per document.

Errors

Same envelope as the rest of /v1/*:

{ "error": { "code": "BAD_REQUEST", "message": "documents must contain at least 1 item", "type": "invalid_request_error" } }

Status	`error.code`	When
400	`BAD_REQUEST`	Body fails Zod validation.
401	`UNAUTHORIZED`	Missing/invalid `Authorization` header.
402	`BUDGET_EXCEEDED`	Project spend cap reached.
403	`MODEL_NOT_ALLOWED`	Model is not on this project's allowlist.
429	`RATE_LIMITED`	Per-key RPM/TPM exceeded.
5xx	`UPSTREAM_ERROR`	Provider failed; retry with backoff.

Limits

1 000 documents per request — chunk above that.
8 000-char query.
Documents longer than the model's per-doc input window are truncated upstream; chunk first if precision matters.