Web search

Models that ground their answers in live web results. Citations, query budgets, when to use it.

FIG.
FIG. 00 · WEB SEARCHQUERY ⇄ CITATIONS

Some models can search the web during generation, ground their answers in live results, and return citations. Useful for "what happened today," news summarization, fact-checking, and any prompt where the model's training cutoff is the wrong answer. With the AI SDK, plain models reach the web through user-defined tools like Tavily or Exa.

FIG. 01GROUND-AND-ANSWER
SCHEMATIC
The model emits a search query, the upstream provider (or your tool) fetches results, and the answer is generated grounded in those snippets. Citations land in `providerMetadata` so you can render them next to the answer.

Filter the catalog by the Web search capability on /models. Top picks:

ModelSearch style
xai/grok-4Auto-enabled; integrated with X / web
perplexity/sonar-proCitation-first; web-grounded by design
perplexity/sonarCheaper Perplexity tier
anthropic/claude-opus-4.6Web search beta; explicit opt-in per request
google/gemini-2.5-proGoogle grounding; uses Search results
openai/gpt-5.4Web search via tool use

With xAI Grok (auto-enabled)

Grok decides on its own whether to search. No flag needed:

import { streamText } from "ai"

const result = streamText({
  model: "xai/grok-4",
  baseURL: "https://synapse.garden/api/v1",
  apiKey: process.env.MG_KEY,
  prompt: "What's the latest news about open-source LLMs?",
})

for await (const part of result.textStream) {
  process.stdout.write(part)
}

With Perplexity (always grounded)

Perplexity Sonar is search-first — every answer is grounded in retrieved web pages with explicit citations:

const result = streamText({
  model: "perplexity/sonar-pro",
  prompt: "Summarize recent advances in vector databases.",
  providerOptions: {
    perplexity: { searchMode: "web" }, // "web" | "academic"
  },
})

const final = await result.text
const meta = await result.providerMetadata
console.log(meta?.perplexity?.citations)
// [{ url: "https://...", title: "..." }, ...]

With Anthropic Claude (beta)

Claude's web search is a separate tool:

const result = streamText({
  model: "anthropic/claude-opus-4.6",
  prompt: "What new AI papers were published this week?",
  providerOptions: {
    anthropic: {
      tools: [
        { type: "web_search_20250305", name: "web_search", max_uses: 3 },
      ],
    },
  },
})

max_uses caps how many search calls Claude can make per prompt — defends against runaway exploration.

With Google Gemini (grounding)

Gemini's "grounding" is opt-in:

const result = streamText({
  model: "google/gemini-2.5-pro",
  prompt: "What's the population of Tokyo today?",
  providerOptions: {
    google: { useSearchGrounding: true },
  },
})

The response carries grounding chunks in providerMetadata.google.groundingMetadata.

With OpenAI (tool-use pattern)

OpenAI doesn't have native web search yet — use a tool with a search API of your choice:

import { tool } from "ai"
import { z } from "zod"

streamText({
  model: "openai/gpt-5.4",
  prompt: "What's the latest in quantum computing?",
  tools: {
    webSearch: tool({
      description: "Search the web and return top 5 results",
      parameters: z.object({ query: z.string() }),
      execute: async ({ query }) => {
        const res = await fetch(`https://api.tavily.com/search?q=${encodeURIComponent(query)}`, {
          headers: { Authorization: `Bearer ${process.env.TAVILY_KEY}` },
        })
        return await res.json()
      },
    }),
  },
  maxSteps: 3,
})

This pattern works with any model that supports tools — bring your own search API (Tavily, Exa, Brave, SerpAPI, etc.).

Citations

Models that ground in web results return citations in their providerMetadata:

const meta = await result.providerMetadata

// Perplexity:
meta?.perplexity?.citations
// [{ url, title, snippet }, ...]

// Claude web_search tool:
meta?.anthropic?.toolUses
// [{ type: "web_search", input: { query }, results: [...] }, ...]

// Gemini grounding:
meta?.google?.groundingMetadata?.groundingChunks
// [{ web: { uri, title } }, ...]

For citation-rich UX, render the citations alongside the answer with hover/click-through to the source URL.

Pricing

Web search is billed per search query, not per token. Browse /models for live rates. Typical:

  • Perplexity Sonar — built into the per-token rate; no separate query fee
  • Anthropic web search beta — ~$10 per 1,000 queries
  • Google grounding — ~$35 per 1,000 grounded responses
  • xAI Grok — auto-included; small surcharge for heavy search usage

Caveats

  • Stale-by-a-day. Even "live" web search isn't real-time — the underlying search index typically lags by hours. For minute-by-minute data (stock prices, sports scores), use a dedicated API tool.
  • Hallucinated citations. Models sometimes fabricate URLs or misattribute quotes. Always render citations as links and let the user click through.
  • Geographic bias. Search results lean toward English / US sources unless you specify a region. Pass country / lang hints in your tool call when relevant.
  • Rate limits. Web search has separate quotas per provider — heavy use can throttle independent of token rate limits.
  • Privacy. Your prompt is sent to the upstream search engine. If your prompt contains PII, redact before searching.
  • Factual questions older than the training cutoff — the model already knows. Adding search wastes tokens and adds latency.
  • Reasoning-heavy questions — grounding doesn't help with "explain why" or "design a system." Use a reasoning model instead.
  • Internal data — RAG over your own corpus (see Embeddings) is faster, cheaper, and more accurate than asking a model to "search the web for our company docs."