Advisor
Consult a stronger model mid-generation as a server tool
Beta
Server tools are currently in beta. The API and behavior may change.
The openrouter:advisor server tool lets a model consult a higher-intelligence advisor model mid-generation. When your model hits a decision point — before committing to an approach, when it’s stuck, or before declaring a task done — it invokes the tool with a prompt. The advisor model thinks, returns its guidance as the tool result, and your model continues, informed by the advice.
Unlike a fixed model pairing, the advisor can be any OpenRouter model, and it can optionally run as a sub-agent with its own tools (for example openrouter:web_search). The tool returns the advisor model’s response directly as the tool result — your model writes the final answer.
You can offer the model a choice of several named advisors by including multiple openrouter:advisor entries in the tools array — one per advisor (see Multiple advisors). At most one entry may omit name to act as the default advisor.
Each advisor also remembers its own prior consultations across requests when you replay the conversation transcript (see Cross-request memory), and the tool is available on the Chat Completions, Responses, and Anthropic Messages APIs (see Anthropic Messages API).
Quick start
Choosing the advisor model
The advisor model is resolved with the following precedence:
parameters.modelon the tool definition, if set.- The
modelargument the executor passes in the tool call, if the definition does not fix one. - The model from the outer API request, as a fallback.
This lets you either pin the advisor model up front (parameters.model) or let the executing model pick it per call. The advisor tool itself can never be the advisor model.
When does the model invoke it?
The tool’s description steers the model to consult the advisor before substantive work, when it’s stuck, or before declaring a task done — not for trivial steps a single model can resolve directly. To force a consultation on every request, set tool_choice: "required" (with multiple advisors this forces the first entry — see Multiple advisors).
Parameters
Pass an optional parameters object on the tool entry:
Tool-call arguments
When invoking the tool, the model passes:
Multiple advisors
To offer the model a choice of advisors, include multiple openrouter:advisor entries in the tools array — one per advisor. Give each its own name (plus its own model, instructions, and the other advisor fields); the model sees one distinct tool per named advisor and calls whichever fits the task:
Rules for advisor entries:
- At most one entry may omit
name— it becomes the default advisor. Two or more unnamed advisor entries fail the request with a400: “Only one advisor tool can serve as the default. All other advisor tools must have a name defined.” - Names must be unique across entries (compared after trimming whitespace). A duplicate name fails the request with a
400. - Names allow letters, digits, spaces, underscores, and dashes (e.g.
"Lead Architect"), are trimmed, and must be 1–64 characters.
A single advisor is just one entry — name it, or leave name off to keep it as the default. Each advisor’s result reports the model it consulted, so you can tell the advisors apart in the response.
tool_choice and named advisors
Forcing the advisor with tool_choice (e.g. tool_choice: "required", or selecting the openrouter:advisor tool) targets the first advisor entry. Forcing a specific named advisor via tool_choice is not yet supported.
Cross-request memory
Each advisor remembers its own prior prompt → advice exchanges across API requests in a conversation. When you send a follow-up request that replays the prior transcript — assistant messages with their advisor tool calls and results included, as returned by the API — the advisor sees its earlier consultations replayed into its context before the new prompt. Tell the advisor a fact in one request, and it can recall it in the next without the executor restating it.
This works on all three APIs; the only requirement is that you replay the advisor exchanges you received:
- Chat Completions: include the assistant message’s advisor
tool_callsand the pairedrole: "tool"result messages from prior turns. - Responses API: include the
openrouter:advisoroutput items from prior responses ininput, unchanged. - Anthropic Messages API: include the assistant message’s advisor
server_tool_useandadvisor_tool_resultcontent blocks from prior turns.
Memory is per advisor: in a multi-advisor setup, each advisor recalls only its own prior exchanges — a “reviewer” advisor never sees what the “architect” was told. There is no fixed limit on the number of replayed exchanges; if the history exceeds the advisor model’s context window, it is compressed with the middle-out transform, which trims the middle of the conversation and keeps the oldest and newest exchanges.
Memory applies to prompt-mode consultations. With forward_transcript: true the advisor already sees the full parent conversation, so prior exchanges are not separately replayed.
Keep advisor entry order stable
Advisor identity is positional — derived from the entry’s index in the request tools array. Keep the order of advisor entries stable across the requests of a conversation (and echo the instance_name field on replayed Responses items unchanged). Reordering or inserting advisor entries between requests shifts identities, and each advisor reconstructs another’s memory.
Streaming advice
By default the advice arrives only once the advisor has finished — as a single tool result. Set parameters.stream to true to have the advice stream out incrementally as the advisor model produces it:
In the Responses API, the advisor’s output item then emits response.output_text.delta events as the advice is generated, followed by a response.output_text.done and the completed item. The completed item still carries the full advice string, so consumers that don’t read the deltas are unaffected. stream can be set per advisor entry, so you can stream some advisors and not others.
The streamed deltas mirror how a normal assistant message streams text — the item_id on each delta is the advisor output item’s id.
Streaming has no effect on the Chat Completions API (the advice arrives only as the final tool result regardless of stream). Streaming the advice in the Anthropic Messages API is a planned fast-follow; today a Messages request behaves as if stream were false.
What the tool returns
On success the tool result contains the advice text and the model that produced it:
On failure the result has status: "error" with a message; the calling model continues without the advice:
Anthropic Messages API
On /api/v1/messages, request the advisor with the native Anthropic tool shape — and it works with any executor model, not just Anthropic ones:
The response carries the advisor consultation as the official Anthropic block shapes — a server_tool_use block with name: "advisor" for the call, followed by an advisor_tool_result block with the advice:
Replay these blocks unchanged on the assistant message of follow-up requests for cross-request memory.
Notes on the native shape:
modelis the only advisor configuration the native shape carries. Forinstructions, sub-agenttools,forward_transcript, and the other parameters, use theopenrouter:advisorform on Chat Completions or Responses.max_usesis not honored: consultations are capped per request by OpenRouter’s fixed limit, and amax_usesbelow that limit does not lower it.caching,allowed_callers, anddefer_loadingare also ignored.- Forcing the advisor via
tool_choice: { "type": "tool", "name": "advisor" }is supported.
Sub-agent tools
When you pass tools, the advisor runs as an agentic sub-agent over them before producing its advice — for example, giving the advisor openrouter:web_search lets it ground its guidance in fresh sources. The advisor’s tool use happens inside the tool call; only its final text is returned to your model.
Recursion protection
The advisor tool cannot invoke itself. Two guards enforce this:
- A self-reference check rejects an advisor entry inside the advisor’s own
toolsarray (and rejects the advisor tool name as the advisormodel). - Each inner advisor call carries an
x-openrouter-advisor-depthheader; the advisor tool is stripped from any sub-call, so an advisor sub-agent can never re-enter the advisor.
Consultations are also capped per request to bound cost and latency.
Related
- Fusion server tool — multi-model deliberation
- Web Search server tool
- Web Fetch server tool