Research API Overview
What is the Research API?
The Research API returns grounded, natural language answers to questions of varying complexity.
It runs multiple searches, processes the results, cross-references sources, and synthesizes everything into a thorough, Markdown-formatted answer with inline citations.
When you need a typed response, you can also get structured JSON by defining an output_schema.
Ask a hard question, get a researched answer with sources.
How it’s different from Search
The Search API and the Research API serve different purposes by delivering different outputs:
Use the Search API when you want raw results to feed into your own pipeline. Use the Research API when you want a ready-to-use answer backed by sources.
How it works
Research operates as an agentic system that autonomously plans and executes a multi-step research strategy for your question.
Search, Contents, and Live News as retrieval primitives
Research uses You.com’s Search, Contents, and Live News APIs as its core tools. Rather than firing generic web queries, the system selects the right tool for each sub-question — search for discovery, contents for deep page reads, live news for time-sensitive information, and several other internal tools to aid in generating the best possible answer. This targeted tool selection reduces wasted calls and gives the reasoning model cleaner inputs at each step.
The system also evaluates retrieved sources for freshness, diversity, and relevance before incorporating them into the answer.
Context management at scale
Deep research generates far more information than any single LLM context window can hold. Research uses context-masking and compaction strategies that let it operate well beyond those limits — maintaining coherent reasoning across hundreds or thousands of turns without losing track of what it found, what it verified, and what remains unresolved.
At higher effort levels, a single query can run more than 1,000 reasoning turns and process up to 10 million tokens.
Budget-based planning
The system receives a compute budget determined by the research_effort tier you choose. It plans its approach around that budget, allocating more effort to verifying ambiguous or high-stakes claims and moving quickly through well-sourced facts. This is the mechanism that enables the range of latency, accuracy, and cost tradeoffs across tiers.
What you get
Every Research API response includes:
content: A Markdown-formatted answer by default, or a JSON object when you provideoutput_schema. Inline citations such as[[1, 2]]reference items in thesourcesarray.content_type: The format of the content field.textis returned for default Markdown responses.objectis returned for structured output.sources: The web pages the API read and cited in the answer — each with a URL, title, and relevant snippets.
Key features
Research effort levels
The research_effort parameter controls how much compute the API allocates to your question. Higher effort means more searches, deeper source reading, and more cross-referencing — at the cost of longer response times.
For the same query, the difference between tiers is substantial. Here’s an abridged comparison for the question “Which global cities improved air quality the most over the past 10 years, and what measurable actions contributed?”:
research_effort = standard
research_effort = exhaustive
The exhaustive response identifies additional cities (Seoul, with specific UNEP data), includes more granular measurements (µg/m³ ranges, percentage reductions over specific date ranges), and cross-references more sources to verify claims.
Citation-backed answers
Every claim in the response links back to a specific source via inline citations. Your users (or your system) can verify any statement by following the numbered references to the sources array.
Markdown output
The content field is formatted in Markdown with headers, lists, and inline citations — ready to render in a UI or feed into downstream processing.
Source Control
source_control lets you constrain which web sources the research agent searches and visits. Use it when you want results from trusted domains only, need to block specific sites, want recent content, or need results focused on a specific country.
source_control is a top-level request field alongside input and research_effort.
include_domains and exclude_domains cannot be used together in the same request.
You can also combine filters:
Structured Output
Use output_schema when you want output.content returned as a JSON object instead of free-form text. This is useful for returning predictable fields, extracting entities, or feeding Research API output into another typed system.
output_schema is supported with standard, deep, and exhaustive research effort. It is not supported with lite. Sending output_schema with research_effort: "lite" returns 422.
When output_schema is provided, the structured result is returned in output.content and output.content_type is object. Sources remain in output.sources. The API does not add citation fields into your schema object automatically.
Schema Rules
output_schema follows a narrow JSON Schema subset designed for reliable structured generation.
Required rules:
- The root must be an object.
- The root must not use top-level
anyOf. - Every object must define
properties. - Every object must set
additionalProperties: false. - Every property must be listed in
required. - Recursive schemas are not supported.
- Standalone
{"type": "null"}is not supported outsideanyOf. Use a nullable union such as["string", "null"]instead.
Supported patterns include nested objects, arrays, enums, nested anyOf, and non-recursive $defs and $ref.
Unsupported keywords:
allOfcontainsnotdependentRequireddependentSchemasformatif/then/elsemaxContains/minContainsmaxItems/minItemsmaxLength/minLengthmaxProperties/minPropertiesmaximum/minimummultipleOfpatternpatternPropertiespropertyNamesunevaluatedItems/unevaluatedPropertiesuniqueItems
Selected limits:
If the schema is invalid, the request fails validation before model execution. The schema string budget counts property names, $defs names, enum values, and const values. It applies to schema shape only. Request-level limits such as total task spec size are enforced separately at the request layer.
Using Source Control and Structured Output Together
source_control and output_schema can be combined in a single request. For example, you can restrict research to specific domains while requesting a structured response:
Quickstart
Parameters
Common use cases
Complex question answering
When a question can’t be answered from a single source — comparative analyses, multi-factor evaluations, questions that span multiple domains — the Research API handles the synthesis for you.
“Compare the pricing models of the top 3 vector databases and their tradeoffs for a 10M-document collection”
Due diligence and market research
Quickly gather verified, cited information about companies, markets, or technologies. The citation-backed output gives you traceability that raw LLM generation can’t.
Internal tools and knowledge assistants
Build internal research tools where employees can ask complex questions and get sourced answers — product comparisons, regulatory summaries, technical deep dives — without manually reading dozens of pages.
Content creation pipelines
Use the Research API as the first step in a content pipeline: ask a research question, get a cited draft, then use it as source material for blog posts, reports, or briefings.
Best practices
Match research effort to the question
Don’t use exhaustive for simple factual questions — lite or standard will be faster and cheaper. Save deep and exhaustive for questions where thoroughness and accuracy justify the longer response time.
Verify citations for high-stakes use cases
The inline citations make verification straightforward. For legal, financial, or medical contexts, build a step that follows citation URLs to confirm claims before surfacing them to end users.
Use structured inputs for better results
The input field supports up to 40,000 characters. For complex research tasks, include context, constraints, or specific angles you want covered. A well-scoped question produces a more focused answer.
Pricing
Research API pricing is tiered by effort level. All new accounts receive $100 in free credits to get started.
Higher effort tiers allocate more compute for deeper reasoning, more source verification, and higher accuracy. See the research effort levels table above for pricing and latency by tier.
For volume discounts, annual pricing, or enterprise features, visit you.com/pricing or contact [email protected].