Research API Example | You.com | You.com

When a Simple Search Isn’t Enough

You’ve used the Search API. You send a query, you get back a list of web results. Titles, URLs, snippets. It works great for quick lookups, building search UIs, and feeding results into RAG pipelines.

But some questions need more than a list of links. Questions like “what are the latest breakthroughs in quantum computing” or “how do mRNA vaccines work” need someone (or something) to actually read through dozens of sources, cross-reference them, and write up a coherent answer with citations.

That’s what the Research API does. It’s like having a research assistant that reads the web and writes you a report with footnotes.

What You’ll Build

By the end of this guide you’ll be able to call the Research API in Python or TypeScript and get back comprehensive, cited answers to questions of varying complexity. You’ll also see how to use beta request fields with direct HTTP calls.

Here’s the difference between Search and Research for the same query:

Search API returns raw results:

1 [
2   { "title": "Google Quantum AI: Willow Chip",
3     "url": "https://blog.google/...",
4     "snippet": "Google's Willow chip demonstrated..."
5   },
6   { "title": "IBM Quantum: Heron Processor",
7     "url": "https://www.ibm.com/...",
8     "snippet": "IBM announced..."
9   }
10 ]

Research API returns a synthesized, grounded answer with inline citations:

1 ## Recent Breakthroughs in Quantum Computing
2 
3 Quantum computing has seen several major advances in recent years...
4 
5 **Error correction milestones.** Google's Willow chip demonstrated that
6 increasing the number of qubits can actually reduce errors [1], a key
7 threshold for practical quantum computing...
8 
9 **Hardware scaling.** IBM's Heron processor achieved... [2]
10 
11 Sources:
12 [1] Google Quantum AI: Willow Chip
13     https://blog.google/technology/research/google-willow-quantum-chip/
14 [2] IBM Quantum: Heron Processor
15     https://www.ibm.com/quantum/blog/ibm-quantum-heron

Same question, fundamentally different output. Search gives you building blocks. Research gives you the finished product.

Why This Approach

What the Research API Does Under the Hood

The Research API doesn’t just run one search. It’s powered by the You.com search index, purpose-built for speed, accuracy, and relevance. Instead of relying on third-party search providers, the Research API searches and reads pages directly from this index. That means lower latency, fresher results, and better extraction quality than you’d get stitching together external services yourself.

On top of that index sits an agentic system that does the actual research. It doesn’t follow a fixed pipeline. Instead it works in an iterative loop:

Reads your question and decides what to search for
Searches the web and reviews the results
Visits promising pages and extracts the relevant content
Decides whether to search again, visit more pages, or start writing based on what it’s found so far
Repeats until it has enough information (controlled by the effort level you choose)
Synthesizes everything into a cited markdown answer

The agent adapts its research strategy as it goes. A question about quantum computing might lead it down a different path than one about financial regulations. Higher effort levels give the agent more time to iterate, which means more searches, more pages read, and a more thorough answer.

Because the search index, page extraction, and synthesis are all built and optimized together, the Research API can deliver results that would be difficult to replicate by chaining separate tools.

When to Use Research vs Search

	Search API	Research API
Speed	Fast (~1s)	Slower (5–60s depending on effort)
Output	List of web results (title, URL, snippet)	Comprehensive markdown answer with citations
Best for	Quick lookups, search UIs, and RAG pipelines	Complex questions, report generation, and deep analysis
Example	”nextjs docs"	"how does next.js compare to remix for production apps”

Use Research when:

The question requires reading and synthesizing multiple sources
You need a cited, comprehensive answer (not just links)
You’re building report generation, fact-checking, or deep analysis features
Your users expect a written answer, not a list of results

Use Search when:

You need raw results fast
You’re building a search UI or autocomplete
You’re feeding results into your own RAG pipeline
You need raw search results and/or additional filters such as result count, language and livecrawl

Prerequisites

You need two things:

A You.com API key. Get one at you.com/platform
The SDK for your language:

$ # Python
$ pip install youdotcom
$ 
$ # TypeScript
$ npm install @youdotcom-oss/sdk

That’s it. No other dependencies needed.

Research API fields such as source_control and output_schema may not be available in SDKs yet. Use cURL or fetch for those fields until SDK support ships.

Step-by-Step Walkthrough

Python

Set your API key

$ export YDC_API_KEY="your-api-key-here"

Make a research call

1 import os
2 from youdotcom import You
3 
4 you = You(api_key_auth=os.environ["YDC_API_KEY"])
5 response = you.research(input="What are the latest breakthroughs in quantum computing?")
6 print(response.output.content)

response.output.content is a markdown string with inline citations like [1], [2]. response.output.sources is the list of sources used.

Print the sources

1 for i, source in enumerate(response.output.sources, 1):
2     print(f"[{i}] {source.title}")
3     print(f"    {source.url}")

Try different effort levels

1 import os
2 from youdotcom import You, models
3 
4 you = You(api_key_auth=os.environ["YDC_API_KEY"])
5 
6 # Quick answer (~5s)
7 response = you.research(input="what is RAG", research_effort=models.ResearchEffort.LITE)
8 
9 # Thorough research (~20-30s)
10 response = you.research(input="compare RAG architectures", research_effort=models.ResearchEffort.DEEP)
11 
12 # Most comprehensive (~30-60s)
13 response = you.research(input="full analysis of RAG vs fine-tuning", research_effort=models.ResearchEffort.EXHAUSTIVE)

TypeScript

Install and initialize

1 import { You } from "@youdotcom-oss/sdk";
2 
3 const you = new You({ apiKeyAuth: process.env.YDC_API_KEY });

Make a research call

1 const result = await you.research({
2   input: "What are the latest breakthroughs in quantum computing?",
3   researchEffort: "standard",
4 });
5 
6 console.log(result.output.content);     // markdown answer
7 console.log(result.output.sources);     // source list

Use in a Next.js API route

This is the pattern used in the live demo. The API route calls Research server-side so the API key stays safe:

1 import { NextResponse } from "next/server";
2 import { You } from "@youdotcom-oss/sdk";
3 
4 export const maxDuration = 60; // Research can take up to 60s
5 
6 export async function POST(request: Request) {
7   const { input, research_effort } = await request.json();
8   const you = new You({ apiKeyAuth: process.env.YDC_API_KEY });
9   const result = await you.research({ input, researchEffort: research_effort });
10   return NextResponse.json(result);
11 }

Full Working Example

We’ve built a complete sample app you can fork and deploy:

GitHub: youdotcom-oss/ydc-research-sample
Live demo: https://you.com/examples/research

The repo includes:

research.py is a standalone Python CLI script
app/api/research/route.ts is the Next.js API route using the TypeScript SDK
app/page.tsx is the React frontend with effort level picker and markdown rendering

To run it yourself:

$ git clone https://github.com/youdotcom-oss/ydc-research-sample.git
$ cd ydc-research-sample
$ 
$ # Python
$ pip install youdotcom
$ export YDC_API_KEY="your-key"
$ python research.py "your question here"
$ 
$ # Web app
$ cp .env.example .env.local   # add your API key
$ npm install
$ npm run dev                   # open localhost:3000

Customization Guide

Effort Levels

The research_effort parameter controls how deep the Research API digs. Higher effort means more searches, more sources read, and more thorough analysis.

Level	Speed	When to Use
`lite`	~5s	Quick factual questions, simple lookups that still benefit from synthesis
`standard`	~10–15s	Default. Balanced speed and depth. Good for most questions
`deep`	~20–30s	Complex topics that need cross-referencing. Competitive analysis, technical comparisons
`exhaustive`	~30–60s	Maximum depth. Due diligence, comprehensive reports, academic-style research

Choosing the Right Effort Level

lite for anything you’d normally just google but want a synthesized answer instead of clicking through links
standard when you’re not sure. It’s the sweet spot for most use cases
deep when accuracy and thoroughness matter more than speed
exhaustive when you need to be really sure. Compliance checks, investment research, technical deep dives

Source Control

Use source_control to constrain which web sources the research agent searches and visits. You can allow specific domains, block specific domains, filter by recency, or focus results by country.

$ curl -X POST https://api.you.com/v1/research \
>   -H "X-API-Key: $YDC_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "input": "What are the latest developments in quantum computing?",
>     "research_effort": "deep",
>     "source_control": {
>       "include_domains": ["nature.com", "arxiv.org", "science.org"]
>     }
>   }'

Supported source_control fields:

Field	Description
`include_domains`	Only return results from these domains. Max 500 domains. Cannot be used with `exclude_domains`.
`exclude_domains`	Never return results from these domains. Max 500 domains. Also blocks browsing on those domains.
`freshness`	Accepts `day`, `week`, `month`, `year`, or a custom `YYYY-MM-DDtoYYYY-MM-DD` range.
`country`	ISO 3166-1 alpha-2 country code, such as `US`, `GB`, or `DE`.

Structured Output Beta

Use output_schema to return a JSON object in output.content instead of a Markdown string. This is useful when you need predictable fields for downstream systems.

output_schema works with standard, deep, and exhaustive. It does not work with lite, which returns 422 when output_schema is present.

1 const response = await fetch("https://api.you.com/v1/research", {
2   method: "POST",
3   headers: {
4     "X-API-Key": process.env.YDC_API_KEY ?? "",
5     "Content-Type": "application/json",
6   },
7   body: JSON.stringify({
8     input: "Compare Snowflake, Databricks, and BigQuery for enterprise data warehousing",
9     research_effort: "deep",
10     output_schema: {
11       type: "object",
12       properties: {
13         comparison: {
14           type: "array",
15           items: {
16             type: "object",
17             properties: {
18               product: { type: "string" },
19               strengths: {
20                 type: "array",
21                 items: { type: "string" },
22               },
23               weaknesses: {
24                 type: "array",
25                 items: { type: "string" },
26               },
27               pricing_model: { type: "string" },
28               best_for: { type: "string" },
29             },
30             required: ["product", "strengths", "weaknesses", "pricing_model", "best_for"],
31             additionalProperties: false,
32           },
33         },
34         recommendation: { type: "string" },
35       },
36       required: ["comparison", "recommendation"],
37       additionalProperties: false,
38     },
39   }),
40 });
41 
42 const data = await response.json();
43 console.log(data.output.content);

output_schema supports a reliability-focused JSON Schema subset:

The root must be an object.
The root must not use top-level anyOf.
Every object must define properties.
Every object must set additionalProperties: false.
Every property must be listed in required.
Recursive schemas are not supported.
Standalone {"type": "null"} is not supported outside anyOf. Use a nullable union such as ["string", "null"] instead.

Supported patterns include nested objects, arrays, enums, nested anyOf, and non-recursive $defs and $ref. Unsupported keywords include allOf, format, pattern, min/max constraints, uniqueItems, and conditional schema keywords. See the Research API overview for the full rules and limits.

Response Format

The API returns a structured response. By default, output.content is Markdown and output.content_type is text:

1 {
2   "output": {
3     "content": "## Quantum Computing Breakthroughs\n\nQuantum computing has seen several major advances... [1]\n\n...",
4     "content_type": "text",
5     "sources": [
6       {
7         "url": "https://blog.google/technology/research/google-willow-quantum-chip/",
8         "title": "Google Quantum AI: Willow Chip",
9         "snippets": ["Google's Willow chip demonstrated..."]
10       }
11     ]
12   }
13 }

output.content is a Markdown string with numbered inline citations ([1], [2], etc.)
output.sources is an array of sources, each with URL, title, and relevant snippets
Citations in the content map to the sources array by index

When you provide output_schema, output.content is a JSON object and output.content_type is object. Sources remain in output.sources.

What’s Next

Now that you can call the Research API, here are some directions to explore:

Research API Reference. Full API docs with all parameters and response fields.
10 Creative Ways to Use AI Web Search & Research in Your n8n Workflows. Automate research with n8n workflows.
Simple Search Sample. If you need raw search results instead.
Full API Docs. Documentation for all You.com APIs.
Get an API Key. Sign up and start building.

Resources

Research API Reference
Python SDK (pip install youdotcom)
TypeScript SDK (npm install @youdotcom-oss/sdk)
GitHub: ydc-research-sample
Live Demo
Discord