Parent Document Retriever

A RAG technique that indexes small chunks for precise matching but retrieves larger parent documents for LLM context. Balances retrieval precision with comprehensive context by separating indexing granularity from context size.

Visit Website

Overview

Parent Document Retriever is a RAG technique that separates what you index from what you retrieve. It indexes small, focused chunks for precise matching but returns larger parent documents to provide comprehensive context to the LLM.

The Problem with Standard Chunking

Large Chunks:

Good context for LLM
Poor retrieval precision
May not match specific queries

Small Chunks:

Good retrieval precision
Insufficient context for LLM
Missing surrounding information

Solution: Two-Level Chunking

Index Level: Small chunks (e.g., 200 tokens)
Retrieval Level: Parent documents (e.g., 1000 tokens)

When a small chunk matches, return its entire parent document.

How It Works

Indexing

Parent Doc: [A B C D E F G H I J]
  ↓ split into
Child Chunks: [A B] [C D] [E F] [G H] [I J]
  ↓ embed and index
Vector DB: stores child chunk embeddings with parent doc IDs

Retrieval

Query → matches [C D]
  ↓ retrieve parent
Return: [A B C D E F G H I J]

Implementation

from langchain.retrievers import ParentDocumentRetriever
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Child splitter (for indexing)
child_splitter = RecursiveCharacterTextSplitter(chunk_size=200)

# Parent splitter (for retrieval)
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)

retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=docstore,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
)

Benefits

Better Retrieval: Small chunks match queries precisely
Better Context: Large parents give LLM full picture
Reduced Redundancy: Deduplication at parent level
Flexible: Tune child/parent sizes independently

Variants

Full Document as Parent

Children: Sentences or paragraphs
Parent: Entire document
Best for short documents

Hierarchical Chunks

Multiple levels (sentence → paragraph → section)
Flexible context size
More complex implementation

Storage Requirements

Vectors: Only child chunks (smaller footprint)
Documents: Both child and parent docs
Document store: Key-value store for parents

Trade-offs

Advantages:

Best of both worlds (precision + context)
Better LLM answers
Reduced token usage vs. retrieving multiple small chunks

Costs:

More complex implementation
Need document store in addition to vector DB
Slightly higher storage

When to Use

Documents with clear hierarchical structure
When small chunks lack context
When retrieval precision is critical
Long documents that need chunking

Pricing

Implementation-dependent. Requires vector DB + document store (can use same database).

Surveys

Loading more......

Information

Websitepython.langchain.com

PublishedMar 15, 2026

Tags

3 Items

#rag #retrieval #chunking

Similar Products

Contextual Retrieval

A RAG enhancement technique from Anthropic that adds chunk-specific explanatory context to each document chunk before embedding. Contextual Retrieval reduces retrieval failure rates by 49% and improves accuracy by 67% compared to traditional RAG methods.

000

Sentence Window Retrieval

A RAG technique that indexes individual sentences for precise matching but retrieves surrounding sentences (a window) for context. Provides fine-grained retrieval precision while maintaining adequate context for LLM generation.

000

Cascading Retrieval

Advanced retrieval approach combining dense vectors, sparse vectors, and reranking in a multi-stage pipeline, achieving up to 48% better performance than single-method retrieval.

000

RecursiveCharacterTextSplitter

LangChain's hierarchical text chunking strategy achieving 85-90% accuracy by recursively splitting using progressively finer separators to preserve semantic boundaries.

000

Chunk Size Optimization

The process of determining optimal text segment sizes for embedding and retrieval in vector databases. Chunk size significantly impacts RAG quality, balancing between capturing complete context (larger chunks) and retrieval precision (smaller chunks), typically ranging from 256 to 1024 tokens.

000

Reranking

A two-stage retrieval process where initial candidates from vector search are reordered using more sophisticated models like cross-encoders. Reranking significantly improves result quality by applying computationally expensive models to a small set of candidates, commonly used in RAG systems and search applications.

000

Overview

The Problem with Standard Chunking

Large Chunks:

Good context for LLM
Poor retrieval precision
May not match specific queries

Small Chunks:

Good retrieval precision
Insufficient context for LLM
Missing surrounding information

Solution: Two-Level Chunking

Index Level: Small chunks (e.g., 200 tokens)
Retrieval Level: Parent documents (e.g., 1000 tokens)

When a small chunk matches, return its entire parent document.

How It Works

Indexing

Parent Doc: [A B C D E F G H I J]
  ↓ split into
Child Chunks: [A B] [C D] [E F] [G H] [I J]
  ↓ embed and index
Vector DB: stores child chunk embeddings with parent doc IDs

Retrieval

Query → matches [C D]
  ↓ retrieve parent
Return: [A B C D E F G H I J]

Implementation

from langchain.retrievers import ParentDocumentRetriever
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Child splitter (for indexing)
child_splitter = RecursiveCharacterTextSplitter(chunk_size=200)

# Parent splitter (for retrieval)
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)

retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=docstore,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
)

Benefits

Better Retrieval: Small chunks match queries precisely
Better Context: Large parents give LLM full picture
Reduced Redundancy: Deduplication at parent level
Flexible: Tune child/parent sizes independently

Variants

Full Document as Parent

Children: Sentences or paragraphs
Parent: Entire document
Best for short documents

Hierarchical Chunks

Multiple levels (sentence → paragraph → section)
Flexible context size
More complex implementation

Storage Requirements

Vectors: Only child chunks (smaller footprint)
Documents: Both child and parent docs
Document store: Key-value store for parents

Trade-offs

Advantages:

Best of both worlds (precision + context)
Better LLM answers
Reduced token usage vs. retrieving multiple small chunks

Costs:

More complex implementation
Need document store in addition to vector DB
Slightly higher storage

When to Use

Documents with clear hierarchical structure
When small chunks lack context
When retrieval precision is critical
Long documents that need chunking

Pricing

Implementation-dependent. Requires vector DB + document store (can use same database).

Parent Document Retriever

Overview

The Problem with Standard Chunking

Solution: Two-Level Chunking

How It Works

Indexing

Retrieval

Implementation

Benefits

Variants

Full Document as Parent

Hierarchical Chunks

Storage Requirements

Trade-offs

When to Use

Pricing

Information

Categories

Tags

Similar Products

Parent Document Retriever

Overview

The Problem with Standard Chunking

Solution: Two-Level Chunking

How It Works

Indexing

Retrieval

Implementation

Benefits

Variants

Full Document as Parent

Hierarchical Chunks

Storage Requirements

Trade-offs

When to Use

Pricing

Information

Categories

Tags

Similar Products