AI Agents and Workflows

Vectorless RAG

Skip embeddings entirely. Build a tree index from document structure, summarize nodes with an LLM, and use structured output to find the answer. RAG without a single vector.

Every RAG tutorial starts the same way: chunk your document, embed the chunks, store them in a vector database. But open any earnings report and look at the table of contents. The document already tells you where everything is. Headers, sections, subsections - a built-in index that we flatten into vectors and throw away.

Vectorless RAG keeps that structure. Parse the document's hierarchy into a tree, summarize each node with an LLM, and at query time the model reads the tree index - titles and summaries, no content - and reasons about which nodes contain the answer. No embeddings, no vector math, no database.

What You'll Build

  • Parse markdown into a hierarchical tree structure
  • Generate bottom-up LLM summaries for each node
  • Use structured output for reasoning-based retrieval
  • Build a complete RAG pipeline without any embeddings

How It Works

Footnotes

  1. LangChain MarkdownHeaderTextSplitter

  2. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval (Sarthi et al., ICLR 2024)

  3. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al., NeurIPS 2022)