Embeddings and Vector Databases

Convert text to vectors with Ollama embeddings. Store them in PostgreSQL (pgvector) with HNSW indexes for persistent, scalable similarity search.

In the previous tutorial, we created enriched chunks with structure-aware splitting and LLM-generated context. But they exist only in Python memory, you'll lose them when you close your Python instance.

Production RAG systems need three things:

  1. Persistence - Chunks survive restarts
  2. Scalability - Handle thousands of documents without re-processing
  3. Speed - Sub-second similarity search, even with millions of vectors

We'll solve this with embeddings (converting text to vectors) and pgvector (PostgreSQL extension for vector similarity search), running locally via Supabase.

What You'll Build

  • Understand how embeddings capture semantic meaning
  • Store chunks with vector embeddings in PostgreSQL (pgvector)
  • Query chunks using full-text search (keywords) and vector search (semantic)
  • Index vectors with HNSW for fast similarity search

What are Embeddings?

Footnotes

  1. Qwen3 Embedding

  2. Massive Text Embedding Benchmark (MTEB) Leaderboard

  3. Qwen3 Embedding on Ollama

  4. Why vector databases are a scam

  5. HNSW Indexing 2