Beginner1 min read

Understanding Vector Embeddings for Search

What embeddings actually are, why cosine similarity works, and how semantic search beats keyword matching for real questions.

Stack
  • Python
  • NumPy
  • OpenAI

Part 1 of 3

Series

Building an LLM Knowledge System

  1. Understanding Vector Embeddings for Search
  2. Prompt Engineering Patterns for Production
  3. Building a RAG Pipeline with Qdrant and Gemini

What you’ll learn

  • What an embedding vector represents and where it comes from
  • Why cosine similarity captures meaning better than keyword overlap
  • How semantic search answers questions exact-match search misses
On this page

An embedding turns text into a list of numbers that captures meaning. Similar ideas land close together in that space, which is what makes semantic search possible.

Why not just keywords?

Keyword search matches characters. Embeddings match meaning, so "how do I cancel my plan" finds a doc titled "Ending a subscription" even with no shared words.

Measuring closeness

Most systems compare vectors with cosine similarity, which looks at the angle between them rather than their length:

py
import numpy as np

def cosine(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

Where it goes wrong

Embeddings inherit the biases and blind spots of their training data, and chunking choices matter a lot. Test retrieval on real queries before trusting it.