What Is an Embedding (Vector)? How Meaning Becomes Numbers, Uses, and Choosing a Model
RAG, semantic search, and recommendations all rely on an unsung workhorse: the embedding (vector). An embedding is the meaning of text (or an image) converted into a sequence of numbers — a vector. The word "dog" becomes a list of hundreds to thousands of numbers that act as "coordinates of meaning," so words close in meaning sit near each other ("dog" and "puppy" are close; "dog" and "car" are far), and closeness is quantified with measures like cosine similarity. Famous example: "king − man + woman ≈ queen." Because of this, a machine can judge whether meaning is close even when the characters don't match. This beginner guide covers what an embedding is (a "map of meaning"), why closeness measures meaning (dimensions and cosine similarity), what it's used for (RAG, semantic search, classification and dedup, recommendations, and multimodal), how to choose an embedding model (API type like OpenAI text-embedding-3, Cohere, Gemini, Voyage; open-source like BGE-M3, Nomic, Qwen3; plus Matryoshka, which can cut 3,072 dimensions to 1,024 while keeping about 95% of quality at roughly a third of the cost), and vector DBs (Pinecone, Weaviate, Qdrant, Chroma, pgvector) with a three-step start (pick a model, vectorize and store documents, vectorize the question and search). Embeddings are the foundation of implementing RAG.