What is a Vector Database?

A vector database stores and indexes high-dimensional embedding vectors from ML models, enabling fast similarity search for AI applications like RAG and semantic search.

HyperStore · Published on 2026-06-20

#embeddings #RAG #retrieval-augmented generation #semantic search #similarity search #vector database

A vector database is a specialized storage system designed to handle high-dimensional vectors, the numeric arrays of typically hundreds or thousands of floating-point numbers that machine learning models use to represent meaning. Words, sentences, images, audio clips, and user behaviors can all be encoded as vectors, and a vector database makes it possible to store billions of these embeddings and find the closest matches to a new query in milliseconds.

How a vector database works

When an ML model such as a large language model or a vision encoder produces an embedding, that vector is shipped to the database along with a reference to the original item, a piece of text, an image file, a product record, and so on. The database builds an index using an approximate nearest neighbor (ANN) algorithm such as HNSW (Hierarchical Navigable Small World) or IVF (Inverted File Index), structures that sacrifice a small amount of exactness in exchange for dramatically faster queries on large datasets. At search time, the application sends a fresh embedding as a query, and the index returns the top-k vectors ranked by a similarity metric, commonly cosine similarity, dot product, or Euclidean distance.

Why it matters

Traditional keyword search cannot tell that "feline companion" and "house cat" mean nearly the same thing, but their embeddings land close together in vector space, so a vector database surfaces them as matches anyway. This capability is what underpins modern semantic search, recommendation engines, image and audio retrieval, anomaly detection, and the retrieval step in Retrieval-Augmented Generation (RAG), where an LLM is grounded in documents fetched from a vector store. Without purpose-built indexing, comparing a query against millions of vectors one by one would be far too slow for production traffic.

Key types and examples

Dedicated vector databases: purpose-built engines such as Milvus, Qdrant, Weaviate, and Pinecone, designed from the ground up around ANN indexes.
Vector search libraries: lightweight engines like FAISS and Annoy that run inside an application rather than as a standalone service.
Hybrid databases: conventional stores such as PostgreSQL (via pgvector), Elasticsearch, and MongoDB that add vector indexing to existing document or relational features.
Managed cloud services: hosted offerings from major cloud providers that integrate vector search with broader data platforms.

Choosing between them usually comes down to scale, latency requirements, whether the data lives alongside structured records, and how much operational overhead a team is willing to take on. The strongest systems in the field are evaluated on benchmarks such as the ANN-Benchmarks leaderboard, which compares recall against queries per second across representative datasets.

How a vector database works

Why it matters

Key types and examples

You might also like

What is a Neural Network?

What is a Transformer?

What is a Token in AI and Language Models?

Related posts

What is an Embedding?

What is Retrieval-Augmented Generation (RAG)?

Graphlit Review: API-First AI Platform for Unstructured Data