Vector Database
A database optimized for storing and searching embeddings, enabling fast similarity search across millions of items.
Why Vector Databases Exist
Regular databases find exact matches. "Find all users named John." Vector databases find similar items. "Find documents similar to this query."
This requires different technology. Traditional indexes don't work for high-dimensional similarity search. Vector databases use specialized algorithms to search efficiently.
How They Work
- Store: Convert your content to embeddings and store them
- Index: Build efficient search structures
- Query: Convert your search to an embedding, find similar stored embeddings
- Return: Get the most similar items with similarity scores
Common Vector Databases
Pinecone: Managed service, easy to start, scales well.
Weaviate: Open-source, feature-rich, can self-host.
Chroma: Lightweight, great for prototyping.
Qdrant: Open-source, high performance, Rust-based.
pgvector: PostgreSQL extension, use your existing database.
When You Need One
Building RAG systems, semantic search, or recommendation engines usually requires a vector database. The choice depends on:
- Scale (how many vectors?)
- Infrastructure (managed vs self-hosted?)
- Features (filtering, hybrid search?)
- Budget
Performance Considerations
Vector search speed depends on:
- Database choice and configuration
- Embedding dimensions
- Index type and settings
- Hardware (especially memory)
For production systems, benchmark with realistic data before committing.