Why this choice matters more than people think
The vector database is the memory of every RAG system. Get it wrong and you either overpay, bottleneck at P95, or carry an operational burden that chews through staff hours every month. Get it right and it recedes into the background like Postgres itself — present, reliable, not a topic of conversation. Federal programs tend to over-engineer this layer because the commercial marketing implies you need a dedicated vector database for anything serious. You usually do not.
This post walks the real options in 2026, what each is good at, what each is bad at, and how to pick without listening to the vendor that most wants your business.
Vector DB comparison: compliance, speed at 1M vectors, self-hosted support
The contenders in 2026

pgvector
An extension to PostgreSQL that adds vector columns, distance operators, and HNSW / IVFFlat indexes. Available on RDS for PostgreSQL in GovCloud, Azure Database for PostgreSQL in Azure Government, and Aurora PostgreSQL. Zero new service to authorize — inherits from the Postgres authorization.
Strengths: you are already running Postgres; metadata and vectors in one database; transactional semantics when you need them; no separate operational surface.
Limits: HNSW memory footprint grows with vector count; filtered ANN is less sophisticated than Qdrant; above ~50M vectors you start tuning carefully.
Qdrant
Open-source vector database in Rust. Self-hosted or managed (managed Qdrant Cloud is not FedRAMP-authorized as of early 2026; self-hosted in GovCloud is the federal path). Excellent filtered-ANN, payload support, and snapshot tooling.
Strengths: strong filtered search with complex boolean conditions on metadata; good performance at 100M+ vectors; clean API; mature client libraries.
Limits: separate service to operate, back up, and include in your ATO package; no native BM25 (hybrid is your orchestration responsibility).
Weaviate
Open-source vector DB with strong module ecosystem (built-in vectorizers, reranker integration). Self-hosted; managed cloud is not FedRAMP for federal use cases as of early 2026.
Strengths: GraphQL API, multi-tenancy, native hybrid with BM25, cross-references between objects.
Limits: steeper learning curve; the module ecosystem can pull you toward features that are nice to have but create lock-in.
Milvus
Open-source vector DB from Zilliz. Strong at massive scale. Self-hosted on Kubernetes is the federal path.
Strengths: designed for billion-vector workloads; GPU indexing options; strong community.
Limits: operational complexity (multiple components, object storage dependency); overkill for most federal corpora; K8s expertise required.
Amazon OpenSearch Service (k-NN)
Managed OpenSearch in AWS GovCloud with native k-NN vector search plus full BM25 and rich querying. FedRAMP High authorized.
Strengths: single engine for hybrid (vector + BM25 + filters); mature ops; same service for logs and vectors; familiar query DSL.
Limits: k-NN is solid but not the rigorous for pure ANN at very high QPS; indexing can be slow on large bulk loads; JVM tuning.
Azure AI Search
Managed search service in Azure Government with vector support, BM25, semantic ranker, and filtered search. FedRAMP High authorized.
Strengths: hybrid out of the box; semantic ranker adds a cross-encoder-style boost without self-hosting one; integrates tightly with Azure OpenAI for embeddings; low ops.
Limits: tier-based pricing that scales with storage and replicas; less control than a self-hosted option; index schema is less flexible than OpenSearch.
pgvector + ParadeDB / pg_search
Emerging pattern: pgvector for vectors plus ParadeDB or pg_search for high-quality BM25 inside Postgres. Single-database hybrid. Worth watching, especially for programs already committed to Postgres.
Comparison matrix
| Option | Authorization path | Comfort zone (vectors) | Native hybrid | Ops burden |
|---|---|---|---|---|
| pgvector on RDS/Azure PG | Inherits from managed Postgres | Up to ~50M | No (combine with FTS) | Low |
| Qdrant self-hosted | Your ATO + platform inheritance | 10M-500M+ | No (orchestrate BM25) | Medium |
| Weaviate self-hosted | Your ATO + platform inheritance | 10M-200M | Yes | Medium |
| Milvus self-hosted | Your ATO + platform inheritance | 50M-10B+ | Limited | High |
| OpenSearch managed | FedRAMP High (GovCloud) | 10M-500M | Yes | Low |
| Azure AI Search | FedRAMP High (Azure Gov) | 10M-200M | Yes (+ semantic ranker) | Very low |
Decision framework
Start with the question: am I already using Postgres?
If yes, default to pgvector until you have a measured reason to move. The operational simplicity is worth a lot. Most federal document-corpus workloads sit comfortably in pgvector territory.
Next: do I need native hybrid, or will I orchestrate it?
Native hybrid (OpenSearch, Azure AI Search, Weaviate) saves engineering effort. Orchestrated hybrid (pgvector + Postgres FTS, Qdrant + a separate BM25 service) offers more control but requires you to implement RRF or a similar fusion method yourself.
Next: where are my governance and boundary requirements?
If you are on Azure Government, Azure AI Search is the path of least resistance. If you are on AWS GovCloud, pgvector on RDS or OpenSearch is the path of least resistance. Self-hosted options (Qdrant, Weaviate, Milvus) are appropriate when a managed service does not meet a specific constraint or when you are in a classified enclave where managed services do not run.
Next: what is my scale and query pattern?
Honest numbers: most federal corpora we have worked with fit in 1M-20M vectors. A few fit in 50M-200M. A handful of crawl-and-retrieve workloads cross 500M. Size your choice to the actual corpus, not the hypothetical one.
Benchmarks you actually need to run
Do not pick on vendor benchmarks. Load a sample of your real corpus, generate embeddings with your real embedder, and measure:
- Recall@10 against a held-out ground-truth set of query/relevant-doc pairs from your corpus.
- Latency P50, P95, P99 at your target QPS.
- Indexing throughput for a full reindex, because you will do full reindexes.
- Filtered ANN with realistic metadata filters (classification, effective date, source).
- Hybrid quality if you plan to use native hybrid.
Expect surprises. We have seen pgvector outperform Qdrant on small corpora, Qdrant dominate at scale, and Azure AI Search close the gap on everything it is priced for. No vendor's marketing matches every workload.
CUI, classification, and multi-tenancy
Vector stores must respect classification boundaries. Three patterns we use:
- Separate index per classification. Cleanest. Different authorization boundaries, different retrieval endpoints. Works for any vector store.
- Single index with filter-at-query. Simpler ops, thinner security boundary. Requires the filter to be applied server-side and verified from an authenticated identity, not client-side metadata.
- Separate tenant / collection per program. Qdrant collections, Weaviate tenants, OpenSearch indices. Good for multi-program consolidations on one platform.
For CUI we default to separate indices. The extra ops cost is small; the security clarity is worth it.
Update cadence and reindexing
Corpora churn. Documents are added, superseded, redacted, or removed. Your vector store strategy must handle this without a nightly full rebuild (on large corpora) and without leaving ghost chunks in the index.
- Track chunk-to-document relationships in a system of record (a relational table). The vector store is a secondary index, not the source of truth.
- On document update, delete all chunks for that doc version and insert new chunks. Atomic where possible.
- Run a periodic reconciliation that compares the system of record against the vector index and repairs drift.
- Version the embedding model in the chunk metadata so you can do incremental re-embed when you change models.
Failure modes we have debugged
Orphan chunks after doc deletion
Delete path existed but did not include all index shards. Retrieval returned chunks from documents that no longer existed.
Silent embedding model change
SDK auto-updated to a new version; new vectors had different distributions; recall collapsed. Pin versions.
Missing filter enforcement
Classification filter was applied client-side. A bug in the client path bypassed it. Filters must be server-side and signed.
HNSW tuning surprise
Default efConstruction was too low for the corpus size; recall was bad for a week before anyone noticed, because the eval harness was not running.
Index bloat on frequent updates
pgvector HNSW can accumulate deleted-tombstone overhead under heavy churn. Schedule REINDEX.
Where this fits in our practice
We size and stand up vector stores as part of the full RAG platform. See our RAG architecture for the surrounding pipeline, and our MLOps on GovCloud for how we wire indexing into CI/CD.