Skip to main content

Vector database selection for federal AI systems.

April 3, 2026 · 14 min read · pgvector, Milvus, Weaviate, Qdrant, OpenSearch, Azure AI Search — honest comparison on FedRAMP status, cost, and scaling.

Why this choice matters more than people think

The vector database is the memory of every RAG system. Get it wrong and you either overpay, bottleneck at P95, or carry an operational burden that chews through staff hours every month. Get it right and it recedes into the background like Postgres itself — present, reliable, not a topic of conversation. Federal programs tend to over-engineer this layer because the commercial marketing implies you need a dedicated vector database for anything serious. You usually do not.

This post walks the real options in 2026, what each is good at, what each is bad at, and how to pick without listening to the vendor that most wants your business.

Short version. Default to pgvector on managed Postgres if you are already using Postgres. Move to Qdrant self-hosted, Azure AI Search, or Amazon OpenSearch when you have specific scale or feature requirements pgvector does not meet.

Vector DB comparison: compliance, speed at 1M vectors, self-hosted support

The contenders in 2026

pgvector

An extension to PostgreSQL that adds vector columns, distance operators, and HNSW / IVFFlat indexes. Available on RDS for PostgreSQL in GovCloud, Azure Database for PostgreSQL in Azure Government, and Aurora PostgreSQL. Zero new service to authorize — inherits from the Postgres authorization.

Strengths: you are already running Postgres; metadata and vectors in one database; transactional semantics when you need them; no separate operational surface.

Limits: HNSW memory footprint grows with vector count; filtered ANN is less sophisticated than Qdrant; above ~50M vectors you start tuning carefully.

Qdrant

Open-source vector database in Rust. Self-hosted or managed (managed Qdrant Cloud is not FedRAMP-authorized as of early 2026; self-hosted in GovCloud is the federal path). Excellent filtered-ANN, payload support, and snapshot tooling.

Strengths: strong filtered search with complex boolean conditions on metadata; good performance at 100M+ vectors; clean API; mature client libraries.

Limits: separate service to operate, back up, and include in your ATO package; no native BM25 (hybrid is your orchestration responsibility).

Weaviate

Open-source vector DB with strong module ecosystem (built-in vectorizers, reranker integration). Self-hosted; managed cloud is not FedRAMP for federal use cases as of early 2026.

Strengths: GraphQL API, multi-tenancy, native hybrid with BM25, cross-references between objects.

Limits: steeper learning curve; the module ecosystem can pull you toward features that are nice to have but create lock-in.

Milvus

Open-source vector DB from Zilliz. Strong at massive scale. Self-hosted on Kubernetes is the federal path.

Strengths: designed for billion-vector workloads; GPU indexing options; strong community.

Limits: operational complexity (multiple components, object storage dependency); overkill for most federal corpora; K8s expertise required.

Amazon OpenSearch Service (k-NN)

Managed OpenSearch in AWS GovCloud with native k-NN vector search plus full BM25 and rich querying. FedRAMP High authorized.

Strengths: single engine for hybrid (vector + BM25 + filters); mature ops; same service for logs and vectors; familiar query DSL.

Limits: k-NN is solid but not the rigorous for pure ANN at very high QPS; indexing can be slow on large bulk loads; JVM tuning.

Azure AI Search

Managed search service in Azure Government with vector support, BM25, semantic ranker, and filtered search. FedRAMP High authorized.

Strengths: hybrid out of the box; semantic ranker adds a cross-encoder-style boost without self-hosting one; integrates tightly with Azure OpenAI for embeddings; low ops.

Limits: tier-based pricing that scales with storage and replicas; less control than a self-hosted option; index schema is less flexible than OpenSearch.

pgvector + ParadeDB / pg_search

Emerging pattern: pgvector for vectors plus ParadeDB or pg_search for high-quality BM25 inside Postgres. Single-database hybrid. Worth watching, especially for programs already committed to Postgres.

Comparison matrix

OptionAuthorization pathComfort zone (vectors)Native hybridOps burden
pgvector on RDS/Azure PGInherits from managed PostgresUp to ~50MNo (combine with FTS)Low
Qdrant self-hostedYour ATO + platform inheritance10M-500M+No (orchestrate BM25)Medium
Weaviate self-hostedYour ATO + platform inheritance10M-200MYesMedium
Milvus self-hostedYour ATO + platform inheritance50M-10B+LimitedHigh
OpenSearch managedFedRAMP High (GovCloud)10M-500MYesLow
Azure AI SearchFedRAMP High (Azure Gov)10M-200MYes (+ semantic ranker)Very low

Decision framework

Start with the question: am I already using Postgres?

If yes, default to pgvector until you have a measured reason to move. The operational simplicity is worth a lot. Most federal document-corpus workloads sit comfortably in pgvector territory.

Next: do I need native hybrid, or will I orchestrate it?

Native hybrid (OpenSearch, Azure AI Search, Weaviate) saves engineering effort. Orchestrated hybrid (pgvector + Postgres FTS, Qdrant + a separate BM25 service) offers more control but requires you to implement RRF or a similar fusion method yourself.

Next: where are my governance and boundary requirements?

If you are on Azure Government, Azure AI Search is the path of least resistance. If you are on AWS GovCloud, pgvector on RDS or OpenSearch is the path of least resistance. Self-hosted options (Qdrant, Weaviate, Milvus) are appropriate when a managed service does not meet a specific constraint or when you are in a classified enclave where managed services do not run.

Next: what is my scale and query pattern?

Honest numbers: most federal corpora we have worked with fit in 1M-20M vectors. A few fit in 50M-200M. A handful of crawl-and-retrieve workloads cross 500M. Size your choice to the actual corpus, not the hypothetical one.

Benchmarks you actually need to run

Do not pick on vendor benchmarks. Load a sample of your real corpus, generate embeddings with your real embedder, and measure:

  • Recall@10 against a held-out ground-truth set of query/relevant-doc pairs from your corpus.
  • Latency P50, P95, P99 at your target QPS.
  • Indexing throughput for a full reindex, because you will do full reindexes.
  • Filtered ANN with realistic metadata filters (classification, effective date, source).
  • Hybrid quality if you plan to use native hybrid.

Expect surprises. We have seen pgvector outperform Qdrant on small corpora, Qdrant dominate at scale, and Azure AI Search close the gap on everything it is priced for. No vendor's marketing matches every workload.

The best vector database on your program is the one you can run correctly for the next five years without a dedicated person. Everything else is a distant second.

CUI, classification, and multi-tenancy

Vector stores must respect classification boundaries. Three patterns we use:

  • Separate index per classification. Cleanest. Different authorization boundaries, different retrieval endpoints. Works for any vector store.
  • Single index with filter-at-query. Simpler ops, thinner security boundary. Requires the filter to be applied server-side and verified from an authenticated identity, not client-side metadata.
  • Separate tenant / collection per program. Qdrant collections, Weaviate tenants, OpenSearch indices. Good for multi-program consolidations on one platform.

For CUI we default to separate indices. The extra ops cost is small; the security clarity is worth it.

Update cadence and reindexing

Corpora churn. Documents are added, superseded, redacted, or removed. Your vector store strategy must handle this without a nightly full rebuild (on large corpora) and without leaving ghost chunks in the index.

  • Track chunk-to-document relationships in a system of record (a relational table). The vector store is a secondary index, not the source of truth.
  • On document update, delete all chunks for that doc version and insert new chunks. Atomic where possible.
  • Run a periodic reconciliation that compares the system of record against the vector index and repairs drift.
  • Version the embedding model in the chunk metadata so you can do incremental re-embed when you change models.

Failure modes we have debugged

Orphan chunks after doc deletion

Delete path existed but did not include all index shards. Retrieval returned chunks from documents that no longer existed.

Silent embedding model change

SDK auto-updated to a new version; new vectors had different distributions; recall collapsed. Pin versions.

Missing filter enforcement

Classification filter was applied client-side. A bug in the client path bypassed it. Filters must be server-side and signed.

HNSW tuning surprise

Default efConstruction was too low for the corpus size; recall was bad for a week before anyone noticed, because the eval harness was not running.

Index bloat on frequent updates

pgvector HNSW can accumulate deleted-tombstone overhead under heavy churn. Schedule REINDEX.

Where this fits in our practice

We size and stand up vector stores as part of the full RAG platform. See our RAG architecture for the surrounding pipeline, and our MLOps on GovCloud for how we wire indexing into CI/CD.

FAQ

Is pgvector enough for federal production workloads?
For the vast majority of federal document-corpus workloads, yes. RDS for PostgreSQL in GovCloud and Azure Database for PostgreSQL in Azure Government both support pgvector with HNSW indexes. It scales to tens of millions of vectors with good latency and avoids operating a separate service.
When do you outgrow pgvector?
Around 50M-100M vectors, or when P95 latency under 50ms at high QPS becomes a hard requirement, or when you need rich filtered-ANN semantics that go beyond what pgvector provides. At that point Qdrant or Milvus self-hosted, or Azure AI Search managed, become more attractive.
Does Milvus work in GovCloud?
Milvus is self-hosted open-source software. It runs in GovCloud on EKS or EC2 the same way any container workload does. There is no FedRAMP authorization on Milvus itself because it is software you deploy, not a service. The authorization inheritance comes from the underlying AWS GovCloud platform plus your own ATO package.
What is hybrid search and why does federal need it?
Hybrid search combines dense vector similarity with sparse keyword search (typically BM25) and fuses the results. Federal documents have high-signal exact tokens (case numbers, statute citations, contract numbers, proper nouns) that vector search smooths over. Hybrid recovers those.
Should I use Azure AI Search or OpenSearch?
Azure AI Search is simpler, offers semantic ranking out of the box, and is a good fit for Azure Government programs. OpenSearch is a better fit if you are already on AWS GovCloud and want full control over scoring, analyzers, and plugins. Both support native hybrid.
How much does vector storage cost in federal cloud?
Dominant cost is the compute hosting the index, not storage. A 10M-vector pgvector index fits on a $300-500/month RDS instance. A 100M-vector Qdrant cluster in GovCloud runs $3K-8K/month in EC2. Managed services like Azure AI Search scale by tier and query volume; budget $1K-10K/month for production.

Related insights

Picking the right vector store for a federal AI program?

We benchmark vector databases against your corpus, your query mix, and your authorization boundary, and help you choose without vendor lock-in.