What classified AI actually means
Most AI vendors have never shipped a system that runs on a classified network. Classified AI is not a different algorithm — it is a different delivery model. The model weights, the training data, the prompt logs, the gradient updates, the evaluation harness, the developer's laptop, the CI runner, the observability stack, and the human in the loop all live inside a classified enclave. There is no outbound internet. There is no pip install at runtime. There is no hosted LLM API. There is no screenshot leaving the room. Every dependency must be brought in through an accredited transfer path, scanned for malware, and approved by the information system security manager. This changes how you build, test, deploy, and monitor an AI system in ways that only show up after you try it.
Precision Federal builds classified AI the way the Intelligence Community and Defense mission owners actually consume it: open-weight large language models running on accredited GPU clusters inside SCIFs, retrieval systems that honor clearance-based data labels, inference stacks that emit classified audit logs to the enclave SIEM, and ATO packages that survive a Special Access Program assessor. The founder's prior federal delivery includes SCIF-adjacent work at Harmonia (Harmonia Holdings) and deep experience with FedRAMP High and DoD IL5 patterns that translate directly into Secret and TS/SCI design.
Why this matters federally: the agencies with the most valuable AI use cases — IC analytic tradecraft, DoD targeting workflows, counterintelligence triage, cleared law enforcement collection — cannot use commercial SaaS LLMs at all. They need builders who can run the same modern AI stack behind the green door, and they need those builders to understand that a missing sanitization step is a spillage, not a bug.
CLASSIFIED AI — FEDERAL APPLICATION FIT
The classified AI stack we use
Every component below was selected because it runs inside a closed enclave without phoning home, supports offline license activation, and produces auditable artifacts.
- Open-weight LLMs: Llama 3.1 (8B/70B/405B), Mistral Large, Mixtral 8x22B, Qwen 2.5, DeepSeek-V3, Phi-4. Weights imported via accredited one-way transfer, SHA-256 verified, scanned, and registered in a classified model registry.
- Inference runtimes: vLLM, TensorRT-LLM, llama.cpp, Text Generation Inference — all usable fully offline. NVIDIA Triton for serving when GPU utilization matters more than throughput.
- Fine-tuning: LoRA and QLoRA via PEFT, full-parameter SFT on DGX-class hardware, DPO/KTO for preference alignment. Training logs stay inside the enclave; no W&B cloud.
- Retrieval and RAG: self-hosted vector stores (Qdrant, Milvus, Weaviate, pgvector). Classification labels attached at the chunk level; retrieval filter enforces clearance-based access before the LLM ever sees a document.
- Classified MLOps: MLflow self-hosted, DVC on classified S3-equivalent, Airflow or Argo Workflows for orchestration. See our MLOps page for the unclassified baseline.
- Red team and eval: Garak, PyRIT, lm-evaluation-harness, custom mission-specific evals. Adversarial probes stored in the enclave; findings routed to the ISSM.
- GPU platforms: NVIDIA DGX, HPE Cray, Supermicro GPU nodes, often on DISA-accredited hosting or SCIF-resident compute. Kubernetes via OpenShift or vanilla with STIG-hardened control plane.
- Guardrails: NVIDIA NeMo Guardrails, Llama Guard, classification-aware output filters built to the agency's marking standard (CAPCO).
Multi-level security (MLS) design
When users at different clearance levels share an AI system, leakage is the first failure mode. We design retrieval, prompting, and logging so a Secret user never sees TS content and a TS user sees correctly-marked output. Controls include per-document classification labels, user-attribute filtering on the retriever, mandatory access control in the application layer, output classification that defaults to the highest input level, and CAPCO-compliant banners on every response. For cross-domain flows we integrate with accredited cross-domain solutions (Forcepoint, Owl, Arbit) rather than rolling our own guard.
Federal deployment considerations
Classified AI engagements almost never start as greenfield. The accrediting authority already has opinions about where the enclave lives, who can touch it, and what an acceptable change management process looks like. Our delivery pattern:
- Boundary first: the authorization boundary is drawn before the first line of code. Every LLM, retriever, ingestion pipeline, and logging component is mapped to it.
- Classification of weights: open-weight models are unclassified on import; fine-tunes on classified data become classified at the derivation level. The model registry enforces this.
- ATO path: we align to the agency's assessment body — DCSA, DIA SAP office, NSA CSfC, or service-specific accrediting authorities — and reuse control inheritance from the hosting enclave wherever possible. See ATO engineering.
- ITAR and EAR: training data and model artifacts that touch defense articles or dual-use technologies are treated as controlled. Export control officers review before any cross-boundary transfer.
- Cleared ops: only cleared personnel touch classified artifacts. Development on unclassified benches uses synthetic data of equivalent structure, and promotion to the enclave follows the accredited transfer path.
Where this fits in Precision Federal engagements
Classified AI sits at the top of the sensitivity pyramid for our federal work. It layers on top of generative AI, RAG systems, and MLOps, and inherits authorization patterns from ATO engineering and cybersecurity. Typical engagements: stand up a sanitized open-weight LLM inside a mission enclave, harden a retrieval pipeline against spillage, deliver an ATO package for a TS/SCI AI application, or embed a cleared engineer for sustained mission AI delivery.