What CUI is, and is not
Controlled Unclassified Information is federal information that requires safeguarding or dissemination controls but is not classified. It is the category that covers most of what a federal program generates: procurement-sensitive, law-enforcement-sensitive, tax-return information, export-controlled technical data, privacy-act-covered PII, PHI on federal systems, and much more. NARA maintains the official CUI Registry, currently listing 125+ categories organized under 20 groupings.
Controlled Unclassified Information triggers NIST 800-171 compliance. AI systems that ingest, process, or output CUI — including federal health records, procurement data, and law enforcement information — must meet all 110 800-171 controls before first use.
CUI is not "secret lite." It is an administrative category. Losing control of CUI does not produce a classified spill, but it does produce an incident reportable under the applicable contract clause (DFARS 252.204-7012 for DoD, FAR 52.204-21 for general federal, agency clauses for many others).
Basic vs Specified

| Type | What it means | Examples |
|---|---|---|
| CUI Basic | General CUI. Standard 800-171 safeguarding. No additional legal overlay. | Procurement-sensitive, general law-enforcement, ordinary administrative records. |
| CUI Specified | CUI with additional specific safeguarding or dissemination controls mandated by law, regulation, or policy. | Tax return information (IRS Pub 1075), export-controlled (ITAR/EAR), nuclear-related, law-enforcement-sensitive (LES). |
Basic obeys the baseline. Specified adds obligations on top — personnel clearance, encryption at specific levels, storage restrictions, transport constraints. An AI system touching tax return information inherits IRS Publication 1075 obligations that are meaningfully more stringent than Basic.
Marking requirements
Rules in 32 CFR Part 2002 and NARA's CUI marking handbook.
- Banner line on every page: starts with "CUI", optional category codes (e.g., "CUI//SP-PROCURE").
- Portion marks for documents mixing CUI and non-CUI content.
- Designation indicator: who designated, contact info, authority.
- Decontrol schedule if the CUI has a finite control period.
For electronic systems, markings carry forward as metadata. Unmarked CUI is still CUI — unmarked is a finding, not a free pass.
NIST 800-171 is the floor
For contractor systems handling CUI, 800-171 is the baseline. For federal-agency-operated systems, the baseline is NIST 800-53 Rev 5 Moderate (or High, depending on categorization). FedRAMP authorizations sit on top of 800-53, so cloud services inherit that baseline.
Why most commercial LLM vendors are not CUI-safe
- OpenAI (commercial API) — no FedRAMP
- Anthropic (commercial API) — no FedRAMP
- Google Gemini (consumer) — no 800-171
- Hugging Face Inference API — no CUI controls
- Amazon Bedrock (GovCloud) — FedRAMP High
- Azure OpenAI (Gov) — FedRAMP High / IL4
- Self-hosted open weights on FedRAMP IaaS
- On-prem deployment inside cleared boundary
- No FedRAMP authorization on the commercial service.
- No NIST 800-171 attestation — commercial endpoints are not built against it.
- Default terms allow prompts to be used for product improvement or training in some configurations.
- Data residency cannot be guaranteed — commercial regions span multiple countries.
- Vendor-side prompt logging is standard operational practice, not handled at CUI sensitivity.
The moment a developer pastes CUI into a commercial API, you have a CUI spill. Reportable under DFARS 7012 within 72 hours.
What is safe
| Service | CUI-safe for | Notes |
|---|---|---|
| Azure OpenAI (Azure Government) | CUI Basic, many Specified | FedRAMP High, IL4/IL5. No training on customer data. |
| Amazon Bedrock (AWS GovCloud) | CUI Basic, many Specified | FedRAMP High, IL4/IL5. Claude, Titan, select Meta/Mistral. |
| Google Vertex AI (Assured Workloads) | CUI Basic, many Specified | FedRAMP High, IL4/IL5. Gemini family. |
| Self-hosted open-weight on GovCloud | Most CUI, including some Specified | Inherit platform controls, own model-layer controls. |
| Specified with statutory overlay (IRS 1075, ITAR) | Check the specific statute | Beyond FedRAMP — need the specific overlay. |
RAG, fine-tuning, and the forgotten CUI pathways
RAG retrieval from unauthorized stores
You harden the LLM endpoint, then connect it to a retrieval store containing CUI documents and let the model return content without respecting source-document classification. Retrieval must enforce source-document access controls at query time, not only at index time.
Fine-tuning data leaking into model weights
If you fine-tune on CUI, the weights inherit the CUI sensitivity. The fine-tuned model is now a CUI artifact. Storage, transport, inference all at the CUI level. You cannot export the weights to a commercial endpoint for "just the inference."
Prompt logs as CUI stores
Your audit log captures full prompts. Users will put CUI in prompts. Your log is now a CUI store — FIPS-validated crypto at rest, retention per SSP, cleared-staff access, scheduled purge.
The designation and decontrol problem
Agencies sometimes over-mark, stamping "CUI" on content that does not meet the criteria. They also sometimes under-mark, sending you unmarked documents with CUI-category content. The obligation runs to the information, not the marking. Unmarked CUI is still CUI. When in doubt, ask the contracting officer.
Bottom line
CUI is the default data category in federal AI work. Mark it, handle under 800-171 for contractor systems or 800-53 for federal systems, and keep it off commercial LLM endpoints. Use FedRAMP-authorized services in Government regions or self-host on authorized infrastructure. Treat RAG stores, fine-tuned weights, and prompt logs as CUI artifacts when they carry CUI content.
Frequently asked questions
Controlled Unclassified Information — federal information requiring safeguarding or dissemination controls under 32 CFR Part 2002. 125+ categories in the NARA CUI Registry, grouped into Basic and Specified.
Basic follows baseline 800-171 or 800-53 controls. Specified carries additional statutory or regulatory obligations — tax return information (IRS Pub 1075), ITAR/EAR, law-enforcement-sensitive, nuclear-related.
No. Commercial endpoints are not FedRAMP-authorized and not built against 800-171. Sending CUI to them is a reportable incident. Use Azure OpenAI Government, Bedrock GovCloud, Vertex Assured Workloads, or self-hosted on authorized infrastructure.
Yes. Banner line, portion marks when mixing, designation indicator, decontrol schedule when applicable. 32 CFR Part 2002 and NARA's marking handbook are the source of truth.
If fine-tuned on CUI, yes. The weights inherit the training-data sensitivity.
Report to the contracting officer and, for DFARS 7012 contracts, to DIBNet within 72 hours. Preserve affected media. Expect a damage assessment. Consequences range from stop-work orders to debarment.