CUI Handling for Federal AI Systems

What CUI is, and is not

Controlled Unclassified Information is federal information that requires safeguarding or dissemination controls but is not classified. It is the category that covers most of what a federal program generates: procurement-sensitive, law-enforcement-sensitive, tax-return information, export-controlled technical data, privacy-act-covered PII, PHI on federal systems, and much more. NARA maintains the official CUI Registry, currently listing 125+ categories organized under 20 groupings.

CUI CHANGES EVERYTHING

Controlled Unclassified Information triggers NIST 800-171 compliance. AI systems that ingest, process, or output CUI — including federal health records, procurement data, and law enforcement information — must meet all 110 800-171 controls before first use.

CUI is not "secret lite." It is an administrative category. Losing control of CUI does not produce a classified spill, but it does produce an incident reportable under the applicable contract clause (DFARS 252.204-7012 for DoD, FAR 52.204-21 for general federal, agency clauses for many others).

If data came from a federal source and is not already public, assume it is CUI until a contracting officer tells you otherwise.

Basic vs Specified

Type	What it means	Examples
CUI Basic	General CUI. Standard 800-171 safeguarding. No additional legal overlay.	Procurement-sensitive, general law-enforcement, ordinary administrative records.
CUI Specified	CUI with additional specific safeguarding or dissemination controls mandated by law, regulation, or policy.	Tax return information (IRS Pub 1075), export-controlled (ITAR/EAR), nuclear-related, law-enforcement-sensitive (LES).

Basic obeys the baseline. Specified adds obligations on top — personnel clearance, encryption at specific levels, storage restrictions, transport constraints. An AI system touching tax return information inherits IRS Publication 1075 obligations that are meaningfully more stringent than Basic.

Marking requirements

Rules in 32 CFR Part 2002 and NARA's CUI marking handbook.

Banner line on every page: starts with "CUI", optional category codes (e.g., "CUI//SP-PROCURE").
Portion marks for documents mixing CUI and non-CUI content.
Designation indicator: who designated, contact info, authority.
Decontrol schedule if the CUI has a finite control period.

For electronic systems, markings carry forward as metadata. Unmarked CUI is still CUI — unmarked is a finding, not a free pass.

NIST 800-171 is the floor

For contractor systems handling CUI, 800-171 is the baseline. For federal-agency-operated systems, the baseline is NIST 800-53 Rev 5 Moderate (or High, depending on categorization). FedRAMP authorizations sit on top of 800-53, so cloud services inherit that baseline.

Why most commercial LLM vendors are not CUI-safe

NOT CUI-safe

OpenAI (commercial API) — no FedRAMP
Anthropic (commercial API) — no FedRAMP
Google Gemini (consumer) — no 800-171
Hugging Face Inference API — no CUI controls

CUI-capable options

Amazon Bedrock (GovCloud) — FedRAMP High
Azure OpenAI (Gov) — FedRAMP High / IL4
Self-hosted open weights on FedRAMP IaaS
On-prem deployment inside cleared boundary

No FedRAMP authorization on the commercial service.
No NIST 800-171 attestation — commercial endpoints are not built against it.
Default terms allow prompts to be used for product improvement or training in some configurations.
Data residency cannot be guaranteed — commercial regions span multiple countries.
Vendor-side prompt logging is standard operational practice, not handled at CUI sensitivity.

The moment a developer pastes CUI into a commercial API, you have a CUI spill. Reportable under DFARS 7012 within 72 hours.

What is safe

Service	CUI-safe for	Notes
Azure OpenAI (Azure Government)	CUI Basic, many Specified	FedRAMP High, IL4/IL5. No training on customer data.
Amazon Bedrock (AWS GovCloud)	CUI Basic, many Specified	FedRAMP High, IL4/IL5. Claude, Titan, select Meta/Mistral.
Google Vertex AI (Assured Workloads)	CUI Basic, many Specified	FedRAMP High, IL4/IL5. Gemini family.
Self-hosted open-weight on GovCloud	Most CUI, including some Specified	Inherit platform controls, own model-layer controls.
Specified with statutory overlay (IRS 1075, ITAR)	Check the specific statute	Beyond FedRAMP — need the specific overlay.

RAG, fine-tuning, and the forgotten CUI pathways

RAG retrieval from unauthorized stores

You harden the LLM endpoint, then connect it to a retrieval store containing CUI documents and let the model return content without respecting source-document classification. Retrieval must enforce source-document access controls at query time, not only at index time.

Fine-tuning data leaking into model weights

If you fine-tune on CUI, the weights inherit the CUI sensitivity. The fine-tuned model is now a CUI artifact. Storage, transport, inference all at the CUI level. You cannot export the weights to a commercial endpoint for "just the inference."

Prompt logs as CUI stores

Your audit log captures full prompts. Users will put CUI in prompts. Your log is now a CUI store — FIPS-validated crypto at rest, retention per SSP, cleared-staff access, scheduled purge.

The designation and decontrol problem

Agencies sometimes over-mark, stamping "CUI" on content that does not meet the criteria. They also sometimes under-mark, sending you unmarked documents with CUI-category content. The obligation runs to the information, not the marking. Unmarked CUI is still CUI. When in doubt, ask the contracting officer.

Marking is a service to the reader, not a determinant of the information's sensitivity. An unmarked email with procurement-sensitive content is still CUI.

Bottom line

CUI is the default data category in federal AI work. Mark it, handle under 800-171 for contractor systems or 800-53 for federal systems, and keep it off commercial LLM endpoints. Use FedRAMP-authorized services in Government regions or self-host on authorized infrastructure. Treat RAG stores, fine-tuned weights, and prompt logs as CUI artifacts when they carry CUI content.

Frequently asked questions

What is CUI?

Controlled Unclassified Information — federal information requiring safeguarding or dissemination controls under 32 CFR Part 2002. 125+ categories in the NARA CUI Registry, grouped into Basic and Specified.

What is the difference between CUI Basic and CUI Specified?

Basic follows baseline 800-171 or 800-53 controls. Specified carries additional statutory or regulatory obligations — tax return information (IRS Pub 1075), ITAR/EAR, law-enforcement-sensitive, nuclear-related.

Can I send CUI to commercial OpenAI or Claude?

No. Commercial endpoints are not FedRAMP-authorized and not built against 800-171. Sending CUI to them is a reportable incident. Use Azure OpenAI Government, Bedrock GovCloud, Vertex Assured Workloads, or self-hosted on authorized infrastructure.

Does CUI need to be marked?

Yes. Banner line, portion marks when mixing, designation indicator, decontrol schedule when applicable. 32 CFR Part 2002 and NARA's marking handbook are the source of truth.

Is a fine-tuned model weight a CUI artifact?

If fine-tuned on CUI, yes. The weights inherit the training-data sensitivity.

What happens if I spill CUI?

Report to the contracting officer and, for DFARS 7012 contracts, to DIBNet within 72 hours. Preserve affected media. Expect a damage assessment. Consequences range from stop-work orders to debarment.

CUI handling for federal AI systems

What CUI is, and is not

Basic vs Specified

Marking requirements

NIST 800-171 is the floor

Why most commercial LLM vendors are not CUI-safe

What is safe

RAG, fine-tuning, and the forgotten CUI pathways

RAG retrieval from unauthorized stores

Fine-tuning data leaking into model weights

Prompt logs as CUI stores

The designation and decontrol problem

Bottom line

Frequently asked questions

Architecting a CUI-handling AI system?

CUI handling for federal AI systems

What CUI is, and is not

Basic vs Specified

Marking requirements

NIST 800-171 is the floor

Why most commercial LLM vendors are not CUI-safe

What is safe

RAG, fine-tuning, and the forgotten CUI pathways

RAG retrieval from unauthorized stores

Fine-tuning data leaking into model weights

Prompt logs as CUI stores

The designation and decontrol problem

Bottom line

Frequently asked questions

NIST 800-171 and CMMC 2.0 for Small AI Firms

IRS Publication 1075 Compliance for Federal AI

FedRAMP LLM Deployment in 2026

Architecting a CUI-handling AI system?