Skip to main content
Compliance

CUI handling for federal AI systems

Controlled Unclassified Information is the default data category in federal AI work. Getting it wrong is a reportable incident. Here is the categorical, marking, and model-endpoint reality.

What CUI is, and is not

Controlled Unclassified Information is federal information that requires safeguarding or dissemination controls but is not classified. It is the category that covers most of what a federal program generates: procurement-sensitive, law-enforcement-sensitive, tax-return information, export-controlled technical data, privacy-act-covered PII, PHI on federal systems, and much more. NARA maintains the official CUI Registry, currently listing 125+ categories organized under 20 groupings.

CUI CHANGES EVERYTHING

Controlled Unclassified Information triggers NIST 800-171 compliance. AI systems that ingest, process, or output CUI — including federal health records, procurement data, and law enforcement information — must meet all 110 800-171 controls before first use.

CUI is not "secret lite." It is an administrative category. Losing control of CUI does not produce a classified spill, but it does produce an incident reportable under the applicable contract clause (DFARS 252.204-7012 for DoD, FAR 52.204-21 for general federal, agency clauses for many others).

If data came from a federal source and is not already public, assume it is CUI until a contracting officer tells you otherwise.

Basic vs Specified

TypeWhat it meansExamples
CUI BasicGeneral CUI. Standard 800-171 safeguarding. No additional legal overlay.Procurement-sensitive, general law-enforcement, ordinary administrative records.
CUI SpecifiedCUI with additional specific safeguarding or dissemination controls mandated by law, regulation, or policy.Tax return information (IRS Pub 1075), export-controlled (ITAR/EAR), nuclear-related, law-enforcement-sensitive (LES).

Basic obeys the baseline. Specified adds obligations on top — personnel clearance, encryption at specific levels, storage restrictions, transport constraints. An AI system touching tax return information inherits IRS Publication 1075 obligations that are meaningfully more stringent than Basic.

Marking requirements

Rules in 32 CFR Part 2002 and NARA's CUI marking handbook.

  • Banner line on every page: starts with "CUI", optional category codes (e.g., "CUI//SP-PROCURE").
  • Portion marks for documents mixing CUI and non-CUI content.
  • Designation indicator: who designated, contact info, authority.
  • Decontrol schedule if the CUI has a finite control period.

For electronic systems, markings carry forward as metadata. Unmarked CUI is still CUI — unmarked is a finding, not a free pass.

NIST 800-171 is the floor

For contractor systems handling CUI, 800-171 is the baseline. For federal-agency-operated systems, the baseline is NIST 800-53 Rev 5 Moderate (or High, depending on categorization). FedRAMP authorizations sit on top of 800-53, so cloud services inherit that baseline.

Why most commercial LLM vendors are not CUI-safe

NOT CUI-safe
  • OpenAI (commercial API) — no FedRAMP
  • Anthropic (commercial API) — no FedRAMP
  • Google Gemini (consumer) — no 800-171
  • Hugging Face Inference API — no CUI controls
CUI-capable options
  • Amazon Bedrock (GovCloud) — FedRAMP High
  • Azure OpenAI (Gov) — FedRAMP High / IL4
  • Self-hosted open weights on FedRAMP IaaS
  • On-prem deployment inside cleared boundary
  • No FedRAMP authorization on the commercial service.
  • No NIST 800-171 attestation — commercial endpoints are not built against it.
  • Default terms allow prompts to be used for product improvement or training in some configurations.
  • Data residency cannot be guaranteed — commercial regions span multiple countries.
  • Vendor-side prompt logging is standard operational practice, not handled at CUI sensitivity.

The moment a developer pastes CUI into a commercial API, you have a CUI spill. Reportable under DFARS 7012 within 72 hours.

What is safe

ServiceCUI-safe forNotes
Azure OpenAI (Azure Government)CUI Basic, many SpecifiedFedRAMP High, IL4/IL5. No training on customer data.
Amazon Bedrock (AWS GovCloud)CUI Basic, many SpecifiedFedRAMP High, IL4/IL5. Claude, Titan, select Meta/Mistral.
Google Vertex AI (Assured Workloads)CUI Basic, many SpecifiedFedRAMP High, IL4/IL5. Gemini family.
Self-hosted open-weight on GovCloudMost CUI, including some SpecifiedInherit platform controls, own model-layer controls.
Specified with statutory overlay (IRS 1075, ITAR)Check the specific statuteBeyond FedRAMP — need the specific overlay.

RAG, fine-tuning, and the forgotten CUI pathways

RAG retrieval from unauthorized stores

You harden the LLM endpoint, then connect it to a retrieval store containing CUI documents and let the model return content without respecting source-document classification. Retrieval must enforce source-document access controls at query time, not only at index time.

Fine-tuning data leaking into model weights

If you fine-tune on CUI, the weights inherit the CUI sensitivity. The fine-tuned model is now a CUI artifact. Storage, transport, inference all at the CUI level. You cannot export the weights to a commercial endpoint for "just the inference."

Prompt logs as CUI stores

Your audit log captures full prompts. Users will put CUI in prompts. Your log is now a CUI store — FIPS-validated crypto at rest, retention per SSP, cleared-staff access, scheduled purge.

The designation and decontrol problem

Agencies sometimes over-mark, stamping "CUI" on content that does not meet the criteria. They also sometimes under-mark, sending you unmarked documents with CUI-category content. The obligation runs to the information, not the marking. Unmarked CUI is still CUI. When in doubt, ask the contracting officer.

Marking is a service to the reader, not a determinant of the information's sensitivity. An unmarked email with procurement-sensitive content is still CUI.

Bottom line

CUI is the default data category in federal AI work. Mark it, handle under 800-171 for contractor systems or 800-53 for federal systems, and keep it off commercial LLM endpoints. Use FedRAMP-authorized services in Government regions or self-host on authorized infrastructure. Treat RAG stores, fine-tuned weights, and prompt logs as CUI artifacts when they carry CUI content.

Frequently asked questions

What is CUI?

Controlled Unclassified Information — federal information requiring safeguarding or dissemination controls under 32 CFR Part 2002. 125+ categories in the NARA CUI Registry, grouped into Basic and Specified.

What is the difference between CUI Basic and CUI Specified?

Basic follows baseline 800-171 or 800-53 controls. Specified carries additional statutory or regulatory obligations — tax return information (IRS Pub 1075), ITAR/EAR, law-enforcement-sensitive, nuclear-related.

Can I send CUI to commercial OpenAI or Claude?

No. Commercial endpoints are not FedRAMP-authorized and not built against 800-171. Sending CUI to them is a reportable incident. Use Azure OpenAI Government, Bedrock GovCloud, Vertex Assured Workloads, or self-hosted on authorized infrastructure.

Does CUI need to be marked?

Yes. Banner line, portion marks when mixing, designation indicator, decontrol schedule when applicable. 32 CFR Part 2002 and NARA's marking handbook are the source of truth.

Is a fine-tuned model weight a CUI artifact?

If fine-tuned on CUI, yes. The weights inherit the training-data sensitivity.

What happens if I spill CUI?

Report to the contracting officer and, for DFARS 7012 contracts, to DIBNet within 72 hours. Preserve affected media. Expect a damage assessment. Consequences range from stop-work orders to debarment.

1 business day response

Architecting a CUI-handling AI system?

We design CUI enclaves that keep federal work compliant without suffocating engineering velocity.

Talk to usRead more insights →
UEI Y2JVCZXT9HP5CAGE 1AYQ0NAICS 541512SAM.GOV ACTIVE