Skip to main content
Security

SBOM and software supply chain for federal AI

Federal contracting increasingly requires a Software Bill of Materials. For AI systems the inventory extends to model weights, training-data provenance, and inference-time tooling. Here is what to ship.

Why SBOM matters now

Executive Order 14028 (Improving the Nation's Cybersecurity, May 2021) directed NIST and NTIA to define SBOM requirements for federal software procurement. NIST's Secure Software Development Framework (SSDF, SP 800-218) and NTIA's minimum SBOM elements followed. OMB M-22-18 and M-23-16 extended the obligation to require SBOMs for federal software procurements. For FedRAMP, SBOM is increasingly a standard ConMon deliverable. For contractors delivering AI systems, SBOM is now an expected artifact in the body of evidence.

SBOM IS NOW MANDATORY

Executive Order 14028 requires SBOMs for software sold to the federal government. For AI systems, SBOM must cover model weights, training frameworks, and all dependencies — not just application code. ML-specific SBOM tooling (CycloneDX ML) is available but adoption is still maturing.

SBOM is the lowest-effort supply-chain control that produces the most operational value. Teams that still argue about whether to generate one are having the wrong conversation.

SPDX vs CycloneDX

Two major machine-readable SBOM formats exist.

FormatOriginStrengths
SPDXLinux Foundation, ISO/IEC 5962 standardMature tooling, strong in license compliance, broad adoption in enterprise.
CycloneDXOWASPStronger vulnerability linkage (VEX), explicit ML-BOM extension, simpler schema.

Both are accepted by federal consumers. CycloneDX is increasingly preferred for AI/ML workloads because of its explicit ML-BOM extension. SPDX remains dominant for traditional software. Many teams generate both.

Minimum NTIA elements

NTIA defined the minimum fields an SBOM must contain. Every component entry needs:

  • Supplier name
  • Component name
  • Component version
  • Other unique identifier (e.g., PURL, CPE)
  • Dependency relationships
  • Author of the SBOM entry
  • Timestamp

Modern SBOMs add more — hashes, license information, vulnerabilities (via VEX), sources URLs. For federal delivery, target NTIA-plus: the minimum plus hashes, licenses, and a CycloneDX VEX for known-exploited-vulnerability status on any Critical/High CVEs.

ML-specific supply-chain elements

Traditional SBOM covers code. For AI systems you also need to inventory:

ElementWhy it mattersHow to record
Model weightsProvenance, fine-tuning lineage, classification inheritanceCycloneDX ML-BOM or custom component entry with model hash, version, base-model reference, training-data reference.
Training dataLicensing, PII/CUI exposure, bias sourceDataset name, version, hash, license, classification, source provenance.
Fine-tuning dataSame as training — sensitivity travels into weightsSame schema as training.
Inference toolingQuantization, serving stack, tokenizersStandard SBOM entries for llama.cpp, vLLM, transformers, etc.
Retrieval indexesSource-document provenance for RAGIndex build pipeline version, source corpus reference, embedding model used.
Prompt templates and system promptsBehavioral configuration, traceable to change managementVersioned in git, hashed, referenced in SBOM.

How to generate SBOMs in practice

You do not hand-author SBOMs. You generate them from the build and deploy pipelines.

SBOM Generation and Maintenance Pipeline

1
Source scan (syft / cdx-python)
Build CI
2
Container image scan (syft on final image)
Image build
3
ML-BOM metadata from model registry
Model push
4
Vulnerability enrichment (grype / VEX)
Post-scan
5
SBOM signing (cosign / sigstore)
Attestation
6
KEV delta monitoring and ConMon reporting
Monthly
Language-ecosystem tools

syft (Anchore) for general containers and filesystems, cyclonedx-python, cyclonedx-maven, cyclonedx-node-npm for language-specific builds.

Container images

syft on the final image, scanned with grype for vulnerabilities, VEX emitted.

ML artifacts

CycloneDX ML-BOM fields populated from model-registry metadata (Hugging Face, internal registries).

Signing

cosign or sigstore for SBOM attestation; sign with keys managed inside your authorization boundary.

The output belongs alongside the artifact it describes. A container image without its SBOM is incomplete. A deployed model without its ML-BOM is incomplete.

Known-exploited-vulnerability posture

CISA publishes the Known Exploited Vulnerabilities (KEV) catalog. Federal agencies use KEV to prioritize remediation. For any SBOM you deliver, you should be able to answer:

  • Does any component map to a KEV CVE?
  • For each KEV, is the vulnerable code path reachable in your deployment?
  • What is the remediation plan?

CycloneDX VEX is the standard for expressing "this CVE is present in the component but not exploitable in this deployment because..." statements. Federal consumers value a VEX-annotated SBOM over a raw SBOM because it reduces the manual triage burden.

Three practical patterns

1. SBOM-in-CI

Generate SBOMs on every build, sign them, attach them to the artifact registry. Do not treat SBOM generation as a compliance afterthought — it is a build step.

2. Continuous VEX

Run vulnerability scans against SBOMs on a schedule. When a new CVE lands, regenerate the VEX. Federal consumers value fresh VEX data over historical static scans.

3. ML-BOM at model-register time

When a new model version is added to the internal registry, generate the ML-BOM as part of the registration workflow. Base-model reference, training-data hash, fine-tuning lineage, evaluation results — all attached at registration.

The SBOM is not the deliverable. The SBOM pipeline is the deliverable. A one-time SBOM is worthless next month.

Common mistakes

  • Generating an SBOM once at contract award and never refreshing it.
  • Omitting transitive dependencies — Python's transitive graph alone can be hundreds of entries.
  • Missing container base-image components.
  • No ML-BOM at all for AI systems.
  • No signing — an unsigned SBOM is trivially forgeable.
  • No VEX — handing consumers a raw SBOM and expecting them to triage.

Bottom line

SBOM is a federal expectation. CycloneDX with ML-BOM extension is the right default for AI systems. Generate in CI, sign with sigstore/cosign, annotate with VEX, and regenerate continuously. For AI specifically, inventory model weights, training data, fine-tuning lineage, and retrieval indexes. Treat the SBOM pipeline as a product, not a paperwork step.

Frequently asked questions

Is SBOM required for federal contracts?

Increasingly, yes. EO 14028, OMB M-22-18, M-23-16 established the direction. FedRAMP ConMon is moving toward SBOM-as-standard. Many agency contracts now specify SBOM delivery directly.

SPDX or CycloneDX?

Both are accepted. CycloneDX is increasingly preferred for AI/ML due to the ML-BOM extension and stronger VEX integration. Many teams generate both. SPDX is ISO/IEC 5962.

What are the NTIA minimum SBOM elements?

Supplier name, component name, component version, unique identifier (PURL or CPE), dependency relationships, SBOM author, and timestamp. Most real SBOMs include hashes, licenses, and vulnerabilities on top.

Do I need to include model weights in the SBOM?

Yes, for AI systems. CycloneDX ML-BOM or custom components with model hash, version, base-model reference, and training-data lineage.

What is VEX?

Vulnerability Exploitability eXchange — a way to annotate an SBOM with statements like 'CVE-X is present but not exploitable in this deployment because...'. Reduces triage burden for consumers.

How do I sign an SBOM?

cosign or sigstore, with keys managed inside your authorization boundary. Attach the signed SBOM to the artifact registry alongside the artifact it describes.

1 business day response

Shipping SBOMs for AI?

If you need SBOM generation wired into your CI pipeline with ML-BOM extensions and VEX annotations, we can stand it up in a sprint.

Talk to usRead more insights →
UEI Y2JVCZXT9HP5CAGE 1AYQ0NAICS 541512SAM.GOV ACTIVE