Skip to main content
Policy

The White House National AI Policy Framework (March 2026) and what it means for SBIR offerors

The March 2026 framework set federal AI acquisition, deployment, and governance direction. For SBIR Phase I, II, and III offerors the practical effect is a new layer of evidence the technical volume is expected to produce — safety testing, provenance, continuous monitoring, red-teaming, and inventory alignment.

A new policy baseline, not a new compliance regime

On March 20, 2026, the White House issued the National AI Policy Framework, consolidating direction to federal agencies on how AI systems are to be acquired, deployed, monitored, and retired. For SBIR offerors, the framework does not create a new compliance regime from scratch — it pulls together existing obligations under NIST AI RMF 1.0, NIST SP 800-53 Rev 5, NIST SP 800-37 RMF, and OMB M-24-10, and raises the bar on how those obligations are documented at the proposal stage. The practical effect: Phase I technical volumes, Phase II continuous-monitoring plans, and Phase III transition narratives all need to address a specific list of safety, provenance, and governance artifacts that were previously optional. This article walks through what changed, what stayed the same, and what to add to a proposal being written today.

WHAT OFFERORS SHOULD ADD TO A PROPOSAL WRITTEN AFTER MARCH 20, 2026

Safety-testing evidence package, model-provenance attestation, red-team plan with threat-model scope, continuous-monitoring plan mapped to NIST SP 800-37 Step 6, agency use-case inventory alignment statement, civil-rights / disparate-impact assessment where applicable, and a supply-chain attestation covering training-data jurisdiction, model weights, and inference dependencies. Each item maps to a specific clause or control family — do not treat them as narrative bullets.

The March 2026 framework did not invent new obligations. It compressed the timeline for documenting obligations that were already implicit in NIST AI RMF 1.0 and OMB M-24-10, and raised the expectation that SBIR offerors show the evidence at proposal time, not deployment time.

The framework's pillars, in the order they bind an offeror

The framework is organized around six operative pillars. The ordering below reflects which pillars most directly change how an SBIR proposal is written, not the document's internal taxonomy.

1. Safety testing before deployment

The framework treats pre-deployment safety testing as a required input to any authorization decision for an AI-enabled federal system. For SBIR offerors, this means a Phase I technical volume that proposes to deliver an AI capability needs to describe what safety testing will look like before any government user touches the output. Vague language (“we will conduct comprehensive testing”) is no longer sufficient; reviewers will look for a specific methodology, a named threat model, and a deliverable artifact (test plan, test results, remediation log). The methodology should trace to NIST AI RMF 1.0 functions (Govern, Map, Measure, Manage) and the test results should be produced in a form that fits into the eventual system security plan.

2. Model provenance and supply-chain attestation

The framework directs agencies to document the provenance of models deployed in federal environments — training-data jurisdiction, model-weight origin, inference-dependency supply chain, and any fine-tuning datasets. For offerors using commercial foundation models (Claude, GPT, Llama, Gemini, Mistral), this reinforces FY26 NDAA §1532 provenance requirements and extends them to the full inference stack. For offerors proposing custom models, it raises the documentation burden on training-data sourcing. At proposal time, an offeror should be able to state: which foundation model (if any) is in the loop, which vendor, what the vendor's contractual data-handling commitments are, and how the stack will be attested in the SSP.

3. Governance and agency inventories

Under OMB M-24-10, agencies already maintain AI use-case inventories. The March 2026 framework strengthens that requirement and pushes agencies to align new AI acquisitions to specific inventoried use cases with defined risk tiers. SBIR offerors should expect TPOCs and contracting officers to ask which inventoried use case the proposed capability corresponds to (if any) and what the risk-tier classification would be. Phase II proposals in particular should describe how the capability will be registered in the agency's inventory at transition, what the risk tier is, and what governance review cadence applies.

4. Red-teaming

The framework treats red-teaming as a standard control expectation, not an optional enhancement. For offerors, the effect is that a Phase II proposal is expected to include a red-team plan — scope, adversarial-prompt categories (prompt injection, jailbreaks, data exfiltration, model-extraction attempts), cadence, and remediation protocol. This is not the same as penetration testing of the underlying infrastructure; it is adversarial testing of the AI-specific behavior. We cover the mechanics of federal AI red-teaming separately in AI red-teaming for federal systems.

5. Continuous monitoring

AI systems drift. Inputs change, fine-tuned models degrade, upstream providers update weights, and behavior shifts over time. The framework treats continuous monitoring as the operational counterpart to pre-deployment safety testing, and maps it squarely onto NIST SP 800-37 RMF Step 6. Phase II proposals are expected to describe a continuous-monitoring plan with defined metrics, cadence, triggering thresholds for re-authorization, and an incident-response protocol for behavior regressions.

6. Civil-rights and disparate-impact considerations

For AI systems that touch decisions affecting individuals — benefits adjudication, hiring, credit, healthcare access, law enforcement workflows — the framework incorporates civil-rights review and disparate-impact assessment as part of the governance cycle. Not every SBIR topic is affected, but those that are need to address the assessment methodology in the proposal rather than treating it as a post-deployment afterthought.

How the framework ties to existing controls

None of the pillars stand alone. Each has a predecessor in existing guidance, and the framework's principal effect is to draw those predecessors into a single operating picture. Knowing the crosswalk matters for offerors because it is the crosswalk — not the framework's own language — that will be cited in the SSP, the ATO package, and the continuous-monitoring reports.

Framework pillarUnderlying control or guidanceWhat an SBIR offeror cites
Safety testingNIST AI RMF 1.0 (Measure, Manage functions); NIST SP 800-53 Rev 5 CA-2, CA-8, SI-7Test-plan artifact mapped to RMF functions; results feed SSP §3 and ATO package
Model provenanceNIST SP 800-53 Rev 5 SA-8, SR family (supply chain); FY26 NDAA §1532; OMB M-24-10 §5Provenance attestation as SSP appendix; vendor contractual terms referenced in cost proposal
Governance / inventoryOMB M-24-10; agency-specific AI governance directivesUse-case registration plan in Phase II transition narrative
Red-teamingNIST AI RMF 1.0 (Measure 2.7); NIST SP 800-53 Rev 5 CA-8(2), RA-10Red-team plan as Phase II deliverable; findings mapped to RMF Measure function
Continuous monitoringNIST SP 800-37 Rev 2 RMF Step 6; NIST SP 800-137 ConMonConMon plan as Phase II artifact; triggers for re-authorization documented
Civil-rights reviewExecutive branch civil-rights guidance; agency general counsel reviewDisparate-impact methodology in technical volume where applicable
Acquisition alignmentFAR Part 39 (IT acquisition); agency AI acquisition supplementsReferenced in cost proposal and Phase III transition plan

What Phase I technical volumes now need to address

Phase I proposals have always emphasized feasibility over compliance mechanics. The framework does not change that posture, but it does raise the floor. A Phase I technical volume that is silent on safety testing, provenance, or red-teaming now reads as incomplete regardless of how strong the underlying feasibility argument is. The following are the specific additions we now recommend for any Phase I proposal being drafted in the post-framework window.

  • A one-paragraph safety-testing methodology. Name the NIST AI RMF functions in scope, describe the threat model (prompt injection, hallucination, data leakage, unsafe tool use as applicable), and specify the artifact produced (test plan, test log, remediation register).
  • A provenance statement. Identify the foundation model if any, the vendor, the authorization posture (FedRAMP, DoD IL), and the contractual data-handling terms. If using open-weight models, name the weights, the hosting environment, and the provenance of any fine-tuning data.
  • A red-team concept paragraph. Phase I does not require a full red-team execution, but it should describe how red-teaming will be approached in Phase II — scope, adversarial categories, cadence.
  • An inventory-alignment note. Indicate how the capability maps to the agency's AI use-case inventory under OMB M-24-10, or flag that the mapping will be completed in Phase II.
  • A continuous-monitoring teaser. One paragraph outlining the metrics and cadence the Phase II plan will elaborate on.
  • A civil-rights / disparate-impact note where applicable. Only relevant for topics touching individual decisions; if irrelevant, state so explicitly rather than omit.

These additions fit inside the existing Phase I technical-volume page budget without requiring wholesale restructuring. They add roughly two to three pages of content when handled concisely. Omitting them does not disqualify a proposal, but it does reduce the evaluation score on governance and risk-management axes that several agencies now weight explicitly.

What Phase II proposals need to add on continuous monitoring

Phase II is where the framework's operational implications land hardest. A Phase II proposal now needs to describe a continuous-monitoring plan with enough specificity that an ISSO and an Authorizing Official can evaluate it without follow-up questions. At a minimum that plan covers:

  • Metric definitions. Which behavioral metrics are monitored (accuracy drift, hallucination rate, refusal-rate deltas, adversarial-input detection rate, latency-distribution shift) and how each is measured.
  • Cadence. How often each metric is sampled and reported. Daily, weekly, monthly, event-triggered — named.
  • Thresholds. The numeric or categorical conditions that trigger an alert, a re-test, or a re-authorization cycle.
  • Governance routing. Who receives the alerts, what the escalation path is, and how findings are documented back into the SSP.
  • Re-authorization triggers. The conditions under which a significant change (per NIST SP 800-37 RMF Step 6) forces an update to the ATO package — model-version upgrades, training-data refreshes, scope expansions.
  • Tooling. What logs, dashboards, and alerting infrastructure implement the plan, and which systems host them.

This level of specificity is already common in mature federal AI contracts. What the framework changes is that it is now expected in the Phase II proposal itself, not deferred to a post-award SSP exercise. Offerors who treat ConMon as boilerplate will lose points to offerors who treat it as a deliverable artifact.

FedRAMP authorization inheritance and the framework

FedRAMP remains the authorization backbone for cloud-delivered AI services. The framework does not replace FedRAMP — it runs on top of it. For offerors using FedRAMP-authorized AI services (Amazon Bedrock in GovCloud, Azure OpenAI in Azure Government, Vertex AI at applicable impact levels), the practical effect is that a significant portion of the safety-testing, monitoring, and provenance obligations can be partially inherited through the cloud provider's authorization package. Offerors still own the application-layer implementation (prompt handling, tool-scope restriction, output filtering), but the underlying service's safety posture is established through the authorization.

The crosswalk to document in the SSP is: which controls are inherited from the cloud provider, which are shared responsibility, and which are fully owned by the offeror. Shared-responsibility matrices for Bedrock GovCloud and Azure Government OpenAI are published by the respective vendors; use them verbatim rather than paraphrasing. A useful default: the model itself is vendor-responsibility, the orchestration and prompts are offeror-responsibility, the logging is shared.

The "safety testing evidence package" — what reviewers expect

The framework uses the phrase "safety testing" across several pillars. In practice, what reviewers evaluate is a coherent evidence package built from a specific set of artifacts. The package need not be complete at Phase I — it is built up through Phase II and finalized at Phase III transition — but the proposal should describe what the package will contain. We recommend the following structure:

  • Threat model. A written description of the adversarial surface. For a federal AI capability this typically covers prompt injection, jailbreaks, data-exfiltration via model output, unauthorized tool-use, sensitive-data disclosure, and hallucination-driven decision errors. Map each to the NIST AI RMF Measure function.
  • Test plan. A protocol describing the adversarial inputs, test harness, pass/fail criteria, and remediation workflow.
  • Test results. Logs, pass/fail rates per category, failure exemplars, and remediation actions taken.
  • Red-team findings and remediations. For Phase II and beyond, independent red-team exercises with findings and remediation status.
  • Continuous-monitoring dashboard. Live operational evidence that safety metrics are being tracked in production.
  • Attestation. A signed statement from the principal investigator or responsible officer that the above components exist, are current, and reflect the deployed system.

At proposal time, the offeror is committing to produce this package, not to show it in finished form. The specificity of the commitment is what distinguishes a competitive proposal from a generic one.

Agency variance — DoD vs. civilian implementation pace

Federal agencies will not implement the framework at the same pace. Based on the pattern of prior NIST and OMB guidance adoption, we expect the following variance through 2026 and into 2027.

DoD components — working under the established NIST SP 800-53 / RMF baseline and with years of AI risk-management muscle in programs like JAIC-legacy and CDAO efforts — will operationalize the framework's expectations inside existing ATO cycles relatively quickly. Expect SBIR topic descriptions from Air Force, Army, Navy, SOCOM, and SDA to start citing framework-aligned language in SITIS responses and topic Q&A by late 2026. The framework does not override DoDI 5000.82 or the CJCSI AI governance expectations — it layers on top.

Civilian agencies vary more. Agencies with mature chief AI officer functions (VA, HHS, Treasury, GSA) will move quickly. Agencies with smaller AI footprints will lag, and in practice their SBIR topics may not cite the framework explicitly for another cycle. Offerors writing across both DoD and civilian topics should default to the higher bar (DoD-aligned framework compliance) in their boilerplate, since the incremental writing cost is trivial and the optionality is worth it.

The IC operates on separate authorization tracks (ICD 503, ICD 705) but has historically harmonized to NIST controls for unclassified-adjacent work. Framework alignment will flow through the ODNI-coordinated pathway; offerors in that space should track their sponsor's AI governance directive rather than the White House framework directly.

Practical Phase I checklist — 10 additions post-framework

The following checklist is what we now use internally when drafting any Phase I SBIR technical volume for an AI-touching topic. Every item adds measured content; collectively they total roughly two to four pages inside the standard page budget and raise the evaluation signal on governance and risk-management dimensions.

#AdditionWhere it lives in the volume
1Safety-testing methodology paragraph. Named threat model, NIST AI RMF mapping, artifact produced.Technical approach section
2Model-provenance statement. Foundation model, vendor, authorization posture, data-handling terms.Approach / technical architecture
3Red-team concept paragraph. Phase II scope, adversarial categories, cadence.Phase II transition
4Agency inventory alignment. Mapping to OMB M-24-10 use-case inventory (or commitment to complete in Phase II).Phase III / transition narrative
5Continuous-monitoring teaser. Metrics, cadence, triggers — concise.Technical approach or Phase II transition
6Civil-rights / disparate-impact note. Only if topic touches individual decisions; state scope explicitly.Technical approach
7FedRAMP inheritance statement. Which controls are inherited, shared, owned.Technical architecture
8Supply-chain attestation commitment. Training-data jurisdiction, model-weight origin, inference dependencies.Risk / compliance section
9Re-authorization trigger definition. What constitutes a significant change under NIST SP 800-37 RMF Step 6.Risk / compliance section
10PI attestation language. Principal investigator sign-off on the safety-testing evidence package commitment.Cover / signature block

Red-teaming, specifically

Because red-teaming is the pillar most likely to be underestimated by SBIR offerors who come from a pure research or ML-engineering background, it warrants its own note. Federal AI red-teaming is not web-application penetration testing. The target is the model's behavior — the prompts, the tool calls, the outputs — not the network perimeter or the authentication layer. Standard categories include: prompt-injection via untrusted data sources, jailbreaking refusal behavior, extracting sensitive information from training or fine-tuning data, causing the model to call tools out of scope, triggering hallucinations on policy-sensitive topics, and probing for bias-driven disparate outputs.

A Phase II red-team plan typically specifies the test harness, the categories of adversarial inputs, the cadence, the severity rubric, and the remediation protocol. Some agencies accept internal red-teaming by the offeror; others require third-party red-teaming or agency-led exercises. Budget accordingly in the Phase II cost volume — treat red-teaming as a line item, not as overhead. Our detailed treatment of how this operates in a federal context lives at AI red-teaming for federal systems.

Phase III transition — governance handoff

Phase III transition narratives often read as commercialization plans divorced from the governance posture of the delivered capability. The framework makes that separation harder to defend. A Phase III narrative now benefits from addressing: how the safety-testing evidence package transfers to the transition customer, how continuous-monitoring operational responsibility shifts, how re-authorization triggers are documented for the receiving agency, and how use-case inventory entries move from Phase II registration to steady-state governance. Treating the governance handoff as part of the commercialization story — not as compliance cleanup — is one of the clearest differentiators in evaluation.

What has not changed

  • FedRAMP remains the cloud-authorization backbone. The framework layers on top; it does not replace.
  • NIST AI RMF 1.0 and NIST SP 800-53 Rev 5 remain the underlying control catalogs. The framework cites them; offerors cite them.
  • Phase I page budgets and evaluation criteria are unchanged. The additions fit inside the existing volume structure.
  • SBIR Phase I dollar ceilings are unchanged. The documentation expansion does not come with additional funding.
  • FAR Part 39 IT acquisition rules apply as before. The framework does not amend the FAR directly; agencies translate its direction into acquisition supplements over time.

Frequently asked questions

Does the March 2026 framework create new statutory obligations for SBIR offerors?

No. The framework is executive-branch policy direction. Statutory obligations flow from the underlying instruments — NIST AI RMF 1.0 (referenced), NIST SP 800-53 Rev 5 (authority), OMB M-24-10 (implementation), FY26 NDAA §1532 (provenance). The framework's practical effect is to compress the documentation timeline and raise proposal-stage expectations.

Do I need a completed safety-testing evidence package at Phase I submission?

No. At Phase I you commit to producing the package. The proposal should name the artifacts, the methodology, and the NIST AI RMF functions in scope. The package itself is built through Phase I execution and matured in Phase II.

How does the framework interact with FedRAMP authorization inheritance?

FedRAMP remains the cloud-authorization pathway. The framework pushes offerors to document which safety, monitoring, and provenance controls are inherited from the FedRAMP-authorized service (e.g., Bedrock in GovCloud, Azure OpenAI in Azure Government), which are shared responsibility, and which are owned by the offeror at the application layer.

Do I need third-party red-teaming at Phase II?

Agency-dependent. Some agencies accept offeror-internal red-teaming; others require third-party or agency-led exercises. Confirm during SITIS Q&A or TPOC engagement. Budget as a Phase II cost line item, not as overhead.

Does this apply to topics where AI is a component but not the central deliverable?

Yes, in proportion to the AI role. If the capability's correctness materially depends on AI inference or decisioning, the framework's expectations apply. If AI is incidental (e.g., autocomplete in a UI), a brief treatment is usually sufficient.

How should I cite the framework in a technical volume?

Reference it thematically with the underlying authoritative document — NIST AI RMF 1.0, NIST SP 800-53 Rev 5, OMB M-24-10 — as the binding citation. Do not paraphrase framework language as if it were regulatory text. The evaluator will look for the underlying control citation, not framework prose.

1 business day response

Book a federal AI compliance review

We help SBIR offerors and federal primes align Phase I, II, and III proposals to the March 2026 AI Policy Framework — safety testing, provenance, continuous monitoring, red-teaming, and inventory alignment. No vendor religion.

Book a compliance reviewRead more insights →
UEI Y2JVCZXT9HP5CAGE 1AYQ0NAICS 541512SAM.GOV ACTIVE