What "edge" means in a DoD context
The word "edge" has been so heavily commercialized that it now refers to anything not running in a hyperscale cloud. That is useful vocabulary for enterprise IT and useless vocabulary for military systems. In the DoD context, edge has a sharper meaning: compute that sits at or near the point of sensing or action, under hard SWAP-C constraints, often with no reliable backhaul to a cloud, and with mission consequences if inference fails or misbehaves. A drone doing object detection on its own gimbal is edge. A vehicle doing obstacle avoidance from onboard cameras is edge. A handheld translator running in a jammed environment is edge.
SWAP-C — Size, Weight, Power, and Cost — is the governing constraint. A forward-deployed compute box has a fixed footprint, a power budget (often battery-limited), a thermal envelope, and a unit cost that must survive attrition. A model that requires 300W and liquid cooling does not get to be edge no matter how good it is. A model that requires 15W on a Jetson Orin NX and produces usable results does. Engineering trade-offs — accuracy for latency, latency for power, model size for thermal margin — are the work.
The second governing constraint is connectivity. "Denied, degraded, intermittent, or limited" (DDIL) is the acronym. At the edge, you cannot assume a network, cannot assume bandwidth, cannot assume reachback to a model API. The system has to do useful work with what it has on board. That eliminates entire classes of architectures — anything that depends on a cloud-hosted LLM, anything that requires model updates every few minutes, anything that assumes always-on telemetry — and forces a different design posture.
Edge AI Capability Fit — DoD Platform Types
Hardware platforms
The dominant platform for embedded GPU inference in DoD systems is NVIDIA Jetson. The Orin family — Nano, NX, AGX — covers most of the practical performance range for SWAP-constrained inference, from a few watts up to around 60W. Jetson has become a de facto standard because the software stack (CUDA, TensorRT, the Jetpack SDK) is mature, the ecosystem is broad, and agencies know what to expect from it. If a proposal says "we will run on Jetson Orin NX at 15W," reviewers immediately understand the performance envelope.
For x86 workloads, Intel NUC-class boxes and rugged variants (Klas, Crystal, Curtiss-Wright, Mercury) are common in ground-vehicle and fixed-site contexts. These run standard Linux stacks and accommodate larger models if the power budget allows. For deterministic low-latency requirements — signal processing, SAR imagery, radar fusion — FPGA accelerators (Xilinx/AMD Versal, Intel Agilex) appear frequently. Custom ASICs for specific mission profiles exist in the higher-volume programs but are rarely practical targets for SBIR prototypes.
A newer class worth naming: Qualcomm and Ambarella SoCs show up in UAS and handheld form factors where battery life is the binding constraint. Apple Silicon appears occasionally in dismounted soldier-system prototypes where the vendor wants to trade ITAR friction for M-series efficiency. Most DoD SBIR topics will not name a specific platform but will specify SWAP envelopes; matching an envelope to a real platform is the first engineering decision in a proposal.
Model compression viable for field deployment
A foundation-scale model does not fit on an edge platform. Getting usable inference into a SWAP envelope requires compression, and the three techniques that matter most are quantization, pruning, and distillation. None of them is free — each trades some accuracy for resource efficiency — but the trade can be managed with care.
Quantization — reducing parameter precision from fp32 to fp16, int8, or lower — is the first lever. Modern quantization-aware training can preserve most of a model's accuracy at int8 with a 4x memory and compute savings. Int4 and lower quantization schemes are viable for specific architectures (especially for LLMs via GPTQ/AWQ-style approaches) and are increasingly used in edge deployments where memory bandwidth is the binding constraint.
Pruning removes redundant parameters or entire structural elements (heads, layers, channels). Structured pruning is more edge-relevant than unstructured pruning because structured reductions produce actual latency improvements on real hardware, not just smaller parameter counts. Knowledge distillation — training a small student model from a large teacher — is the third lever, often combined with quantization and pruning in production edge stacks. The combination can produce a deployable model 10x to 100x smaller than the research-grade original with tolerable accuracy loss.
Communication-denied inference
The most distinctive requirement in DoD edge AI is the denied-comms scenario: the system must do useful work without reaching back to a cloud. That is the opposite of the "assume network" posture most commercial AI is built on. In practice, it means the model lives on the device, inference happens locally, and any external call is either asynchronous, queued, or absent entirely.
This rules out cloud-hosted LLM APIs for any mission path. A translation assistant for a soldier in a jammed environment cannot depend on calling OpenAI; it has to run on-device, even if the on-device model is smaller and less capable. The same logic applies to fleet-scale model updates: you cannot assume a drone can download new weights mid-mission. Edge systems need a baseline that is good enough at the start of the mission, with updates that happen at a known base, not in the field.
The architectural pattern that has stabilized is a local inference stack plus opportunistic reachback. When connectivity exists, the system syncs telemetry and pulls updates. When it does not, the system continues operating from what it has on board. The design work is in defining what "continues operating" means gracefully — what confidence thresholds to use, when to fall back to human operator, how to communicate system state without the usual telemetry.
FedRAMP and security requirements for edge systems
Edge systems do not escape federal security requirements. If data is classified, the device must be accredited to handle it. If data is CUI, appropriate controls apply. The NIST SP 800-171 control set is the baseline for CUI on DoD systems, with STIGs applied to specific components. For classified processing, the TEMPEST and cross-domain considerations layer on top.
FedRAMP specifically applies to cloud services, not to edge devices in isolation, but the edge system almost always interacts with back-end services that are FedRAMP-scoped. The boundary question — which components sit inside the authorization, which are external, what crosses — is answered in the System Security Plan. Firms building edge AI should treat the boundary documentation as seriously as the technical work; an elegant edge system that cannot get an ATO does not field.
For AI-specific security, the NIST AI RMF and emerging CDAO guidance (see our piece on federal AI red-teaming) are where expectations are crystallizing. Adversarial robustness on-device — ensuring the model does not collapse under perturbed inputs — is a near-term requirement for any fielded CV or decision-support system.
Computer vision at the edge — the primary use case
Computer vision is the dominant DoD edge AI application today. Object detection, tracking, classification, and change detection from EO/IR sensors are the largest category of deployed edge AI by far. The workflow is well-understood: collect representative data, fine-tune a backbone (YOLO, DETR, or a specialized architecture), compress via quantization and pruning, deploy on Jetson or equivalent, evaluate against mission-representative data, iterate.
Where this gets hard is not in the nominal computer-vision pipeline but in the edge cases. Sensor degradation, adversarial environments (camouflage, spoofing, physical-world adversarial examples), long-tail classes (rare but high-consequence targets), and the evaluation burden — proving the model works across an operationally meaningful distribution — are the real work. Firms that have a credible story for these are differentiated; firms that can demo YOLO on a benchmark dataset are not.
Adjacent to CV, edge speech and language are growing. Tactical translators, edge-deployed transcription for ISR, and on-device command-and-control voice interfaces are all active topic areas. The compression challenges for speech are comparable to vision; the evaluation burden is higher because linguistic variance is harder to characterize than visual variance.
SBIR topics funding edge AI in 2026
Edge AI is one of the richest topic areas across DoD SBIR cycles. Specific agencies and their typical focus:
- SOCOM — tactical autonomy, wearable sensing, edge computer vision for dismounted operations. Often the most mission-specific topics with the fastest transition potential.
- DARPA — foundational research in edge inference efficiency, adversarial robustness, novel architectures. Higher risk, higher reward; often multi-year programs.
- DIU — commercial-tech solicitations with rapid-contract structure. Edge AI appears regularly, often in specific operational contexts (logistics, maintenance, ISR).
- Army — ground vehicle autonomy, UAS, and soldier systems. High topic volume with predictable cadence.
- Air Force / Space Force — airborne CV, satellite edge processing, autonomous collaborative platforms. AFWERX Open Topic provides flexible entry.
- Navy — shipboard edge, undersea systems, mission computing. Slower cadence but high transition rates on the Phase III side.
What a winning edge AI SBIR proposal needs to show
Reviewers of edge AI proposals are looking for specific signals. Name the target SWAP envelope — not "efficient" but "15W on Jetson Orin NX." Name the target accuracy — not "high performance" but "mAP above X on this benchmark." Name the target latency — not "real-time" but "X ms per frame at Y FPS." Name the target environment — not "operational" but "maritime surface search in sea state 3 through 5 at altitudes between A and B." Every specific claim strengthens the proposal; every vague claim weakens it.
Second, show the compression plan. A proposal that says "we will deploy a state-of-the-art model" without addressing the quantization, pruning, and distillation path raises the question of whether the model will actually fit. A proposal that walks through a credible compression pipeline — and ideally shows preliminary evidence from a smaller prototype — is much more convincing.
Third, show the evaluation plan. Edge AI lives or dies on how it handles the long tail, and reviewers want to see a test plan that includes adversarial conditions, sensor degradation, out-of-distribution inputs, and operationally representative scenarios — not just benchmark numbers.
Bottom line
Edge AI for deployed military systems is a genuine discipline with its own engineering constraints, security requirements, and market structure. The firms that treat it seriously — as SWAP-C engineering with an AI component, rather than AI with a deployment afterthought — will find a large and growing opportunity across DoD SBIR. The firms that treat it as a trend to ride will produce demos that do not field.
Frequently asked questions
Model inference on forward-deployed hardware under SWAP-C constraints, often in denied-comms environments. It is distinct from cloud AI (latency, connectivity) and commercial IoT (security, mission consequence, ruggedization).
NVIDIA Jetson is the dominant embedded GPU platform. Intel NUC-class x86 for general compute, Xilinx/AMD and Intel FPGAs for deterministic low-latency workloads, and a growing set of ASICs for specific mission profiles.
SOCOM, DARPA, DIU, Army, Air Force/Space Force, and Navy all run regular edge AI topics. SOCOM is the most mission-specific; DARPA the most foundational; DIU the fastest to contract.
FedRAMP applies to cloud services, not to edge devices in isolation. Edge systems inherit the control expectations of the sensitivity level of their data (NIST 800-171 for CUI baseline) and must document the security boundary between on-device and back-end services in their SSP.
Quantization-aware training to int8 or int4, structured pruning, and knowledge distillation — usually in combination. The right combination depends on the target platform, the base architecture, and the accuracy budget.