Space Domain Awareness: ML Pipelines in the Open Catalog

Open-Literature Reading Everything below comes from peer-reviewed papers, the publicly published BAA, and open agency documents. Internal Precision Federal solution content, proposal text, and any program-office communications are off-limits for public articles in active program spaces, and none appears here.

SDA ML Pipelines — Public Quality Signals (0–100)

Hybrid analytical-plus-learned architecture

90%

Data engineering effort vs modeling effort

86%

Operator inspectability of automated outputs

80%

Calibration of conjunction-decision recommendations

75%

Public-benchmark performance reporting

68%

Identifiable transition program office

62%

Higher score = stronger published quality signal in SDA ML pipeline work.

What "space domain awareness" means in public

Space domain awareness, in publicly available DoD and commercial documents, refers to the ongoing detection, tracking, characterization, and behavior analysis of objects in orbit. The public catalog — provided through space-track.org by U.S. Space Force units and mirrored by commercial SSA providers — is the data substrate for nearly all open research in the field. Civil-SSA provisioning is currently transitioning to the Department of Commerce under the Traffic Coordination System for Space (TraCSS) program. The published ML literature on SDA reads as a long argument with the catalog: how to clean it, how to associate new observations to it, how to detect when an object is doing something unusual, and how to handle the cases where the catalog is wrong.

Catalog ingest

space-track.org → cleaned + time-aligned

→

Hybrid filter

Physics-grounded propagators + learned residuals

→

Decision support

Calibrated conjunction risk + maneuver alerts

Hybrid filters dominate: physics-grounded propagators with learned residuals. Pure black-box ML approaches have not survived peer review on independent data. Data engineering accounts for the bulk of practitioner effort.

The data formats and protocols are also public and worth knowing. The Two-Line Element (TLE) format remains the workhorse despite well-documented limitations on accuracy and uncertainty representation. The CCSDS Orbit Data Messages (ODM) — Orbit Mean Elements Message (OMM), Orbit Parameter Message (OPM), Orbit Ephemeris Message (OEM) — are the modern standard. The Conjunction Data Message (CDM) carries the screening output between catalog and operator. Any pipeline that consumes the catalog must handle TLE and CCSDS-format ingest, and any output that feeds an operator must respect those schemas.

The methodological orientation that follows is one of catalog augmentation rather than catalog replacement. The catalog is the system of record; ML pipelines either improve specific stages of catalog maintenance (initial orbit determination, association, propagation residuals) or extract higher-order behaviors (maneuver detection, intent characterization, conjunction risk) from the catalog plus auxiliary observations. Published work that treats the catalog as the ground truth and the ML as the contribution is more credible than work that proposes to learn the catalog itself end-to-end.

Association and tracking

The classical pipeline for SDA — initial orbit determination, observation-to-track association, batch and sequential filtering — is well documented in the public astrodynamics literature. The ML-flavored variations replace specific stages: learned association functions when classical Mahalanobis distance fails on dense fields, learned residual models on top of analytical orbit dynamics, and learned classifiers for object type from light-curve data. The most important methodological point in the published work is that none of these ML components replaces the analytical core; they augment it.

The classical methods are well-anchored. SGP4/SDP4 propagation handles general-perturbation theory. Gauss's, Laplace's, and Gooding's methods cover initial orbit determination. The unscented Kalman filter (Julier and Uhlmann), the extended Kalman filter, and the Gaussian mixture filters of Vallado, DeMars, and others are the publishing-grade sequential-filtering tools. The Multiple Hypothesis Tracker (Reid; later Blackman, Bar-Shalom) is the standard for ambiguous-association regimes. Modern open-source ecosystems — astropy, poliastro, Orekit, the Stone Soup tracking framework — implement these in code that practitioners can read.

The ML augmentations published in journals like the Journal of Guidance, Control, and Dynamics, Acta Astronautica, and the AIAA/AAS conferences cluster around three patterns. Learned association functions for dense-field observation-to-track matching, where classical residual gating fails because the gates overlap. Residual models trained on observed minus predicted state — the Kalman innovations — to absorb un-modeled dynamics like atmospheric drag variation, solar radiation pressure asymmetries, and thruster outgassing. Light-curve classifiers (Lambert, Schmidt, others) that infer object class — debris, deactivated satellite, active satellite, rocket body — from photometric time series. Each augmentation publishes its own evaluation against catalog ground truth.

Anomaly detection on orbital behavior

"Anomaly" in SDA has multiple meanings — maneuvers, debris events, payload activations, attitude changes — and each has its own public detection literature. Maneuver detection on cataloged objects has a substantial public corpus, including work from the Air Force Research Laboratory, university astrodynamics groups, and commercial SSA providers. The methodological convergence is on hybrid filters: physics-grounded propagators with learned residual models. Pure black-box ML approaches have not survived peer review well in this space.

The maneuver-detection problem has a specific shape. Given a sequence of state estimates over time, decide whether the unmodeled acceleration that explains the residuals is consistent with a planned thruster firing, a station-keeping maneuver, or a more substantial transfer. Lemmens and Krag (ESA), Holzinger, Scheeres, and others have published frameworks built on innovations testing, sliding-window CUSUM, and multiple-model adaptive estimation (IMM-style). Recent work has experimented with sequence transformers and Gaussian process models for residual learning, with the reproducibility caveats that the published comparisons make explicit.

Debris-event detection has a different shape — a sudden increase in the catalog from a fragmentation event — and a different methodology, leaning on collision-induced breakup models (NASA Standard Breakup Model and successors) and on rapid-cadence observation campaigns. Payload-activation and attitude-change detection rely on photometric and radar cross-section signal analysis. The cross-cutting finding is that detection performance scales with observation cadence and sensor diversity more than with model sophistication.

A clean, calibrated, well-aligned dataset is more valuable than a clever model.

Data engineering as the bottleneck

The most under-appreciated finding in the public SDA-ML literature is that data engineering accounts for the bulk of practitioner effort. A clean, calibrated, well-aligned dataset is more valuable than a clever model. Practitioners entering this space typically discover that the model is the easy part.

The specific data-engineering work breaks into well-known stages. Time-system handling — UTC, UT1, TAI, GPS, TT — is a recurring source of small errors that compound. Reference-frame transformations — TEME for TLE outputs, ICRF for catalog work, ITRF for ground-station-relative coordinates, RIC frames for relative motion — must be unambiguous in any pipeline. Outlier handling on the catalog is its own subfield: TLE entries with stale epoch, with duplicate satellite numbers, with retrograde-orbit anomalies, with mistyped fields. The published Astrodynamics Handbook tradition (Vallado and others) and the open Orekit and astropy implementations are the public references.

The data-quality investment also feeds evaluation. A model whose performance is reported on a "clean" subset of the catalog but degrades on the messy long tail is not yet operational. Practitioners who build evaluation harnesses that include the failure cases the catalog actually contains — TLE epoch drift, observation gaps over the South Pacific where sensor coverage is thin, association ambiguity in dense GEO clusters — are doing the work that converts a research model into a deployable component.

Conjunction analysis and decision support

Public work on conjunction risk assessment has migrated from deterministic miss-distance calculations to probabilistic frameworks that quantify uncertainty in both state estimates. Decision-support outputs — what action to take, when, under what confidence — are an active research area. The quality bar for any system that touches conjunction decisions is extraordinarily high: false positives generate wasteful maneuvers, false negatives are catastrophic. Published research community consensus is that automated decision recommendations need to be calibrated and inspectable.

The standard methodology has clear public anchors. The probability-of-collision (Pc) calculations of Foster, Alfano, and Patera — and the comparisons summarized by Hejduk and others — define the analytical baselines. The CDM is the data carrier. The Kelvins ESA Collision Avoidance Challenge (2019) and the published Rich-Rendezvous datasets gave the community shared benchmarks for learned screening. The decision-support layer — when to maneuver, how much delta-v to allocate, how to coordinate with other operators — has migrated to multi-objective optimization frameworks and Bayesian decision-analysis in recent literature.

The calibration requirement deserves explicit treatment. A system that says "Pc = 0.0001, do not maneuver" must be calibrated to actually deliver one collision per ten thousand such recommendations. The published reliability-diagram methodology, Brier-score comparisons, and conformal-prediction adaptations are the open tools for testing calibration. Inspectability is the second requirement: an operator must be able to trace the recommendation back to its constituent data points and modeling assumptions. Black-box scores without provenance are not acceptable in operational decision contexts.

Fit for software SBIR

SDA is unusual among DoD problem domains in that the underlying data is largely public, the academic community is active, and the path from prototype to operational deployment runs through identifiable program offices. Software-first small businesses can earn standing here by publishing on benchmarks the community recognizes and by building evaluation harnesses that handle the catalog the way an operator would.

The named program offices are public. The 18th Space Defense Squadron at Vandenberg operates the catalog. The Space Surveillance Network feeds it. The Joint Commercial Operations cell coordinates with commercial SSA providers (LeoLabs, Slingshot, COMSPOC, Kayhan, ExoAnalytic, NorthStar, and others). The Space Systems Command program offices, AFRL/RV, and SDA each fund work in this space. The Department of Commerce TraCSS office is the new civil counterpart. An offeror who cannot name which of these is the customer for a proposed capability is not yet positioned to win.

Public Toolchains and Datasets a Reader Should Know

space-track.org / TraCSS. The public catalog and the new civil-SSA provisioning path.

Orekit, astropy, poliastro, Stone Soup. The open-source astrodynamics and tracking ecosystems.

CCSDS Blue Books. The protocol standards for OMM/OPM/OEM and CDM.

Kelvins ESA challenges, AIAA/AAS proceedings. The community benchmarks and venues where SDA-ML publication happens.

How we use this site

We write articles like this to make our reading visible — what we think the open literature says, what we think the open gaps are, and where careful work might land. We do not use these pages to preview proposed approaches in active program spaces. Precision Federal is a software-only SBIR firm. If your office is funding work in this area and would value a software-first partner with a documented public-reading habit, we welcome the introduction.

Common questions on the public-record framing

Why are hybrid filters preferred over pure-ML in this space?

Physics-grounded propagators have decades of validation. Learned residuals on top of analytical orbit dynamics outperform black-box approaches on independent data.

How is data engineering the under-appreciated bottleneck?

Schema reconciliation across observation sources, time synchronization, frame transformations, and sensor calibration metadata account for the bulk of practitioner effort. Clean datasets beat clever models.

What does this article not cover?

Specific catalog content under access restriction, specific maneuver-detection algorithms in operational use, or any Precision Federal solution.

Public catalog and tooling references

Reference	Provider	Use
space-track.org	U.S. Space Force units	Public SSA catalog
Stone Soup	Open-source	Tracking and association framework
Orekit / SGP4	Open-source	Orbit propagation
ESA Kelvins	ESA	Public benchmark competitions
TraCSS	Department of Commerce	Civil-SSA program transition

Frequently asked questions

What is the public catalog that SDA research depends on?

The public catalog provided through space-track.org by U.S. Space Force units, mirrored by commercial SSA providers, is the data substrate for nearly all open SDA research. Civil-SSA provisioning is currently transitioning to the Department of Commerce under the Traffic Coordination System for Space (TraCSS) program.

How do learned methods relate to classical astrodynamics in published SDA pipelines?

The methodological convergence is on hybrid filters: physics-grounded propagators with learned residual models. ML components — learned association functions, residual models on top of analytical orbit dynamics, classifiers from light-curve data — augment the analytical core; they do not replace it. Pure black-box approaches have not survived peer review well.

Why is conjunction analysis held to such a high quality bar?

False positives generate wasteful maneuvers; false negatives are catastrophic. Public work on conjunction risk assessment has migrated from deterministic miss-distance calculations to probabilistic frameworks that quantify uncertainty in both state estimates, and the published consensus is that automated decision recommendations need to be calibrated and inspectable.

Where can a software-first small business earn standing in SDA?

By publishing on benchmarks the community recognizes and by building evaluation harnesses that handle the catalog the way an operator would. SDA is unusual in that the underlying data is largely public and the academic community is active.