NSF NCAR Research Applications Laboratory NSF NCAR Research Applications Laboratory

WISP Quarterly Meeting · June 30, 2026

NCAR Model Manager (ncarmm)

Run NCAR's weather models in the cloud — without reinventing the wheel each time.

Co-leads: David Hahn & Victor Weeks
NSF National Center for Atmospheric Research

This material is based upon work supported by the NSF National Center for Atmospheric Research, a major facility sponsored by the U.S. National Science Foundation and managed by the University Corporation for Atmospheric Research.

Standing up WRF / MPAS on the cloud means rebuilding a deep, pinned dependency stack — compilers, MPI, and the parallel-I/O chain — from scratch, every time.

HDF5 NetCDF PnetCDF PIO
The real killer:  all four must be built against the same MPI — one mismatch and you lose days.
  • Each project rebuilds this from scratch → weeks lost, undocumented tribal knowledge.
  • Scientists want to run science, not debug build toolchains.

Ease

One command to a cloud-ready, runnable model image.

No duplicate work

Pinned, reviewed build recipes shared across all projects.

Reproducible builds

Every dependency pinned by SHA / version, so the software stack rebuilds identically.

Portable

Same recipe targets AMIs today, containers next.

ncarmm build mpas --version 8.3.1
Models
WRF 4.4.0 WRF-Chem 4.4.0 MPAS 8.3.1
Cloud
Amazon Web Services
Output
Ready-to-run machine images (AMIs) + AWS ParallelCluster reference configs.
CLI
list-models list-images build delete

Scope: builds the model and its toolchain — input-data staging & output movement stay in the user's workflow.

manifest.yaml Manifest Declares the model + every pinned dependency.
Image Builder + CloudFormation Automated build A right-sized EC2 instance compiles the parallel-I/O stack + model.
snapshot AMI Tagged & discoverable machine image.
ParallelCluster Run the model Launch the AMI on HPC-class nodes.

The build instance compiles the full parallel-I/O stack — HDF5 + PnetCDF + PIO + METIS + the model, then snapshots to the AMI. Networking self-provisions by default; locked-down accounts are accommodated with pre-created instance profiles.

Prototype · proof-of-concept · not yet released
container_spec.yaml
single spec
Jinja2
Dockerfile
Apptainer .def · rootless --fakeroot
Both reuse the AMI build script compile_mpas.sh byte-for-byte — zero diff, CI drift-guarded.
  • Validated single-node (real idealized forecast → NetCDF check).
  • Multi-node / EFA binding is the next step.
MPAS variable-resolution hexagonal mesh over Earth
MPAS variable-resolution mesh — the model now ships as both AMI and container.

A new model or version is a new manifest + recipe directory; the framework does the rest.

  • Plugin-style layout — each model / version is self-contained and loaded dynamically.
  • Adding one never touches the core.
On the radar
CheMPAS WRF-Hydro FastEddy New versions of existing models
Including private models
Bring-your-own manifest + recipe Build-time auth token

Import the model registry, register a private manifest + recipe, and pass a token at build time to pull restricted source — run a proprietary model (e.g. WRF WxMod, a non-public FastEddy) on the framework without that code living here.

CSP support is a pluggable interface — AWS is the first implementation. Adding a provider = implement two functions.

get_image_details() build_image()
Implemented
Amazon Web Services
On the radar
Google Cloud Microsoft Azure Oracle Other CSPs
01

Cluster lifecycle

Start / stop compute clusters on demand — the clearest cost-control story.

02

Containers, first-class

Container output alongside AMIs, growing out of the MPAS prototype.

03

More models + clouds

Continued breadth across the roadmap — new models, new providers.

Cost honesty: for cloud NWP the bill is dominated by EFA HPC instances + FSx Lustre + data egress — not idle time alone.

Today
GitHub repository
Direct pip install from a Git / URL ref. Not on PyPI.
Planned
NCAR-internal PyPI
A clean pip install ncarmm for NCAR staff.
Possible future
Open source
On the table — gauging the room's appetite.

Repository: github.com/NCAR/ncarmm

ncarmm: build once, run where NCAR computes — cloud today, on-prem container next.

  • Solves duplicated cloud-model effort across projects.
  • Working today: WRF, WRF-Chem, MPAS on AWS.
  • Extensible by design: more models, more clouds, containers.
Let's discuss
What models / clouds do you need?
Would containers help your HPC workflow?
Interest in open source?

Not presented — reference for the co-leads.

Do cloud runs reproduce our Derecho results bit-for-bit?
No, and that's expected. We reproduce the build (pinned sources/compiler/libs). Forecasts differ at round-off across hardware, AVX paths, and MPI rank counts — same as any MPI code.
Did you validate the MPAS container?
It passes an end-to-end idealized (Jablonowski–Williamson) forecast with NetCDF output checks — a build-integration test. Real-data IC/LBC validation against observations is future work.
Does WRF-Chem come with the emissions preprocessors?
No. We deliver the chem-enabled WRF binary + dependency stack; emissions / mechanism setup is application-specific and stays with the user.
Multi-node on EFA / InfiniBand?
Validated single-node today. Multi-node needs host-MPI/EFA binding (the classic hybrid-Apptainer model) — known next step.
How do I get data in/out, and what's the cost?
Input staging + output egress are the user's workflow (FSx Lustre + S3). Cost is driven by EFA HPC instances, Lustre, and egress; start/stop clusters targets the idle-compute slice.
Is it on PyPI / open source?
Not yet on PyPI (GitHub + URL install today; internal PyPI planned). Open source is possible, not committed.
Can we run a proprietary / non-public model (e.g. WRF WxMod, a private FastEddy)?
Yes, by design. A downstream project imports the model registry, registers its own manifest + Image Builder recipe, and supplies a build-time auth token (OAuth / PAT) to pull the restricted source. The proprietary code never lives in ncarmm — the framework just builds & runs it. Registry hook is a near-term design item.