This notebook illustrates how the ModelSpec works with models.yml to curate and access model attributes.
Overview¶
The ModelSpec class defines the complete specification for an ocean model configuration, including:
Templates: Jinja2 template locations for compile-time and run-time configuration files
Settings: Default settings and configuration files
Code: Repository specifications for ROMS, MARBL, and associated code
Inputs: Default specifications for grid, initial conditions, and forcing data
Datasets: List of required source datasets
Model specifications are stored in models.yml and loaded using load_models_yaml().
Setup¶
Import the necessary modules and enable autoreload for development.
%load_ext autoreload
%autoreload 2
from pathlib import Path
from cson_forge import models, config
Load ModelSpec¶
Load a model specification from models.yml using load_models_yaml(). This function takes the path to the YAML file and the model name.
# Load a model specification
model_name = "cson_roms-marbl_v0.1"
model_spec = models.load_models_yaml(config.paths.models_yaml, model_name)
print(f"Loaded ModelSpec: {model_spec.name}")
print(f"Type: {type(model_spec)}")
Loaded ModelSpec: cson_roms-marbl_v0.1
Type: <class 'cson_forge.models.ModelSpec'>
Inspect ModelSpec Structure¶
The ModelSpec is a Pydantic model with several main components. Let’s explore each one:
# View all ModelSpec attributes
print("ModelSpec attributes:")
# Use the class to access model_fields (not the instance) to avoid deprecation warning
for attr in model_spec.__class__.model_fields.keys():
value = getattr(model_spec, attr)
if isinstance(value, (list, dict)) and len(str(value)) > 100:
print(f" - {attr}: {type(value).__name__} (length: {len(value)})")
else:
print(f" - {attr}: {value}")
ModelSpec attributes:
- name: cson_roms-marbl_v0.1
- templates: compile_time=CodeRepository(documentation='', locked=False, location='/Users/mclong/codes/cson-forge/cson_forge/model-configs/cson_roms-marbl_v0.1/templates/compile-time', commit='', branch='main', filter=PathFilter(directory='', files=['bgc.opt.j2', 'blk_frc.opt.j2', 'cdr_frc.opt.j2', 'cppdefs.opt.j2', 'diagnostics.opt.j2', 'ocean_vars.opt.j2', 'param.opt.j2', 'river_frc.opt.j2', 'surf_flux.opt.j2', 'tides.opt.j2', 'tracers.opt.j2', 'Makefile'])) run_time=CodeRepository(documentation='', locked=False, location='/Users/mclong/codes/cson-forge/cson_forge/model-configs/cson_roms-marbl_v0.1/templates/run-time', commit='', branch='main', filter=PathFilter(directory='', files=['roms.in.j2', 'marbl_in', 'marbl_tracer_output_list', 'marbl_diagnostic_output_list']))
- settings: properties=PropertiesSpec(n_tracers=34) compile_time=SettingsStage(settings_dict={'bgc': {'wrt_his': True, 'output_period_his': 86400, 'nrpf_his': 7, 'wrt_avg': False, 'output_period_avg': 86400, 'nrpf_avg': 7, 'wrt_his_dia': True, 'output_period_his_dia': 86400, 'nrpf_his_dia': 7, 'wrt_avg_dia': False, 'output_period_avg_dia': 60, 'nrpf_avg_dia': 1, 'nbgc_flx': 2, 'interp_frc': 1}, 'blk_frc': {'interp_frc': 1}, 'cppdefs': {'obc_west': True, 'obc_east': True, 'obc_north': True, 'obc_south': True, 'marbl': True}, 'param': {'LLm': 512, 'MMm': 512, 'N': 60, 'NP_XI': 16, 'NP_ETA': 16, 'NSUB_X': 1, 'NSUB_E': 1, 'nt_passive': 0, 'ntrc_bio': 32}, 'ocean_vars': {'wrt_file_rst': True, 'output_period_rst': 43200, 'monthly_restarts': False, 'nrpf_rst': 2, 'wrt_file_his': True, 'output_period_his': 86400, 'nrpf_his': 7, 'wrt_Z': True, 'wrt_Ub': True, 'wrt_Vb': True, 'wrt_U': True, 'wrt_V': True, 'wrt_R': False, 'wrt_O': False, 'wrt_W': True, 'wrt_Akv': False, 'wrt_Akt': False, 'wrt_Aks': False, 'wrt_Hbls': False, 'wrt_Hbbl': False, 'wrt_file_avg': False, 'output_period_avg': 604800, 'nrpf_avg': 1, 'wrt_avg_Z': True, 'wrt_avg_Ub': True, 'wrt_avg_Vb': True, 'wrt_avg_U': True, 'wrt_avg_V': True, 'wrt_avg_R': True, 'wrt_avg_O': True, 'wrt_avg_W': True, 'wrt_avg_Akv': True, 'wrt_avg_Akt': True, 'wrt_avg_Aks': True, 'wrt_avg_Hbls': True, 'wrt_avg_Hbbl': True, 'code_check': False}, 'surf_flux': {'wrt_smflx': False, 'wrt_stflx': False, 'sflx_avg': False, 'output_period': 31536000, 'nrpf': 10, 'sst_vname': 'sst', 'sst_tname': 'sst_time', 'sss_vname': 'sss', 'sss_tname': 'sss_time', 'interp_frc': 1}, 'tides': {'ntides': 10, 'bry_tides': True, 'pot_tides': True, 'ana_tides': False}, 'river_frc': {'river_source': False, 'analytical': False, 'nriv': 0, 'rvol_vname': 'river_volume', 'rvol_tname': 'river_time', 'rtrc_vname': 'river_tracer', 'rtrc_tname': 'river_time'}, 'diagnostics': {'diag_avg': False, 'output_period': 86400, 'nrpf': 7, 'diag_uv': False, 'diag_trc': False, 'diag_pflx': True, 'timescale': 86400, 'diag_prec': 'nf90_double'}, 'tracers': {'interp_t': 1}, 'cdr_frc': {'cdr_source': False, 'cdr_volume': True, 'cdr_analytical': False, 'ncdr': 1, 'cdr_file': 'cdr_forcing.nc', 'cdrvol_vname': 'cdr_volume', 'cdrvol_tname': 'cdr_time', 'cdrtrc_vname': 'cdr_tracer', 'cdrtrc_tname': 'cdr_time', 'cdrflx_vname': 'cdr_trcflx', 'cdrflx_tname': 'cdr_time', 'cdr_loc_lon': 'cdr_lon', 'cdr_loc_lat': 'cdr_lat', 'cdr_loc_dep': 'cdr_dep', 'cdr_scl_hor': 'cdr_hsc', 'cdr_scl_vrt': 'cdr_vsc'}}) run_time=SettingsStage(settings_dict={'roms.in': {'title': {'casename': None}, 'time_stepping': {'ntimes': 1200, 'dt': 2160, 'ndtfast': 60, 'ninfo': 1}, 's_coord': {'theta_s': 5.0, 'theta_b': 2.0, 'tcline': 300.0}, 'grid': {'grid_file': None}, 'forcing': {'surface_forcing_path': None, 'surface_forcing_bgc_path': None, 'boundary_forcing_path': None, 'boundary_forcing_bgc_path': None, 'river_path': None}, 'lateral_visc': {'visc2': 0.0, 'visc4': 0.0, 'rho0': 1000.0}, 'output_root_name': {'output_root_name': None}, 'vertical_mixing': {'akv': 0.0, 'akt_default': 0.0}, 'tracer_diff2': {'tnu2_default': 0.0}, 'bottom_drag': {'rdrg': 0.0, 'rdrg2': 0.001, 'zob': 0.01, 'cdb_min': 0.0001, 'cdb_max': 0.01}, 'v_sponge': {'v_sponge': 0.0}, 'gamma2': 1.0, 'ubind': 0.1, 'initial': {'nrrec': 1, 'initial_file': None}}})
- code: roms=CodeRepository(documentation='', locked=False, location='https://github.com/CWorthy-ocean/ucla-roms.git', commit='', branch='main', filter=None) run_time=CodeRepository(documentation='', locked=False, location='placeholder://run_time', commit='', branch='main', filter=None) compile_time=CodeRepository(documentation='', locked=False, location='placeholder://compile_time', commit='', branch='main', filter=None) marbl=CodeRepository(documentation='', locked=False, location='https://github.com/marbl-ecosys/MARBL.git', commit='marbl0.45.0', branch='', filter=None)
- inputs: grid=GridInput(topography_source='ETOPO5') initial_conditions=InitialConditionsInput(source=SourceSpec(name='GLORYS', climatology=False), bgc_source=SourceSpec(name='UNIFIED', climatology=True)) forcing=ForcingInput(surface=[SurfaceForcingItem(source=SourceSpec(name='ERA5', climatology=False), type='physics', correct_radiation=True), SurfaceForcingItem(source=SourceSpec(name='UNIFIED', climatology=True), type='bgc', correct_radiation=False)], boundary=[BoundaryForcingItem(source=SourceSpec(name='GLORYS', climatology=False), type='physics'), BoundaryForcingItem(source=SourceSpec(name='UNIFIED', climatology=True), type='bgc')], tidal=[TidalForcingItem(source=SourceSpec(name='TPXO', climatology=False), ntides=15)], river=[RiverForcingItem(source=SourceSpec(name='DAI', climatology=False), include_bgc=True)])
- datasets: ['ERA5', 'GLORYS_REGIONAL', 'TPXO', 'UNIFIED_BGC']
/var/folders/x8/7n8hknbj717fxnf07pnk3pch0000gn/T/ipykernel_11022/1169389429.py:3: PydanticDeprecatedSince211: Accessing the 'model_fields' attribute on the instance is deprecated. Instead, you should access this attribute from the model class. Deprecated in Pydantic V2.11 to be removed in V3.0.
for attr in model_spec.model_fields.keys():
Templates Specification¶
The templates field defines where Jinja2 templates are located for compile-time and run-time configuration files.
if model_spec.templates:
print("Templates Specification:")
print(f" Compile-time location: {model_spec.templates.compile_time.location}")
print(f" Compile-time files: {model_spec.templates.compile_time.filter.files}")
print(f"\n Run-time location: {model_spec.templates.run_time.location}")
print(f" Run-time files: {model_spec.templates.run_time.filter.files}")
else:
print("No templates specification")
Templates Specification:
Compile-time location: /Users/mclong/codes/cson-forge/cson_forge/model-configs/cson_roms-marbl_v0.1/templates/compile-time
Compile-time files: ['bgc.opt.j2', 'blk_frc.opt.j2', 'cdr_frc.opt.j2', 'cppdefs.opt.j2', 'diagnostics.opt.j2', 'ocean_vars.opt.j2', 'param.opt.j2', 'river_frc.opt.j2', 'surf_flux.opt.j2', 'tides.opt.j2', 'tracers.opt.j2', 'Makefile']
Run-time location: /Users/mclong/codes/cson-forge/cson_forge/model-configs/cson_roms-marbl_v0.1/templates/run-time
Run-time files: ['roms.in.j2', 'marbl_in', 'marbl_tracer_output_list', 'marbl_diagnostic_output_list']
Settings Specification¶
The settings field defines default settings and configuration file locations for compile-time and run-time stages.
if model_spec.settings:
print("Settings Specification:")
print(f" Properties: {model_spec.settings.properties}")
print(f" Compile-time defaults: {model_spec.settings.compile_time._default_config_yaml}")
print(f" Run-time defaults: {model_spec.settings.run_time._default_config_yaml}")
else:
print("No settings specification")
Settings Specification:
Properties: n_tracers=34
Compile-time defaults: /Users/mclong/codes/cson-forge/cson_forge/model-configs/cson_roms-marbl_v0.1/templates/compile-time-defaults.yml
Run-time defaults: /Users/mclong/codes/cson-forge/cson_forge/model-configs/cson_roms-marbl_v0.1/templates/run-time-defaults.yml
Code Repository Specification¶
The code field defines the code repositories (ROMS, MARBL) and their locations, branches, and commits.
print("Code Repository Specification:")
print(f" ROMS location: {model_spec.code.roms.location}")
print(f" ROMS branch: {model_spec.code.roms.branch}")
if model_spec.code.marbl:
print(f" MARBL location: {model_spec.code.marbl.location}")
print(f" MARBL commit: {model_spec.code.marbl.commit}")
else:
print(" MARBL: Not specified")
Code Repository Specification:
ROMS location: https://github.com/CWorthy-ocean/ucla-roms.git
ROMS branch: main
MARBL location: https://github.com/marbl-ecosys/MARBL.git
MARBL commit: marbl0.45.0
Inputs Specification¶
The inputs field defines default specifications for grid, initial conditions, and forcing data. These serve as defaults when generating inputs.
print("Inputs Specification:")
print(f"\nGrid:")
print(f" Topography source: {model_spec.inputs.grid.topography_source}")
print(f"\nInitial Conditions:")
print(f" Source: {model_spec.inputs.initial_conditions.source}")
if model_spec.inputs.initial_conditions.bgc_source:
print(f" BGC source: {model_spec.inputs.initial_conditions.bgc_source}")
print(f"\nForcing:")
if model_spec.inputs.forcing:
if model_spec.inputs.forcing.surface:
print(f" Surface forcing ({len(model_spec.inputs.forcing.surface)} sources):")
for i, surf in enumerate(model_spec.inputs.forcing.surface, 1):
print(f" {i}. {surf.source.name} ({surf.type})")
if model_spec.inputs.forcing.boundary:
print(f" Boundary forcing ({len(model_spec.inputs.forcing.boundary)} sources):")
for i, bnd in enumerate(model_spec.inputs.forcing.boundary, 1):
print(f" {i}. {bnd.source.name} ({bnd.type})")
if model_spec.inputs.forcing.tidal:
print(f" Tidal forcing ({len(model_spec.inputs.forcing.tidal)} sources):")
for i, tide in enumerate(model_spec.inputs.forcing.tidal, 1):
print(f" {i}. {tide.source.name} (ntides: {tide.ntides})")
if model_spec.inputs.forcing.river:
print(f" River forcing ({len(model_spec.inputs.forcing.river)} sources):")
for i, riv in enumerate(model_spec.inputs.forcing.river, 1):
print(f" {i}. {riv.source.name} (include_bgc: {riv.include_bgc})")
Inputs Specification:
Grid:
Topography source: ETOPO5
Initial Conditions:
Source: name='GLORYS' climatology=False
BGC source: name='UNIFIED' climatology=True
Forcing:
Surface forcing (2 sources):
1. ERA5 (physics)
2. UNIFIED (bgc)
Boundary forcing (2 sources):
1. GLORYS (physics)
2. UNIFIED (bgc)
Tidal forcing (1 sources):
1. TPXO (ntides: 15)
River forcing (1 sources):
1. DAI (include_bgc: True)
Required Datasets¶
The datasets field lists all source datasets required by this model configuration. These are derived from the inputs specification.
print("Required Datasets:")
print(f" Total: {len(model_spec.datasets)}")
for i, dataset in enumerate(model_spec.datasets, 1):
print(f" {i}. {dataset}")
Required Datasets:
Total: 4
1. ERA5
2. GLORYS_REGIONAL
3. TPXO
4. UNIFIED_BGC
Accessing Nested Fields¶
You can access nested fields using dot notation. Here are some examples:
# Examples of accessing nested fields
print("Example field access:")
print(f" ROMS repository URL: {model_spec.code.roms.location}")
print(f" Number of tracers: {model_spec.settings.properties.n_tracers}")
print(f" First surface forcing source: {model_spec.inputs.forcing.surface[0].source.name}")
print(f" Grid topography source: {model_spec.inputs.grid.topography_source}")
Example field access:
ROMS repository URL: https://github.com/CWorthy-ocean/ucla-roms.git
Number of tracers: 34
First surface forcing source: ERA5
Grid topography source: ETOPO5
ModelSpec as Dictionary¶
You can convert the ModelSpec to a dictionary for inspection or serialization:
# Convert to dictionary (Pydantic model_dump)
model_dict = model_spec.model_dump()
print("ModelSpec as dictionary (top-level keys):")
for key in model_dict.keys():
print(f" - {key}")
# You can also use model_dump_json() for JSON serialization
# import json
# json_str = model_spec.model_dump_json(indent=2)
ModelSpec as dictionary (top-level keys):
- name
- templates
- settings
- code
- inputs
- datasets
Summary¶
The ModelSpec provides a structured, validated representation of model configurations:
Type-safe: Pydantic models provide validation and type checking
Accessible: Use dot notation to access nested fields
Serializable: Convert to dict/JSON for storage or inspection
Complete: Contains all information needed to configure and build a model
This specification is used by CstarSpecBuilder to configure model builds and input generation.