DHI · ecomodeller · May 14, 2026 · May 14, 2026 · May 14, 2026
@@ -54,6 +54,21 @@ This updates `roadmap/README.md` from the feature frontmatter.
 
 ## Coding Conventions
 
+### Public vs. Internal API
+
+A name is **private** if any segment of its import path starts with `_`.
+Mechanically enforced by `PLC2701`: no module may import a `_foo` name from
+another module. Underscores on function names are otherwise a convention —
+authors may use them as a hint that a function is file-local, but this is
+not required and not enforced.
+
+- Public: `modelskill.matching.match`, `modelskill.utils.rename_coords_xr`
+- Private (module path): `modelskill._names.get_name`, `modelskill.timeseries._timeseries.validate_data_var_name`
+- File-local hint: a leading `_` on a function name inside any module, marking no cross-module use
+
+See [ADR-012](adr/012-public-private-api-convention.md). Enforced in CI by
+Ruff rule `PLC2701` (zero violations in `src/`; tests have a per-file ignore).
+
 ### Docstrings
 - All docstrings use **NumPy format** (not Google or reStructuredText style)
 - Include sections: Parameters, Returns, Raises, Examples, See Also, Notes as appropriate
@@ -169,7 +184,6 @@ Plots support both matplotlib (static) and plotly (interactive) backends.
 - Internal data storage uses xarray Datasets with standardized coordinate/variable names
 - Time coordinates use pandas datetime64
 - Spatial coordinates: `x`, `y` (and `z` when applicable)
-- Reserved names in `_RESERVED_NAMES` should not be used for model/observation names
 - The `Quantity` class handles physical quantities with units and validation
 
 ### Testing Structure (`tests/`)

@@ -0,0 +1,56 @@
+# ADR-012: Public vs. internal API convention
+
+**Status**: Accepted
+
+**Date**: 2026-05-14
+
+## Context
+
+ModelSkill had accumulated an inconsistent privacy convention. Leading underscores appeared on both **file names** (`_comparison.py`, `_misc.py`, `_timeseries.py`) and **function or constant names** (`_get_name`, `_parse_metric`, `_validate_data_var_name`), with no written rule about what each level meant or how the two interacted. Symptoms:
+
+- `from modelskill.timeseries._timeseries import _validate_data_var_name` — privacy claimed at both the path segment *and* the function name, while the function was imported from 4+ other files across subpackages.
+- `timeseries/__init__.py` placed five underscore-prefixed names in `__all__`, simultaneously declaring them "officially public" and "private."
+- Pyright (Pylance) flagged `reportPrivateUsage` on every cross-module `_foo` import, generating ambient warnings during development.
+- No written policy distinguished "what users can rely on" from "what the package uses internally."
+
+Three candidate definitions of "public" were considered:
+
+- **Strict**: only names in `modelskill.__all__` at the top level.
+- **Documented**: anything appearing in `docs/api/`.
+- **Conventional**: anything reachable as `modelskill.<...>` without a leading `_` in any path segment.
+
+The conventional definition was chosen because the team did not want documentation gaps to silently shrink the public API: a function reachable via a non-underscored import path could already have been adopted by users, even if undocumented.
+
+## Decision
+
+**Privacy is set by the module path, not the function name.** A name is private if any segment of its import path starts with `_`. The only mechanically enforced rule is `PLC2701`: no module may import a name starting with `_` from another module. A leading `_` on a function name is otherwise a convention with no enforced meaning — authors may use it as a hint that a function is file-local. Inside a private module, where the module path already carries the privacy signal, this hint is optional.
+
+Examples:
+
+- `modelskill.timeseries._timeseries.validate_data_var_name` — private (module path).
+- `modelskill._names.get_name` — private (module path).
+- `modelskill.metrics._format_directional_label` (hypothetical) — private (function name); used only inside `metrics.py`.
+- `modelskill.matching.match` — public.
+
+This is the same convention used by scikit-learn (`sklearn.utils._param_validation.validate_params`).
+
+The convention is mechanically enforced by Ruff rule `PLC2701` (`import-private-name`), which flags `from x import _foo` style imports. After the renames applied with this ADR, the codebase has zero PLC2701 violations in `src/`. Tests are allowed to import private names via a per-file ignore, since white-box testing of internals is a legitimate need.
+
+## Alternatives Considered
+
+**Snapshot-test the public surface.** A test that asserts `dir(modelskill)` matches a frozen list catches accidental public additions but requires updating the expected list on every intentional public change. Rejected as ongoing maintenance cost for limited additional coverage beyond what Ruff and PR review already provide.
+
+**Introduce a `modelskill/_internal/` subpackage.** Concentrating all cross-cutting internal helpers in one location would make "shared internal API" structurally obvious. Rejected as pure churn: the helpers already live in natural subpackage homes, and moving them does not change the privacy story under this convention.
+
+**Rename `utils.py` → `_utils.py` and rely on a deprecation shim.** Would break legitimately-public functions (`rename_coords_xr`, `rename_coords_pd`, `make_unique_index`) that are imported from non-underscored paths internally and could already be used by downstream consumers. The actual cross-cutting *private* helpers (`_get_name`, `_get_idx`, `_RESERVED_NAMES`) were split out into a new `modelskill/_names.py` module instead, leaving `utils.py` as a public module. The regression helper that previously lived as `metrics._linear_regression` was split into its own single-purpose private module `modelskill/_regression.py`. Naming the new module `_names.py` rather than `_utils.py` is deliberate: it describes the concern (reserved names plus name/index resolution) instead of a contentless role.
+
+**Keep the status quo and configure Pyright to silence `reportPrivateUsage`.** Rejected because the warning is genuinely useful for catching future drift; removing the noise by suppression would also remove the signal.
+
+## Consequences
+
+- **Cross-module use of internal helpers is no longer flagged.** `from modelskill._names import get_name` is unambiguously legal; the module path says "internal." Pyright's `reportPrivateUsage` falls silent.
+- **Function-name underscores remain a convention.** `PLC2701` enforces that no `_foo` may be imported across modules. Beyond that, underscores carry no formal meaning: authors may use them as a navigation hint that a function is file-local, in either public or private modules.
+- **Future contributors have a clear default.** New internal helpers go in `_modules`, not in `_underscored_names` inside public modules. PRs that put internal helpers in `metrics.py` or `utils.py` with a leading underscore now stand out as the exception rather than the norm.
+- **One breaking change is accepted.** Five underscored names were previously re-exported from `modelskill.timeseries` via `__all__` (`_parse_track_input`, etc.). These names are no longer reachable from `modelskill.timeseries`; the renamed functions live in `modelskill.timeseries._point`, `_track`, `_vertical`. By Python convention, leading-underscore names were never part of the stable API, so this is consistent with the new policy.
+- **Tests retain access to internals** via a `tests/**/*.py` per-file ignore for `PLC2701`. White-box testing of private modules remains supported.
+- **`metrics.py` no longer hosts `_parse_metric` or `_linear_regression`.** They moved to `comparison/_utils.py:parse_metric` and `_regression.py:linear_regression` respectively. The first is consumer-specific to the comparison subpackage; the second is a general regression helper used by both `metrics.lin_slope` and `plotting._scatter`.
@@ -30,6 +30,7 @@ Each ADR follows this structure:
 - [ADR-009](009-factory-pattern.md) - Factory pattern for type detection
 - [ADR-010](010-optional-domain-dependencies.md) - Optional dependencies for domain-specific model types (Draft)
 - [ADR-011](011-vertical-pre-extracted-columns.md) - VerticalModelResult ingests pre-extracted columns
+- [ADR-012](012-public-private-api-convention.md) - Public vs. internal API convention
 
 ## Contributing
 

@@ -75,7 +75,13 @@ extend-exclude = ["notebooks"]
 
 [tool.ruff.lint]
 ignore = ["E501"]
-select = ["E4", "E7", "E9", "F", "D200", "D205"]
+select = ["E4", "E7", "E9", "F", "D200", "D205", "PLC2701"]
+preview = true
+explicit-preview-rules = true
+
+[tool.ruff.lint.per-file-ignores]
+# Tests legitimately exercise package internals.
+"tests/**/*.py" = ["PLC2701"]
 
 [tool.mypy]
 python_version = "3.10"

@@ -0,0 +1,50 @@
+"""Reserved coordinate/variable names and name-or-index resolution.
+
+Centralises the names ModelSkill reserves on its xarray data structures and
+the small resolver used wherever a user supplies a name or positional index
+against a list of valid names. Private per ADR-012.
+"""
+
+from __future__ import annotations
+from typing import Sequence, cast
+from collections.abc import Hashable
+
+
+RESERVED_COORD_NAMES = ["x", "y", "z", "time"]
+RESERVED_COMPARER_VAR_NAMES = [*RESERVED_COORD_NAMES, "Observation"]
+
+
+def get_name(x: int | str | None, valid_names: Sequence[Hashable]) -> str:
+    """Parse name/idx from list of valid names (e.g. obs from obs_names), return name."""
+    return cast(str, valid_names[get_idx(x, valid_names)])
+
+
+def get_idx(x: int | str | None, valid_names: Sequence[Hashable]) -> int:
+    """Parse name/idx from list of valid names (e.g. obs from obs_names), return idx."""
+
+    if x is None:
+        if len(valid_names) == 1:
+            return 0
+        else:
+            raise ValueError(
+                f"Multiple items available. Must specify name or index. Available items: {valid_names}"
+            )
+
+    n = len(valid_names)
+    if n == 0:
+        raise ValueError(f"Cannot select {x} from empty list!")
+    elif isinstance(x, str):
+        if x in valid_names:
+            idx = valid_names.index(x)
+        else:
+            raise KeyError(f"Name {x} could not be found in {valid_names}")
+    elif isinstance(x, int):
+        if x < 0:
+            x += n
+        if x >= 0 and x < n:
+            idx = x
+        else:
+            raise IndexError(f"Id {x} is out of range for {valid_names}")
+    else:
+        raise TypeError(f"Input {x} invalid! Must be None, str or int, not {type(x)}")
+    return idx
@@ -0,0 +1,40 @@
+"""Linear regression helper shared by ``metrics.lin_slope`` and the scatter trendline.
+
+Private per ADR-012: the leading underscore on the module name signals internal use.
+"""
+
+from __future__ import annotations
+from typing import Tuple
+
+import numpy as np
+from numpy.typing import ArrayLike
+
+
+def linear_regression(
+    obs: ArrayLike, model: ArrayLike, reg_method: str = "ols"
+) -> Tuple[float, float]:
+    """Fit a linear regression of ``model`` against ``obs`` and return (slope, intercept)."""
+    if len(obs) == 0:  # type: ignore[arg-type]
+        return np.nan, np.nan
+
+    if reg_method == "ols":
+        from scipy.stats import linregress
+
+        reg = linregress(obs, model)
+        intercept = reg.intercept
+        slope = reg.slope
+    elif reg_method == "odr":
+        from scipy import odr
+
+        data = odr.Data(obs, model)
+        odr_obj = odr.ODR(data, odr.unilinear)
+        output = odr_obj.run()
+
+        intercept = output.beta[1]
+        slope = output.beta[0]
+    else:
+        raise NotImplementedError(
+            f"Regression method: {reg_method} not implemented, select 'ols' or 'odr'"
+        )
+
+    return slope, intercept
@@ -28,13 +28,13 @@
 from ..skill import SkillTable
 from ..skill_grid import SkillGrid
 
-from ..utils import _get_name
+from .._names import get_name
 from ._comparison import Comparer
-from ..metrics import _parse_metric
 from ._utils import (
-    _add_spatial_grid_to_df,
-    _groupby_df,
-    _parse_groupby,
+    add_spatial_grid_to_df,
+    groupby_df,
+    parse_groupby,
+    parse_metric,
     IdxOrNameTypes,
     TimeTypes,
 )
@@ -230,7 +230,7 @@ def __getitem__(
             return ComparerCollection([self[i] for i in idxs])
 
         if isinstance(x, int):
-            name = _get_name(x, self.obs_names)
+            name = get_name(x, self.obs_names)
             return self._comparers[name]
 
         if isinstance(x, Iterable):
@@ -332,16 +332,16 @@ def sel(
                 models = [model]
             else:
                 models = list(model)
-            mod_names: List[str] = [_get_name(m, self.mod_names) for m in models]
+            mod_names: List[str] = [get_name(m, self.mod_names) for m in models]
         if observation is None:
             observation = self.obs_names
         else:
             observation = [observation] if np.isscalar(observation) else observation  # type: ignore
-            observation = [_get_name(o, self.obs_names) for o in observation]  # type: ignore
+            observation = [get_name(o, self.obs_names) for o in observation]  # type: ignore
 
         if (quantity is not None) and (self.n_quantities > 1):
             quantity = [quantity] if np.isscalar(quantity) else quantity  # type: ignore
-            quantity = [_get_name(v, self.quantity_names) for v in quantity]  # type: ignore
+            quantity = [get_name(v, self.quantity_names) for v in quantity]  # type: ignore
         else:
             quantity = self.quantity_names
 
@@ -482,14 +482,14 @@ def skill(
         """
         cc = self
 
-        pmetrics = _parse_metric(metrics)
+        pmetrics = parse_metric(metrics)
 
-        agg_cols = _parse_groupby(by, n_mod=cc.n_models, n_qnt=cc.n_quantities)
+        agg_cols = parse_groupby(by, n_mod=cc.n_models, n_qnt=cc.n_quantities)
         agg_cols, attrs_keys = self._attrs_keys_in_by(agg_cols)
 
         df = cc._to_long_dataframe(attrs_keys=attrs_keys, observed=observed)
 
-        res = _groupby_df(df, by=agg_cols, metrics=pmetrics)
+        res = groupby_df(df, by=agg_cols, metrics=pmetrics)
         mtr_cols = [m.__name__ for m in pmetrics]  # type: ignore
         res = res.dropna(subset=mtr_cols, how="all")  # TODO: ok to remove empty?
         res = self._append_xy_to_res(res, cc)
@@ -634,19 +634,19 @@ def gridded_skill(
         """
         cmp = self
 
-        metrics = _parse_metric(metrics)
+        metrics = parse_metric(metrics)
 
         df = cmp._to_long_dataframe()
-        df = _add_spatial_grid_to_df(df=df, bins=bins, binsize=binsize)
+        df = add_spatial_grid_to_df(df=df, bins=bins, binsize=binsize)
 
-        agg_cols = _parse_groupby(by, n_mod=cmp.n_models, n_qnt=cmp.n_quantities)
+        agg_cols = parse_groupby(by, n_mod=cmp.n_models, n_qnt=cmp.n_quantities)
         if "x" not in agg_cols:
             agg_cols.insert(0, "x")
         if "y" not in agg_cols:
             agg_cols.insert(0, "y")
 
         df = df.drop(columns=["x", "y"]).rename(columns=dict(xBin="x", yBin="y"))
-        res = _groupby_df(df, by=agg_cols, metrics=metrics, n_min=n_min)
+        res = groupby_df(df, by=agg_cols, metrics=metrics, n_min=n_min)
         ds = res.to_xarray().squeeze()
 
         # change categorial index to coordinates
@@ -724,7 +724,7 @@ def mean_skill(
             case _:
                 raise ValueError("Invalid weights specification")
 
-        pmetrics = _parse_metric(metrics)
+        pmetrics = parse_metric(metrics)
         skilldf = (
             self.skill(metrics=pmetrics)
             .to_dataframe()
@@ -809,7 +809,7 @@ def score(
         {'mod': 8.414442957854142}
         """
 
-        metric = _parse_metric(metric)[0]
+        metric = parse_metric(metric)[0]
 
         if not (callable(metric) or isinstance(metric, str)):
             raise ValueError("metric must be a string or a function")