Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,21 @@ This updates `roadmap/README.md` from the feature frontmatter.

## Coding Conventions

### Public vs. Internal API

A name is **private** if any segment of its import path starts with `_`.
Mechanically enforced by `PLC2701`: no module may import a `_foo` name from
another module. Underscores on function names are otherwise a convention —
authors may use them as a hint that a function is file-local, but this is
not required and not enforced.

- Public: `modelskill.matching.match`, `modelskill.utils.rename_coords_xr`
- Private (module path): `modelskill._names.get_name`, `modelskill.timeseries._timeseries.validate_data_var_name`
- File-local hint: a leading `_` on a function name inside any module, marking no cross-module use

See [ADR-012](adr/012-public-private-api-convention.md). Enforced in CI by
Ruff rule `PLC2701` (zero violations in `src/`; tests have a per-file ignore).

### Docstrings
- All docstrings use **NumPy format** (not Google or reStructuredText style)
- Include sections: Parameters, Returns, Raises, Examples, See Also, Notes as appropriate
Expand Down Expand Up @@ -169,7 +184,6 @@ Plots support both matplotlib (static) and plotly (interactive) backends.
- Internal data storage uses xarray Datasets with standardized coordinate/variable names
- Time coordinates use pandas datetime64
- Spatial coordinates: `x`, `y` (and `z` when applicable)
- Reserved names in `_RESERVED_NAMES` should not be used for model/observation names
- The `Quantity` class handles physical quantities with units and validation

### Testing Structure (`tests/`)
Expand Down
56 changes: 56 additions & 0 deletions adr/012-public-private-api-convention.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# ADR-012: Public vs. internal API convention

**Status**: Accepted

**Date**: 2026-05-14

## Context

ModelSkill had accumulated an inconsistent privacy convention. Leading underscores appeared on both **file names** (`_comparison.py`, `_misc.py`, `_timeseries.py`) and **function or constant names** (`_get_name`, `_parse_metric`, `_validate_data_var_name`), with no written rule about what each level meant or how the two interacted. Symptoms:

- `from modelskill.timeseries._timeseries import _validate_data_var_name` — privacy claimed at both the path segment *and* the function name, while the function was imported from 4+ other files across subpackages.
- `timeseries/__init__.py` placed five underscore-prefixed names in `__all__`, simultaneously declaring them "officially public" and "private."
- Pyright (Pylance) flagged `reportPrivateUsage` on every cross-module `_foo` import, generating ambient warnings during development.
- No written policy distinguished "what users can rely on" from "what the package uses internally."

Three candidate definitions of "public" were considered:

- **Strict**: only names in `modelskill.__all__` at the top level.
- **Documented**: anything appearing in `docs/api/`.
- **Conventional**: anything reachable as `modelskill.<...>` without a leading `_` in any path segment.

The conventional definition was chosen because the team did not want documentation gaps to silently shrink the public API: a function reachable via a non-underscored import path could already have been adopted by users, even if undocumented.

## Decision

**Privacy is set by the module path, not the function name.** A name is private if any segment of its import path starts with `_`. The only mechanically enforced rule is `PLC2701`: no module may import a name starting with `_` from another module. A leading `_` on a function name is otherwise a convention with no enforced meaning — authors may use it as a hint that a function is file-local. Inside a private module, where the module path already carries the privacy signal, this hint is optional.

Examples:

- `modelskill.timeseries._timeseries.validate_data_var_name` — private (module path).
- `modelskill._names.get_name` — private (module path).
- `modelskill.metrics._format_directional_label` (hypothetical) — private (function name); used only inside `metrics.py`.
- `modelskill.matching.match` — public.

This is the same convention used by scikit-learn (`sklearn.utils._param_validation.validate_params`).

The convention is mechanically enforced by Ruff rule `PLC2701` (`import-private-name`), which flags `from x import _foo` style imports. After the renames applied with this ADR, the codebase has zero PLC2701 violations in `src/`. Tests are allowed to import private names via a per-file ignore, since white-box testing of internals is a legitimate need.

## Alternatives Considered

**Snapshot-test the public surface.** A test that asserts `dir(modelskill)` matches a frozen list catches accidental public additions but requires updating the expected list on every intentional public change. Rejected as ongoing maintenance cost for limited additional coverage beyond what Ruff and PR review already provide.

**Introduce a `modelskill/_internal/` subpackage.** Concentrating all cross-cutting internal helpers in one location would make "shared internal API" structurally obvious. Rejected as pure churn: the helpers already live in natural subpackage homes, and moving them does not change the privacy story under this convention.

**Rename `utils.py` → `_utils.py` and rely on a deprecation shim.** Would break legitimately-public functions (`rename_coords_xr`, `rename_coords_pd`, `make_unique_index`) that are imported from non-underscored paths internally and could already be used by downstream consumers. The actual cross-cutting *private* helpers (`_get_name`, `_get_idx`, `_RESERVED_NAMES`) were split out into a new `modelskill/_names.py` module instead, leaving `utils.py` as a public module. The regression helper that previously lived as `metrics._linear_regression` was split into its own single-purpose private module `modelskill/_regression.py`. Naming the new module `_names.py` rather than `_utils.py` is deliberate: it describes the concern (reserved names plus name/index resolution) instead of a contentless role.

**Keep the status quo and configure Pyright to silence `reportPrivateUsage`.** Rejected because the warning is genuinely useful for catching future drift; removing the noise by suppression would also remove the signal.

## Consequences

- **Cross-module use of internal helpers is no longer flagged.** `from modelskill._names import get_name` is unambiguously legal; the module path says "internal." Pyright's `reportPrivateUsage` falls silent.
- **Function-name underscores remain a convention.** `PLC2701` enforces that no `_foo` may be imported across modules. Beyond that, underscores carry no formal meaning: authors may use them as a navigation hint that a function is file-local, in either public or private modules.
- **Future contributors have a clear default.** New internal helpers go in `_modules`, not in `_underscored_names` inside public modules. PRs that put internal helpers in `metrics.py` or `utils.py` with a leading underscore now stand out as the exception rather than the norm.
- **One breaking change is accepted.** Five underscored names were previously re-exported from `modelskill.timeseries` via `__all__` (`_parse_track_input`, etc.). These names are no longer reachable from `modelskill.timeseries`; the renamed functions live in `modelskill.timeseries._point`, `_track`, `_vertical`. By Python convention, leading-underscore names were never part of the stable API, so this is consistent with the new policy.
- **Tests retain access to internals** via a `tests/**/*.py` per-file ignore for `PLC2701`. White-box testing of private modules remains supported.
- **`metrics.py` no longer hosts `_parse_metric` or `_linear_regression`.** They moved to `comparison/_utils.py:parse_metric` and `_regression.py:linear_regression` respectively. The first is consumer-specific to the comparison subpackage; the second is a general regression helper used by both `metrics.lin_slope` and `plotting._scatter`.
1 change: 1 addition & 0 deletions adr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Each ADR follows this structure:
- [ADR-009](009-factory-pattern.md) - Factory pattern for type detection
- [ADR-010](010-optional-domain-dependencies.md) - Optional dependencies for domain-specific model types (Draft)
- [ADR-011](011-vertical-pre-extracted-columns.md) - VerticalModelResult ingests pre-extracted columns
- [ADR-012](012-public-private-api-convention.md) - Public vs. internal API convention

## Contributing

Expand Down
8 changes: 7 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,13 @@ extend-exclude = ["notebooks"]

[tool.ruff.lint]
ignore = ["E501"]
select = ["E4", "E7", "E9", "F", "D200", "D205"]
select = ["E4", "E7", "E9", "F", "D200", "D205", "PLC2701"]
preview = true
explicit-preview-rules = true

[tool.ruff.lint.per-file-ignores]
# Tests legitimately exercise package internals.
"tests/**/*.py" = ["PLC2701"]

[tool.mypy]
python_version = "3.10"
Expand Down
50 changes: 50 additions & 0 deletions src/modelskill/_names.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
"""Reserved coordinate/variable names and name-or-index resolution.

Centralises the names ModelSkill reserves on its xarray data structures and
the small resolver used wherever a user supplies a name or positional index
against a list of valid names. Private per ADR-012.
"""

from __future__ import annotations
from typing import Sequence, cast
from collections.abc import Hashable


RESERVED_COORD_NAMES = ["x", "y", "z", "time"]
RESERVED_COMPARER_VAR_NAMES = [*RESERVED_COORD_NAMES, "Observation"]


def get_name(x: int | str | None, valid_names: Sequence[Hashable]) -> str:
"""Parse name/idx from list of valid names (e.g. obs from obs_names), return name."""
return cast(str, valid_names[get_idx(x, valid_names)])


def get_idx(x: int | str | None, valid_names: Sequence[Hashable]) -> int:
"""Parse name/idx from list of valid names (e.g. obs from obs_names), return idx."""

if x is None:
if len(valid_names) == 1:
return 0
else:
raise ValueError(
f"Multiple items available. Must specify name or index. Available items: {valid_names}"
)

n = len(valid_names)
if n == 0:
raise ValueError(f"Cannot select {x} from empty list!")
elif isinstance(x, str):
if x in valid_names:
idx = valid_names.index(x)
else:
raise KeyError(f"Name {x} could not be found in {valid_names}")
elif isinstance(x, int):
if x < 0:
x += n
if x >= 0 and x < n:
idx = x
else:
raise IndexError(f"Id {x} is out of range for {valid_names}")
else:
raise TypeError(f"Input {x} invalid! Must be None, str or int, not {type(x)}")
return idx
40 changes: 40 additions & 0 deletions src/modelskill/_regression.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
"""Linear regression helper shared by ``metrics.lin_slope`` and the scatter trendline.

Private per ADR-012: the leading underscore on the module name signals internal use.
"""

from __future__ import annotations
from typing import Tuple

import numpy as np
from numpy.typing import ArrayLike


def linear_regression(
obs: ArrayLike, model: ArrayLike, reg_method: str = "ols"
) -> Tuple[float, float]:
"""Fit a linear regression of ``model`` against ``obs`` and return (slope, intercept)."""
if len(obs) == 0: # type: ignore[arg-type]
return np.nan, np.nan

if reg_method == "ols":
from scipy.stats import linregress

reg = linregress(obs, model)
intercept = reg.intercept
slope = reg.slope
elif reg_method == "odr":
from scipy import odr

data = odr.Data(obs, model)
odr_obj = odr.ODR(data, odr.unilinear)
output = odr_obj.run()

intercept = output.beta[1]
slope = output.beta[0]
else:
raise NotImplementedError(
f"Regression method: {reg_method} not implemented, select 'ols' or 'odr'"
)

return slope, intercept
36 changes: 18 additions & 18 deletions src/modelskill/comparison/_collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,13 @@
from ..skill import SkillTable
from ..skill_grid import SkillGrid

from ..utils import _get_name
from .._names import get_name
from ._comparison import Comparer
from ..metrics import _parse_metric
from ._utils import (
_add_spatial_grid_to_df,
_groupby_df,
_parse_groupby,
add_spatial_grid_to_df,
groupby_df,
parse_groupby,
parse_metric,
IdxOrNameTypes,
TimeTypes,
)
Expand Down Expand Up @@ -230,7 +230,7 @@ def __getitem__(
return ComparerCollection([self[i] for i in idxs])

if isinstance(x, int):
name = _get_name(x, self.obs_names)
name = get_name(x, self.obs_names)
return self._comparers[name]

if isinstance(x, Iterable):
Expand Down Expand Up @@ -332,16 +332,16 @@ def sel(
models = [model]
else:
models = list(model)
mod_names: List[str] = [_get_name(m, self.mod_names) for m in models]
mod_names: List[str] = [get_name(m, self.mod_names) for m in models]
if observation is None:
observation = self.obs_names
else:
observation = [observation] if np.isscalar(observation) else observation # type: ignore
observation = [_get_name(o, self.obs_names) for o in observation] # type: ignore
observation = [get_name(o, self.obs_names) for o in observation] # type: ignore

if (quantity is not None) and (self.n_quantities > 1):
quantity = [quantity] if np.isscalar(quantity) else quantity # type: ignore
quantity = [_get_name(v, self.quantity_names) for v in quantity] # type: ignore
quantity = [get_name(v, self.quantity_names) for v in quantity] # type: ignore
else:
quantity = self.quantity_names

Expand Down Expand Up @@ -482,14 +482,14 @@ def skill(
"""
cc = self

pmetrics = _parse_metric(metrics)
pmetrics = parse_metric(metrics)

agg_cols = _parse_groupby(by, n_mod=cc.n_models, n_qnt=cc.n_quantities)
agg_cols = parse_groupby(by, n_mod=cc.n_models, n_qnt=cc.n_quantities)
agg_cols, attrs_keys = self._attrs_keys_in_by(agg_cols)

df = cc._to_long_dataframe(attrs_keys=attrs_keys, observed=observed)

res = _groupby_df(df, by=agg_cols, metrics=pmetrics)
res = groupby_df(df, by=agg_cols, metrics=pmetrics)
mtr_cols = [m.__name__ for m in pmetrics] # type: ignore
res = res.dropna(subset=mtr_cols, how="all") # TODO: ok to remove empty?
res = self._append_xy_to_res(res, cc)
Expand Down Expand Up @@ -634,19 +634,19 @@ def gridded_skill(
"""
cmp = self

metrics = _parse_metric(metrics)
metrics = parse_metric(metrics)

df = cmp._to_long_dataframe()
df = _add_spatial_grid_to_df(df=df, bins=bins, binsize=binsize)
df = add_spatial_grid_to_df(df=df, bins=bins, binsize=binsize)

agg_cols = _parse_groupby(by, n_mod=cmp.n_models, n_qnt=cmp.n_quantities)
agg_cols = parse_groupby(by, n_mod=cmp.n_models, n_qnt=cmp.n_quantities)
if "x" not in agg_cols:
agg_cols.insert(0, "x")
if "y" not in agg_cols:
agg_cols.insert(0, "y")

df = df.drop(columns=["x", "y"]).rename(columns=dict(xBin="x", yBin="y"))
res = _groupby_df(df, by=agg_cols, metrics=metrics, n_min=n_min)
res = groupby_df(df, by=agg_cols, metrics=metrics, n_min=n_min)
ds = res.to_xarray().squeeze()

# change categorial index to coordinates
Expand Down Expand Up @@ -724,7 +724,7 @@ def mean_skill(
case _:
raise ValueError("Invalid weights specification")

pmetrics = _parse_metric(metrics)
pmetrics = parse_metric(metrics)
skilldf = (
self.skill(metrics=pmetrics)
.to_dataframe()
Expand Down Expand Up @@ -809,7 +809,7 @@ def score(
{'mod': 8.414442957854142}
"""

metric = _parse_metric(metric)[0]
metric = parse_metric(metric)[0]

if not (callable(metric) or isinstance(metric, str)):
raise ValueError("metric must be a string or a function")
Expand Down
Loading
Loading