-
Notifications
You must be signed in to change notification settings - Fork 9
Add support for network model results (Res1d) #536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The CI test for the latest Python version fails, and will likely continue to in the future due to Pythonnet dependency of mikeio1d: |
|
Current problem of integration between MIKE IO 1D and ModelSkill A user of MIKE 1D would most often have observational data from a monitor at a specific location (e.g. a flow meter in a pipe, a gauge in a water distribution network, or some operational gauges at a WWTP). Track observations are not relevant (to my knowledge). ModelSkill has two available observation types: The
The workflow today is to manually save observation and model results to Therefore, there's a need for a new Alternative designs Design 1 - network location as kwargs, specific to MIKE 1D obs_12l1 = ms.NetworkLocationObservation(
data = "data/flow_meter.dfs0",
item = 0,
reach = “12l1”, # alternatively catchment=, or node= ...
chainage = “28.410”, # alternatively, x,y coords to choose nearest gridpoint, respecting graph connectivity
name = "Flow Meter A",
)
mod_12l1_v1 = ms.NetworkModelResult(
data="data/network_v1.res1d",
)
mod_12l1_v2 = ms.NetworkModelResult(
data="data/network_v2.res1d",
)
cc = ms.ComparerCollection([
ms.match(obs_12l1, mod_12l1_v1),
ms.match(obs_12l1, mod_12l1_v2),
])Design 2 - generic location interface, implemented by each library obs_12l1 = ms.NetworkLocationObservation(
data = "data/flow_meter.dfs0",
item = 0,
network_location = res.reaches["my_reach"][0] # leave complexity of locating network in mikeio1d
name = "Flow Meter A",
)
mod_12l1_v1 = ms.NetworkModelResult(
data="data/network_v1.res1d",
)
mod_12l1_v2 = ms.NetworkModelResult(
data="data/network_v2.res1d",
)
cc = ms.ComparerCollection([
ms.match(obs_12l1, mod_12l1_v1),
ms.match(obs_12l1, mod_12l1_v2),
])
# this requires updates to mikeio1d (and other network libraries) to implement interfaces somewhat like this
@dataclass(frozen=True)
class NodeLocation(NetowrkLocation):
"""
Minimal explicit node location.
"""
node_id: str
group: str | None
@dataclass(frozen=True)
class ReachLocation(NetworkLocation):
"""
Minimal explicit reach location.
"""
reach_id: str
chainage: float | None
gridpoint_index: int | None # either chainage or gridpoint index need to be provided.
group: str | None
NetworkLocation = NodeLocation | ReachLocation
class GetNetworkQuantityResult(Protocol):
def __call__(self, location: "NetworkLocation", quantity: str) -> pd.Series:
...
There would need to be a plugin mechanism where ModelSkill discovers packages providing these capabilities. This could be determined by file extension, or by a backend parameter where ambiguous. The plugin should happen behind the scenes (e.g. )Work in progress... |
|
Observations will typically be passed to On the other hand, I do see that some sort of Still, I see that this approach is the opposite of @ecomodeller's suggestion here. There, the observations are accessed by reach and chainage, but the model results are accessed directly by name. I will push my suggestions soon, just let me know if you have any thoughts. |
|
A model result covers the entire domain, while each named observation is located in a single position. So in order to assess the quality of a model, several observations distributed across the domain are used. Thus, the location of the observation has to be recorded somewhere, NetCDF can have the location as metadata, while other formats, csv, dfs0 typically this information is store elsewhere and is added to the observation object to be able to extract matching model data from the model result. |
|
In such case we can compare multiple |
In this design, The most important aspect that I see is that we need to be consistent with the existing Modelskill API. Here, the user expects the model result to cover an entire domain and the observation to include sufficient spatial metadata to extract the right data from the modelling domain for comparison. For example, when you compare a dfsu model result with a point observation that has an x, y, then the spatial information exists on the sensor which is sufficient to unambiguously extract data from the dfsu. Unfortunately, with network data, an x,y coordinate is not enough to uniquely identify a time series. |
|
My current implementation works like this. In this case, I use one model and I compare it to multiple observations. I agree with being consistent with |
|
I see. I think the common case is that the sensors would be located at different spatial locations in the network. For example, one might have 5-20 sensors that each exist at a unique locations in the network. How does the implementation handle that case? In the example above, I understand it that these 4 sensors would all exist at a single location in the network (node 7). Here's a a small example of such a use case. In this network, there's two "flow meters" that measure discharge at two unique locations in the network (shown in image below). If I understand it, the current implementation would need to handle it like this: quantity = "Discharge"
mod_item1 = ms.NetworkModelResult(path_to_res1d, quantity, reach="pipe1", chainage=0)
mod_item2 = ms.NetworkModelResult(path_to_res1d, quantity, reach="pipe2", chainage=0)
sens = [
ms.PointObservation(observations, item="sensor_1"),
ms.PointObservation(observations, item="sensor_2"),
]
cmp1 = ms.match(sens[0], mod_item1)
cmp2 = ms.match(sens[1], mod_item2)
cmp1.skill()
cmp2.skill()
|
|
Oh, of course, that make sense. Let me update my changes and let you know. |
I was reflecting on this. Maybe neither of these are the right abstraction yet since they weren't obvious. Perhaps there's a need for both a NetworkModelResult AND a NetworkModelPointResult to make the distinction more explicit. The parallel with the existing modelskill API would be Dfsu/Grid ModelResults and PointModelResult, where the former can be extracted from the latter. Just thinking out loud with this for your consideration |

Uh oh!
There was an error while loading. Please reload this page.