diff --git a/examples/collect_observations/ccss_swe_collect_observations.ipynb b/examples/collect_observations/ccss_swe_collect_observations.ipynb
index a315d78..e257e0a 100644
--- a/examples/collect_observations/ccss_swe_collect_observations.ipynb
+++ b/examples/collect_observations/ccss_swe_collect_observations.ipynb
@@ -259,25 +259,11 @@
     "for i in gdf_in_bbox.index:\n",
     "    observation_utils.getCCSSData(gdf_in_bbox.name[i], gdf_in_bbox.code[i], 'Ca', StartDate, EndDate, OBS_OutputFolder)"
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7e4506d2",
-   "metadata": {},
-   "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "208439e6",
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "cssi_evaluation",
+   "display_name": "nwm_env",
    "language": "python",
    "name": "python3"
   },
diff --git a/examples/nwm/nwm_swe_point_scale_evaluation.ipynb b/examples/nwm/nwm_swe_point_scale_evaluation.ipynb
index c01c054..999dd20 100644
--- a/examples/nwm/nwm_swe_point_scale_evaluation.ipynb
+++ b/examples/nwm/nwm_swe_point_scale_evaluation.ipynb
@@ -244,7 +244,7 @@
    "outputs": [],
    "source": [
     "# Create a folder to save results. Raise an error if the path already exists\n",
-    "Path(OBS_OutputFolder).mkdir(exist_ok=False)"
+    "Path(OBS_OutputFolder).mkdir(exist_ok=True)"
    ]
   },
   {
@@ -335,7 +335,7 @@
    "outputs": [],
    "source": [
     "# Create a folder to save results. If the folder already exists, an error will be raised.\n",
-    "Path(MOD_OutputFolder).mkdir(exist_ok=False)\n",
+    "Path(MOD_OutputFolder).mkdir(exist_ok=True)\n",
     "\n",
     "# Download NWM SWE data for the sites within the watershed bounding box, save to /mod_outputs folder\n",
     "input_crs = 'EPSG:4326' # WGS84 lat/lon. This is the CRS of the input geodataframe (gdf_in_bbox) \n",
@@ -842,7 +842,7 @@
    "source": [
     "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
     "<h4>🧠 Reflect</h4>\n",
-    "<p>Looking at the two plots, what could be some reasons for the model having simulated peak SWE so much earlier and less than observed in Paradise Meadow (PDS)? Perhaps try changing the <code>my_site_code</code> from earlier in the notebook to rerun <code>nwm_utils.comparison_plots()</code> to see the timeseries. \n",
+    "<p>Looking at the two plots, what could be some reasons for the model having simulated peak SWE so much earlier and less than observed in Paradise Meadow (PDS)? Perhaps try changing the <code>my_site_code</code> from earlier in the notebook to rerun <code>plot_utils.comparison_plots()</code> to see the timeseries. \n",
     "\n",
     "<br>What happens if you change the year that is plotted? <br>✏️ Try modifying the bar plot code from <code>bar1 = year1.hvplot.bar</code> to <code>bar1 = year2.hvplot.bar</code> </p>\n",
     "</div>"
@@ -1086,6 +1086,56 @@
     "metrics_df"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "2cfc514c",
+   "metadata": {},
+   "source": [
+    "Look at plots of summary statistics for each station. Here we look at Bias and NSE for each station in the watershed:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6d6b779b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Bias scatter\n",
+    "scatter = metrics_df.hvplot.scatter(\n",
+    "    x='site_id',\n",
+    "    y='bias',\n",
+    "    size=100,\n",
+    "    rot=45,\n",
+    "    ylabel='Bias (m)',\n",
+    "    title='Mean SWE Bias by Station'\n",
+    ")\n",
+    "\n",
+    "hline = hv.HLine(0).opts(color='black', line_dash='dashed', line_width=1)\n",
+    "\n",
+    "scatter * hline"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "99110e67",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# NSE histogram\n",
+    "metrics_df.hvplot.bar(\n",
+    "    x='site_id',\n",
+    "    y='nse',\n",
+    "    rot=45,\n",
+    "    ylabel='Nash–Sutcliffe Efficiency',\n",
+    "    title='NSE by Station',\n",
+    "    height=400,\n",
+    "    width=600,\n",
+    "    bar_width=0.5\n",
+    ")\n"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "5c521c83",
@@ -1099,6 +1149,30 @@
     "</div> "
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dd685949",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plot_utils.comparison_plots(obs_df,  model_df, f'{my_site_code}', f'{my_site_code}', site_label=None)\n",
+    "\n",
+    "# Change the site code to see other Snotel Stations --> e.g., '688:CO:SNTL'\n",
+    "#plot_utils.comparison_plots(obs_df,  model_df, '688:CO:SNTL', '688:CO:SNTL', site_label=None)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "965e400b",
+   "metadata": {},
+   "source": [
+    "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
+    "<h4>🧠 Reflect</h4>\n",
+    "<p>You now have several performance metrics: Bias, Pearson Correlation, Spearman Correlation, NSE, and KGE. If you had to pick just one metric to summarize model performance, which would you choose—and why? As you review the results, compare the peak flow amounts and the timing of snowmelt onset. Do you see any significant differences? Which dataset indicates an earlier melt?</p>\n",
+    "</div>"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "ce04cad1",
@@ -1141,53 +1215,6 @@
     "For some sites, Pearson and Spearman correlations are both close to 1, suggesting a strong relationship between observed and modeled SWE. As shown on the timeseries plot, this strong correlation alone does not indicate a \"good\" model. For example, it does not guarantee accurate timing of key events, such as peak SWE or melt onset. Let's compare these as well. The following code uses `report_max_dates_and_values` function to identify the peak SWE value and the date it occurs for both the observed (CCSS) and modeled (NWM) datasets. "
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "965e400b",
-   "metadata": {},
-   "source": [
-    "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
-    "<h4>🧠 Reflect</h4>\n",
-    "<p>You now have several performance metrics: Bias, Pearson Correlation, Spearman Correlation, NSE, and KGE. If you had to pick just one metric to summarize model performance, which would you choose—and why? As you review the results, compare the peak flow amounts and the timing of snowmelt onset. Do you see any significant differences? Which dataset indicates an earlier melt?</p>\n",
-    "</div>"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "99110e67",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "metrics_df.hvplot.bar(\n",
-    "    x='site_id',\n",
-    "    y='nse',\n",
-    "    rot=45,\n",
-    "    ylabel='Nash–Sutcliffe Efficiency',\n",
-    "    title='NSE by Station',\n",
-    "    height=400,\n",
-    "    width=600,\n",
-    "    bar_width=0.5\n",
-    ")\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "6d6b779b",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "metrics_df.hvplot.scatter(\n",
-    "    x='site_id',\n",
-    "    y='bias',\n",
-    "    size=100,\n",
-    "    rot=45,\n",
-    "    ylabel='Bias (m)',\n",
-    "    title='Mean SWE Bias by Station'\n",
-    ")\n"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "dc3c56e9",
@@ -1201,19 +1228,11 @@
    "id": "310c309c",
    "metadata": {},
    "source": [
+    "One way to learn more about the model performance is to combine metrics that tell us about different aspects of model behavior—such as timing, variability, and magnitude—rather than relying on a single summary measure.\n",
+    "\n",
     "The Condon diagram separates model performance into quadrants based on two metrics: **Spearman’s rho** (shape/time agreement) and **relative bias** (magnitude error). The horizontal line at 0.5 distinguishes whether the model captures the temporal pattern well (above 0.5 = good shape), while the vertical line is traditionally placed at a relative bias of 1.0, which represents a 100% error. This means the model’s total error is as large as the observed signal itself. This threshold has a clear physical interpretation and is used in the original Condon framework to distinguish acceptable vs. large bias. "
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "10a68672",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%reload_ext autoreload"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -1223,14 +1242,6 @@
    "source": [
     "plot_utils.plot_condon_diagram(metrics_df, variable=\"SWE\")"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a2ae0058-3437-4ee0-8f0a-5bc1a289da37",
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {
diff --git a/examples/parflow/parflow_swe_point_scale_evaluation.ipynb b/examples/parflow/parflow_swe_point_scale_evaluation.ipynb
index 88d04c8..c99c12e 100644
--- a/examples/parflow/parflow_swe_point_scale_evaluation.ipynb
+++ b/examples/parflow/parflow_swe_point_scale_evaluation.ipynb
@@ -2,93 +2,26 @@
  "cells": [
   {
    "cell_type": "markdown",
+   "id": "70e9e22e",
    "metadata": {},
    "source": [
     "![NWM](../img/NWM.png)\n",
     "\n",
-    "# Use HydroData to Retrieve Modeled and Observed Snow Data for a Watershed of Interest with ParFlow-CONUS Outputs vs Observed Snow Water Equivalent (SWE) - Full Evaluation Workflow\n",
+    "# Use HydroData to Retrieve Modeled and Observed Snow Data for a Watershed of Interest \n",
+    "## ParFlow-CONUS Outputs vs Observed Snow Water Equivalent (SWE) \n",
+    "\n",
     "Authors: Irene Garousi-Nejad (igarousi@cuahsi.org), Danielle Tijerina-Kreuzer (dtijerina@cuahsi.org)  \n",
-    "Last updated: Feb 2026"
+    "Last updated: April 2026"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "e84e3989",
    "metadata": {},
    "source": [
     "#### Introduction:  \n",
-    "This notebook demonstrates how to perform a point-scale analysis comparing modeled and observed SWE at selected SNOTEL sites.  We focus on analyzing model performance both for **a single SNOTEL site** and **watershed-scale behavior for multiple stations**, with particular attention to the **magnitude and timing of peak SWE**.   \n",
-    "\n",
-    "# FIX THIS: This notebook requires ParFlow-CONUS output, SNOTEL data, and metadata CSVs that are created in the `01_HydroData_collection.ipynb` notebook."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 1. Prepare the Python Environment"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Ensure that the `nwm_env` conda environment is selected as your Jupyter kernel. This environment should already be created if you followed the instructions under section \"Creating your HydroLearnEnv Virtual Environment\" in the `getting_started.md` file."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Import the libraries needed to run this notebook:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "import os\n",
-    "import sys\n",
-    "import glob\n",
-    "from pathlib import Path\n",
-    "\n",
-    "import holoviews as hv\n",
-    "import hvplot.pandas\n",
-    "import geopandas as gpd\n",
-    "import pandas as pd\n",
-    "\n",
-    "# Import the Evaluation library from the project root.\n",
-    "sys.path.append(str((Path.cwd().absolute() / \"../../../src\").resolve()))\n",
-    "from cssi_evaluation.snow import utils\n",
-    "from cssi_evaluation import evaluation_metrics\n",
-    "\n",
-    "# import notebook-specific utilities\n",
-    "from utils import nwm_utils\n",
-    "\n",
-    "hv.extension('bokeh')\n",
-    "\n",
-    "\n",
-    "%load_ext autoreload\n",
-    "%autoreload 2"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "60a74589",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "markdown",
-   "id": "19adce70",
-   "metadata": {},
-   "source": [
-    "# Get data from Hydrodata - from `dataCollectionHydrodata_parflow.ipynb` notebook and needs to be merged into this notebook"
+    "This notebook demonstrates how to perform a point-scale analysis comparing modeled and observed SWE at selected SNOTEL sites, implementing a full Model Evaluation Workflow. \n",
+    "We focus on analyzing model performance both for **a single SNOTEL site** and **watershed-scale behavior for multiple stations**, with particular attention to the **magnitude and timing of peak SWE**.   "
    ]
   },
   {
@@ -106,7 +39,7 @@
    "source": [
     "### 1a. Python Environment  \n",
     "\n",
-    "Ensure that the `nwm_env` conda environment is selected as your Jupyter kernel. This environment should already be created if you followed the instructions under section \"Creating your HydroLearnEnv Virtual Environment\" in the `getting_started.md` file."
+    "Ensure that the `cssi_evaluation` conda environment is selected as your Jupyter kernel. This environment should already be created if you followed the instructions under section \"Creating your HydroLearnEnv Virtual Environment\" in the `GettingStarted.md` file."
    ]
   },
   {
@@ -120,20 +53,20 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "0dfc3fb3",
-   "metadata": {},
+   "id": "ce97d33e",
+   "metadata": {
+    "tags": []
+   },
    "outputs": [],
    "source": [
     "import os\n",
     "import sys\n",
-    "\n",
-    "prefix = os.environ['CONDA_PREFIX']\n",
-    "os.environ['PROJ_LIB'] = os.path.join(prefix, 'share', 'proj')\n",
-    "\n",
-    "# add the src directory to the path so we can import evaluation modules\n",
-    "sys.path.append('../../src/')\n",
-    "\n",
-    "import sys\n",
+    "from pathlib import Path\n",
+    "import holoviews as hv\n",
+    "import hvplot.pandas\n",
+    "import hvplot.xarray\n",
+    "import plotly.io as pio\n",
+    "import plotly.express as px\n",
     "import pyproj\n",
     "import pandas as pd\n",
     "import numpy as np\n",
@@ -144,14 +77,19 @@
     "import matplotlib.dates as mdates\n",
     "import hf_hydrodata as hf\n",
     "import subsettools\n",
-    "import hvplot.xarray\n",
     "\n",
     "\n",
-    "from cssi_evaluation.utils import plot_utils\n",
+    "# Import the Evaluation library from the project root.\n",
+    "sys.path.append(str((Path.cwd().absolute() / \"../../src\").resolve()))\n",
+    "\n",
+    "from cssi_evaluation.variables import snow_utils\n",
+    "from cssi_evaluation.utils import metric_utils, evaluation_utils, plot_utils\n",
+    "\n",
+    "hv.extension('bokeh')\n",
     "\n",
     "\n",
     "%load_ext autoreload\n",
-    "%autoreload 2\n"
+    "%autoreload 2"
    ]
   },
   {
@@ -174,7 +112,7 @@
     "# You need to register on https://hydrogen.princeton.edu/pin \n",
     "# and run the following with your registered information\n",
     "# before you can use the hydrodata utilities\n",
-    "hf.register_api_pin(\"dtt2@princeton.edu\", \"7837\")"
+    "hf.register_api_pin(\"your_email\", \"your_api_pin\")"
    ]
   },
   {
@@ -212,7 +150,7 @@
    "id": "b8620cfc",
    "metadata": {},
    "source": [
-    "## 2. Set Paths"
+    "### 1d. Set Paths"
    ]
   },
   {
@@ -226,11 +164,11 @@
     "StartDate = '2003-10-01'\n",
     "EndDate = '2005-09-30'\n",
     "\n",
-    "domain_data_path = 'examples/parflow/domain_data/' # path to the model domain data\n",
+    "domain_data_path = './domain_data/' # path to the model domain data\n",
     "\n",
     "# Path to save results (obs and mod stands for observation and modeled, respectively)\n",
-    "OBS_OutputFolder = './obs_outputs_PF' \n",
-    "MOD_OutputFolder = './mod_outputs_PF'"
+    "OBS_OutputFolder = './obs_outputs' \n",
+    "MOD_OutputFolder = './mod_outputs'"
    ]
   },
   {
@@ -238,7 +176,7 @@
    "id": "feb58871",
    "metadata": {},
    "source": [
-    "## 3. Retrieve Observed Snow Data "
+    "## 2. Retrieve SNOTEL point observations and metadata from HydroData  "
    ]
   },
   {
@@ -246,7 +184,7 @@
    "id": "45ca2832",
    "metadata": {},
    "source": [
-    "### 3a. Define the watershed of interest\n",
+    "### 2a. Define the watershed of interest\n",
     "\n",
     "One of the simplest ways to gather data and model output from HydroData is by specifying a [Hydrologic Unit Code](https://www.usgs.gov/national-hydrography/watershed-boundary-dataset). Before we retrieve any hydrologic information, we need to indicate a HUC8 code and use it to gather snow water equivalent (SWE) observations from SNOTEL sites  \n",
     "\n",
@@ -261,174 +199,86 @@
    "outputs": [],
    "source": [
     "# ✏️ Specify HUC8 ID and Name for watershed of interest\n",
-    "huc_8_code = '14020001'  # East-Taylor HUC-8\n",
+    "huc_8_code = '14010001'  # East-Taylor HUC-8\n",
     "print(f'HUC-8 ID: {huc_8_code}')\n",
     "\n",
-    "huc_8_name = 'East-Taylor'\n",
+    "huc_8_name = 'Colorado-Headwaters'\n",
     "print(f'HUC-8 name: {huc_8_name}')"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "5de02c3b",
+   "id": "84549c32",
    "metadata": {},
    "source": [
-    "Use the Subsettools function `define_huc_domain()` to get the actual CONUS1 indices associated with the East-Taylor HUC-O8. It returns a tuple `(imin, jmin, imax, jmax)` of grid indices that define a bounding box containing our region (or point) of interest (Note: (imin, jmin, imax, jmax) are the west, south, east and north boundaries of the box respectively) and a mask for that domain."
+    "### 2b. Explore the available SWE data in a watershed  "
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "965bd6ea",
+   "cell_type": "markdown",
+   "id": "de83d6b6",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "ij_bounds, mask = subsettools.define_huc_domain([huc_8_code], 'conus1')\n",
-    "\n",
-    "np.save(f'{domain_data_path}domainMask_{huc_8_name}_conus1.npy', mask)\n",
-    "\n",
-    "plt.imshow(mask, origin='lower')\n",
-    "print(ij_bounds)\n",
-    "print(mask.shape)"
+    "Explore what SWE data is available at SNOTEL sites within the HUC ID you specified that operated during WY2004 and WY2005."
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "61745fb2",
+   "id": "e088705e",
    "metadata": {},
    "source": [
-    "Using the domain mask and the i,j PF-CONUS1 indices, we use a hf_hydrodata function to find and save the associated grid cell center lat/lon pair for each grid cell in the domain. "
+    "<div style=\"color:black;background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
+    "<h4>📖 Did you know?</h4>\n",
+    "<p>The Snow Telemetry (SNOTEL) network, managed by the USDA Natural Resources Conservation Service (NRCS), monitors snowpack conditions across key watersheds in the western United States to support water supply forecasting and climate monitoring. SNOTEL sites are fully automated stations that continuously measure snow water equivalent (SWE), snow depth, precipitation, temperature, and other meteorological variables throughout the year. Unlike manual snow survey programs, SNOTEL provides high-temporal-resolution observations that enable near–real-time assessment of snowpack evolution and interannual variability. These data are widely used for operational forecasting, drought assessment, and long-term climate analysis. </p>\n",
+    "</div>"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "1ff5c1d4",
+   "id": "d74eeccb",
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Extract bounds\n",
-    "i_min, j_min, i_max, j_max = ij_bounds\n",
-    "mask_shape = mask.shape #shape of the subset rectangular domain\n",
-    "\n",
-    "# Create i/j index ranges\n",
-    "i_vals = np.arange(i_min, i_max)\n",
-    "j_vals = np.arange(j_min, j_max)\n",
-    "\n",
-    "# Create full 2D grid (note indexing order carefully)\n",
-    "jj, ii = np.meshgrid(j_vals, i_vals, indexing=\"ij\")"
+    "# Create a folder to save observations\n",
+    "Path(OBS_OutputFolder).mkdir(exist_ok=True)"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "c6ae90bf",
+   "id": "fb365fbf",
    "metadata": {},
    "source": [
-    "Because the function `hf.to_latlon()` finds the coordinates at the lower left corner of a grid cell, we add 0.5 to each i,j index pair to find the **lat/lon at the grid cell center**."
+    "#### Retrieve the metadata for the SNOTEL stations in the watershed:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "13fbc81f",
+   "id": "1cb40a7c",
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Compute grid cell centers\n",
-    "ii_center = ii + 0.5\n",
-    "jj_center = jj + 0.5\n",
+    "# Request site-level attributes for these sites\n",
+    "metadata_df = hf.get_point_metadata(dataset=\"snotel\", variable=\"swe\", temporal_resolution=\"daily\", aggregation=\"sod\",\n",
+    "                                 date_start=StartDate, date_end=EndDate,\n",
+    "                                 huc_id=[huc_8_code], grid='conus1')\n",
     "\n",
-    "# Convert to lat/lon (vectorized loop)\n",
-    "lat = np.zeros(mask_shape)\n",
-    "lon = np.zeros(mask_shape)\n",
+    "# save\n",
+    "metadata_df.to_csv(f'./{OBS_OutputFolder}/df_{huc_8_name}_{huc_8_code}_SNOTEL_metadata.csv', index=False)\n",
     "\n",
-    "for r in range(mask_shape[0]):\n",
-    "    for c in range(mask_shape[1]):\n",
-    "        lat[r, c], lon[r, c] = hf.to_latlon(\"conus1\",\n",
-    "                                            ii_center[r, c],\n",
-    "                                            jj_center[r, c])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "aa20e59f",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Save 2D arrays of Lat & Lon\n",
-    "np.save(f\"{domain_data_path}{huc_8_name}_{huc_8_code}_lat_2d.npy\", lat)\n",
-    "np.save(f\"{domain_data_path}{huc_8_name}_{huc_8_code}_lon_2d.npy\", lon)\n",
-    "\n",
-    "# Save the Lat & Lon in a df (as CSV) matched with the ParFlow-CONUS i,j indices \n",
-    "grid_df = pd.DataFrame({\n",
-    "    \"i\": ii.ravel(),\n",
-    "    \"j\": jj.ravel(),\n",
-    "    \"lat\": lat.ravel(),\n",
-    "    \"lon\": lon.ravel(),\n",
-    "})\n",
-    "grid_df.to_csv(f\"{domain_data_path}df_{huc_8_name}_{huc_8_code}_conus1_gridPoints_LatLon.csv\", index=False)\n",
-    "\n",
-    "# Save a shapefile of the watershed Lat & Lon\n",
-    "grid_gdf = gpd.GeoDataFrame(\n",
-    "    grid_df,\n",
-    "    geometry=gpd.points_from_xy(grid_df.lon, grid_df.lat),\n",
-    "    crs=\"EPSG:4326\"\n",
-    ")\n",
-    "# Save the grid points / GeoDataFrame to a shapefile for later use\n",
-    "grid_gdf.to_file(f\"{domain_data_path}{huc_8_name}_{huc_8_code}_conus1_gridPoints_LatLon.shp\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "41e8d63a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "grid_df"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "84549c32",
-   "metadata": {},
-   "source": [
-    "### 3b. Explore the available SWE data in a watershed  "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e088705e",
-   "metadata": {},
-   "source": [
-    "<div style=\"color:black;background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
-    "<h4>📖 Did you know?</h4>\n",
-    "<p>The Snow Telemetry (SNOTEL) network, managed by the USDA Natural Resources Conservation Service (NRCS), monitors snowpack conditions across key watersheds in the western United States to support water supply forecasting and climate monitoring. SNOTEL sites are fully automated stations that continuously measure snow water equivalent (SWE), snow depth, precipitation, temperature, and other meteorological variables throughout the year. Unlike manual snow survey programs, SNOTEL provides high-temporal-resolution observations that enable near–real-time assessment of snowpack evolution and interannual variability. These data are widely used for operational forecasting, drought assessment, and long-term climate analysis. </p>\n",
-    "</div>"
+    "# View first five records\n",
+    "metadata_df.head(5)"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "de83d6b6",
-   "metadata": {},
-   "source": [
-    "Explore what SWE data is available at sites within the HUC ID you specified that operated during WY2004 and WY2005. If you want to check other variables besides SWE, you can change the `variable` argument name. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "3aa8210e",
+   "id": "d17b371a",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "avail_df = hf.get_site_variables(variable = \"swe\",\n",
-    "                        huc_id = [huc_8_code], grid = 'conus1',\n",
-    "                        date_start = StartDate, date_end = EndDate)\n",
+    "The **metadata file** in HydroData is an important addition to the observations and it is recommended to always gather and save this for the observations you are using (particularly to support reproducibility within an open-science workflow). The saved file has useful attributes like site names, first and last date of available data, lat/lon, and the query URL.  \n",
     "\n",
-    "# View first five records\n",
-    "avail_df.head(5)"
+    ">Additionally, the metadata contains **ParFlow-CONUS1 and ParFlow-CONUS2 `i,j` indices, which indicate the exact model domain grid cell the observation aligns with**. This is a useful HydroData feature that removes the need for users to manually match station latitude/longitude coordinates to the appropriate model grid cell, as this spatial mapping is handled directly within HydroData. We will use these indices below to extract PF-CONUS1 modeled SWE for each SNOTEL station in the section below. "
    ]
   },
   {
@@ -436,47 +286,47 @@
    "id": "268915dc",
    "metadata": {},
    "source": [
-    "### 3c. Map the SNOTEL stations inside the HUC-08 watershed that have available data in the selected time range  \n",
-    "To note here, we are using pre-loaded shape files for the East-Taylor HUC8, which are located in the `/domain_data/` directory."
+    "#### Map the SNOTEL stations inside the HUC-08 watershed that have available data within the selected time range   "
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "f5c95f67",
+   "id": "ffd57847",
    "metadata": {},
    "outputs": [],
    "source": [
-    "### Select station locations that fall within the HUC8 watershed\n",
-    "\n",
-    "# Path to the watershed shapefile that was just created\n",
-    "watershed = f'{domain_data_path}East-Taylor_14020001.shp'\n",
-    "watershed_gdf = gpd.read_file(watershed).to_crs(epsg=4326)\n",
-    "\n",
-    "# Create GeoDataFrame of all available stations\n",
-    "filtered_all_stations_gdf = gpd.GeoDataFrame(\n",
-    "    avail_df,\n",
-    "    geometry=gpd.points_from_xy(\n",
-    "        avail_df.longitude,\n",
-    "        avail_df.latitude\n",
-    "    ),\n",
-    "    crs=\"EPSG:4326\"\n",
-    ")\n",
-    "print(\"Sites CRS:\", filtered_all_stations_gdf.crs)\n",
-    "\n",
-    "# Combine watershed polygons into one geometry\n",
-    "watershed_union = watershed_gdf.geometry.unary_union\n",
-    "\n",
-    "# Filter stations that fall within the watershed\n",
-    "sites_in_watershed = filtered_all_stations_gdf[\n",
-    "    filtered_all_stations_gdf.geometry.within(watershed_union)\n",
-    "].copy()\n",
-    "\n",
-    "sites_in_watershed.reset_index(drop=True, inplace=True)\n",
+    "# Let's define a helper plotting function to use throughout the notebook\n",
+    "def plot_sites_on_map(metadata_df, color_by_site_type=False):\n",
+    "    \"\"\"\n",
+    "    Plots site locations on an interactive map using Plotly.\n",
+    "    \n",
+    "    Parameters:\n",
+    "    - metadata_df: DataFrame containing site metadata with 'latitude', 'longitude', 'site_name', and 'site_id' columns.\n",
+    "    - color_by_site_type: Boolean indicating whether to color points by site type.\n",
+    "    \"\"\"\n",
+    "    pio.renderers.default = \"notebook\"\n",
+    "\n",
+    "    center = {\n",
+    "        \"lat\": metadata_df[\"latitude\"].mean(),\n",
+    "        \"lon\": metadata_df[\"longitude\"].mean()\n",
+    "    }\n",
     "\n",
-    "print(f\"Total sites in watershed: {len(sites_in_watershed)}\")\n",
+    "    fig = px.scatter_map(\n",
+    "        metadata_df,\n",
+    "        lat=\"latitude\",\n",
+    "        lon=\"longitude\",\n",
+    "        color=\"site_type\" if color_by_site_type else None,\n",
+    "        hover_name=\"site_name\",      # appears on hover\n",
+    "        hover_data=[\"site_id\"],      # additional info\n",
+    "        center=center,\n",
+    "        zoom=8,\n",
+    "        height=600\n",
+    "    )\n",
     "\n",
-    "\n"
+    "    fig.update_layout(mapbox_style=\"carto-voyager\")\n",
+    "    fig.update_traces(marker=dict(size=12))\n",
+    "    return fig"
    ]
   },
   {
@@ -490,39 +340,11 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "a1e3bc39",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "## TODO: REPLACE WITH CSSI_EVALUATION.PLOTS FUNCTIONS\n",
-    "\n",
-    "# this may take a moment to load, but it should pop up in a new window\n",
-    "m = plot_utils.plot_sites_within_domain(sites_in_watershed, watershed_gdf, zoom_start=9)\n",
-    "m"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "354fc021",
-   "metadata": {},
-   "source": [
-    "## 4. Retrieve SNOTEL point observations and metadata from HydroData  \n",
-    "Use the `hf.get_point_data()` function to retrieve daily, start-of-day SWE from SNOTEL sites:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d74eeccb",
+   "id": "02571354",
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Create a folder to save observations\n",
-    "isExist = os.path.exists(OBS_OutputFolder)\n",
-    "if isExist == True:\n",
-    "    exit\n",
-    "else:\n",
-    "    os.mkdir(OBS_OutputFolder)"
+    "plot_sites_on_map(metadata_df, color_by_site_type=False)"
    ]
   },
   {
@@ -530,8 +352,9 @@
    "id": "b1805ac2",
    "metadata": {},
    "source": [
-    "### 4a. Get HydroData Observed SWE\n",
-    "Gather the SNOTEL data for all stations within the watershed and save CSV:"
+    "\n",
+    "### 2c. Gather the SNOTEL data for all stations within the watershed  \n",
+    "Here we use the HydroData `hf.get_point_data()` function to retrieve daily, start-of-day SWE from SNOTEL sites:"
    ]
   },
   {
@@ -542,57 +365,19 @@
    "outputs": [],
    "source": [
     "# Request point observations data\n",
-    "data_df = hf.get_point_data(dataset=\"snotel\", variable=\"swe\", temporal_resolution=\"daily\", aggregation=\"sod\",\n",
+    "obs_df = hf.get_point_data(dataset=\"snotel\", variable=\"swe\", temporal_resolution=\"daily\", aggregation=\"sod\",\n",
     "                         date_start=StartDate, date_end=EndDate,\n",
     "                         huc_id=[huc_8_code], grid='conus1')\n",
     "                         #polygon=watershed_bbox, polygon_crs=watershed_crs)\n",
     "\n",
     "# save\n",
-    "data_df.to_csv(f'./{OBS_OutputFolder}/df_{huc_8_name}_{huc_8_code}_SNOTEL.csv', index=False)\n",
+    "obs_df.to_csv(f'./{OBS_OutputFolder}/df_{huc_8_name}_{huc_8_code}_SNOTEL.csv', index=False)\n",
     "\n",
     "# Ensure date column is datetime\n",
-    "data_df[\"date\"] = pd.to_datetime(data_df[\"date\"])\n",
+    "obs_df[\"date\"] = pd.to_datetime(obs_df[\"date\"])\n",
     "\n",
     "# View first five records\n",
-    "data_df.head(5)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fb365fbf",
-   "metadata": {},
-   "source": [
-    "### 4b. Get Metadata for HydroData Observed SWE\n",
-    "Also, retrieve the metadata for the same stations we retrieved SWE observations for:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1cb40a7c",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Request site-level attributes for these sites\n",
-    "metadata_df = hf.get_point_metadata(dataset=\"snotel\", variable=\"swe\", temporal_resolution=\"daily\", aggregation=\"sod\",\n",
-    "                                 date_start=StartDate, date_end=EndDate,\n",
-    "                                 huc_id=['14020001'], grid='conus1')\n",
-    "\n",
-    "# save\n",
-    "metadata_df.to_csv(f'./{OBS_OutputFolder}/df_{huc_8_name}_{huc_8_code}_SNOTEL_metadata.csv', index=False)\n",
-    "\n",
-    "# View first five records\n",
-    "metadata_df.head(5)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d17b371a",
-   "metadata": {},
-   "source": [
-    "The metadata file is an important addition to the observations and it is recommended to always gather and save this for the observations you are using (particularly to support reproducibility within an open-science workflow). The saved file has useful attributes like site names, first and last date of available data, lat/lon, and the query URL.  \n",
-    "\n",
-    "Additionally, the metadata contains **ParFlow-CONUS1 and ParFlow-CONUS2 `i,j` indices, which indicate the exact model domain grid cell the observation aligns with**. This is a useful HydroData feature that removes the need for users to manually match station latitude/longitude coordinates to the appropriate model grid cell, as this spatial mapping is handled directly within HydroData. We will use these indices below to extract PF-CONUS1 modeled SWE for each SNOTEL station in the section below. "
+    "obs_df.head(5)"
    ]
   },
   {
@@ -600,7 +385,7 @@
    "id": "0e50455e",
    "metadata": {},
    "source": [
-    "## 5. Retrieve ParFlow-CONUS1 Modeled Snow Data"
+    "## 3. Retrieve ParFlow-CONUS1 Modeled Snow Data"
    ]
   },
   {
@@ -611,11 +396,7 @@
    "outputs": [],
    "source": [
     "# Create a folder to save results\n",
-    "isExist = os.path.exists(MOD_OutputFolder)\n",
-    "if isExist == True:\n",
-    "    exit\n",
-    "else:\n",
-    "    os.mkdir(MOD_OutputFolder)"
+    "Path(MOD_OutputFolder).mkdir(exist_ok=True)"
    ]
   },
   {
@@ -625,7 +406,7 @@
    "source": [
     "The following section retrieves ParFlow-CONUS1 data for each SNOTEL site within our HUC-08 watershed. The code identifies the CONUS1 `i,j` indices associated with each SNOTEL site, indicated in the `metadata_df`. It then extracts the CONUS1 modeled SWE output for the site and the period of interest, returning the result as a DataFrame. To fairly compare with SNOTEL, which reports SWE once daily at the start of the local day, model output is aggregated by day, using the argment `\"temporal_resolution\": \"daily\"`. Finally, the processed data is saved as a CSV file for each site.  \n",
     "\n",
-    "### 5a. ParFlow CONUS1 Model Dataset Information\n",
+    "### 3a. ParFlow CONUS1 Model Dataset Information\n",
     "We can print some information about the model output dataset by using the `hf.get_catalog_entry()` to get the CONUS1 model dataset metadata. "
    ]
   },
@@ -688,7 +469,7 @@
    "id": "73a13787",
    "metadata": {},
    "source": [
-    "Now, grab the PF-CONUS1 modeled SWE from the SNOTEL site locations. Here we use the CONUS1 i and j indices from the `metadata_df` and grab the SWE from those grid cells.  "
+    "Create a copy of the model dataframe (`model_df`) so we have the same data structure:"
    ]
   },
   {
@@ -698,18 +479,15 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Copy data_df to model_df so we have the same timestamps and site_id structure\n",
-    "model_df = data_df.copy()\n",
+    "# Copy obs_df to model_df so we have the same timestamps and site_id structure\n",
+    "model_df = obs_df.copy()\n",
     "\n",
     "# Set all non-date columns to NaN to prepare for filling in model data\n",
     "non_date_cols = model_df.columns.difference([\"date\"])\n",
     "model_df[non_date_cols] = np.nan\n",
     "\n",
-    "# Rename site_id columns for PF outputs \n",
-    "model_df.columns = [\n",
-    "    col if col == \"date\" else col.replace(\":SNTL\", \"\") + \":PFCONUS1\"\n",
-    "    for col in model_df.columns\n",
-    "]"
+    "# Keep official SNOTEL site names for obs columns and use same for model\n",
+    "model_df[\"date\"] = pd.to_datetime(model_df[\"date\"])"
    ]
   },
   {
@@ -717,7 +495,7 @@
    "id": "523bd35c",
    "metadata": {},
    "source": [
-    "Use the function `hf.get_gridded_data()` and PF-CONUS1 `i,j` indices to select the SWE output for the correct location and time period: "
+    "Now, retrieve the PF-CONUS1 modeled SWE from the the corresponding SNOTEL locations. Here we use the CONUS1 `i,j` indices from the `metadata_df` and grab the SWE from those grid cells using the function `hf.get_gridded_data()`:"
    ]
   },
   {
@@ -729,10 +507,12 @@
    "source": [
     "# Loop over each station in metadata_df\n",
     "for idx, row in metadata_df.iterrows():\n",
-    "    site_id = row[\"site_id\"]  # original SNTL site_id\n",
-    "    col_name = site_id.replace(\":SNTL\", \"\") + \":PFCONUS1\"  # corresponding column in model_df\n",
+    "    site_id = row[\"site_id\"]  \n",
+    "    col_name = site_id        \n",
+    "\n",
     "    conus_i = int(row[\"conus1_i\"])\n",
-    "    conus_j = int(row[\"conus1_j\"])\n",
+    "    conus_j = int(row[\"conus1_j\"]) \n",
+    "   \n",
     "    \n",
     "    # Build options dict for this station\n",
     "    options = {\n",
@@ -765,7 +545,7 @@
    "id": "7464828b",
    "metadata": {},
    "source": [
-    "## 6. Quick plot sanity check  \n",
+    "#### Quick sanity check plot  \n",
     "Plot a simple timeseries of modeled and observed SWE to make sure our data retrieval was successful. "
    ]
   },
@@ -778,18 +558,21 @@
    "source": [
     "fig, ax = plt.subplots(figsize=(10, 4))\n",
     "\n",
-    "ax.plot(data_df[\"date\"], model_df[\"380:CO:PFCONUS1\"], label=\"Modeled\", linewidth=2)\n",
+    "for site_id in obs_df.columns:\n",
+    "    if site_id == \"date\":\n",
+    "        continue\n",
     "\n",
-    "ax.plot(data_df[\"date\"], data_df[\"380:CO:SNTL\"], label=\"Observed\", linewidth=2)\n",
+    "    ax.plot(obs_df[\"date\"], obs_df[site_id], label=f\"{site_id} Obs\", linewidth=2)\n",
+    "    ax.plot(obs_df[\"date\"], model_df[site_id], linestyle=\"--\", label=f\"{site_id} Mod\")\n",
     "\n",
     "ax.set_xlabel(\"Date\")\n",
     "ax.set_ylabel(\"SWE (mm)\")\n",
     "\n",
-    "# Date formatting for x-axis\n",
+    "# Date formatting\n",
     "ax.xaxis.set_major_locator(mdates.MonthLocator(interval=3))\n",
     "ax.xaxis.set_major_formatter(mdates.DateFormatter('%m-%Y'))\n",
     "\n",
-    "ax.legend(loc='upper left')\n",
+    "ax.legend(loc='upper left', fontsize=8)\n",
     "ax.grid(True, alpha=0.3)\n",
     "\n",
     "plt.tight_layout()"
@@ -797,70 +580,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "da3df109",
-   "metadata": {},
-   "source": [
-    "# Start of comparison from old notebook"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 3. Spatial Mapping of the SNOTEL sites  \n",
-    "Before evaluating model performance, we plot the GIS data associated with the records in the combined DataFrame. The map below shows the SNOTEL stations included in the evaluation dataset, along with the watershed boundary used for the model simulations. Hover over the pins to see the site names.  \n",
-    "\n",
-    "We also print a table of the SNOTEL site metadata to help with the single site selection."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Path to the watershed shapefile\n",
-    "watershed = \"./domain_data/East-Taylor_14020001.shp\"\n",
-    "watershed_gdf = gpd.read_file(watershed).to_crs(epsg=4326)\n",
-    "\n",
-    "# Create GeoDataFrame of all available stations\n",
-    "filtered_all_stations_gdf = gpd.GeoDataFrame(\n",
-    "    metadata_df,\n",
-    "    geometry=gpd.points_from_xy(\n",
-    "        metadata_df.longitude,\n",
-    "        metadata_df.latitude\n",
-    "    ),\n",
-    "    crs=\"EPSG:4326\"\n",
-    ")\n",
-    "print(\"Sites CRS:\", filtered_all_stations_gdf.crs)\n",
-    "\n",
-    "# Combine watershed polygons into one geometry\n",
-    "watershed_union = watershed_gdf.geometry.unary_union\n",
-    "\n",
-    "# Filter stations that fall within the watershed\n",
-    "sites_in_watershed = filtered_all_stations_gdf[\n",
-    "    filtered_all_stations_gdf.geometry.within(watershed_union)\n",
-    "].copy()\n",
-    "\n",
-    "sites_in_watershed.reset_index(drop=True, inplace=True)\n",
-    "\n",
-    "print(f\"Total sites in watershed: {len(sites_in_watershed)}\")\n",
-    "\n",
-    "m = nwm_utils.plot_sites_within_domain(sites_in_watershed, watershed_gdf, zoom_start=9)\n",
-    "m"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "sites_in_watershed"
-   ]
-  },
-  {
-   "cell_type": "markdown",
+   "id": "f0c57cdc",
    "metadata": {},
    "source": [
     "## 4. Compare Modeled and Observed SWE Timeseries at a Single Site\n",
@@ -882,9 +602,10 @@
   },
   {
    "cell_type": "markdown",
+   "id": "602560f1",
    "metadata": {},
    "source": [
-    "Review the sites within the watershed from the interactive map above and click on the markers to view the site name and code. Recall, we also printed out the site metadata for all sites within the watershed, which contains the 3-letter site codes.\n",
+    "Review the sites within the watershed from the interactive map above and click on the markers to view the site name and code. Recall, we also printed out the site metadata for all sites within the watershed, which contains the SNOTEL site codes.\n",
     "\n",
     "✏️ Once you’ve identified the site of interest, **enter its site code in the next code cell for `my_site_code`**:   "
    ]
@@ -892,28 +613,42 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "473b9326",
    "metadata": {},
    "outputs": [],
    "source": [
     "# choose a site of interest within the watershed\n",
-    "my_site_code = '380:CO:'\n",
+    "my_site_code = '335:CO:SNTL'\n",
+    "\n",
+    "# make sure date columns are datetime and set as index for easier plotting and metric calculations\n",
+    "obs_df[\"date\"] = pd.to_datetime(obs_df[\"date\"])\n",
+    "model_df[\"date\"] = pd.to_datetime(model_df[\"date\"])\n",
     "\n",
-    "############################ THIS BELOW DOESNT WORK BECAUSE CODE IS NOT COMPLETE\n",
-    "# filter to only that site\n",
-    "sites_in_watershed[sites_in_watershed['site_id']==my_site_code]"
+    "obs_df = obs_df.set_index(\"date\")\n",
+    "model_df = model_df.set_index(\"date\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7905a491",
+   "metadata": {},
+   "source": [
+    "Use the `comparison_plots` utility function to provide a quicky view of the data."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "1c1f5648",
    "metadata": {},
    "outputs": [],
    "source": [
-    "nwm_utils.comparison_plots(combined_df, f'{my_site_code}SNTL', f'{my_site_code}PFCONUS1')"
+    "plot_utils.comparison_plots(obs_df,  model_df, f'{my_site_code}', f'{my_site_code}', site_label=None)"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "dfc27ff3",
    "metadata": {},
    "source": [
     "To move beyond an overall summary of daily performance, we replot the modeled vs. observed SWE scatter while highlighting specific months with a distinct color. This gives us more information about the **seasonal model performance**.  \n",
@@ -925,6 +660,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "1354a01b",
    "metadata": {},
    "source": [
     "##### 📊 For this example, let's highlight the _early snow accumulation period_ of October - January:"
@@ -933,18 +669,25 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "28f3e419",
    "metadata": {},
    "outputs": [],
    "source": [
-    "combined_df['month'] = combined_df.index.month\n",
+    "plot = plot_utils.plot_custom_scatter_SWE(\n",
+    "    obs_df,\n",
+    "    model_df,\n",
+    "    f\"{my_site_code}\",\n",
+    "    f\"{my_site_code}\",\n",
+    "    site_label=my_site_code,\n",
+    "    highlight_months=[10, 11, 12, 1],\n",
+    ")\n",
     "\n",
-    "plot = nwm_utils.plot_custom_scatter_SWE(combined_df, f'{my_site_code}SNTL', f'{my_site_code}PFCONUS1',\n",
-    "                                        highlight_months=[10, 11, 12, 1])\n",
-    "plot "
+    "plot"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "c831342f",
    "metadata": {},
    "source": [
     "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
@@ -956,6 +699,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "aba6c6fe",
    "metadata": {},
    "source": [
     "## 5. Peak SWE Evaluation at the Watershed Scale  \n",
@@ -969,7 +713,7 @@
     "- Soil moisture recharge and groundwater contributions\n",
     "\n",
     "<div style=\"padding: 10px;\">\n",
-    "  <img src=\"img/peak_swe_example.png\" style=\"width:85%;\">\n",
+    "  <img src=\"../img/peak_swe_example.png\" style=\"width:85%;\">\n",
     "</div>  \n",
     "\n",
     "_Example daily SWE at a single site, showing two important periods in snow processes: accumulation (before peak) and ablation (after peak). The vertical line marks peak SWE._"
@@ -977,6 +721,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "add14a79",
    "metadata": {},
    "source": [
     "### 5.1 Comparing Modeled and Observed Peak SWE at All Sites in the Watershed\n",
@@ -990,28 +735,39 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "2e15a789",
    "metadata": {},
    "outputs": [],
    "source": [
     "# isolate the columns associated with observations and model predictions.\n",
     "# these will be inputs to our same-day comparison function.\n",
-    "obs_cols = sorted([col for col in combined_df.columns if col.endswith('SNTL')])\n",
-    "mod_cols = sorted([col for col in combined_df.columns if col.endswith('PFCONUS1')])"
+    "obs_swe_cols = obs_df.columns.tolist()\n",
+    "mod_swe_cols = model_df.columns.tolist()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3854f098",
+   "metadata": {},
+   "source": [
+    "Use the `modeled_swe_at_observed_peak` utility function to perform the same-day SWE comparison."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "ed342496",
    "metadata": {},
    "outputs": [],
    "source": [
-    "# compute the same-day SWE comparison during the observed peak SWE for each of the observation and modeled sites.\n",
-    "df_observed_peak = utils.modeled_swe_at_observed_peak(combined_df, obs_cols, mod_cols)\n",
+    "# compute the same-day SWE comparison during the observed peak SWE for each of the observation and modeled sites\n",
+    "df_observed_peak = snow_utils.modeled_swe_at_observed_peak(obs_df, model_df)\n",
     "df_observed_peak"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "7e0c6dea",
    "metadata": {},
    "source": [
     "##### 📊 Visualize the amount of SWE on **the day of observed peak SWE occurs** for both the model and observations at each station"
@@ -1020,6 +776,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "297f8608",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -1060,6 +817,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "60d98757",
    "metadata": {},
    "source": [
     "#### 📋 Modeled vs Observed Peak SWE Comparison (timing & magnitude)  \n",
@@ -1069,16 +827,18 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "78907563",
    "metadata": {},
    "outputs": [],
    "source": [
     "# compute the different-day SWE comparison for each of the observed and modeled sites.\n",
-    "df_both_peak = utils.modeled_vs_observed_peak_swe(combined_df, obs_cols, mod_cols)\n",
+    "df_both_peak = snow_utils.modeled_vs_observed_peak_swe(obs_df, model_df)\n",
     "df_both_peak"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "d5ec9e5d",
    "metadata": {},
    "source": [
     "##### 📊 Visualize the quantity of peak SWE for both the model and observations at each station"
@@ -1087,6 +847,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "a93b9fa9",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -1125,6 +886,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "ed021831",
    "metadata": {},
    "source": [
     "### 5.2 Visualizing Model Error for Peak SWE\n",
@@ -1139,6 +901,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "e46b82f8",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -1146,21 +909,14 @@
     "df_both_peak['Peak_Date_Diff_Days'] = (df_both_peak['Modeled_Date'] - \n",
     "                                      df_both_peak['Observed_Date']).dt.days\n",
     "df_both_peak['Peak_SWE_Diff'] = (df_both_peak['Modeled'] - \n",
-    "                           df_both_peak['Observed'])"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "df02795e",
-   "metadata": {},
-   "outputs": [],
-   "source": [
+    "                           df_both_peak['Observed'])\n",
+    "\n",
     "df_both_peak"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "b77c566e",
    "metadata": {},
    "source": [
     "##### 📊 Visualize the error between the modeled and observed peak SWE  "
@@ -1169,6 +925,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "29c68cf0",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -1206,6 +963,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "4f707f0c",
    "metadata": {},
    "source": [
     "The left panel shows the timing error (date difference) and the right panel the magnitude error (SWE difference). When we computed the difference in date and SWE quantity above, we took `modeled - observed` so:  \n",
@@ -1218,11 +976,12 @@
   },
   {
    "cell_type": "markdown",
+   "id": "8729d9ff",
    "metadata": {},
    "source": [
     "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
     "<h4>🧠 Reflect</h4>\n",
-    "<p>Looking at the two plots, what could be some reasons for the model having simulated peak SWE both earlier and less than the observed peak SWE? Perhaps try changing the <code>my_site_code</code> from earlier in the notebook to rerun <code>nwm_utils.comparison_plots()</code> to see the timeseries for a different station to look at the peak magnitude and timing. \n",
+    "<p>Looking at the two plots, what could be some reasons for the model having simulated peak SWE both earlier and less than the observed peak SWE? Perhaps try changing the <code>my_site_code</code> from earlier in the notebook to rerun <code>plot_utils.comparison_plots()</code> to see the timeseries for a different station to look at the peak magnitude and timing. \n",
     "\n",
     "<br>What happens if you change the year that is plotted? <br>✏️ Try modifying the bar plot code from <code>bar1 = year1.hvplot.bar</code> to <code>bar1 = year2.hvplot.bar</code>. Don't forget to change the title!</p>\n",
     "</div>"
@@ -1230,6 +989,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "fdabdbd6",
    "metadata": {},
    "source": [
     "#### 📊 Next, we combine the timing and magnitude errors and plot them together for each station."
@@ -1238,6 +998,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "66105b84",
    "metadata": {},
    "outputs": [],
    "source": [
@@ -1260,11 +1021,12 @@
     "vline = hv.VLine(0).opts(color='gray', line_dash='dashed')\n",
     "hline = hv.HLine(0).opts(color='gray', line_dash='dashed')\n",
     "\n",
-    "(scatter * vline * hline).opts(legend_position='top_left', show_grid=True)\n"
+    "(scatter * vline * hline).opts(legend_position='bottom_left', show_grid=True)\n"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "285ec901",
    "metadata": {},
    "source": [
     "✏️ **Try changing how we view this plot.**  \n",
@@ -1274,313 +1036,292 @@
   },
   {
    "cell_type": "markdown",
+   "id": "f6a20df4",
    "metadata": {},
    "source": [
-    "## 6. Compute and Statistics and Error Metrics  \n",
-    "The previous section visualized when and where modeled SWE differs from observations, both in terms of peak SWE timing and magnitude. However, visual inspection alone makes it difficult to compare performance across sites or to summarize model behavior in a consistent or quantifiable way. In this section, we compute commonly used statistical error metrics to quantify model performance, allowing us to objectively assess bias, error magnitude, and variability for sites within the watershed.  \n",
-    "\n",
-    "Proposed outline (DTK, Jan 2026):\n",
-    "- Summary metrics at a station\n",
-    "- Summary metrics at all stations within the watershed\n",
-    "- Combined timing and magnitude for all stations within the watershed (Condon metric)\n",
-    "- Focus on timing: summary statistics for single station for accumulation & ablation periods (using the new wrapper: `nwm_utils.compute_stats_period()`)\n",
-    "- Melt period statistics"
+    "### 5.3 Visualizing Model Error for Melt Period\n",
+    "The function `compute_melt_period_obs_vs_model` summarizes melt behavior for both the observed and modeled SWE time series at each station and for each water year, allowing the two datasets to be compared side by side. For every station-year combination, it identifies the date of peak SWE, then defines the end of the melt period as the first date after that peak when SWE reaches zero and stays at zero for at least `min_zero_days` consecutive days. Using these two dates, it computes the melt period length in days and the average melt rate over the full melt period, expressed in meters per day. The final output is a table that reports these melt timing and melt rate statistics for both observations and model simulations, making it possible to evaluate whether the model melts snow too early or too late, too quickly or too slowly, and how well it reproduces the overall timing and pace of seasonal snow disappearance across sites and years.\n"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "nwm_utils.compute_stats(combined_df, f'CCSS_{my_site_code}_swe_m', f'NWM_{my_site_code}_swe_m')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
+   "id": "b2636cad",
    "metadata": {},
+   "outputs": [],
    "source": [
-    "Pearson and Spearman correlations are both close to 1, suggesting a strong relationship between observed and modeled SWE. As shown on the timeseries plot, this strong correlation alone does not indicate a \"good\" model. For example, it does not guarantee accurate timing of key events, such as peak SWE or melt onset. Let's compare these as well. The following code uses `report_max_dates_and_values` function to identify the peak SWE value and the date it occurs for both the observed (CCSS) and modeled (NWM) datasets. "
+    "melt_df_stats = snow_utils.compute_melt_period_obs_vs_model(obs_df, model_df, min_zero_days=10)\n",
+    "melt_df_stats"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "9cc72756",
    "metadata": {},
    "source": [
-    "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
-    "<h4>🧠 Reflect</h4>\n",
-    "<p>You now have several performance metrics: Bias, Pearson Correlation, Spearman Correlation, NSE, and KGE. If you had to pick just one metric to summarize model performance, which would you choose—and why? As you review the results, compare the peak flow amounts and the timing of snowmelt onset. Do you see any significant differences? Which dataset indicates an earlier melt?</p>\n",
-    "</div>"
+    "From a snow hydrologist’s perspective, this table is very useful because it captures three core aspects of seasonal snow behavior: **peak timing and magnitude**, **melt-out timing**, and **average melt rate**. The timeline or dumbbell view is often the most intuitive for timing diagnostics, while the 1:1 scatter is the best overall summary of performance."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {
-    "tags": []
-   },
+   "id": "a2b7829c",
+   "metadata": {},
    "outputs": [],
    "source": [
-    "summary_table = nwm_utils.report_max_dates_and_values(combined_df, f'CCSS_{my_site_code}_swe_m', f'NWM_{my_site_code}_swe_m')\n",
-    "summary_table"
+    "plot_utils.plot_scatter_melt_metrics(melt_df_stats)"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "5359941d",
    "metadata": {},
    "source": [
-    "### Summary Metrics at Multiple Sites"
+    "While site-level comparisons provide useful detail, it is often more informative to evaluate model behavior at the **process level** to understand systematic tendencies across the watershed. In this step, we assess whether the model tends to melt snow earlier or later than observed by analyzing the difference in melt period length (model minus observed). Positive differences indicate that the model retains snow longer (later melt), while negative differences indicate earlier melt. The following visualization summarizes this behavior across all stations and water years, highlighting both the direction and magnitude of melt timing bias, and providing an overall assessment of whether the model systematically accelerates or delays snowmelt processes."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "1b7b75be",
    "metadata": {},
    "outputs": [],
    "source": [
-    "site_codes = ['DAN', 'HRS', 'KIB', 'PDS', 'SLI', 'TUM', 'WHW']\n",
-    "\n",
-    "rows = []\n",
-    "\n",
-    "for site in site_codes:\n",
-    "    obs_col = f'CCSS_{site}_swe_m'\n",
-    "    mod_col = f'NWM_{site}_swe_m'\n",
-    "\n",
-    "    stats_table = nwm_utils.compute_stats(combined_df, obs_col, mod_col)\n",
-    "\n",
-    "    rows.append({\n",
-    "        'Station': site,\n",
-    "        'Mean_Obs': stats_table.loc['observed', 'Mean'],\n",
-    "        'Mean_Mod': stats_table.loc['modeled', 'Mean'],\n",
-    "        'Bias_m': stats_table.loc['Bias (Modeled - Observed)', 'Mean'],\n",
-    "        'Pearson_r': stats_table.loc['Pearson Correlation', 'Mean'],\n",
-    "        'Spearman_r': stats_table.loc['Spearman Correlation', 'Mean'],\n",
-    "        'NSE': stats_table.loc['Nash-Sutcliffe Efficiency (NSE)', 'Mean'],\n",
-    "        'KGE': stats_table.loc['Kling-Gupta Efficiency (KGE)', 'Mean']\n",
-    "    })\n",
-    "\n",
-    "stats_AllStations = pd.DataFrame(rows)\n",
-    "\n",
-    "print('All Stations Statistics Summary:')\n",
-    "stats_AllStations"
+    "# Process-level assessment\n",
+    "plot_utils.plot_melt_bias_summary(\n",
+    "    melt_df_stats,                          # Pandas DataFrame containing site melt statistics\n",
+    "    \"Melt_Period_Days_Obs\",                 # Name of the column corresponding with the observation data\n",
+    "    \"Melt_Period_Days_Mod\",                 # Name of the column corresponding with the modeled data\n",
+    "    \"Melt Period Length (Obs vs Model)\"     # Title of the plot\n",
+    ")"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
+   "id": "809e568f",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "stats_AllStations.hvplot.bar(\n",
-    "    x='Station',\n",
-    "    y='NSE',\n",
-    "    rot=45,\n",
-    "    ylabel='Nash–Sutcliffe Efficiency',\n",
-    "    title='NSE by Station',\n",
-    "    height=400,\n",
-    "    width=600,\n",
-    "    bar_width=0.5\n",
-    ")\n"
+    "## 6. Compute and Statistics and Error Metrics  \n",
+    "The previous section visualized when and where modeled SWE differs from observations, both in terms of peak SWE timing and magnitude. However, visual inspection alone makes it difficult to compare performance across sites or to summarize model behavior in a consistent or quantifiable way. In this section, we compute commonly used statistical error metrics to quantify model performance, allowing us to objectively assess bias, error magnitude, and variability for sites within the watershed.  "
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "7bff12bf",
    "metadata": {},
    "outputs": [],
    "source": [
-    "stats_summary.hvplot.scatter(\n",
-    "    x='Station',\n",
-    "    y='Bias_m',\n",
-    "    size=100,\n",
-    "    rot=45,\n",
-    "    ylabel='Bias (m)',\n",
-    "    title='Mean SWE Bias by Station'\n",
-    ")\n"
+    "# Check for mismatched site columns between obs and model data\n",
+    "for col in obs_df.columns:\n",
+    "    if col not in model_df.columns:\n",
+    "        print(f\"{col} missing in model data\")"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "64161df0",
    "metadata": {},
    "source": [
-    "### Combine Magnitude (absolute relative bias) and Timing (Spearman's rho) metrics using the Condon metric (and with all stations, a Condon diagram)"
+    "### 6.1 Compute Metrics"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
+   "id": "21c8ca67",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "bias1 = evaluation_metrics.bias(combined_df.CCSS_TUM_swe_m, combined_df.NWM_TUM_swe_m)\n",
-    "bias1"
+    "Using the modeled and observed dataframes, we can compute summary metrics. The following is some examples demonstrating how to use individual metrics of interest for a specific site, referenced as `my_site_code`."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "9d0f5c1d",
    "metadata": {},
    "outputs": [],
    "source": [
-    "abs_bias = evaluation_metrics.absolute_relative_bias(combined_df.CCSS_TUM_swe_m, combined_df.NWM_TUM_swe_m)\n",
-    "abs_bias"
+    "bias = metric_utils.bias(obs_df.loc[:, f'{my_site_code}'], model_df.loc[:, f'{my_site_code}'])\n",
+    "abs_bias = metric_utils.absolute_relative_bias(obs_df.loc[:, f'{my_site_code}'], model_df.loc[:, f'{my_site_code}'])\n",
+    "srho = metric_utils.spearman_rank(obs_df.loc[:, f'{my_site_code}'], model_df.loc[:, f'{my_site_code}'])\n",
+    "\n",
+    "print(f\"For site {my_site_code}: bias = {bias}, abs_bias = {abs_bias}, srho = {srho}\")"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
+   "id": "bf7a7432",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "srho = evaluation_metrics.spearman_rank(combined_df.CCSS_TUM_swe_m, combined_df.NWM_TUM_swe_m)\n",
-    "srho"
+    "Apply it to all sites using the `calculate_metrics` capability of the evaluation framework."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "1a6cc22b",
    "metadata": {},
    "outputs": [],
    "source": [
-    "evaluation_metrics.condon(abs_bias, srho)"
+    "metrics_df = evaluation_utils.calculate_metrics(\n",
+    "    obs_df,                # Pandas DataFrame containing the time series observations.    \n",
+    "    model_df,              # Pandas DataFrame containing the time series model calculations.\n",
+    "    metadata_df,                   # Pandas DataFrame containing location and site attribute information.\n",
+    "    metrics_list=None,             # List of metrics to calculate. None indicates that all metrics will be computed.\n",
+    "    write_csv=False,               # Indicates the the output should not be saved to CSV (default = False)\n",
+    "    csv_path=None,                 # Indicates the the output should not be saved to CSV (default = None)\n",
+    ")\n",
+    "metrics_df"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "0838c8f5",
    "metadata": {},
    "source": [
-    "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
-    "  <h4>🧠 Reflect</h4>\n",
-    "  <p>\n",
-    "    What is the modeled SWE on the date when the observed SWE reaches its peak?<br>\n",
-    "    ✏️ Use the code snippet below to find the answer.\n",
-    "  </p>\n",
-    "  <pre><code class=\"language-python\">\n",
-    "  \n",
-    "    # Find date of the peak SWE from observed data\n",
-    "    date_obs_max = combined_df['CCSS_HRS_swe_m'].idxmax()\n",
-    "\n",
-    "    # Get corresponding value of modeled data on that date\n",
-    "    value_mod_at_max_obs = combined_df.loc[date_obs_max, 'NWM_HRS_swe_m']\n",
-    "  </code></pre>\n",
-    "</div>\n"
+    "Look at plots of summary statistics for each station. Here we look at Bias and NSE for each station in the watershed:"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d58febb4",
    "metadata": {},
+   "outputs": [],
    "source": [
-    "### Focus on Timing: Melt Period Metrics\n",
-    "Compare the average melt rate over the full melt period. "
+    "# Bias scatter\n",
+    "scatter = metrics_df.hvplot.scatter(\n",
+    "    x='site_id',\n",
+    "    y='bias',\n",
+    "    size=100,\n",
+    "    rot=45,\n",
+    "    ylabel='Bias (m)',\n",
+    "    title='Mean SWE Bias by Station'\n",
+    ")\n",
+    "\n",
+    "hline = hv.HLine(0).opts(color='black', line_dash='dashed', line_width=1)\n",
+    "\n",
+    "scatter * hline"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "988fcb43",
    "metadata": {},
+   "outputs": [],
    "source": [
-    "The following function computes the melt period length by identifying the first date after the peak SWE when SWE drops to zero and remains at zero for at least (`min_zero_days`) consecutive days. This is used to define the end of the melt period. Finally, the function calculates the average melt rate, which represents the rate at which snow disappeared, expressed in meters per day, over the full melt period."
+    "# NSE histogram\n",
+    "metrics_df.hvplot.bar(\n",
+    "    x='site_id',\n",
+    "    y='nse',\n",
+    "    rot=45,\n",
+    "    ylabel='Nash–Sutcliffe Efficiency',\n",
+    "    title='NSE by Station',\n",
+    "    height=400,\n",
+    "    width=600,\n",
+    "    bar_width=0.5\n",
+    ")\n"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
+   "id": "d0ee45fb",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "melt_stats_df = utils.compute_melt_period_statistics(combined_df)\n",
-    "melt_stats_df.head()\n",
-    "melt_stats_df"
+    "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
+    "  <h4>🧠 Reflect</h4>\n",
+    "  <p>\n",
+    "    If you recall from earlier, we plotted the timeseries of our selected station. Replot it below. Do the metrics make sense given the visual comparison between modeled and observed? For example, when you look at the timeseries, is the model consistently predicting SWE to be higher or lower than observations? Does this align with the <b>Bias</b> sign (+ or -)?\n",
+    "  </p>\n",
+    "</div> "
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {
-    "tags": []
-   },
+   "id": "9fe75736",
+   "metadata": {},
    "outputs": [],
    "source": [
-    "observed_melt_period = nwm_utils.compute_melt_period(combined_df[f'CCSS_{my_site_code}_swe_m'])\n",
-    "observed_melt_period"
+    "plot_utils.comparison_plots(obs_df,  model_df, f'{my_site_code}', f'{my_site_code}', site_label=None)\n",
+    "\n",
+    "# Change the site code to see other Snotel Stations --> e.g., '688:CO:SNTL'\n",
+    "#plot_utils.comparison_plots(obs_df,  model_df, '688:CO:SNTL', '688:CO:SNTL', site_label=None)"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
+   "cell_type": "markdown",
+   "id": "a2d295d2",
+   "metadata": {},
    "source": [
-    "modeled_melt_period = nwm_utils.compute_melt_period(combined_df[f'NWM_{my_site_code}_swe_m'])\n",
-    "modeled_melt_period"
+    "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
+    "<h4>🧠 Reflect</h4>\n",
+    "<p>You now have several performance metrics: Bias, Pearson Correlation, Spearman Correlation, NSE, and KGE. If you had to pick just one metric to summarize model performance, which would you choose—and why?</p>\n",
+    "</div>"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
+   "id": "ced951d0",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "accum_months = [10, 11, 12, 1, 2, 3]\n",
-    "ablation_months = [4, 5, 6]\n",
-    "\n",
-    "accum_stats = nwm_utils.compute_stats_period(\n",
-    "    combined_df,\n",
-    "    f'CCSS_{my_site_code}_swe_m',\n",
-    "    f'NWM_{my_site_code}_swe_m',\n",
-    "    accum_months\n",
-    ")\n",
-    "\n",
-    "accum_stats"
+    "The metrics above are computed over the entire snow season, including both accumulation and ablation periods. If desired, you can subset the data and recompute the metrics to examine differences between these phases. For example, if we define the **ablation period** as April through June, the following code computes statistics for data within those months. This approach provides a more detailed understanding of how well the model represents snow accumulation versus melt processes and where performance differences may occur."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "ba082688",
    "metadata": {},
    "outputs": [],
    "source": [
+    "ablation_months = [4, 5, 6]  # 4:April, 5:May, 6:June. \n",
+    "\n",
+    "# Create subsets of the aligned dataframes that exclude the ablation months\n",
+    "obs_df_ablation = obs_df[obs_df.index.month.isin(ablation_months)]\n",
+    "model_df_ablation = model_df[model_df.index.month.isin(ablation_months)]\n",
     "\n",
-    "ablation_stats = nwm_utils.compute_stats_period(\n",
-    "    combined_df,\n",
-    "    f'CCSS_{my_site_code}_swe_m',\n",
-    "    f'NWM_{my_site_code}_swe_m',\n",
-    "    ablation_months\n",
+    "# Compute metrics for the ablation period\n",
+    "metrics_df_ablation = evaluation_utils.calculate_metrics(\n",
+    "    obs_df_ablation,\n",
+    "    model_df_ablation,\n",
+    "    metadata_df,\n",
+    "    metrics_list=None,\n",
+    "    write_csv=False,\n",
+    "    csv_path=None,\n",
     ")\n",
     "\n",
-    "ablation_stats"
+    "metrics_df_ablation"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "aff0fac4",
    "metadata": {},
    "source": [
-    "<div style=\"color:black; background-color:#f5f5f5; padding:10px; border-left: 5px solid #007acc;\">\n",
-    "  <h4>🧠 Reflect</h4>\n",
-    "  <p>\n",
-    "    If you recall from earlier, we plotted the timeseries of out selected station. Replot it below. Do the metrics make sense given the visual comparison between modeled and observed? For example, when you look at the timeseries, is the model consistently predicting SWE to be higher or lower than observations? Does this align with the <b>Bias</b> sign (+ or -)?\n",
-    "  </p>\n",
-    "</div> "
+    "### 6.2 Combining Magnitude (Absolute Relative Bias) and Timing (Spearman’s Rho) Using the Condon Metric "
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
+   "id": "53396d9b",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "nwm_utils.comparison_plots(combined_df, f'CCSS_{my_site_code}_swe_m', f'NWM_{my_site_code}_swe_m')\n"
+    "One way to learn more about the model performance is to combine metrics that tell us about different aspects of model behavior—such as timing, variability, and magnitude—rather than relying on a single summary measure.\n",
+    "\n",
+    "The Condon diagram separates model performance into quadrants based on two metrics: **Spearman’s rho** (shape/time agreement) and **relative bias** (magnitude error). The horizontal line at 0.5 distinguishes whether the model captures the temporal pattern well (above 0.5 = good shape), while the vertical line is traditionally placed at a relative bias of 1.0, which represents a 100% error. This means the model’s total error is as large as the observed signal itself. This threshold has a clear physical interpretation and is used in the original Condon framework to distinguish acceptable vs. large bias. "
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
+   "id": "fb9d46d7",
    "metadata": {},
    "outputs": [],
-   "source": []
+   "source": [
+    "plot_utils.plot_condon_diagram(metrics_df, variable=\"SWE\")"
+   ]
   }
  ],
  "metadata": {
diff --git a/src/cssi_evaluation/utils/evaluation_utils.py b/src/cssi_evaluation/utils/evaluation_utils.py
index fabb7de..3e2ce96 100644
--- a/src/cssi_evaluation/utils/evaluation_utils.py
+++ b/src/cssi_evaluation/utils/evaluation_utils.py
@@ -118,10 +118,23 @@ def calculate_metrics(
     # Initialize empty metrics DataFrame to store calculated comparison metrics.
     metrics_df = initialize_metrics_df(obs_metadata_df, metrics_list)
 
-    num_sites = obs_data_df.shape[1] - 1  # first column is 'date'
+    # Ensure datetime index for obs_data_df 
+    if "date" in obs_data_df.columns:
+        obs_data_df["date"] = pd.to_datetime(obs_data_df["date"])
+        obs_data_df = obs_data_df.set_index("date")
+    else:
+        obs_data_df.index = pd.to_datetime(obs_data_df.index)
+
+    # --- Ensure datetime index for model_data_df ---
+    if "date" in model_data_df.columns:
+        model_data_df["date"] = pd.to_datetime(model_data_df["date"])
+        model_data_df = model_data_df.set_index("date")
+    else:
+        model_data_df.index = pd.to_datetime(model_data_df.index)
+
+    site_cols = obs_data_df.columns
 
-    for i in range(num_sites):
-        site_id = obs_data_df.columns[(i + 1)]
+    for site_id in site_cols:
 
         obs_data = obs_data_df.loc[:, [site_id]].to_numpy()
         model_data = model_data_df.loc[:, [site_id]].to_numpy()
diff --git a/src/cssi_evaluation/variables/snow_utils.py b/src/cssi_evaluation/variables/snow_utils.py
index ad0a1d9..109f694 100644
--- a/src/cssi_evaluation/variables/snow_utils.py
+++ b/src/cssi_evaluation/variables/snow_utils.py
@@ -82,94 +82,6 @@ def modeled_swe_at_observed_peak(obs_df: pd.DataFrame, model_df: pd.DataFrame) -
 
     return pd.concat(results)
 
-######### ORIGINAL CUAHSI FUNCTION BELOW:
-# def modeled_swe_at_observed_peak(
-#     df: pd.DataFrame, obs_swe_cols: list[str], mod_swe_cols: list[str]
-# ) -> pd.DataFrame:
-#     """
-#     Extract modeled SWE values on the dates of observed peak (maximum) SWE.
-
-#     This function evaluates model performance by comparing observed peak SWE
-#     to the modeled SWE on the same calendar date. For each station and water year,
-#     the date of maximum observed SWE is identified, and the modeled SWE value
-#     at that date is extracted.
-
-#     Parameters
-#     ==========
-#     df: pandas.DataFrame
-#         A pandas dataframe containing columns associated with modeled and observed SWE. The
-#         dataframe must have an datetime[64] index.
-#     obs_swe_cols: list[str]
-#         Names of the columns associated with observed SWE
-#     mod_swe_cols: list[str]
-#         Names of the columns associated with modeled SWE
-
-#     Returns
-#     =======
-#     df: pandas.DataFrame
-#         A dataframe containing observed max observed SWE and the modeled SWE at the index of the
-#         maximum observed. The format of the DataFrame will be:
-
-#                 Observed     Modeled              Water_Year       Station
-#         <date>  <max value>  <value at obs max>   <water year>    <obs station name>
-#         <date>  <max value>  <value at obs max>   <water year>    <obs station name>
-#         ...
-
-#         Example:
-
-#                     Observed	Modeled	  Water_Year	Station
-#         2019-04-18	0.98044	    1.0293	  2019	        CCSS_DAN_swe_m
-#         2019-04-20	2.12090	    1.3598	  2019	        CCSS_HRS_swe_m
-#         2019-03-28	0.80264	    0.6708	  2019	        CCSS_KIB_swe_m
-#         2019-04-07	1.78562	    0.9965	  2019	        CCSS_PDS_swe_m
-#         ...
-
-#     """
-
-#     # compute water year if it doesn't already exist in the dataframe.
-#     # this is needed to properly align the same-day comparison
-#     if "Water_Year" not in df.columns:
-#         compute_water_year(df, inplace=True)
-
-#     # check to make sure that the input columns are the same length.
-#     # Raise an exception if they aren't, because our computation will fail.
-#     if len(obs_swe_cols) != len(mod_swe_cols):
-#         raise Exception("Modeled and observed inputs must be the same length")
-
-#     # make sure our column data is represented as float64, otherwise
-#     # the pandas operations below will fail.
-#     df = df.apply(pd.to_numeric, errors="coerce").astype("float64")  # type: ignore[assignment]
-#     df["Water_Year"] = df["Water_Year"].astype(int)  # keep wateryear an integer
-
-#     # Loop over each pairwise grouping of obs and mod columns that
-#     # have been provided as inputs. Group data for these stations
-#     # by water year and determine when the maximum value occurs in
-#     # the observation series. Save this value along with the corresponding
-#     # mod value at the same time.
-#     dfs = []
-#     for obs, mod in zip(obs_swe_cols, mod_swe_cols):
-
-#         # get the data for the current obs and mod columns
-#         # but drop all NaN data that may exist.
-#         dat = df.dropna(subset=[obs, mod, "Water_Year"]).copy()
-
-#         # if all data is NaN for the current obs, mod combination
-#         # just skip it.
-#         if dat.empty:
-#             print(f"Skipping ({obs}, {mod}) because all data is NaN")
-#             continue
-
-#         idx = dat.groupby("Water_Year")[obs].idxmax()
-#         dat = dat.loc[idx, [obs, mod, "Water_Year"]].copy()
-
-#         dat.rename(columns={obs: "Observed", mod: "Modeled"}, inplace=True)
-#         dat["Station"] = obs
-
-#         dfs.append(dat)
-
-#     # concatenate all dataframes together and return
-#     return pd.concat(dfs)
-
 
 def modeled_vs_observed_peak_swe(obs_df: pd.DataFrame, model_df: pd.DataFrame) -> pd.DataFrame:
     """
@@ -245,106 +157,6 @@ def modeled_vs_observed_peak_swe(obs_df: pd.DataFrame, model_df: pd.DataFrame) -
     return pd.concat(results).sort_values(["Station", "Water_Year"]).reset_index(drop=True)
 
 
-######### ORIGINAL CUAHSI FUNCTION BELOW:
-# def modeled_vs_observed_peak_swe(
-#     df: pd.DataFrame, obs_swe_cols: list[str], mod_swe_cols: list[str]
-# ) -> pd.DataFrame:
-#     """
-#     Extract and compare modeled and observed peak (maximum) SWE values and their timing.
-
-#     This function identifies the dates and magnitudes of peak SWE
-#     independently for both observed and modeled time series. For each station
-#     and water year, it extracts the maximum observed SWE and its occurrence date,
-#     as well as the maximum modeled SWE and its occurrence date.
-
-#     Parameters
-#     ==========
-#     df: pandas.DataFrame
-#         A pandas dataframe containing columns associated with modeled and observed SWE. The
-#         dataframe must have an datetime[64] index.
-#     obs_swe_cols: list[str]
-#         Names of the columns associated with observed SWE
-#     mod_swe_cols: list[str]
-#         Names of the columns associated with modeled SWE
-
-#     Returns
-#     =======
-#      df: pandas.DataFrame
-#         A dataframe containing maximum observed and modeled SWE at their respective times of
-#         occurence. The format of the DataFrame will be:
-
-#            Observed     Observed_Date   Modeled      Modeled_Date    Water_Year    Station
-#         0  <max value>  <obs max date>  <max value>  <mod max date>  <water year>  <obs station name>
-#         1  <max value>  <obs max date>  <max value>  <mod max date>  <water year>  <obs station name>
-#         ...
-
-#         Example:
-
-#             Observed  Observed_Date	 Modeled  Modeled_Date  Water_Year  Station
-#         0	0.98044	  2019-04-18	 1.0393	  2019-04-10    2019	    CCSS_DAN_swe_m
-#         1	0.41910	  2020-04-21	 0.5206	  2020-04-18    2020	    CCSS_DAN_swe_m
-#         2	2.12090	  2019-04-20	 1.5498	  2019-04-03    2019	    CCSS_HRS_swe_m
-#         3	0.89662	  2020-04-10	 0.5745	  2020-04-10    2020	    CCSS_HRS_swe_m
-#         ...
-
-
-#     """
-
-#     # compute water year if it doesn't already exist in the dataframe.
-#     # this is needed to properly align the same-day comparison
-#     if "Water_Year" not in df.columns:
-#         compute_water_year(df, inplace=True)
-
-#     # check to make sure that the input columns are the same length.
-#     # Raise an exception if they aren't, because our computation will fail.
-#     if len(obs_swe_cols) != len(mod_swe_cols):
-#         raise Exception("Modeled and observed inputs must be the same length")
-
-#     # make sure our column data is represented as float64, otherwise
-#     # the pandas operations below will fail.
-#     df = df.apply(pd.to_numeric, errors="coerce").astype("float64")  # type: ignore[assignment]
-#     df["Water_Year"] = df["Water_Year"].astype(int)  # keep wateryear an integer
-
-#     # Loop over each pairwise grouping of obs and mod columns that
-#     # have been provided as inputs. Group data for these stations
-#     # by water year and determine when the maximum value occurs in
-#     # both the observation and modeled series. Save these values
-#     # along with their corresponding times
-#     dfs = []
-#     for obs, mod in zip(obs_swe_cols, mod_swe_cols):
-
-#         # get the data for the current obs and mod columns
-#         # but drop all NaN data that may exist.
-#         dat = df.dropna(subset=[obs, mod, "Water_Year"]).copy()
-
-#         # if all data is NaN for the current obs, mod combination
-#         # just skip it.
-#         if dat.empty:
-#             print(f"Skipping ({obs}, {mod}) because all data is NaN")
-#             continue
-
-#         obs_idx = dat.groupby("Water_Year")[obs].idxmax()
-#         obs_dat = dat.loc[obs_idx, [obs, "Water_Year"]].copy()
-#         obs_dat = obs_dat.rename(columns={obs: "Observed"})
-#         obs_dat["Observed_Date"] = obs_idx.values
-
-#         mod_idx = dat.groupby("Water_Year")[mod].idxmax()
-#         mod_dat = dat.loc[mod_idx, [mod, "Water_Year"]].copy()
-#         mod_dat = mod_dat.rename(columns={mod: "Modeled"})
-#         mod_dat["Modeled_Date"] = mod_idx.values
-
-#         dfs.append(
-#             # combine the observation and modeled sub-dataframes into one
-#             # by joining them on Water_Year. Then add
-#             obs_dat.merge(mod_dat, on="Water_Year", how="outer").assign(
-#                 # create a new "Station" column containing the value of the obs
-#                 Station=obs
-#             )
-#         )
-
-#     # concatenate all dataframes together and return
-#     return pd.concat(dfs).reset_index().drop("index", axis=1)
-
 def compute_melt_period_single(
     swe_series: pd.Series,
     min_zero_days: int = 10
@@ -427,66 +239,6 @@ def compute_melt_period_all_sites(
 
     return pd.DataFrame(result)
 
-# def compute_melt_period(
-#     swe_series: pd.Series, min_zero_days: int = 10
-# ) -> dict[str, Any]:
-#     """
-#     computes the snow melt period for the input Series.
-
-#     Parameters
-#     ==========
-#     swe_series: pandas.Series
-#         A pandas series containing SWE values indexed by datetime.
-#     min_zero_days: int -> 10
-#         The minimum number of consecutive days with zero SWE to consider
-#         when determining the melt end date.
-
-#     Returns
-#     =======
-#     dict[str, Any]
-#         A dictionary containing melt period information with the following keys:
-#         peak_date, peak_swe_m, melt_end_date, melt_period_days, melt_rate_m/d
-
-#     """
-
-#     peak_date = swe_series.idxmax()
-#     peak_swe = swe_series.max()
-
-#     after_peak = swe_series.loc[peak_date:]
-
-#     zero_streak = 0
-#     melt_end_date = None
-
-#     for date, value in after_peak.items():
-#         if value == 0:
-#             zero_streak += 1
-#         else:
-#             zero_streak = 0
-
-#         if zero_streak >= min_zero_days:
-#             melt_end_date = date
-#             break
-
-#     if melt_end_date is None:
-#         raise ValueError(
-#             f"Could not find a period of at least {min_zero_days} consecutive zero SWE days after the peak."
-#         )
-
-#     melt_period_days = (melt_end_date - peak_date).days
-
-#     # Compute melt rate, but handle the case where melt_period_days is zero to avoid division by zero
-#     if melt_period_days == 0:
-#         melt_rate = np.nan
-#     else:
-#         melt_rate = peak_swe / melt_period_days
-
-#     return {
-#         "peak_date": peak_date,
-#         "peak_swe_m": peak_swe,
-#         "melt_end_date": melt_end_date,
-#         "melt_period_days": melt_period_days,
-#         "melt_rate_m/d": melt_rate,
-#     }
 
 def compute_melt_period_obs_vs_model(
     obs_df: pd.DataFrame,
@@ -576,62 +328,6 @@ def compute_melt_period_obs_vs_model(
 
     return pd.DataFrame(result)
 
-# def compute_melt_period_statistics(
-#     df: pd.DataFrame, min_zero_days: int = 10
-# ) -> pd.DataFrame:
-#     """
-#     Computes melt period statistics for each station and water year in the input DataFrame.
-
-#     Parameters
-#     ==========
-
-#     Returns
-#     =======
-#     pandas.DataFrame
-#         A pandas DataFrame containing melt period statistics with the following columns:
-#         Water_Year, Station, Peak_SWE_Date, Peak_SWE_m, Melt_End_Date, Melt_Period_Days,
-#         Melt_Rate_m_per_day
-
-#     """
-
-#     # TODO: move ccss columns as an input parameter
-#     result = []
-
-#     # Identify CCSS SWE columns
-#     ccss_columns = [
-#         col for col in df.columns if col.startswith("CCSS_") and col.endswith("_swe_m")
-#     ]
-
-#     for wy, group in df.groupby("Water_Year"):
-#         for station_col in ccss_columns:
-
-#             # TODO: refactore dropna handling similar to other functions
-#             # Clean series
-#             swe_series = pd.to_numeric(group[station_col], errors="coerce").dropna()
-
-#             # Skip if insufficient data
-#             if swe_series.empty or swe_series.max() == 0:
-#                 continue
-
-#             try:
-#                 # Compute melt period stats
-#                 stats = compute_melt_period(swe_series, min_zero_days=min_zero_days)
-#                 result.append(
-#                     {
-#                         "Water_Year": wy,
-#                         "Station": station_col,
-#                         "Peak_SWE_Date": stats["peak_date"],
-#                         "Peak_SWE_m": stats["peak_swe_m"],
-#                         "Melt_End_Date": stats["melt_end_date"],
-#                         "Melt_Period_Days": stats["melt_period_days"],
-#                         "Melt_Rate_m_per_day": stats["melt_rate_m/d"],
-#                     }
-#                 )
-#             except ValueError:
-#                 # If melt period cannot be determined, skip
-#                 continue
-
-#     return pd.DataFrame(result)
 
 def compute_snow_timing_grid(grid_swe_da, water_year, threshold=1.0, smooth_window=5):
     """