Summary
Build 15_optimization/038_optimization_on_histogram_surrogate.ipynb as a companion to the existing 030_Classification_Optimization.ipynb. The new notebook teaches a general numerical-methods pattern: when iterative parameter optimization is expensive over $N$ raw data points, replace the per-point sum in the cost function with a per-cell sum over a multi-dimensional histogram of the feature space, optimize on the surrogate, then verify the final result on the full dataset.
The pattern was used in Ch. 6 of:
Lee, K. (2004). Longitudinal Driver Model and Collision Warning and Avoidance Algorithms Based on Human Driving Databases. Ph.D. dissertation, Mechanical Engineering, University of Michigan.
There, two 5D histograms reduced 3.4 M data points (~400 MB) to ~1.9 MB while approximating the confusion matrix used by fmincon with 1000 random initial conditions. The same pattern transplants cleanly to a 2D classification example for teaching.
Pedagogical arc
Scope decisions
Curricular hooks
- Bridges
20_probability/ histograms (per-class joint densities) and 15_optimization/ (optimization on a surrogate)
- Direct follow-on to
030_Classification_Optimization.ipynb
- Train-on-surrogate, verify-on-truth pattern is a general engineering-numerical-methods lesson reusable elsewhere
Out of scope (parked as future threads)
Summary
Build$N$ raw data points, replace the per-point sum in the cost function with a per-cell sum over a multi-dimensional histogram of the feature space, optimize on the surrogate, then verify the final result on the full dataset.
15_optimization/038_optimization_on_histogram_surrogate.ipynbas a companion to the existing030_Classification_Optimization.ipynb. The new notebook teaches a general numerical-methods pattern: when iterative parameter optimization is expensive overThe pattern was used in Ch. 6 of:
There, two 5D histograms reduced 3.4 M data points (~400 MB) to ~1.9 MB while approximating the confusion matrix used by
fminconwith 1000 random initial conditions. The same pattern transplants cleanly to a 2D classification example for teaching.Pedagogical arc
ipywidgetsslider on grid resolutionScope decisions
Curricular hooks
20_probability/histograms (per-class joint densities) and15_optimization/(optimization on a surrogate)030_Classification_Optimization.ipynbOut of scope (parked as future threads)
039) once 038 lands.