Skip to content

Commit 0c16951

Browse files
pavelkomarovclaude
andcommitted
Add JOSS 2.0 paper draft
paper/paper.md and paper/paper.bib covering the updated package: seven method families, multidimensional support, variable step size, missing data, circular domains, robustdiff, tvrdiff, and hyperparameter optimization. References companion Taxonomy paper (arXiv:2512.09090). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent b38199f commit 0c16951

2 files changed

Lines changed: 253 additions & 0 deletions

File tree

paper/paper.bib

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
@article{vanBreugel2022,
2+
doi = {10.21105/joss.04078},
3+
url = {https://doi.org/10.21105/joss.04078},
4+
year = {2022},
5+
publisher = {The Open Journal},
6+
volume = {7},
7+
number = {71},
8+
pages = {4078},
9+
author = {Floris van Breugel and Yuying Liu and Bingni W. Brunton and J. Nathan Kutz},
10+
title = {{PyNumDiff}: A {Python} package for numerical differentiation of noisy time-series data},
11+
journal = {Journal of Open Source Software}
12+
}
13+
14+
@misc{komarov2025,
15+
title = {A Taxonomy of Numerical Differentiation Methods},
16+
author = {Pavel Komarov and Floris van Breugel and J. Nathan Kutz},
17+
year = {2025},
18+
eprint = {2512.09090},
19+
archivePrefix = {arXiv},
20+
primaryClass = {math.NA},
21+
url = {https://arxiv.org/abs/2512.09090}
22+
}
23+
24+
@article{vanBreugel2020numerical,
25+
doi = {10.1109/ACCESS.2020.3034077},
26+
year = {2020},
27+
author = {Floris {van Breugel} and J. Nathan Kutz and Bingni W. Brunton},
28+
journal = {IEEE Access},
29+
title = {Numerical differentiation of noisy data: A unifying multi-objective optimization framework},
30+
volume = {8},
31+
pages = {196865--196877}
32+
}
33+
34+
@article{chartrand2011numerical,
35+
author = {Rick Chartrand},
36+
title = {Numerical differentiation of noisy, nonsmooth data},
37+
journal = {ISRN Applied Mathematics},
38+
year = {2011},
39+
volume = {2011},
40+
pages = {164564},
41+
doi = {10.5402/2011/164564}
42+
}
43+
44+
@article{brunton2016discovering,
45+
author = {Steven L. Brunton and Joshua L. Proctor and J. Nathan Kutz},
46+
title = {Discovering governing equations from data by sparse identification of nonlinear dynamical systems},
47+
journal = {Proceedings of the National Academy of Sciences},
48+
year = {2016},
49+
volume = {113},
50+
number = {15},
51+
pages = {3932--3937},
52+
doi = {10.1073/pnas.1517384113}
53+
}
54+
55+
@article{virtanen2020scipy,
56+
author = {Pauli Virtanen and Ralf Gommers and Travis E. Oliphant and others},
57+
title = {{SciPy} 1.0: Fundamental algorithms for scientific computing in {Python}},
58+
journal = {Nature Methods},
59+
year = {2020},
60+
volume = {17},
61+
pages = {261--272},
62+
doi = {10.1038/s41592-019-0686-2}
63+
}
64+
65+
@article{harris2020array,
66+
author = {Charles R. Harris and K. Jarrod Millman and St{\'{e}}fan J. {van der Walt} and others},
67+
title = {Array programming with {NumPy}},
68+
journal = {Nature},
69+
year = {2020},
70+
volume = {585},
71+
pages = {357--362},
72+
doi = {10.1038/s41586-020-2649-2}
73+
}
74+
75+
@article{diamond2016cvxpy,
76+
author = {Steven Diamond and Stephen Boyd},
77+
title = {{CVXPY}: A {Python}-embedded modeling language for convex optimization},
78+
journal = {Journal of Machine Learning Research},
79+
year = {2016},
80+
volume = {17},
81+
number = {83},
82+
pages = {1--5}
83+
}
84+
85+
@software{pynumdiff_zenodo,
86+
author = {Floris van Breugel and Yuying Liu and Bingni W. Brunton and J. Nathan Kutz},
87+
title = {{PyNumDiff}},
88+
year = {2022},
89+
publisher = {Zenodo},
90+
doi = {10.5281/zenodo.6374098},
91+
url = {https://doi.org/10.5281/zenodo.6374098}
92+
}

paper/paper.md

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
---
2+
title: 'PyNumDiff: A Python Package for Numerical Differentiation of Noisy Data'
3+
tags:
4+
- Python
5+
- numerical differentiation
6+
- time series
7+
- denoising
8+
- dynamics
9+
- signal processing
10+
authors:
11+
- name: Pavel Komarov
12+
orcid: 0000-0000-0000-0000 # TODO: fill in
13+
corresponding: true
14+
affiliation: 1
15+
- name: Floris van Breugel
16+
orcid: 0000-0000-0000-0000 # TODO: fill in
17+
affiliation: 2
18+
- name: J. Nathan Kutz
19+
orcid: 0000-0003-0944-900X
20+
affiliation: 1
21+
affiliations:
22+
- name: Department of Applied Mathematics, University of Washington, USA
23+
index: 1
24+
- name: Department of Mechanical Engineering, University of Nevada, Reno, USA
25+
index: 2
26+
date: 1 April 2026
27+
bibliography: paper.bib
28+
---
29+
30+
# Summary
31+
32+
Computing derivatives from measured data is a foundational requirement across science and
33+
engineering. Whether fitting governing equations from experimental observations, designing
34+
control laws, or processing sensor streams, researchers routinely need derivative estimates
35+
from discrete, noisy measurements. The textbook approach — finite differencing — amplifies
36+
noise proportionally to $1/\Delta t$, making it unreliable for real data. Smoothing before
37+
differencing helps, but the choice of algorithm and its tuning parameters substantially
38+
affect the result, and no single choice is universally best.
39+
40+
PyNumDiff is an open-source Python package that addresses this by providing a broad suite of
41+
numerical differentiation methods for noisy data under a unified API. Seven families of
42+
algorithms are implemented: (1) prefiltering followed by finite difference calculation;
43+
(2) iterated finite differencing; (3) polynomial fitting; (4) basis function fitting;
44+
(5) total variation regularization; (6) Kalman smoothing and its outlier-robust variant;
45+
and (7) local approximation with linear models. All methods return a smoothed signal estimate
46+
and a derivative estimate as a matched pair `(x_hat, dxdt_hat)`. A companion paper
47+
[@komarov2025] provides a comprehensive theoretical taxonomy of these methods, benchmarks
48+
their relative performance, and guides method selection for different application scenarios.
49+
50+
51+
# Statement of Need
52+
53+
Estimating derivatives from noisy measurements arises throughout experimental science,
54+
data-driven modeling, system identification, and control. Standard finite differences amplify
55+
noise, and while smoothing helps, the field has produced a diverse ecosystem of specialized
56+
algorithms — each with different strengths regarding outlier robustness, computational cost,
57+
handling of irregular time steps, or treatment of missing data. Without a consolidated
58+
library, practitioners either default to the nearest available tool or implement bespoke
59+
solutions that are hard to compare or reproduce.
60+
61+
PyNumDiff fills this gap. Its unified interface lets users compare methods directly on the
62+
same data, exploit specialized capabilities, and select hyperparameters without requiring
63+
ground-truth derivatives. The package is particularly valuable in workflows that use
64+
derivatives as regression targets: for example, Sparse Identification of Nonlinear Dynamics
65+
(SINDy) [@brunton2016discovering] learns governing equations by regressing measured
66+
derivatives, making clean, reliable derivative estimates a prerequisite for accurate model
67+
identification.
68+
69+
70+
# State of the Field
71+
72+
Relevant Python tools exist, but none covers the full scope of PyNumDiff. `numpy.gradient`
73+
and `scipy.signal.savgol_filter` [@virtanen2020scipy] are widely available but address only
74+
a narrow slice of the method space. The `findiff` package provides high-order finite
75+
difference stencils for clean simulation data rather than noisy measurements. The standalone
76+
`TVRegDiff` code [@chartrand2011numerical] implements a single total variation regularization
77+
method; PyNumDiff includes and extends this. No existing Python package combines seven method
78+
families with a unified API, NaN and variable-step support, multidimensional capability,
79+
circular domain handling, and integrated hyperparameter optimization.
80+
81+
The original PyNumDiff [@vanBreugel2022] introduced four method families with basic parameter
82+
optimization. The present version substantially extends that foundation in scope, capability,
83+
and software quality.
84+
85+
86+
# Software Design
87+
88+
**Unified API.** All differentiation methods share the call signature
89+
90+
```python
91+
x_hat, dxdt_hat = method(x, dt_or_t, **params)
92+
```
93+
94+
where `x` is a NumPy array [@harris2020array] of measurements, `dt_or_t` is either a
95+
scalar step size or an array of sample locations (for variable step size), and keyword
96+
arguments configure the method. The two return values always match the shape of `x`. This
97+
uniformity makes method comparison straightforward and enables drop-in substitution.
98+
99+
**Multidimensional support.** An `axis` parameter controls which axis is differentiated,
100+
allowing all methods to operate on blocks of time series simultaneously. The implementation
101+
iterates over all non-time axes with `np.ndindex`, applying the algorithm independently to
102+
each vector, so a 2D array of shape `(N, D)` is handled without reshaping or looping in
103+
user code.
104+
105+
**Missing data and variable step size.** Several methods handle NaN-valued entries as missing
106+
observations and non-uniform sample spacing: `splinediff`, `polydiff`, `rbfdiff`, `rtsdiff`,
107+
and `robustdiff`. This supports realistic experimental scenarios where sensors skip samples or
108+
operate at irregular rates.
109+
110+
**Circular and wrapped domains.** `rtsdiff` accepts a `circular=True` flag for quantities
111+
like angles that live on a periodic domain. Innovation residuals are wrapped to $[-\pi, \pi]$
112+
before each Kalman update, and `x_hat` is returned wrapped to the same range. This avoids the
113+
erroneous large-magnitude spikes that naive smoothers produce when a signal crosses the
114+
$\pm\pi$ boundary. The wrapping is injected via an optional `innovation_fn` hook on the
115+
underlying `kalman_filter` primitive, keeping the filter itself general.
116+
117+
**Outlier robustness.** `robustdiff` solves a convex optimization problem that replaces the
118+
least-squares Kalman cost with Huber loss on both measurement and process residuals,
119+
dramatically reducing outlier influence. It uses `cvxpy` [@diamond2016cvxpy] as its
120+
optimization backend, available via `pip install pynumdiff[advanced]`. `tvrdiff` similarly
121+
minimizes a total variation penalty on the derivative, promoting piecewise-smooth solutions
122+
and tolerating abrupt changes; it supports regularization of the velocity, acceleration, or
123+
jerk (orders 1–3).
124+
125+
**Hyperparameter optimization.** Because every method has tuning parameters, PyNumDiff
126+
provides a multi-objective optimization framework in `pynumdiff.optimize` that minimizes a
127+
weighted sum of a smoothness penalty and a data fidelity term [@vanBreugel2020numerical].
128+
When ground-truth derivatives are available, the optimizer minimizes derivative error
129+
directly. The smoothness weight `tvgamma` can be initialized from the signal's estimated
130+
cutoff frequency $f_c$ via the empirical formula
131+
$$\texttt{tvgamma} = \exp(-1.6\ln f_c - 0.71\ln \Delta t - 5.1).$$
132+
133+
134+
# Research Impact
135+
136+
The original PyNumDiff paper [@vanBreugel2022] has accumulated over 60 citations since its
137+
2022 publication, and the package has been used in experimental biology to estimate flight
138+
kinematics from motion capture, in control engineering for observer design, and in
139+
data-driven dynamics identification. The companion Taxonomy paper [@komarov2025], submitted
140+
to the Journal of Computational Physics, provides the theoretical analysis underpinning the
141+
method collection and benchmarks all included methods across diverse test signals. The
142+
package is available on PyPI (`pip install pynumdiff`), documented at
143+
[pynumdiff.readthedocs.io](https://pynumdiff.readthedocs.io/master/), and archived on
144+
Zenodo [@pynumdiff_zenodo].
145+
146+
147+
# AI Usage Disclosure
148+
149+
The draft of this paper was prepared with assistance from Claude Sonnet 4.6 (Anthropic).
150+
This included integrating material from the repository, documentation, and related papers
151+
into a coherent draft. The authors reviewed and edited all content and take full
152+
responsibility for its accuracy.
153+
154+
155+
# Acknowledgements
156+
157+
The authors thank Yuying Liu and Bingni W. Brunton for their contributions to the original
158+
PyNumDiff package [@vanBreugel2022].
159+
160+
161+
# References

0 commit comments

Comments
 (0)