Skip to content

Commit 191f7fb

Browse files
committed
remove green color
1 parent 4c8636a commit 191f7fb

3 files changed

Lines changed: 12 additions & 12 deletions

File tree

paper/analysis.tex

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ \subsection{Methods}
7575

7676
For constant $\mtx{\Omega}$, the estimator~\refequ{equ:nystrompp-trace-estimator} coincides with the Nyström++ estimator from~\cite{persson-2022-improved-variants}, which is based on the Hutch++ estimator~\cite{meyer-2021-hutch-optimal}. In this situation (constant $\mtx{B}$), these estimators were both shown to achieve a relative $\varepsilon$-error with $\mathcal{O}(\varepsilon^{-1})$ matrix-vector products only, independent of the singular value decay of $\mtx{B}$.
7777

78-
\textcolor{green}{At this point, we acknowledge the existence of the XNysTrace estimator from \cite{epperly-2024-xtrace-making} which oftentimes outperforms the Nyström++ estimator. Moreover, a straight-forward extension of the XNysTrace estimator to the parameter-dependent setting seems to be in reach. However, what distinguishes \refalg{alg:nystrom-chebyshev-pp} for estimating spectral densities are two key observations --- using the cyclic invariance of the trace and the affine linear form of Chebyshev expansions (see \refsec{subsubsec:chebyshev-nystrom-implementation} for details) --- which it exploits to significantly speed up the computation. Unfortunately, we do not see a way of marrying these observations with the efficient implementation of XNysTrace described in \cite[Section 2.2]{epperly-2024-xtrace-making}, which limits its suitability for spectral density estimation.} \textcolor{green}{??? Maybe this would be more appropriate in the introduction? ???}
78+
At this point, we acknowledge the existence of the XNysTrace estimator from \cite{epperly-2024-xtrace-making} which oftentimes outperforms the Nyström++ estimator. Moreover, a straight-forward extension of the XNysTrace estimator to the parameter-dependent setting seems to be in reach. However, what distinguishes \refalg{alg:nystrom-chebyshev-pp} for estimating spectral densities are two key observations --- using the cyclic invariance of the trace and the affine linear form of Chebyshev expansions (see \refsec{subsubsec:chebyshev-nystrom-implementation} for details) --- which it exploits to significantly speed up the computation. Unfortunately, we do not see a way of marrying these observations with the efficient implementation of XNysTrace described in \cite[Section 2.2]{epperly-2024-xtrace-making}, which limits its suitability for spectral density estimation.
7979
%We can interpret this estimator as an interpolation between the trace of the Nyström approximation and the Girard-Hutchinson estimator.
8080

8181
%In the remainder of this section, we will derive upper bounds on the error of these estimators.
@@ -360,7 +360,7 @@ \subsection{Nyström++ estimator for parameter-dependent matrices}
360360
\label{qu:nystrompp-theorem-bound}
361361
\end{equation}
362362
holds with probability at least $1 - \gamma^{-n_{\mtx{\Omega}} / 4}$ for
363-
$c = \textcolor{green}{154}$. In particular, given $\varepsilon > 0$ and $\delta \in (0, 1)$, the bound $\int_{a}^{b} \lVert \mtx{B}(t) - \Nystr{\mtx{\Omega}}{\mtx{B}}(t) \rVert _F~\mathrm{d}t \leq \varepsilon \int_{a}^{b} \Trace(\mtx{B}(t))~\mathrm{d}t$ holds with probability at least $1-\delta$ if $n_{\mtx{\Omega}} = \mathcal{O}(\varepsilon^{-2} + \log(\delta^{-1}))$.
363+
$c = 154$. In particular, given $\varepsilon > 0$ and $\delta \in (0, 1)$, the bound $\int_{a}^{b} \lVert \mtx{B}(t) - \Nystr{\mtx{\Omega}}{\mtx{B}}(t) \rVert _F~\mathrm{d}t \leq \varepsilon \int_{a}^{b} \Trace(\mtx{B}(t))~\mathrm{d}t$ holds with probability at least $1-\delta$ if $n_{\mtx{\Omega}} = \mathcal{O}(\varepsilon^{-2} + \log(\delta^{-1}))$.
364364
\end{lemma}
365365

366366
%\todo{Proof idea: structural bound, then higher order moment bound to apply Markov's inequality.}
@@ -380,7 +380,7 @@ \subsection{Nyström++ estimator for parameter-dependent matrices}
380380
\end{equation*}
381381
we set
382382
$\mtx{\Omega}_1 := \mtx{U}_1^{\top} \mtx{\Omega} \in \mathbb{R}^{k \times n_{\mtx{\Omega}}}$ and $\mtx{\Omega}_2 := \mtx{U}_2^{\top} \mtx{\Omega} \in \mathbb{R}^{(n - k) \times n_{\mtx{\Omega}}}$, which are independent Gaussian random matrices.
383-
Applying Theorem B.1 from~\cite{persson-2023-randomized-lowrank} for $f(x) = x$, see also \textcolor{green}{proof of \cite[Corollary 8.2]{tropp-2023-randomized-algorithms}}, yields the bound
383+
Applying Theorem B.1 from~\cite{persson-2023-randomized-lowrank} for $f(x) = x$, see also proof of \cite[Corollary 8.2]{tropp-2023-randomized-algorithms}, yields the bound
384384
\begin{equation}
385385
\lVert \mtx{B}(t) - \Nystr{\mtx{\Omega}}{\mtx{B}}(t) \rVert _F
386386
\leq \lVert \mtx{\Lambda}_2 \rVert _F + \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \mtx{\Omega}_2 \mtx{\Omega}_1^{\dagger} \rVert _{(4)}^2,
@@ -410,7 +410,7 @@ \subsection{Nyström++ estimator for parameter-dependent matrices}
410410
\[
411411
\mathbb{E}\left[ \big\| ( \mtx{\Omega}_1 \mtx{\Omega}_1^{\top} )^{-1} \big\|_2^{\sfrac{n_{\mtx{\Omega}}}{4}} \right]% \notag \\
412412
\leq
413-
\textcolor{green}{\left(1 + \frac{n_{\mtx{\Omega}}}{2}\right)}
413+
\left(1 + \frac{n_{\mtx{\Omega}}}{2}\right)
414414
\left( \frac{3}{4} n_{\mtx{\Omega}}\right)^{\sfrac{n_{\mtx{\Omega}}}{4}}
415415
\big( ( n_{\mtx{\Omega}} / 2 + 1)!\big)^{-\frac{n_{\mtx{\Omega}}}{2+n_{\mtx{\Omega}}}}.
416416
\]
@@ -422,7 +422,7 @@ \subsection{Nyström++ estimator for parameter-dependent matrices}
422422
\frac{e^4 n_{\mtx{\Omega}}}{(n_{\mtx{\Omega}} / 2+ 1)^2} \le
423423
\frac{3}{4} \frac{e^4}{n_{\mtx{\Omega}}},
424424
\end{equation}
425-
where we used \textcolor{green}{$(1 + m)^{\sfrac{1}{m}} \leq e$ and} $(m!)^{-\sfrac{1}{m}} \leq e/m$. Note that, in contrast
425+
where we used $(1 + m)^{\sfrac{1}{m}} \leq e$ and $(m!)^{-\sfrac{1}{m}} \leq e/m$. Note that, in contrast
426426
to the result of~\cite[Lemma B.3]{tropp-2023-randomized-algorithms}, this inequality is valid for arbitrarily large $n_{\mtx{\Omega}}$, at the expense of a slightly larger constant.
427427
%
428428
% To match the decay rate of the moments of the first term in \refequ{equ:nystrom-proof-persson-bonud}, we will need to ensure that this term also is of order $\mathcal{O}(\Trace(\mtx{B}) / \sqrt{k})$. We bound the moments of $\lVert \mtx{\Omega}_1^{\dagger} \rVert _2$ similarly to \cite[Lemma B.3]{tropp-2023-randomized-algorithms}, but without restricting $q$ to be smaller than $18$. The explicit integration of \cite[Equation B.7]{tropp-2023-randomized-algorithms} imposes the condition $n_{\mtx{\Omega}} - k - 2q \geq 0$. Both $n_{\mtx{\Omega}}$ and $k$ are integers and must be of the same order to ensure a decay of $\mathcal{O}(\Trace(\mtx{B}) / \sqrt{k})$. To avoid restricting the choice of $n_{\mtx{\Omega}}$ too much, we let it be even and set $k = n_{\mtx{\Omega}}/2$. The moment $q$ should be chosen as large as possible to ensure a fast decay of the failure probability, so we fix it to $q = n_{\mtx{\Omega}}/4$. Therefore, we get
@@ -443,7 +443,7 @@ \subsection{Nyström++ estimator for parameter-dependent matrices}
443443
The second factor in~\refequ{equ:nystrom-proof-processed-tail} is bounded using \reflem{lem:spectral-norm-moment} with $\mtx{A} = \mtx{\Lambda}_2^{\sfrac{1}{2}}$: % and $p = n_{\mtx{\Omega}}/2$:
444444
\[
445445
\mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}}\left[ \big\| \mtx{\Lambda}_2^{\sfrac{1}{2}} \mtx{\Omega}_2 \big\|_2^2 \right]
446-
\leq \textcolor{green}{\frac{5}{4}} n_{\mtx{\Omega}} \Big( \textcolor{green}{2} \big\| \mtx{\Lambda}_2^{\sfrac{1}{2}} \big\|_2^2 + \frac{1}{n_{\mtx{\Omega}}} \big\| \mtx{\Lambda}_2^{\sfrac{1}{2}} \big\|_F^2 \Big).
446+
\leq \frac{5}{4} n_{\mtx{\Omega}} \Big( 2 \big\| \mtx{\Lambda}_2^{\sfrac{1}{2}} \big\|_2^2 + \frac{1}{n_{\mtx{\Omega}}} \big\| \mtx{\Lambda}_2^{\sfrac{1}{2}} \big\|_F^2 \Big).
447447
% \label{equ:spectral-norm-bound-applied}
448448
\]
449449
%The $q$-th moment of the second term can be processed with standard matrix norm manipulations and the stochastic independence of $\mtx{\Omega}_1$ and $\mtx{\Omega}_2$ to
@@ -478,19 +478,19 @@ \subsection{Nyström++ estimator for parameter-dependent matrices}
478478
Inserting this inequality and~\refequ{equ:pinv-spectral-norm-bound} into~\refequ{equ:nystrom-proof-processed-tail} gives
479479
\begin{equation}
480480
\mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}}\left[ \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \mtx{\Omega}_2 \mtx{\Omega}_1^{\dagger} \rVert _{(4)}^2 \right]
481-
\leq \textcolor{green}{\frac{15 e^4}{16}} \sqrt{n_{\mtx{\Omega}}} \Big( \textcolor{green}{2} \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _2^2 + \frac{1}{n_{\mtx{\Omega}}} \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _F^2 \Big).
481+
\leq \frac{15 e^4}{16} \sqrt{n_{\mtx{\Omega}}} \Big( 2 \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _2^2 + \frac{1}{n_{\mtx{\Omega}}} \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _F^2 \Big).
482482
%\leq \sqrt{k} \frac{e^4}{2} \frac{(k + n_{\mtx{\Omega}})(2p + n_{\mtx{\Omega}})}{(n_{\mtx{\Omega}} - k + 1)^2} \left( 3 \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _2^2 + \frac{1}{n_{\mtx{\Omega}}} \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _F^2 \right).
483483
\end{equation}
484484
Bounding $\lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _2^2 = \lambda_{k+1} \leq \Trace(\mtx{B})/k$ and $\lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _F^2 = \Trace(\mtx{\Lambda}_2) \le \Trace(\mtx{B})$ (recall that $k = n_{\mtx{\Omega}}/2$) yields
485485
\begin{equation}
486486
\mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}}\left[ \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \mtx{\Omega}_2 \mtx{\Omega}_1^{\dagger} \rVert _{(4)}^2 \right]
487-
\leq \textcolor{green}{\frac{15 e^4}{16}} \sqrt{n_{\mtx{\Omega}}} \Big( \frac{2}{n_{\mtx{\Omega}}} \Trace(\mtx{B}) + \frac{1}{n_{\mtx{\Omega}}} \Trace(\mtx{B}) \Big)
488-
\leq \frac{\textcolor{green}{154}}{\sqrt{n_{\mtx{\Omega}}}} \Trace(\mtx{B}).
487+
\leq \frac{15 e^4}{16} \sqrt{n_{\mtx{\Omega}}} \Big( \frac{2}{n_{\mtx{\Omega}}} \Trace(\mtx{B}) + \frac{1}{n_{\mtx{\Omega}}} \Trace(\mtx{B}) \Big)
488+
\leq \frac{154}{\sqrt{n_{\mtx{\Omega}}}} \Trace(\mtx{B}).
489489
\label{equ:nystrom-proof-tail-bound}
490490
%\leq \sqrt{k} \frac{e^4}{2} \frac{(k + n_{\mtx{\Omega}})(2p + n_{\mtx{\Omega}})}{(n_{\mtx{\Omega}} - k + 1)^2} \left( 3 \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _2^2 + \frac{1}{n_{\mtx{\Omega}}} \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \rVert _F^2 \right).
491491
\end{equation}
492492

493-
Inserting~\refequ{equ:nystrom-proof-tail-bound} along with \refequ{equ:nystrom-proof-frobenius-trace} in \refequ{equ:nystrom-proof-persson-bonud}\textcolor{green}{, letting $c=154$,} and using the triangle inequality for $\mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}}[\cdot]$, we obtain
493+
Inserting~\refequ{equ:nystrom-proof-tail-bound} along with \refequ{equ:nystrom-proof-frobenius-trace} in \refequ{equ:nystrom-proof-persson-bonud}, letting $c=154$, and using the triangle inequality for $\mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}}[\cdot]$, we obtain
494494
\begin{equation} \label{eq:blubber}
495495
\mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}} \left[\lVert \mtx{B} - \Nystr{\mtx{\Omega}}{\mtx{B}} \rVert _F \right]
496496
\leq \mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}} \left[ \lVert \mtx{\Lambda}_2 \rVert _F \right] + \mathbb{E}^{\sfrac{n_{\mtx{\Omega}}}{4}} \left[ \lVert \mtx{\Lambda}_2^{\sfrac{1}{2}} \mtx{\Omega}_2 \mtx{\Omega}_1^{\dagger} \rVert _{(4)}^2 \right]

paper/intro.tex

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ \section{Introduction}
1212

1313
For a \emph{constant} matrix $\mtx{B}$, one of the most popular trace estimators is the Girard-Hutchinson estimator \cite{girard-1989-fast-montecarlo, hutchinson-1990-stochastic-estimator} along with variance reduction techniques~\cite{gambhir-2017-deflation-method, saibaba-2017-randomized-matrixfree, lin-2017-randomized-estimation, meyer-2021-hutch-optimal, persson-2022-improved-variants, chen-2023-krylovaware-stochastic, epperly-2024-xtrace-making}. Suitable extensions to parameter-dependent matrices have been considered, e.g., in~\cite{lin-2017-randomized-estimation,chen-2023-krylovaware-stochastic}, but we are not aware of an analysis providing rigorous justification and insight of these extensions. In passing, we note that dynamic trace estimation~\cite{dharangutte-2024-dynamic-trace,woodruff-2024-optimal-query} is an efficient technique for subsequently estimating the traces of matrices $\mtx{B}(t_1), \dots, \mtx{B}(t_m)$ when the increments $\mtx{B}(t_{i+1}) - \mtx{B}(t_i)$ are relatively small in norm. The potential of dynamic trace estimation appears to be limited in our setting because $\mtx{B}(t)$ may change rapidly close to eigenvalues, with $g_{\sigma}$ approximating a Dirac delta function.
1414

15-
All methods considered in this work are based on the following simple idea: Apply an existing randomized trace estimator to $\Trace(\mtx{B}(t))$ with \emph{constant} random vectors, that is, the same randomization is used for each value of the parameter $t$. \textcolor{green}{For example, the Girard-Hutchinson estimator becomes $n_{\mtx{\Psi}}^{-1} \sum_{j=1}^{n_{\mtx{\Psi}}} \vct{\psi}_j^{\top} \mtx{B}(t) \vct{\psi}_j$ for $n_{\mtx{\Psi}}$ constant Gaussian random vectors $\vct{\psi}_1, \dots, \vct{\psi}_{n_{\mtx{\Psi}}}$.}
15+
All methods considered in this work are based on the following simple idea: Apply an existing randomized trace estimator to $\Trace(\mtx{B}(t))$ with \emph{constant} random vectors, that is, the same randomization is used for each value of the parameter $t$. For example, the Girard-Hutchinson estimator becomes $n_{\mtx{\Psi}}^{-1} \sum_{j=1}^{n_{\mtx{\Psi}}} \vct{\psi}_j^{\top} \mtx{B}(t) \vct{\psi}_j$ for $n_{\mtx{\Psi}}$ constant Gaussian random vectors $\vct{\psi}_1, \dots, \vct{\psi}_{n_{\mtx{\Psi}}}$.
1616
??? STOP HERE ???
1717

1818

paper/paper.tex

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
\textcolor{red}{TODO for Fabio:
4141
\begin{itemize}
4242
%\item Add author list. Matti first, rest (He/Kressner/Lam) in alphabetical order.
43-
\item The text style is a bit too generous but no need to change this now. \textcolor{green}{Ok! Let's talk about it at some point.}
43+
\item The text style is a bit too generous but no need to change this now.
4444
\end{itemize}
4545
}
4646
\color{blue}

0 commit comments

Comments
 (0)