Iterative refinement for symmetric eigenvalue decomposition II: clustered eigenvalues (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs13160-019-00348-4.pdf

Iterative refinement for symmetric eigenvalue decomposition II: clustered eigenvalues

Japan Journal of Industrial and Applied Mathematics July 2019, Volume 36, Issue 2, pp 435–459 | Cite as Iterative refinement for symmetric eigenvalue decomposition II: clustered eigenvalues AuthorsAuthors and affiliations Takeshi OgitaKensuke Aishima Open Access Original Paper First Online: 22 February 2019 390 Downloads Abstract We are concerned with accurate eigenvalue decomposition of a real symmetric matrix A. In the previous paper (Ogita and Aishima in Jpn J Ind Appl Math 35(3): 1007–1035, 2018), we proposed an efficient refinement algorithm for improving the accuracy of all eigenvectors, which converges quadratically if a sufficiently accurate initial guess is given. However, since the accuracy of eigenvectors depends on the eigenvalue gap, it is difficult to provide such an initial guess to the algorithm in the case where A has clustered eigenvalues. To overcome this problem, we propose a novel algorithm that can refine approximate eigenvectors corresponding to clustered eigenvalues on the basis of the algorithm proposed in the previous paper. Numerical results are presented showing excellent performance of the proposed algorithm in terms of convergence rate and overall computational cost and illustrating an application to a quantum materials simulation. KeywordsAccurate numerical algorithm Iterative refinement Symmetric eigenvalue decomposition Clustered eigenvalues This study was partially supported by CREST, JST and JSPS KAKENHI Grant numbers 16H03917, 25790096. Mathematics Subject Classification65F15 15A18 15A23 1 Introduction Let A be a real symmetric $n \times n$ matrix. Since solving a standard symmetric eigenvalue problem $Ax = \lambda x$, where $\lambda \in {\mathbb {R}}$ is an eigenvalue of A and $x \in {\mathbb {R}}^{n}$ is an eigenvector of A associated with $\lambda $, is ubiquitous in scientific computing, it is important to develop reliable numerical algorithms for calculating eigenvalues and eigenvectors accurately. Excellent overviews on the symmetric eigenvalue problem can be found in references [20, 23]. We are concerned with the eigenvalue decomposition of A such that $$\begin{aligned} A = {X}{D}{X}^{\mathrm {T}}, \end{aligned}$$ (1) where ${X}$ is an $n \times n$ orthogonal matrix whose ith columns are eigenvectors $x_{(i)}$ of A (called an eigenvector matrix) and ${D} = (d_{ij})$ is an $n \times n$ diagonal matrix whose diagonal elements are the corresponding eigenvalues $\lambda _{i} \in {\mathbb {R}}$, i.e., $d_{ii} = \lambda _{i}$ for $i = 1, \ldots , n$. Throughout the paper, we assume that $$\begin{aligned} \lambda _{1} \le \lambda _{2} \le \cdots \le \lambda _{n}, \end{aligned}$$ and the columns of ${X}$ are ordered correspondingly. We here collect notation used in this paper. Let I and O denote the identity matrix and the zero matrix of appropriate size, respectively. Unless otherwise specified, $\Vert \cdot \Vert $ means $\Vert \cdot \Vert _{2}$, which denotes the Euclidean norm for vectors and the spectral norm for matrices. For legibility, if necessary, we distinguish between the approximate quantities and the computed results, e.g., for some quantity $\alpha $ we write $\widetilde{\alpha }$ and $\widehat{\alpha }$ as an approximation of $\alpha $ and a computed result for $\alpha $, respectively. The accuracy of an approximate eigenvector depends on the gap between the corresponding eigenvalue and its nearest neighbor eigenvalue (cf., e.g., [20, Theorem 11.7.1]). For simplicity, suppose all eigenvalues of A are simple. Let $\widehat{X} \in {\mathbb {R}}^{n \times n}$ be an approximation of ${X}$. Let $z_{(i)}:=\widehat{x}_{(i)}/\Vert \widehat{x}_{(i)}\Vert $ for $i = 1, 2, \ldots , n$, where $\widehat{x}_{(i)}$ are the i-th columns of $\widehat{X}$. Moreover, for each i, suppose that the Ritz value $\mu _{i}:=z_{(i)}^{\mathrm {T}}Az_{(i)}$ is closer to $\lambda _{i}$ than to any other eigenvalues. Let $ gap (\mu _{i})$ denote the smallest difference between $\mu _{i}$ and any other eigenvalue, i.e., $ gap (\mu _{i}) := \min _{j \ne i}|\mu _{i}-\lambda _{j}|$. Then, it holds for all i that $$\begin{aligned} |\sin \theta (x_{(i)},z_{(i)})| \le \frac{\Vert Az_{(i)}-\mu _{i}z_{(i)}\Vert }{ gap (\mu _{i})}, \quad \theta (x_{(i)},z_{(i)}):=\mathrm{arc} \cos (|x_{(i)}^{\mathrm {T}}z_{(i)}|) . \end{aligned}$$ Suppose $\widehat{X}$ is obtained by some backward stable algorithm with the relative rounding error unit ${\mathbf {u}}$ in floating-point arithmetic. For example, ${\mathbf {u}}= 2^{-53}$ for IEEE 754 binary64. Then, there exists $\varDelta _{A}^{(i)}$ such that $$\begin{aligned} (A + \varDelta _{A}^{(i)})z_{(i)} = \mu _{i}z_{(i)}, \quad \Vert \varDelta _{A}^{(i)}\Vert = {\mathcal {O}}(\Vert A\Vert {\mathbf {u}}), \end{aligned}$$ which implies $\Vert Az_{(i)}-\mu _{i}z_{(i)}\Vert = {\mathcal {O}}(\Vert A\Vert {\mathbf {u}})$, and hence, for all i, $$\begin{aligned} |\sin \theta (x_{(i)},z_{(i)})| \le \alpha _{i}, \quad \alpha _{i} = {\mathcal {O}}\left( \frac{\Vert A\Vert {\mathbf {u}}}{ gap (\mu _{i})}\right) . \end{aligned}$$ (2) The smaller the eigenvalue gap, the worse the accuracy of a computed eigenvector. Therefore, refinement algorithms for eigenvectors are useful for obtaining highly accurate results. For example, highly accurate computations of a few or all eigenvectors are crucial for large-scale electronic structure calculations in material physics [24, 25], in which specific interior eigenvalues with associated eigenvectors need to be computed. On related work on refinement algorithms for symmetric eigenvalue decomposition, see the previous paper [17] for details. In [17], we proposed a refinement algorithm for the eigenvalue decomposition of A, which works not for an individual eigenvector but for all eigenvectors. Since the algorithm is based on Newton’s method, it converges quadratically, provided that an initial guess is sufficiently accurate. In practice, although the algorithm refines computed eigenvectors corresponding to sufficiently separated simple eigenvalues, it cannot refine computed eigenvectors corresponding to “nearly” multiple eigenvalues. This is because it is difficult for standard numerical algorithms in floating-point arithmetic to provide sufficiently accurate initial approximate eigenvectors corresponding to nearly multiple eigenvalues as shown in (2). The purpose of this paper is to remedy this problem, i.e., we aim to develop a refinement algorithm for the eigenvalue decomposition of a symmetric matrix with clustered eigenvalues. We briefly explain the idea of our proposed algorithm. We focus on the so-called $\sin \theta $ theorem by Davis–Kahan [5, Section 2] as follows. For an index set ${\mathcal {J}}$ with $|{\mathcal {J}}|=\ell < n$, let $X_{{\mathcal {J}}} \in {\mathbb {R}}^{n\times \ell }$ denote the eigenvector matrix comprising $x_{(j)}$ for all (...truncated)