Newton-Type Optimal Thresholding Algorithms for Sparse Optimization Problems (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s40305-021-00370-9.pdf

Newton-Type Optimal Thresholding Algorithms for Sparse Optimization Problems

Journal of the Operations Research Society of China https://doi.org/10.1007/s40305-021-00370-9 Newton-Type Optimal Thresholding Algorithms for Sparse Optimization Problems Nan Meng1 · Yun-Bin Zhao2 Received: 6 April 2021 / Revised: 8 September 2021 / Accepted: 9 September 2021 © The Author(s) 2022 Abstract Sparse signals can be possibly reconstructed by an algorithm which merges a traditional nonlinear optimization method and a certain thresholding technique. Different from existing thresholding methods, a novel thresholding technique referred to as the optimal k-thresholding was recently proposed by Zhao (SIAM J Optim 30(1):31–55, 2020). This technique simultaneously performs the minimization of an error metric for the problem and thresholding of the iterates generated by the classic gradient method. In this paper, we propose the so-called Newton-type optimal k-thresholding (NTOT) algorithm which is motivated by the appreciable performance of both Newton-type methods and the optimal k-thresholding technique for signal recovery. The guaranteed performance (including convergence) of the proposed algorithms is shown in terms of suitable choices of the algorithmic parameters and the restricted isometry property (RIP) of the sensing matrix which has been widely used in the analysis of compressive sensing algorithms. The simulation results based on synthetic signals indicate that the proposed algorithms are stable and efficient for signal recovery. Keywords Compressed sensing · Sparse optimization · Newton-type methods · Optimal k-thresholding · Restricted isometry property Mathematics Subject Classification 90C30 · 90C25 · 65F10 · 94A12 · 15A29 This paper is dedicated to the late Professor Duan Li in commemoration of his contributions to optimization, financial engineering, and risk management. The work was founded by the National Natural Science Foundation of China (No. 12071307). B Yun-Bin Zhao Nan Meng 1 School of Mathematics, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK 2 Shenzhen Research Institute of Big Data, Chinese University of Hong Kong, Shenzhen 518172, Guangdong, China 123 N. Meng, Y.-B. Zhao 1 Introduction The sparse optimization problem arises naturally from a wide range of practical scenarios such as compressed sensing [1–4], signal and image processing [5–7], pattern recognition [8], and wireless communications [9]. The typical problem of signal recovery via compressed sensing can be formulated as the following sparse optimization problem: min y − Ax22 : x0 k , x (1) where k is a given integer number reflecting the sparsity level of the target signal x ∗ , A ∈ Rm×n is a measurement matrix with m n, x0 is the so-called 0 -norm counting the nonzeros of the vector x, and y is the acquired measurements of the signal x ∗ to recover. The vector y is usually represented as y = Ax ∗ + η, where η denotes a noise vector. Developing effective algorithms for the model (1) is fundamentally important in signal recovery. At the current stage of development, the main algorithms for solving sparse optimization problems can be categorized into several classes: convex optimization, heuristic algorithms, thresholding algorithms, and Bayes methods. The typical convex optimization methods include 1 -minimization [10,11], reweighted 1 minimization [12,13], and dual-density-based reweighted 1 -minimization [4,14,15]. The widely used heuristic algorithms include orthogonal matching pursuit (OMP) [16,17], subspace pursuit (SP) [18], and compressive sampling matching pursuit (CoSaMP) [19,20]. Depending on thresholding strategies, the thresholding methods can be roughly classified as soft thresholding [21,22], hard thresholding (e.g., [23–27]), and the so-called optimal thresholding methods [28,29]. The hard thresholding is the simplest thresholding approach used to generate iterates satisfying the constraint of the problem (1). Throughout the paper, we use Hk (·) to denote the hard thresholding operator which retains the largest k magnitudes of a vector and zeroes out the others. The following iterative hard thresholding (IHT) scheme x p+1 = Hk x p + λA y − Ax p , where λ > 0 is a stepsize, was first studied in [23,30]. Incorporating a pursuit step (least-squares step) into IHT yields the hard thresholding pursuit (HTP) [26,31], and when λ is replaced by an adaptive stepsize similar to the one used in traditional conjugate methods, it leads to the so-called normalized iterative hard thresholding (NIHT) algorithms in [24,32]. The theoretical performance of these algorithms can be analyzed in terms of the restricted isometry property (RIP) (see, e.g., [3,23,30]). On the other hand, the search direction A (y − Ax p ) of the above-mentioned algorithm is the negative gradient of the objective function of the problem (1). Such a search direction can be replaced by another direction provided that it is a descent direction of the objective function. Thus, an Newton-type direction was studied in [27,33,34]. The following iterative method is proposed and referred to as Newtonstep-based iterative hard thresholding (NSIHT) in [27]: 123 Newton-Type Optimal Thresholding Algorithms… −1 x p+1 = Hk x p + λ A A + I A y − Ax p , (2) where > 0 is a parameter and λ > 0 is the stepsize. However, as pointed out in [28,29], the weakness of the hard thresholding operator Hk (·) is that when applied to a non-sparse iterate generated by the classic gradient method, it may cause an ascending value of the objective of (1) at the thresholded vector, compared to the objective value at its unthresholded counterpart. As a result, direct use of the hard thresholding operator to a non-sparse or non-compressible vector in the course of an algorithm may lead to significant numerical oscillation and divergence of the algorithm. To overcome such a drawback of hard thresholding operator, Zhao [28] proposed an optimal k-thresholding technique which makes it possible to perform thresholding and objective-value reduction simultaneously. The optimal k-thresholding iterative scheme in [28] can be simply stated as x p+1 = Zk# x p + λA y − Ax p , where λ remains a stepsize, and Zk# (·) is the so-called optimal k-thresholding operator. Given a vector u, the thresholded vector Zk# (u) = u ⊗ w ∗ (the Hadamard product of two vectors) where the vector w∗ is the optimal solution to the following quadratic 0-1 optimization problem: w ∗ := arg min y − A(u ⊗ w)22 : e w = k, w ∈ {0, 1}n , w where e = (1, · · · , 1) ∈ Rn is the vector of ones, and {0, 1}n denotes the set of n-dimensional 0-1 vectors. To avoid solving such a binary optimization problem, an alternative approach is to solve its convex relaxation which, as pointed out in [28,29], is the tightest convex relaxation of the above problem: w := arg min y − A(u ⊗ w)22 : e w = k, 0 w e . w (3) Based on the convex relaxation of the operator Zk# (·), efficient algorithms calle (...truncated)