Infimal Convolution Regularisation Functionals of BV and \(\varvec{\mathrm {L}}^{\varvec{p}}\) Spaces

Journal of Mathematical Imaging and Vision, Feb 2016

We study a general class of infimal convolution type regularisation functionals suitable for applications in image processing. These functionals incorporate a combination of the total variation seminorm and \(\mathrm {L}^{p}\) norms. A unified well-posedness analysis is presented and a detailed study of the one-dimensional model is performed, by computing exact solutions for the corresponding denoising problem and the case \(p=2\). Furthermore, the dependency of the regularisation properties of this infimal convolution approach to the choice of p is studied. It turns out that in the case \(p=2\) this regulariser is equivalent to the Huber-type variant of total variation regularisation. We provide numerical examples for image decomposition as well as for image denoising. We show that our model is capable of eliminating the staircasing effect, a well-known disadvantage of total variation regularisation. Moreover as p increases we obtain almost piecewise affine reconstructions, leading also to a better preservation of hat-like structures.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1007%2Fs10851-015-0624-6.pdf

Infimal Convolution Regularisation Functionals of BV and \(\varvec{\mathrm {L}}^{\varvec{p}}\) Spaces

J Math Imaging Vis Infimal Convolution Regularisation Functionals of BV and L p Spaces Part I: The Finite p Case 0 1 2 3 4 Martin Burger 0 1 2 3 4 Konstantinos Papafitsoros 0 1 2 3 4 Evangelos Papoutsellis 0 1 2 3 4 Carola-Bibiane Schönlieb 0 1 2 3 4 0 Department of Applied Mathematics and Theoretical Physics, University of Cambridge , Cambridge , UK 1 Konstantinos Papafitsoros 2 Martin Burger 3 Evangelos Papoutsellis completed his undergraduate studies at the School of Applied Mathematics and Physical Sciences in National Technical University of Athens , obtaining a 5 year diploma in 4 Konstantinos Papafitsoros received a 5-year diploma degree from the School of Applied Mathematics and Physical Sciences, National Technical University of Athens, in 2007. He holds two M.Sc. degrees, one in Pure Mathematics from the University of Athens in 2009 and one in Mathematical Modelling and Scientific Computing from the University of Oxford in 2010. in 2014 from the University of Cambridge where he also stayed as an EPSRC Doctoral Prize fellow at the Department of Applied Mathematics and Theoretical Physics until We study a general class of infimal convolution type regularisation functionals suitable for applications in image processing. These functionals incorporate a combination of the total variation seminorm and L p norms. A unified well-posedness analysis is presented and a detailed study of the one-dimensional model is performed, by computing exact solutions for the corresponding denoising problem and the case p = 2. Furthermore, the dependency of the regularisation properties of this infimal convolution approach to the choice of p is studied. It turns out that in the case p = 2 this regulariser is equivalent to the Huber-type variant of total variation regularisation. We provide numerical examples for image decomposition as well as for image denoising. We show that our model is capable of eliminating the staircasing effect, a well-known disadvantage of total variation regularisation. Moreover as p increases we obtain almost piecewise affine reconstructions, leading also to a better preservation of hat-like structures. Total Variation; Infimal convolution; Denoising; Staircasing; L p norms; Image decomposition - 1 Institute for Computational and Applied Mathematics, University of Münster, Münster, Germany 2 Institute for Mathematics, Humboldt University of Berlin, Berlin, Germany 1 Introduction In this paper, we introduce a family of novel TV–L p infimal convolution type functionals with applications in image processing: TVLαp,β (u) := w∈iLnpf( ) α Du − w α, β > 0 and p > 1. M + β w Lp( ), (1.1) Here · M denotes the Radon norm of a measure. The functional (1.1) is suitable to be used as a regulariser in the context of variational non-smooth regularisation in imaging applications. We study the properties of (1.1), its regularising mechanism for different values of p and apply it successfully to image denoising. 1.1 Context After the introduction of the total variation (TV) for image reconstruction purposes [ 38 ], the use of non-smooth regularisers has become increasingly popular during the last decades (cf. [ 7 ]). They are typically used in the context of variational regularisation, where the reconstructed image is obtained as a solution of a minimisation problem of the type: 1 min u s f − T u sLs ( ) + (u). The regulariser is denoted here by . We assume that the data f , defined on an open, bounded and connected domain ⊂ R2, have been corrupted through a bounded, linear (1.2) operator T and additive (random) noise. Different values of s can be considered for the first term of (1.2), the fidelity term. For example, models incorporating an L2 fidelity term (resp. L1) have been shown to be efficient for the restoration of images corrupted by Gaussian noise (resp. impulse noise). Of course, other types of noise can also be considered and in those cases the form of the fidelity term is adjusted accordingly. Typically, one or more parameters within balance the strength of regularisation against the fidelity term in the minimisation (1.2). The advantage of using non-smooth regularisers is that the regularised images have sharp edges (discontinuities). For instance, it is a well-known fact that TV regularisation promotes piecewise constant reconstructions, thus preserving discontinuities. However, this also leads to blocky-like artefacts in the reconstructed image, an effect known as staircasing. Recall at this point that for two-dimensional images u ∈ L1( ), the definition of the total variation functional reads TV(u) := sup u divφ dx : φ ∈ Cc∞( , R2), φ ∞ ≤ 1 . (1.3) The total variation uses only first-order derivative information in the regularisation process. This can be seen from the fact that for TV(u) < ∞ the distributional derivative Du is a finite Radon measure and TV(u) = Du M. Moreover if u ∈ W1,1( ) then TV(u) = |∇u| dx , i.e. the total variation is the L1 norm of the gradient of u. Higherorder extensions of the total variation functional are widely explored in the literature e.g. [ 4,5,9,11,12,27,29,30,34 ]. The incorporation of second-order derivatives is shown to reduce or even eliminate the staircasing effect. The most successful regulariser of this kind is the second-order total generalised variation (TGV) introduced by Bredies et al. [ 5 ]. Its definition reads TGV2α,β (u) := w∈mBDin( ) α Du − w M +β E w M. (1.4) Here α, β are positive parameters and BD( ) is the space of functions of bounded deformation, i.e. the space of all L1( ) functions w, whose symmetrised distributional derivative E w is a finite Radon measure. This is a less regular space than the usual space of functions of bounded variation BV( ) for which the full gradient Du is required to be a finite Radon measure. Note that if the variable w in the definition (1.4) is forced to be the gradient of another function then we obtain the classical infimal convolution regulariser of Chambolle– Lions [ 9 ]. In that sense TGV can be seen as a particular instance of infimal convolution, optimally balancing first and second-order information. In the discrete formulation of TGV (as well as for TV) the Radon norm is interpreted as an L1 norm. The motivation for the current and the follow-up paper [ 8 ] is to explore the capabilities of L p norms within first-order regularisation functionals designed for image processing purposes. The use of L p norms for p > 1 has been exploited in different contexts—infinity and p-Laplacian (cf. e.g. [ 16 ] and [ 26 ] respectively). 1.2 Our Contribution Comparing the definition (1.1) with the definition of TGV in (1.4), we see that the Radon norm of the symmetrised gradient of w has been substituted by the L p norm of w, thus reducing the order of regularisation. Up to our knowledge, this is the first paper that provides a thorough analysis of TV–L p infimal convolution models (1.1) in this generality. We show that the minimisation in (1.1) is well-defined and that TVLαp,β (u) < ∞ if and only if TV(u) < ∞. Hence p TVLα,β regularised images belong to BV( ) as desired. In order to get more insight in the regularising mechanism p of the TVLα,β functional we provide a detailed and rigorous analysis of its one-dimensional version of the corresponding L2 fidelity denoising problem For the denoising problem (1.5) with p = 2 we also compute exact solutions for simple one-dimensional data. We show that the obtained solutions are piecewise smooth, in contrast to TV (piecewise constant) and TGV (piecewise affine) solutions. Moreover, we show that for p = 2, the 2-homogeneous analogue of the functional (1.1) TVL2α−,βhom (u) = min α Du − w w∈L2( ) β M + 2 w 2L2( ), (1.5) (1.6) is equivalent to a variant of Huber TV [ 24 ], with the functional (1.6) having a close connection to (1.1) itself. Huber total variation is a smooth approximation of total variation and even though it has been widely used in the imaging and inverse problems community, it has not been analysed adequately. Hence, as a by-product of our analysis, we compute exact solutions of the one-dimensional Huber TV denoising p problem. An analogous connection of the TVLα,β functional with a generalised Huber TV regularisation is also established for general p. We proceed with exhaustive numerical experiments focusing on (1.5). Our analysis is confirmed by the fact that the analytical results coincide with the numerical ones. Furthermore, we observe that even though a first-order regularisation functional is used, we are capable of eliminating the staircasing effect, similar to Huber TV. By using the Bregman iteration version of our method [ 32 ], we are also able to enhance the contrast of the reconstructed images, obtaining results very similar in quality to the TGV ones. We observe numerically that high values of p promote almost affine structures similar to second-order regularisation methods. We shed more light of this behaviour in the follow-up paper [ 8 ] where we study in depth the case p = ∞. Let us finally note that we also consider a modified version of the functional (1.1) where w is restricted to be a gradient of another function leading to the more classical infimal convolution setting. Even though, this modified model is not so successful in staircasing reduction, it is effective in decomposing an image into piecewise constant and smooth parts. 1.3 Organisation of the Paper After the introduction we proceed with the introduction of our model in Sect. 2. We prove the well-posedness of (1.1), we provide an equivalent definition and we prove its Lipschitz equivalence with the TV seminorm. We finish this section p with a well-posedness result of the corresponding TVLα,β regularisation problem using standard tools. In Sect. 3 we establish a link between the TVLαp,β functional and its p-homogeneous analogue (using the p-th power of · Lp( )). The p-homogeneous functional (for p = 2) is further shown to be equivalent to Huber total variation, while analogous results are obtained for p = 2. We study the corresponding one-dimensional model in Sect. 4 focusing on the L2 fidelity denoising case. More specifically, after deriving the optimality conditions using Fenchel–Rockafellar duality in Sect. 4.1, we explore the structure of solutions in Sect. 4.2. In Sect. 4.3 we compute exact solutions for the case p = 2, considering a simple step function as data. In Sect. 5 we present a variant of our model suitable for image decomposition purposes, i.e. geometric decomposition into piecewise constant and smooth structures. Section 6 focuses on numerical experiments. Confirmation of the obtained one-dimensional analytical results is done in Sect. 6.2, while two-dimensional denoising experiments are performed in Sect. 6.3 using the split Bregman method. There, we show that our approach can lead to elimination of the staircasing effect and we also show that by using a Bregmanised version we can also enhance the contrast, achieving results very close to TGV, a method considered state of the art in the context of variational regularisation. We finish the section with some image decomposition examples and we summarise our results in Sect. 7. In the appendix, we remind the reader of some basic facts from the theory of Radon measures and BV functions. p 2 Basic Properties of the TVLα,β Functional In this section, we introduce the TVLαp,β functional (1.1) as well as some of its main properties. For α, β > 0 and 1 < p ≤ ∞, we define TVLαp,β : L1( ) → R (where R := R ∪ {+∞}) as follows: TVLαp,β (u) := w∈mLipn( ) α Du − w M + β w Lp( ) . While in the present paper we mainly focus on the finite p case, the results of this section are stated and proved for p = ∞ as well, since the proofs are similar. The next proposition asserts that the minimisation in (1.1) is indeed well-defined. We omit the proof, which is based on standard coercivity and weak lower semicontinuity techniques: Proposition 2.1 Let u ∈ BV( ) with 1 < p ≤ ∞ and α, β > 0. Then the minimum in the definition (1.1) is attained. Another useful formulation of the definition (1.1) is the dual formulation: TVLαp,β (u) = sup u divφ dx : φ ∈ Cc1( ), φ ∞ ≤ α, φ Lq ( ) ≤ β , (2.1) where q denotes here the conjugate exponent of p, see (8.4). The following proposition shows that the two expressions coincide indeed. Recall first that for a functional F : X → R the effective domain is defined as dom F = {x ∈ X : F (x ) < ∞}, while the indicator and characteristic functions of A ⊆ X are defined as IA(x ) = 0, if x ∈ A, ∞, if x ∈/ A, and XA(x ) = 1, if x ∈ A, 0, if x ∈/ A, F (x ). respectively. As usual, we denote by ·, · the duality product of X and its dual X ∗. Finally, recall that the convex conjugate F ∗ : X ∗ → R of F is defined as F ∗(x ∗) = sup x ∗, x − x∈X Proposition 2.2 Let u ∈ BV( ) and 1 < p ≤ ∞ then Proof First notice that in (2.1), we can replace C 1( ) by c C 1( ), since C 1( ) = C 1( ) with the closure taken with 0 c 0 respect to the C 1 norm, i.e. φ = max φ ∞ , ∇φ ∞ . We define Then, we can rewrite (2.1) as TVLαp,β (u) = − inf φ∈X φ φLq∞( ≤) ≤αβ − u divφ dx We can establish the following relation inf {F1(φ) + F2(φ)} + wm∈iXn∗ F1∗(w) + F2∗(w) = 0, φ∈X i.e. the absence of duality gap between the primal and the dual problems [15, Chapter III], provided that the set is a closed vector space [ 2 ]. This is indeed true since on one hand we have and on the other hand, for every φ ∈ X , we can write φ = λ(λ−1φ − 0) with λ−1φ ∞ ≤ α and 0 ∈ dom F1. Since u ∈ BV( ), then TVLαp,β is finite, see also Proposition 2.4. Hence, F1∗(w) < ∞, F2∗(w) < ∞ and F1∗(−w) = w, φ = w, φ = β w Lp( ) , = − inf {F1(φ) + F2(φ)} . φ∈X sup φ∈C1( ) 0 φ Lq ( )≤β sup φ∈Lq ( ) φ Lq ( )≤β sup φ∈C01( ) φ ∞≤α where we have used the fact that, with a density argument, the function w : C01 → R can be extended in the whole Lq ( ) as a bounded, linear functional and using the Riesz representation theorem, we deduce that it is actually an L p function. Similarly, we have F2∗(w) = w, φ + u, divφ = α Du − w M . Thus the desired equality is proven. Remark 2.3 Note that using the dual formulation of TVLαp,β one can easily derive that the functional is lower semicontinuous with respect to the strong L1 topology since it is a pointwise supremum of continuous functions. The following lemma shows that the TVLαp,β functional is Lipschitz equivalent to the total variation. Proposition 2.4 Let u ∈ L1( ) and 1 < p ≤ ∞. Then TVLαp,β (u) < ∞ if and only if u ∈ BV( ). Moreover there exist constants C1 = α and C2 = (CC˜ )−1, where C = 1 max 1, | | q and C˜ = max α1 , β1 such that C2 Du M ≤ TVLαp,β (u) ≤ C1 Du M , for all u ∈ BV( ). (2.2) Finally in the special case where (2.3) (2.4) (2.5) (2.6) TVLαp,β (u) = α Du M , for all u ∈ BV( ). Proof Let u ∈ BV( ). Using the definition (1.1) we have that TVLαp,β (u) ≤ α Du − w M + β w Lp( ) , for every w ∈ L p( ). Setting w = 0 and C1 = α, we obtain TVLαp,β (u) ≤ C1 Du M . For the other direction, for any w ∈ L p( ) ⊂ L1( ), by the triangle inequality we get Du M ≤ α Du and thus minimising again over w and combining (2.5) we get (2.4). Notice that when (2.3) holds then the above proposition implies that w = 0 is an admissible solution to the definition of TVLαp,β (u), i.e. 0 ∈ argmin α Du − w w∈Lp( ) for all u ∈ BV( ). However, in general we cannot prove that this solution is unique. p Having shown the basic properties of the TVLα,β functional, we can use it as a regulariser for variational imaging problems of the type min u∈Ls ( )∩BV( ) s 1 f − T u sLs ( ) + TVLαp,β (u), s ≥ 1, where T : Ls ( ) → Ls ( ) is a bounded, linear operator and f ∈ Ls ( ). We conclude our analysis with existence and uniqueness results for the minimisation problem (2.7). Theorem 2.5 Let 1 < p ≤ ∞ and f ∈ Ls ( ). If T (X ) = 0 then there exists a solution u ∈ Ls ( ) ∩ BV( ) for the problem (2.7). If s > 1 and T is injective then the solution is unique. Proof The proof is a straightforward application of the direct method of calculus of variations. We simply take advantage of the inequality (2.2) and the compactness theorem in BV( ), see Appendix, along with the lower semicontinuity property of TVLαp,β . We also refer the reader to the corresponding proofs in [ 34,39 ]. Since we are mainly interested in studying the regularising properties of TVLαp,β , in what follows we focus on the case where s = 2 and T is the identity function (denoising task) where rigorous analysis can be carried out. From now on, we also focus on the case where p is finite, as the case p = ∞ is studied in the follow-up paper [ 8 ]. We thus define the following problem f − u 2L2( ) + TVLαp,β (u), 3 The p-Homogeneous Analogue and Relation to Huber TV Before we proceed to a detailed analysis of the onedimensional version of (P), in this section we consider its p-homogeneous analogue u∈BV( ) 21 f − u 2L2( ) + α Du − w M + βp w Lpp( ), 1 < p < ∞. min w∈Lp( ) (P p−hom ) We show in Proposition 3.2 that there is a strong connection between the models (P) and (P p−hom ). The reason for the introduction of (P p−hom ) is that, in certain cases, it is technically easier to derive exact solutions for (P p−hom ) rather than for (P) straightforwardly, see Sect. 4.3. Moreover, we can guarantee the uniqueness of the optimal w∗ in (P p−hom ), since (2.7) w∗ = argmin α Du − w w∈Lp( ) β M + p w Lpp( ) , and thus w∗ is unique as a minimiser of a strictly convex functional. The next proposition states that, unless f is a constant function then the optimal w∗ in (P p−hom ) cannot be zero but nonetheless converges to zero as β → ∞. In essence, this means that one cannot obtain TV type solutions with the p-homogeneous model. Proposition 3.1 Let 1 < p < ∞, f ∈ L2( ) and let (w∗, u∗) be an optimal solution pair of the p-homogeneous problem (P p−hom ). Then w∗ = 0 if and only if f is a constant function. For general data f , we have that w∗ → 0 in L p( ) when β → ∞. Proof It follows immediately that if f is constant then (0, f ) is the optimal pair for (P p−hom ). Suppose that (w∗, u∗) solve (P p−hom ). Notice that in this case we also have Suppose now that w∗ = 0. Then (3.1) becomes u∗ = argmin u∈BV( ) 2 u∗ = argmin u∈BV( ) 2 1 1 from which we take 1 1 2 f − u∗ 2L2( ) ≤ 2 ( f − u∗) − u 2L2( ) β + p 2 0 ≤ 2 β p + p ∇u Lpp( ) ⇔ ∇u Lpp( ). u 2L2( ) − ( f − u∗)u dx By dividing the last inequality by and taking the limit → 0 we have that ( f − u∗)u dx ≤ 0. By considering the analogous perturbations u∗ − u , we obtain similarly that ( f − u∗)u dx ≥ 0 and thus ( f − u∗)u dx = 0, ∀u ∈ Cc∞( ). Hence u∗ = f and by taking the optimality condition of (3.2) we get that 0 ∈ ∂ D(·) M ( f ), which implies that D f = 0, i.e. f is a constant function. For the last part of the proposition, (supposing f = 0), simply observe that for every u ∈ BV( ) and w ∈ L p( ) we have that and thus w∗ Lpp( ) → 0 when β → ∞. We further establish a connection between the 1-homogeneous (P) and the p-homogeneous model (P p−hom ): Proposition 3.2 Let 1 < p < ∞ and f ∈ L2( ) not a constant. A pair (w∗, u∗) is a solution of (P p−hom ) with parameters (α, β p−hom ) if and only if it is also a solution of (P) with parameters (α, β1−hom ) where β1−hom = β p−hom w∗ Lp−p(1 ). Proof Since f is not a constant by the previous proposition we have that w∗ = 0. Note that for an arbitrary function u ∈ BV( ): w∗ ∈ argmin α Du − w w∈Lp( ) M + 0 ∈ α∂ Du − · M (w∗) + β p−hom |w∗| p−2w∗ ⇔ 0 ∈ α∂ Du − · M (w∗) + β p−hom p w Lpp( ) ⇔ β1−hom w∗ Lp−p(1 ) |w∗| p−2w∗ ⇔ w∗ ∈ argmin α Du − w w∈Lp( ) M + β1−hom w Lp( ) . This means that w∗ is an admissible solution for both problems (P) and (P p−hom ), with the corresponding set of parameters (α, β1−hom ) and (α, β p−hom ), respectively. The fact that the same holds for u∗ as well, comes from the fact that in both problems we have 1 u∗ ∈ argmin u∈BV( ) 2 Finally, it turns out that for p = 2, problem (P p−hom ) is essentially equivalent to the widely used Huber total variation regularisation, [ 24 ]. In fact we can show that for 1 < p < ∞ (P p−hom ) is equivalent to a generalised Huber total variation regularisation, see also [ 23 ]. This is proved in the next proposition. Proposition 3.3 Let 1 < p < ∞ and consider the functional TVLαp,−βhom : BV( ) → R with TVLαp,−βhom (u) = w∈mLipn( ) α Du − w β M + p w Lpp( ). (3.3) Then TVLαp,−βhom (u) = ϕ p(∇u) dx + α|Ds u|( ), where ϕ p : Rd → R with ⎧⎪ α|x| − 1 − p 1 ϕp(x) = ⎨ ⎪⎩ βp |x|p, α1 , |x| ≥ λ p−1 |x| ≤ where λ := βα , and Ds u denotes the singular part of the measure Du, cf. Appendix. Proof We have TVLαp,−βhom (u) = w∈mLipn( ) α Du − w it is straightforwardly verified setting λ = β/α that the function belongs to L∞( ) ⊂ L p( ) and solves (3.4) with optimal value equal to α1 ϕ p(∇u) dx . Note that in the special case p = 2 we recover the classical Huber total variation regularisation since ϕ2(x ) = α|x | − 2αβ2 |x | ≥ βα , β2 |x |2, |x | ≤ βα , i.e. in that case we have quadratic penalisation for small gradients ( p–power penalisation for the general 1 < p < ∞) and linear penalisation for large gradients. For the reader’s convenience, in Fig. 1 we have plotted some of the functions ϕ p in order to illustrate how their form changes when their parameters vary. Note for instance in Fig. 1a how φ2 is converging to an absolute type function when β is getting large, i.e. approaching a total variation regularisation. This can also be seen from Proposition 3.1 where the optimal variable w is converging to 0 when β → ∞. On the other hand when p is getting large, Fig. 1b, small gradients are essentially not penalised at all, allowing the gradient to be almost constant, equal to its maximum value, leading to piecewise affine structures. We refer to some of the numerical examples in Sect. 6.2 and also the second part of this paper [ 8 ] where the case p = ∞ is examined in detail. 4 The One-Dimensional Case In order to get more insights into the structure of solutions of the problem (P), in this section we study its one-dimensional version. As above, we focus on the finite p case, i.e. 1 < p < ∞. The case p = ∞ leads to several additional complications and will be subject of a forthcoming paper [ 8 ]. For this section ⊂ R is an open and bounded interval, i.e. = (a, b). Our analysis follows closely the ones in [ 6 ] and [ 33 ] where the one dimensional L1–TGV and L2–TGV problems are studied, respectively. 4.1 Optimality Conditions In this section, we derive the optimality conditions for the one-dimensional problem (P). We initially start our analysis by defining the predual problem (P∗), proving existence and uniqueness for its solutions. We employ again the Fenchel–Rockafellar duality theory in order to find a connection between the solutions of the predual and primal problems. p=1.5 p=2 p=4 p=20 −01.5 β =6 − inf We define the predual problem (P∗) as 1 f φ dx + 2 where as always q is the conjugate exponent of p. Existence and uniqueness for the solutions of (P∗) can be verified by standard arguments: Proposition 4.1 For f ∈ L2( ), the predual problem (P∗) admits a unique solution in H1( ). 0 The next proposition justifies the term predual for the problem (P∗). Proposition 4.2 The dual problem of (P∗) is equivalent to the problem (P) in the sense that (w, u) is a solution of the dual of (P∗) if and only if (w, u) ∈ L p( ) × BV( ) and solves (P). Proof Observe that we can also write down the predual problem (P∗) using the following equivalent formulation: inf − (φ,ξ)∈X F1(φ, ξ ) + F2(K (φ, ξ )), where X = H1( ) × H1( ), Y = H1( ) × L2( ) and 0 0 0 K : X → Y , K (φ, ξ ) = (ξ − φ, ξ ), F1 : X → R, with F1(φ, ξ ) = I · Lq ( )≤β (φ) F2 : Y → R, with F2(φ, ψ ) = I{0}(φ) + f ψ dx + I{ · ∞≤α}(ξ ), 1 + 2 ψ 2dx . We denote the infimum in (P∗) by inf P∗. Then, it is immediate that − inf P∗ = − (φ,iξn)f∈X F1(φ, ξ ) + F2(K (φ, ξ )). The dual problem of (4.1), see [ 15 ], is defined as min (w,u)∈Y ∗ F1∗(−K (w, u)) + F2∗(w, u), where K here denotes the adjoint of K . Let (σ, τ ) be elements of X ∗ = H1( )∗ × H01( )∗. The convex conjugate of 0 F1 can be written as (4.1) (4.2) (4.3) F1∗(σ, τ ) = sup (φ,ξ)∈X φ Lq ( )≤β ξ ∞≤α sup σ, φ + α sup φ∈H1( ) 0 φ Lq ( )≤1 ξ ∈ξH∞01(≤1) σ, φ + τ, ξ τ, ξ . (4.4) (P∗) F1∗(σ, τ ) = β sup φ∈Cc∞( ) φ Lq ( )≤1 σ, φ + α sup ξ∈Cc∞( ) ξ ∞≤1 By standard density arguments and using the Riesz representation theorem we have = β σ Lp( ) + α τ M . (4.5) Moreover, we have −K (w, u), (φ, ξ ) = − (w, u), K (φ, ξ ) = − (w, u), (ξ − φ, ξ ) = − w, ξ + w, φ − u, ξ = Du − w, ξ + w, φ . Since F1∗(−K (w, u)) < ∞, F2∗(w, u) < ∞, we obtain that F1∗(−K (w, u)) = β w Lp( ) + α Du − w M , (4.6) and F2∗(w, u) = sup (φ,ψ)∈Y φ=0 w, φ + u, ψ − f, ψ We next verify that we have no duality gap between the two minimisation problems (P) and (P∗). The proof of the following proposition follows the proof of the corresponding proposition in [ 6 ]. We slightly modify it for our case. Proposition 4.3 Let F1, F2, K , X, Y be defined as in (4.2). Then λ(dom F2 − K (dom F1)) = Y λ≥0 and hence it is a closed vector space. Thus [ 2 ] min F1(φ, ξ ) + F2(K (φ, ξ )) (φ,ξ)∈X + (w,u)∈Y ∗ F1∗(−K (w, u)) + F2∗(w, u) = 0. min Proof Let (v, ψ ) ∈ Y and define ψ0(x ) = c1, where c1 = 1 | | ψ (x ) dx . (4.8) (4.9) (4.10) Now let ξ(x ) = ax (ψ0 − ψ )(y) d y. Since by construction, ξ = ψ0 − ψ ∈ L2( ) with ξ(a) = ξ(b) = 0, we have that ξ ∈ H1( ). Furthermore, let φ = −v + ξ ∈ H1( ) and 0 0 (φ, ξ ) ∈ X with (v, ψ ) = (ξ − φ, ψ0 − ξ ) = (0, ψ0) − (ξ − φ, ξ ) = (0, ψ0) − K (φ, ξ ). ∞ ≤ α, we can write Choosing appropriately λ > 0 such that λ−1φ Lq ( ) ≤ β, λ−1ξ (v, ψ ) = λ (0, λ−1ψ0) − K (λ−1φ, λ−1ξ ) , with dom F2 = {0} × L2( ) and dom F1 = {(φ, ξ ) : φ Lq ( ) ≤ β, ξ ∞ ≤ α}. Since (v, ψ ) ∈ Y were chosen arbitrarily, (4.8) holds. Since there is no duality gap, we can find a relationship between the solutions of (P∗) and (P) via the following optimality conditions. Theorem 4.4 (Optimality conditions) Let 1 < p ≤ ∞ and f ∈ L2( ). A pair (w, u) ∈ L p( ) × BV( ) is a solution of (P) if and only if there exists a function φ ∈ H1( ) such 0 that for every (φ, ξ ) and (w, u) that solve (P∗) and (P), respectively. Hence, for every (σ, τ ) ∈ X ∗, we have the following: F1∗(σ, τ ) ≥ F1∗(−K (w, u)) + (σ, τ ) + K (w, u), (φ, ξ ) ⇔ α τ M + β σ Lp( ) ≥ α Du − w M + β w Lp( ) + (σ, τ ) − (w, Du − w), (φ, ξ ) ⇔ α τ M + β σ Lp( ) ≥ α Du − w M + β w Lp( ) + (σ − w), φ + τ − (Du − w), ξ ⇔ α τ M ≥ α Du − w M + τ − (Du − w), ξ , ∀τ ∈ H01( )∗, β σ Lp( ) ≥ β w Lp( ) + σ − w, φ , ∀σ ∈ H01( )∗, α τ M ≥ α Du − w M + τ − (Du − w), ξ , ∀τ ∈ M( ), β σ Lp( ) ≥ β w Lp( ) + σ − w, φ , ∀σ ∈ L p( ), ξ ∈ α∂ · M (Du − w), φ ∈ β∂ · Lp( ) (w). (4.11) (4.12) (4.16) (4.17) Since ξ ∈ H1( ) ⊂ C0( ) in one dimension, we can make 0 use of the fact that ∂ · M(Du − w) ∩ C0( ) = Sgn(Du − w) ∩ C0( ), see (8.1), and write the expressions in (4.15) as ξ ∈ αSgn(Du − w), and ⎨⎧ φ ∈ {φ˜ ∈ Lq ( ) : φ˜ Lq ( ) ≤ β} if w = 0, ⎩ φ = β |ww|p(L−pp−2(w1)) if w = 0. Indeed, the L p norm is an one-homogeneous functional and thus its subdifferential reads ∂ · Lp( ) (w) = z ∈ L p( )∗ : z, w = ≤ σ Lp( ) , ∀σ ∈ L p( ) . w Lp( ) , z, σ Note that for w = 0, the above expression reduces to σ Lp( ) ≥ z, σ for all σ ∈ L p( ), which is valid for any z ∈ Lq ( ) with z Lq ( ) ≤ 1, i.e. the unit ball of Lq ( ). If w = 0 then the subdifferential is simply the Gâteaux derivative of the L p norm, i.e. ∂ · Lp( ) (w) = |ww|pLp−−p2(1w) . Finally, from (4.14) we have for every (wˆ , uˆ) ∈ Y ∗ F2∗(wˆ , uˆ) ≥ F2∗(w, u) + K (φ, ξ ), ((wˆ , uˆ) − (w, u) ⇔ 1 1 2 ( f −uˆ)2 dx ≥ 2 1 1 2 ( f − uˆ)2 dx ≥ 2 ( f − u)2 dx + (ξ −φ, ξ ), (wˆ −w, uˆ −u) ⇔ ( f −u)2 dx + (ξ −φ, wˆ −w + ξ , uˆ −u ⇔ ξ − φ, wˆ − w ≤ 0, ∀wˆ ∈ H01( )∗, 21 ( f − u)2d x + ξ , uˆ −u ≤ 21 ξ = φ, ξ ∈ ∂ 21 f − · L2( ) (u) = u − f. 2 ( f −uˆ)2d x, ∀uˆ ∈ L2( )∗ ⇔ Combining all the above results, we obtain the optimality conditions (4.11) and (4.12). Remark 4.5 We observe that if w = 0 then the conditions (4.11) coincide with the optimality conditions for the L2–TV minimisation problem (ROF) with parameter α, i.e. see also [ 37 ]. On the other hand when w = 0, the additional condition (4.12) depends on the value of p and as we will see later it allows a certain degree of smoothness in the final solution u. 4.2 Structure of the Solutions The optimality conditions (4.11) and (4.12) can help us explore the structure of the solutions for the problem (P) and how this structure is determined by the regularising parameters α, β and the value of p. We initially discuss the cases where the solution u of (P) is a solution of a corresponding ROF minimisation problem i.e. w = 0. Note that the following proposition holds for p = ∞ as well. Proposition 4.6 (ROF solutions) Let q be the conjugate exponent of p ∈ (1, ∞] as defined in (8.4). If then (0, u) is a solution pair for (P) where u solves the ROF minimisation problem (4.18). Proof The proof follows immediately from Proposition 2.4. Proposition 4.6 is valid for any dimension d ≥ 1. It provides a rough threshold for obtaining ROF type solutions in terms of the regularising parameters α, β and the image domain . However, the condition is not sharp in general since as we will see in the following sections we can obtain a sharper estimate for specific data f . The following proposition in the spirit of [ 6,33 ] gives more insight into the structure of solutions of (P). Proposition 4.7 Let f ∈ BV( ) and suppose that (w, u) ∈ L p( ) × BV( ) is a solution pair for (P) with p ∈ (1, ∞]. Suppose that u > f (or u < f ) on an open interval I ⊂ then (Du − w) I = 0, i.e. u = w on I and |Ds u|(I ) = 0. The above proposition is formulated rigorously via the use of good representatives of BV functions, see [ 1 ], but for the sake of simplicity we rather not get into the details here. Instead we refer the reader to [ 6,33 ] where the analogue propositions are shown for the TGV regularised solutions and whose proofs are similar to the one of Proposition 4.7. We now examine the case where the solution is constant in , which in fact coincides with the mean value f˜ of the data f : f˜ := argmin u constant 2 1 f − u 2L2( ) = 1 | | f dx . (4.19) Proposition 4.8 (Mean value solution) If the following conditions hold α ≥ f − f˜ L1( ), 1 β ≥ | | q f − f˜ L1( ), then the solution of (P) is constant and equal to f˜. Proof Clearly, if u is a constant solution of (P), then Du = 0 and from inequality (2.2) we get TVLαp,β (u) = 0. Hence, we have u = f˜. In general, in order to have u = f˜, from the optimality conditions (4.11) and (4.12), it suffices to find a function φ ∈ H1( ) such that 0 φ = f − f˜, φ ∞ ≤ α, φ Lq ( ) ≤ β. Letting φ (x ) = ax ( f (s)− f˜) ds, then obviously φ ∈ H1( ) 0 since φ (a) = φ (b) = 0 and |φ (x )| ≤ | f (s) − f˜| ds ≤ f − f˜ L1( ) < ∞. a x Therefore, φ ∞ ≤ Lq ( ) we obtain f − f˜ L1( ). Also, since L∞( ) ⊂ 1 φ Lq ( ) ≤ | | q φ 1 ∞ ≤ | | q f − f˜ L1( ). Hence, it suffices to choose α and β as in (4.20). In Fig. 2, we summarise our results so far. There, we have partitioned the set {α > 0, β > 0} into different areas that correspond to different types of solutions of the problem (P). The red area, arising from thresholds (4.20) corresponds to the choices of α and β that produce constant solutions while the blue area corresponds to ROF type solutions, according to threshold (2.3). Therefore, we can determine the area where the non-trivial solutions are obtained, i.e. w = 0, see purple region. Note that since the conditions (2.3) and (4.20) are not sharp, the blue/red and the purple areas are potentially larger or smaller, respectively than it is shown in Fig. 2. β |Ω|q1||f − f˜||L1(Ω) ROF 1 β = α|Ω|q u = f˜ TVLp w = 0 ||f − f˜||L1(Ω) α (4.20) Fig. 2 Characterisation of solutions of (P) for any data f : The blue/red areas correspond to the ROF type solutions (w = 0) and the purple area corresponds to the TVL p solutions (w = 0) for 1 < p < ∞. We note that the blue/red and purple areas are potentially larger or smaller, respectively as the conditions we have derived are not sharp (Color figure online) function. We define the step function in as = (−L , L), L > 0 We note that in this case we can derive conditions for every 1 < p ≤ ∞. We are initially interested in solutions that respect the discontinuity at x = 0 and are piecewise constant. From the optimality conditions (4.11)–(4.12), it suffices to find a function φ ∈ H1( ) which, apart from φ (−L) = 0 φ (L) = 0, it also satisfies φ ∞ ≤ α, φ (0) = α, and it is piecewise affine. It is easy to see that by setting φ (x ) = αL (L − |x |), the conditions (4.23) are satisfied and the solution u is piecewise constant. The first condition of (4.12) implies that φ Lq ( ) ≤ β ⇔ βα ≥ ( q2+L1 ) q1 and provides a necessary and sufficient condition that needs to be fulfilled in order for u to be piecewise constant, that is to say u(x ) = (4.22) (4.23) Propositions 4.6 and 4.8 tell us how the solutions u behave when β/α or one of the parameters α and β is large. The other limiting case is also of interest, i.e. when the parameters are small. The analogous questions have been examined in [ 35 ] for the TGV case in arbitrary dimension. There it is shown that whenever β → 0 while keeping α fixed or α → 0 while keeping β fixed, the corresponding TGV solutions converge to the data f strongly in L2. The same result holds for the TVL p regularisation. The proof from [ 35 ] can be straightforwardly adapted to our case. The following proposition reveals more information about the structure of solutions in the case w = 0. Proposition 4.9 (TVL p solutions) Let f ∈ BV( ) and suppose that (w, u) ∈ L p( ) × BV( ) is a solution pair for (P) with p ∈ (1, ∞) and w = 0. Suppose that u > f (or u < f ) on an open interval I ⊂ then the solution u of (P) is obtained by where C = w Lp−p(1 ) . −C (|u (x )| p−2u (x )) + u(x ) = f (x ), ∀x ∈ I β (4.21) Proof Since 1 < p < ∞, w = 0 using Proposition 4.7 and the second optimality condition of (4.12), we have that φ = β |u | p−2u w Lp−p(1 ) . β w Lp−p(1 ) . Hence, using (4.11) we obtain (4.21) where C = Let us make a few remarks regarding equation (4.21) which is in fact the p-Laplace equation. One cannot write down a priori the boundary conditions associated with this equation on an interval I where u > f (or u < f ) as it depends on the data and the type of solution we are looking for. For instance see (4.31) for the kind of boundary conditions that might arise when we are seeking a particular exact solution. A general statement about the solvability of the equation cannot be made either. If the equation coupled with the boundary conditions (that arise when looking for a specific solution u) has a solution then indeed u can possibly solve the minimisation problem. On the other hand, if the p-Laplace equation does not have a solution then the function u that imposed the corresponding boundary conditions cannot be a minimiser. For more details on the p-Laplace equation and its solvability we refer the reader to [ 28 ] and the references therein. 4.3 Exact Solutions of (P) for a Step Function In what follows we compute explicit solutions of the TVL p denoising model (P) for the case p = 2 for a simple data 1 q A special case of the ROF type solution is when u is constant, i.e. when u = f˜, the mean value of f . We define φ (x ) = h2 (L − |x |) and in that case we have that φ ∞ ≤ α ⇔ α ≥ h2L and φ Lq ( ) ≤ β ⇔ β ≥ h2 ( 2qL+q+11 ) q1 . This implies that h u = f˜ = 2 ⇔ h L α ≥ 2 h and β ≥ 2 Using now (4.24)–(4.25) we can draw the exact regions in the quadrant of {α > 0, β > 0} that correspond to these two types of solutions, see the left graph in Fig. 4 for the special case p = 2. Notice that in these regions w = 0. 4.3.2 TVL2 Type Solutions For simplicity reasons, we examine here only the case p = 2 with w = 0 in . However, we refer the reader to Sect. 6.2 where we compute numerically solutions for p = 2. Using Proposition 4.9, we observe that the solution is given by the following second-order differential equation: β2−hom −C u (x ) − u(x ) = f (x ), subject to C = β Even though we can tell that the solution of (4.26) has an exponential form, the fact that the constraint on C depends on the solution w, creates a difficult computation in order to recover u analytically. In order to overcome this obstacle, we consider the one-dimensional version of the 2-homogeneous analogue of (P) that was introduced in Sect. 3: min 1 u∈BV( ) 2 w∈L2( ) f − u 2L2( ) + α Du − w M + One can derive the optimality conditions for (4.27) similarly to Sect. 4.1. A pair (w, u) is a solution of (4.27) if and only if there exists a function φ ∈ H1( ) such that 0 In view of Proposition 3.2, in order to recover analytically the solutions of (P) for p = 2 and determine exactly the purple region in Fig. 2, it suffices to solve the equivalent model (4.27) in which w = 0 holds always. We may restrict our computations only on (−L , 0] ⊂ and due to symmetry the solution in (0, L) is given by u(x ) = h − u(−x ). The optimality condition (4.28) results to −u (x) + ku(x) = 0, where k2 = β1 and x ∈ (−L , 0]. Then, we get u(x ) = c1ekx + c2e−kx with φ (x ) = ck1 ekx − ck2 e−kx + c3 for all x ∈ (−L , 0]. Firstly, we examine solutions that are continuous which due to symmetry must satisfy u(0) = h2 . Since φ ∈ H01(−L , L), we have φ (−L) = 0 and also u (−L) = 0. Finally, we require that φ (0) < α. After some computations, we conclude that u(x) = c1ekx + c2e−kx if x ∈ (−L , 0], h − c1e−kx − c2ekx if x ∈ (0, L) ⇔ tanhk(k L) < 2hα , where c1 = c2e2k L , c2 = 2(e2kL +1) and k = √1β . h On the other hand, in order to get solutions that preserve the discontinuity at x = 0, we require the following: (4.29) (4.30) g(β) = 2hα Fig. 3 Characterisation of solutions of (4.27) for data f being a step function. The green region corresponds to solutions that preserve the discontinuity at x = 0, (4.32), while the blue region corresponds to continuous solutions, (4.30), both having an exponential form (Color figure online) L L α (4.31) (4.32) (4.33) φ (−L) = 0, u (−L) = 0, h u(0) < , φ (0) = α. 2 Then we get u(x) = c1ekx + c2e−kx if x ∈ (−L , 0], h − c1e−kx − c2ekx if x ∈ (0, L) ⇔ tanhk(k L) > 2hα , where c1 = c2e2k L , c2 = e2kαLk−1 and k = √1β . Notice that the conditions for α and β in (4.30) and (4.32) are supplementary and thus only these type of solutions can occur, see the quadrant of {α > 0, β > 0} as it presented in Fig. 3. Letting g(β) = √β tanh ( √Lβ ), if g(β) < 2hα then the solution is of the form (4.30), see the blue region in Fig. 3. On the other hand in the complementary green region we obtain the solution (4.32). For extreme cases where β → ∞, i.e. k → 0 we obtain tanhk(k L) → L, which means that there is an asymptote of g at α = h2L . Although, we know the form of the inverse function of the hyperbolic tangent, we cannot compute analytically the inverse f −1. However, we can obtain an approximation using a Taylor expansion which leads to L3 = L − 3β + O 1 β2 2α = h ⇔ L β tanh √β h L3 β = 3(h L − 2α) , hL . where α > 0 and α = 2 Finally, we would like to describe the solution on the limiting case β → ∞. Letting β → ∞ in (4.30), we have that c1, c2 → h2 and u(x ) → h2 for every x ∈ , which in fact is the mean value solution of (P). For the discontinuous solutions, we have that c1, c2 → 2αL and u(x ) → i.e. the limiting solution is (4.24). We also get that w(x ) = kc2 e2k L+kx − e−kx e2k L−kx − ekx if x ∈ (−L , 0], if x ∈ (0, L], with w L2( ) = c2k√2ek L (sinh(2k L) − 2k L) 21 and c2 is given either from (4.30) or (4.32). Then, in both cases we have w → 0 as k → 0. Observe that the product of β2−hom w L2( ) is bounded as β2−hom → ∞ for both types of solutions and in fact corresponds to the bounds found in (4.24) and (4.25). Indeed, since (4.34) → 2 L3 , as k → 0, 3 α β1−hom Fig. 4 Characterisation of solutions of (P) for p = 2 for data f being a step function. The type of solutions in the purple region of the left graph are exactly the solutions obtained for the 2-homogeneous problem (4.27), on the right graph (Color figure online) (4) A discontinuous piecewise exponential solution given in (4.32) (green region in Fig. 4). Furthermore, there is an one to one correspondence between the purple and the green/light blue regions in Fig. 4. 5 An Image Decomposition Approach In this section, we present another formulation of the problem (P), where we decompose an image into a BV part (piecewise constant) and a part that belongs to W 1, p( ) (smooth). Let 1 < p ≤ ∞ and ⊂ Rd and consider the following minimisation problem: min vu∈∈WB1V,p(( )) 1 L(u, v) := 2 f − u − v 2L2( ) + α Du M +β ∇v Lp( ) . (5.1) In this way, we can decompose our image into two components of different structure. The second term captures the piecewise constant structures in the image, whereas the third term captures the smoothness that depends on the value of p. In the one-dimensional setting, we can prove that the problems (P) and (5.1) are equivalent. Proposition 5.1 Let = (a, b) ⊂ R and 1 < p ≤ ∞. Then a pair (v∗, u∗) ∈ W1, p( ) × BV( ) is a solution of (5.1) if and only if (∇v∗, u∗ + v∗) ∈ L p( ) × BV( ) is a solution of (P). if α > h2L then hL while if α ≤ 2 β2−hom w L2( ) → α h β2−hom w L2( ) → 2 2L3 3 , as β2−hom → ∞, 2L 3 , as β2−hom → ∞. The last result is yet another verification of Proposition 3.2 and it shows that there is an one to one correspondence between β2−hom w L2( ) and β1−hom . The solutions that belong to the purple region of Fig. 4 are the same to the solutions that are shown in Fig. 3. In the next proposition we summarise the type of solutions for (P) for the step function: Proposition 4.10 There are four different types of solutions for the problem (P) with p = 2 taking the step function (4.22) as data: (1) A piecewise constant solution given in (4.24) (blue region in Fig. 4). (2) A constant solution, equal to the mean value of the data, given in (4.25) (brown region in Fig. 4). (3) A continuous exponential solution given in (4.30) (lightblue region in Fig. 4). Proof Let u = u + v then, we have the following f − u 2L2( ) + α Du − w M (w∗, u∗) ∈ argmin u∈BV( ) 2 v∈wW=1,∇pv( ) + β w Lp( ) . x a x a However, the constraints w = ∇v, v ∈ W1, p( ) can simply be substituted by w ∈ L p( ) since w ∈ L p( ) : ∃v ∈ W1, p( ), w = ∇v = L p( ). (5.2) Indeed, let w ∈ L p( ) ⊂ L1( ) for p ∈ (1, ∞) and define v(x ) = ax w(s) ds for x ∈ ⊂ R. Clearly, v = w a.e. and by Hölder’s inequality we have for every x ∈ (a, b) |v(x )| p = w(s) ds ≤ (x − a) p−1 p |w(s)| p ds < C < ∞. Thus v ∈ W1, p( ) for p ∈ (1, ∞). If p = ∞, suppose again w ∈ L∞( ) and let C > 0 be a constant such that |w(x )| ≤ C a.e. on . In that case we have |v(x )| ≤ ax |w(s)| ds ≤ C | | < ∞, i.e. v ∈ L∞( ) and hence v ∈ W1,∞( ) since v = w. Therefore, f − u 2L2( ) + α Du − w M (w∗, u∗) ∈ argmin u∈BV( ) 2 w∈Lp( ) + β w Lp( ) , 1 where u∗ = u∗ + v∗ and w∗ = ∇v∗. Even though for d = 1 it is true that every L p function can be written as a gradient, this is not true in higher dimensions. In fact, as we show in the following sections, this constraint is quite restrictive and for example the staircasing effect cannot be always eliminated in the denoising process, see for instance Fig. 20. The existence of minimisers of (5.1) is shown following again the same techniques as in Theorem 2.5. Moreover, due to the strict convexity of the fidelity term in (5.1), one can prove that the sum u + v ∈ BV( ) is unique for a solution (u, v) ∈ W1, p( ) × BV( ). This result reflects the uniqueness for the problem (P) for u. However one cannot show that the solutions (u, v) are unique in general. Yet, one can say something more about this issue. if (u1, v1), (u2, v2) are two minimisers of (5.1), then from the convexity of L(u, v) we have for 0 ≤ λ ≤ 1 L(λ(u1, v1) + (1 − λ)(u2, v2)) ≤ λL(u1, v1) + (1 − λ)L(u2, v2). Since (u1, v1), (u2, v2) are both minimisers, the above inequality is in fact an equality. Since u1 + v1 = u2 + v2, we obtain α D (λu1 + (1−λ)u2) M +β ∇(λv1 + (1 − λ)v2) Lp( ) = α(λ Du1 M + (1 − λ) Du2 M) + β(λ ∇v1 Lp( ) + (1 − λ) ∇v2 Lp( )). (5.3) If we assume that ∇(λv1 + (1 − λ)v2) Lp( ) < λ ∇v1 Lp( ) + (1 − λ) ∇v2 Lp( ) , then we contradict the equality on (5.3). Hence, the Minkowski inequality becomes an equality, i.e. ∇(λv1 + (1 − λ)v2) Lp( ) = λ ∇v1 Lp( ) + (1 − λ) ∇v2 Lp( ) , which is equivalent to the existence of μ ≥ 0 such that ∇v2 = μ∇v1. In other words, we have proved the following proposition which was also shown in [ 25 ] in a similar context: Proposition 5.2 Let (u1, v1), (u2, v2) be two minimisers of (5.1). Then u1 + v1 = u2 + v2, and there exists a μ ≥ 0 such that ∇v2 = μ∇v1. (5.4) (5.5) 6 Numerical Experiments In this section we present our numerical simulations for the problem (P). We begin with the one-dimensional case where we verify numerically the analytical solutions obtained in Sect. 4.3. Through examples we also investigate the type of structures that are promoted for different values of p. Finally, we proceed to the two-dimensional case where we focus on image denoising tasks and in particular on the elimination of the staircasing effect. We start by defining the discretised version of problem (P) min u∈Rn×m 2 . (6.3) min u∈Rn×m 2 w∈(Rn×m )2 z∈(Rn×m)2 1 (6.1) (6.2) We denote by ∇ = (∇1, ∇2) the discretised gradient with forward differences and zero Neumann boundary conditions defined as (∇1u)i, j = (∇2u)i, j = u(i + 1, j ) − u(i, j ) if 1 ≤ i < n, 1 ≤ j ≤ m, 0 if i = n, 1 ≤ j ≤ m, u(i, j + 1) − u(i, j ) if 1 ≤ i ≤ n, 1 ≤ j < m, 0 if 1 ≤ i ≤ n, j = m. The discrete version of the divergence operator is defined as the adjoint of discrete gradient. That is, for every w = (w1, w2) ∈ (Rn×m )2 and u ∈ Rn×m , we have −divw, u = w, ∇u with (divw)i, j = ⎧⎪⎨ ww11((ii,, jj )) − w1(i, j − 1) iiff 1j <= 1j ,<1m≤, i1≤≤ni, ≤ n, ⎪⎩ −w1(i, j − 1) if j = m, 1 ≤ i ≤ n, + ⎨⎪⎧ ww22((ii,, jj )) − w2(i − 1, j ) iiff i1 =< 1i,<1n≤, 1j ≤≤ mj,≤ m, ⎪⎩ −w2(i − 1, j ) if i = m, 1 ≤ j ≤ m. (6.4) We solve the minimisation problem (6.1) in two ways. The first one is by using the CVX optimisation package with MOSEK solver (interior point methods) [ 19 ]. This method is efficient for small–medium scale optimisation problems and thus it is a suitable choice in order to replicate onedimensional solutions. On the other hand, we prefer to solve large scale two-dimensional versions of (6.1) with the split Bregman method [ 18 ] which has been widely used for the fast solution of non-smooth minimisation problems. 6.1 Split Bregman for L2–TVLp In this section we describe how we adapt the split Bregman algorithm to our discrete model (6.1). We first transform the unconstrained problem (6.1) into a constrained one by setting z = ∇u − w: min u∈Rn×m 2 w∈(Rn×m )2 z∈(Rn×m)2 1 such that z = ∇u − w. f − u 22 + α z 1 + β w p , Replacing the constraint, using a Lagrange multiplier λ, we obtain the following unconstrained formulation: λ f − u 22+α z 1+β w p+2 z − ∇u +w 22 . (6.5) (6.6) w p (6.7) (6.8) (6.11) (6.12) (6.13) The Bregman iteration [ 32 ], that corresponds to the minimisation (6.6) leads to the following two-step algorithm: 1 (uk+1, zk+1, wk+1) = argmin u,z,w 2 + λ2 bk − z + ∇u − w 22 , bk+1 = bk + zk+1 − ∇uk+1 − wk+1. f −u 22 +α z 1 +β Since solving (6.7) at once is a difficult task, we employ a splitting technique and minimise alternatingly for u, z and w. This yields the split Bregman iteration for our method: uk+1 = argmin u∈Rn×m 2 1 λ f − u 22 + 2 zk+1 = argmin α z 1 + λ2 z∈(Rn×m)2 wk+1 = argmin β w∈(Rn×m )2 λ w p + 2 bk + zk − ∇u + w bk + z − ∇uk+1 + w bk + zk+1 −∇uk+1 +w bk+1 = bk + zk+1 − ∇uk+1 − wk+1. Next, we discuss how we solve each of the subproblems (6.9)–(6.11). The first-order optimality condition of (6.9) results into the following linear system: (I − λ ) u = f − λdiv(bk + zk − wk ) . $ %A& ' $ %c& ' Here A is a sparse, symmetric, positive definite and strictly diagonal dominant matrix, thus we can easily solve (6.13) with an iterative solver such as conjugate gradients or Gauss–Seidel. However, due to the zero Neumann boundary conditions, the matrix A can be efficiently diagonalised by the two-dimensional discrete cosine transform, A = Wnm DWnm , where here Wnm is the discrete cosine matrix and D = di ag(μ1, · · · , μnm ) is the diagonal matrix of the eigenvalues of A. In that case, A has a particular structure of a block symmetric Toeplitz-plus-Hankel matrix with Toeplitzplus-Hankel blocks and one can obtain the solution of (6.9) by three operations involving the two-dimensional discrete cosine transform [ 20 ] as follows: Firstly, we calculate the eigenvalues of A by multiplying (6.14) with e1 = (1, 0, · · · , 0) from both sides and using the fact that Wnm Wnm = Wnm Wnm = Inm , we get Di,i = [Wnm Ae1]i , i = 1, 2, · · · , nm. [Wnm e1]i Then, the solution of (6.9) is computed exactly by u = Wnm D−1Wnm c. The solution of the subproblem (6.10) is obtained in a closed form via the following shrinkage operator: zik+1 = shrink αλ (bik − ∇i uk+1 + wik ) $ %g&i ' := max α g 2 − λ gi . g 2 Finally, we discuss the solution of the subproblem (6.11). In the spirit of [ 40 ], we solve (6.11) by a fixed-point iteration scheme. Letting κ = βλ and η = −bk − zk+1 + ∇uk+1, the first-order optimality condition of (6.11) becomes κ |w| p−2w w pp−1 + w − η = 0. For given wk , we obtain wk+1 by the following fixed-point iteration wik+1 = ηi wk pp−1 κ|wk | p−2 + wk p−1 p , (6.14) (6.15) (6.16) (6.17) (6.18) (6.19) ηi . However, we have observed empirically that there is κ+1 no significant computational difference between these two methods. Since we do not solve all the subproblems (6.9)–(6.11) exactly in every iteration, we cannot guarantee convergence for our version of the split Bregman iteration. Moreover, convergence of the split Bregman algorithm when more than two splittings are performed have not been yet fully established, even though this has been an active field of research lately see for instance [ 13,17,31 ]. Let us note that the three subproblems in the split Bregman algorithm can be modified into two subproblems (inexact linearised ADMM) with a small cost in the speed of convergence, see for instance [ 14,21,22 ]. However in practice, the algorithm converges to the right solutions. This claim is supported by the study presented in Fig. 5 where the solutions of the split Bregman iteration are compared to the corresponding solutions obtained with the CVX package for which we have convergence guarantees. There, we have solved the TVL p minimisations that correspond to Figs. 13d–f and 14d, i.e. for p = 23 , 2, 3, 7 using both split Bregman and CVX. We plot the relative differences of the split Bregman iterates uk and the CVX solution uCVX until they are sufficiently close to each other, i.e. k u − uCVX 2 < tol. uCVX 2 (6.20) In all the plots of Fig. 5, we observe that the split Bregman iterates in practice converge to the CVX solution. Their relative difference becomes of the order 10−5 in around 100 iterations except for p = 7, Fig. 5d, where the error tolerance is still reached but only after approximately 10000 iterations. In Table 1, we compare the computational times of the split Bregman algorithm until (6.20) is satisfied and the computational times of CVX for the same examples as in Fig. 5. The implementations were done in MATLAB (2013) using 2.4 GHz Intel Core 2 Duo and 2 GB of memory. Notice that unless p = 2, second line in Table 1, CVX needs more than an hour to converge, in contrast to split Bregman where only a few seconds are required for small values of p. Note that the split Bregman algorithm is significantly slower for large values of p, e.g. p = 7, see fourth line in Table 1, mainly due to the fixed-point iteration in the subproblem (6.19). We would like to point out that the computational speed can be significantly reduced in the p = ∞ case, since the corresponding subproblem is solved exactly, see [ 8,36 ] and in the same time we can obtain similar results to the ones obtained for high values of p. under the convention that 0/0 = 0. We can also consider solving the p-homogenous analogue (P p−hom ), where for certain values of p, e.g. p = 2, we can solve exactly the corresponding version of (6.11), since in that case wik+1 = 6.2 One-Dimensional Results In this section, we present some numerical results in dimension one, i.e. m = 1, u ∈ Rn×1 and w ∈ Rn×1. Initially, we Iteration (a) 50 100 150 0 20 40 80 100 120 0 Fig. 5 Plots of the relative differences between the split Bregman iterates and the CVX solutions until (6.20) is true with tol = 10−5, for the examples in Figs. 13d–f and 14d. In all cases the split Bregman algorithm converges to the solutions given by CVX. a Relative residual error between the split Bregman iterates and the CVX solution for the example in Fig. 13d. b Relative residual error between the split Bregman iterates and the CVX solution for the example in Fig. 13e. c Relative residual error between the split Bregman iterates and the CVX solution for the example in Fig. 13f. d Relative residual error between the split Bregman iterates and the CVX solution for the example in Fig. 14d Table 1 Computational times of the split Bregman algorithm until (6.20) is true with tol = 10−5 and comparison to the computational times of CVX for the same examples as in Fig. 5 p = 1.5 p = 2 p = 3 p = 7 Split Bregman (s) compare our numerical solutions with the analytical ones, obtained in Sect. 4.3 for the step function. We set p = 2, h = 100, L = 1 and = (−1, 1) which is discretised into 2000 points. We first examine the cases where ROF solutions are obtained, i.e. the parameters α and β are selected according to the conditions (4.24) and (4.25). We have done that in Fig. 6 where we see that the analytical solutions coincide with the numerical ones. We proceed by computing the non-ROF solutions. The numerical solutions are solved using the 2-homogeneous analogue (4.27), since we have proved that the 1-homogeneous and p-homogeneous problems are equivalent modulo an appropriate rescaling of the parameter β, see Proposition 3.2. Indeed, as it is described in Fig. 4, in order to obtain solutions from the purple region, it suffices to seek for solutions of the 2-homogeneous (4.27). Recall also that these solutions are exactly the solutions obtained solving a Huber TV problem, see Proposition 3.3. The analytical solutions are given in (4.30) and (4.32) and are compared to the numerical ones in Fig. 7, where we observe that they coincide. We also verify the equivalence between the 1-homogeneous and 2-homogeneous problems where α is fixed and β is obtained from Proposition 3.2, see Fig. 7c. We continue our experiments for general values of p focusing on the structure of the solutions as p increases. In order to compare the solutions for different values p ∈ (1, ∞), we fix the parameter α and choose appropriate values of β. Since we are mainly interested in non-ROF step function TVL2 numerical TVL2 exact step function TVL2 numerical TVL2 exact step function TVL2 numerical TVL2 exact step function Fig. 6 Step function: comparison between numerical solutions of (P) and the corresponding analytical solutions obtained in Sect. 4.3. The parameters α and β are chosen so that conditions (4.24) and (4.25) are satisfied which result in ROF solutions. a Original data. b TVL2: α = 15, β = 500. c TVL2: α = 60, β = 1300 (Color figure online) 100 80 60 40 20 0 100 80 60 40 20 0 Fig. 7 Step function: comparison between numerical and analytical solutions obtained in Sect. 4.3, by solving the 2-homogeneous problem (4.27). The parameters α and β are chosen so that conditions (4.30) and (4.32) are satisfied which result in non-ROF solutions. The last plot shows the equivalence between the 1-homogeneous problem (P) and 2-homogeneous (4.27). a TVL2 : α = 20, β2−hom = 450. b TVL2 : α = 60, β2−hom = 450. c Equivalence of 1 and 2-homogeneous models: α = 15, β2−hom = 450, β1−hom = β2−hom w 2 (Color figure online) p = 1.3 p = 1.5 p = 2 p = 3 p = 4 p = 10 100 80 60 40 20 0 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 Fig. 8 Step function: the types of solutions for the problem (P) for different values of p. a TVL p discontinuous solutions for p = { 43 , 23 , 2, 3, 4, 10}. b TVL p continuous solutions for p = { 3 , 23 , 2, 3, 4, 10} (Color figure online) 4 solutions, we choose α and β so that they belong to the 1 purple region of Fig. 4, i.e. β < ( q2+L1 ) q α and β < h ( 2qL+q+11 ) q1 . We set p = { 43 , 3 , 2, 3, 4, 10} and in order to 2 2 get solutions that preserve the discontinuity we set β = {72, 140, 430, 1350, 2400, 6800} with fixed α = 20, see Fig. 8a. In order to obtain continuous solutions, we set α = 60 and β = {50, 110, 430, 1700, 3000, 9500}, see Fig. 8b. We observe that for p = 43 , the solution has a similar behaviour to p = 2, but with a steeper gradient at the discontinuity point and the solution becomes almost constant near the boundary of . On the other hand, as we increase p, the slope 0 (b) 60 40 20 0 100 80 60 40 20 0 TVL2 numerical TVL2 exact step function TVL2 1-homogenous TVL2 2-homogeneous step function TVL15 piecewise affine −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Fig. 9 Piecewise affine data: TVL15 solution with α = 1, β = 620 (Color figure online) of the solution near the discontinuity point reduces and it becomes almost linear with a relative small constant part near the boundary. The linear structure of the solutions that appears for large p motivates us to examine the case of piecewise affine data f defined as 20 15 10 5 0 −5 8 x 10−3 0 200 400 600 800 (6.21) see Fig. 9. We set again = (−1, 1) and λ = 110 . As we observe, the reconstruction for p = 15 behaves almost linearly everywhere in except near the boundary. In the follow-up paper [ 8 ], where the case p = ∞ is examined in detail and exact solutions are computed for the data (6.21), the occurrence of this linear structure is justified also analytically. In the last part of this section, we present some numerical examples of the image decomposition approach presented in Sect. 5. We use as data a more complicated one-dimensional noiseless signal with piecewise constant, affine and quadratic components and solve the discretised version of (5.1) using CVX. In Fig. 10, we verify numerically the equivalence between (5.1) and (P ) for p = 2, i.e. we show that (∇v, u + v) corresponds to (w, u) where (v, u) and (w, u) are the solutions of (5.1) and (P ), respectively. We also compare the decomposed parts u, v for p = 43 and p = 10. In order to have a reasonable comparison on the correspond3.5 4 3 2 Fig. 12 Square with piecewise affine structures and its noisy version with σ = 0.01. a Square. b Noisy square: PSNR = 20.66 and SSIM = 0.1791 ing solutions, the parameters α, β are selected such that the residual f − u − v 2 is the same for both values of p. As we observe, the v decomposition with p = 43 exhibits some flatness compared to p = 2, compare Figs. 10b and 11a. On the other hand for p = 10, the v component consists again of almost affine structures, Fig. 11b. Notice, that in both cases the v components are continuous. This is expected since in dimension one, we have W1, p( ) ⊂ C ( ) for every 1 < p < ∞. 6.3 Two-Dimensional Results In this section we consider the two-dimensional case where u ∈ Rn×m , w ∈ (Rn×m )2 with m > 1 and is a rectangular image domain. We focus on image denoising tasks and on eliminating the staircasing effect for different values of p. We start with the image in Fig. 12, i.e. a square with piecewise affine structures. The image size is 200 × 200 pixels at a [ 0, 1 ] intensity range. The noisy image, Fig. 12b, is a corrupted version of the original image, Fig. 12a, with Gaussian noise of zero mean and standard deviation σ = 0.01. In Fig. 13, we present the best reconstructions results in terms of two quality measures, the classical Peak Signal to Noise Ratio (PSNR) and the Structural Similarity Index (SSIM), see [ 41 ] for the definition of the latter. In each case, the values of α and β are selected appropriately for optimal PSNR and SSIM. We use here the split Bregman algorithm as this is described in Sect. 6.1. Our stopping criterion is the relative residual error becoming less than 10−6, i.e. (6.22) uk+1 − uk For computational efficiency, we set λ = 10α when 1 < p < 4 and λ = 1000α when p ≥ 4 (empirical rule). Observe that the best reconstructions in terms of the PSNR have no visual difference for p = 23 , 2 and 3 and staircasing is present, Fig. 13a–c. This is one more indication that the PSNR—which is based on the squares of the pointwise differences between the ground truth and the reconstruction—does not correspond to the optimal visual results. The best reconstructions in terms of SSIM are visually better. They exhibit significantly reduced staircasing for p = 23 and p = 3 and is essentially absent in the case of p = 2, see Fig. 13d–f. We can also get a total staircasing elimination by setting higher values for the parameters α and β, as we show in Fig. 14. There, one observes that on one hand as we increase p, almost affine structures are promoted—see the middle row profiles in Fig. 14—and on the other hand these choices of α, β produce a serious loss of contrast that however can be easily treated via the Bregman iteration that we briefly discuss next. d TVL 23 : α = 0.3, β = 7.7, SSIM = 0.9669. e TVL2: α = 0.3, β = 34, SSIM = 0.9706. f TVL3: α = 0.3, β = 182, SSIM = 0.9709 0 20 40 60 80 100 120 140 160 180 200 (a) 3 Fig. 14 Staircasing elimination for p = 23 , 2, 3 and 7. High values of p promote almost affine structures. a TVL 2 : α = 1, β = 25, SSIM = 0.9391. Fig. 15 First row Best reconstructions in terms of SSIM for TV, TVL2 and TGV. Second row Best reconstructions in terms of SSIM for the Bregman iteration versions of TV, TVL2 and TGV. a TV: α = 0.2, SSIM = 0.9387. b TVL2: α = 1, β = 116, SSIM = 0.9433. c TGV: α = 0.12, β = 0.55, SSIM = 0.9861. d TV Bregman iteration: α = 1, SSIM = 0.9401, 4th iteration. e TVL2 Bregman iteration: α = 2, β = 220, SSIM = 0.9778, 4th iteration. f TGV Bregman iteration: α = 2, β = 10, SSIM = 0.9889, 8th iteration Contrast enhancement via Bregman iteration was introduced in [ 32 ], see also [ 3 ] for an application to higher-order models. It involves solving a modified version of the minimisation problem. Setting u0 = f , one solves for k = 1, 2, . . .: uk+1 = argmin u∈Rn×m 2 w∈(Rn×m )2 v˜k+1 = v˜k + f − uk+1. 1 f +v˜k −u 22 +α ∇u −w 1 +β w p, (6.23) Instead of solving (6.1) once for fixed α and β, we solve a sequence of similar problems adding back a noisy residual in each iteration. For stopping criteria regarding the Bregman iteration we refer to [ 32 ]. In Fig. 15 we present our best Bregman iteration results for p = 2 in terms of SSIM along with the corresponding TV and TGV results for which the Bregman iteration has also been employed for the sake of fair comparison. We also show the best SSIM results where no Bregman iteration is used. We solve the TGV minimisation using the Chambolle–Pock primal-dual method [ 10 ]. We notice that the Bregman iteration version of TVL2 leads to a significant contrast improvement, compare for instance Fig. 15b, e. In Fig. 16 Image with symmetric radial structures and its noisy version with σ = 0.01. a Circle. b Noisy circle: SSIM = 0.2457 fact, it can achieve a reconstruction which is visually close to the Bregman iteration version of TGV, compare Fig. 15e, f. We continue our demonstration with a radially symmetric image, see Fig. 16. As in the previous example, we can achieve staircasing-free reconstructions using TVL p regularisation for different values of p, see Fig. 17. In fact, as we increase p, we obtain results that preserve the spike in the centre of the circle, see the corresponding middle row slice in Fig. 17d. This provides us with another motivation to examine the p = ∞ case in [ 8 ]. The loss of contrast can 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 00 20 40 60 80 100 120 140 160 180 200 (b) 3 Fig. 17 Better preservation of spike-like structures for large values of p. a TVL 2 : α = 0.8, β = 17, SSIM = 0.8909. b TVL2: α = 0.8, β = 79, SSIM = 0.8998. c TVL3: α = 0.8, β = 405, SSIM = 0.9019. d TVL7: α = 0.8, β = 3700, SSIM = 0.9024 Fig. 18 Best reconstruction in terms of SSIM for the Bregman iteration versions of TVL2, TVL4, TVL7 and TGV. a TV Bregman iteration: α = 2, SSIM = 0.8912, 6th iteration. b TVL2 Bregman iteration: α = 5, β = 625, SSIM = 0.9718, 12th iteration. c TVL4 Bregman iteration: α = 5, β = 8000, SSIM=0.9802, 13th iteration. d TVL7 Bregman iteration: α = 3, β = 15000, SSIM = 0.9807, 9th iteration. e TGV Bregman iteration: α = 2, β = 10, SSIM = 0.9913, 8th iteration be again treated using the Bregman iteration (6.23). The best results of the latter in terms of SSIM are presented in Fig. 18, for p = 2, 4 and 7 and they are also compared to the corresponding Bregman iteration version of TGV. We observe that we can obtain reconstructions that are visually close to the TGV ones and in fact notice that for p = 7, the spike on the centre of the circle is better reconstructed compared to TGV, see also the surface plots in Fig. 19. Bregman iteration: central part zoom. g TVL7 Bregman iteration: central part zoom. h TGV Bregman iteration: central part zoom (Color figure online) Fig. 20 Comparison between the model (5.1) for p = 2 and TVL2: Staircasing cannot be always eliminated by using the decomposition approach (5.1). a Solution u + v of (5.1): p = 2, α = 0.8, β = 120, SSIM = 0.9268. b TVL2: α = 1, β = 116, SSIM = 0.9433. c Solution u + v of (5.1): p = 2, α = 0.8, β = 70, SSIM = 0.8994. d TVL2: α = 0.8, β = 79, SSIM = 0.8998 1.2 1 0.8 0.6 0.4 0.2 0 −0.2 u v u+v 0 20 40 60 80 100 120 140 160 180 200 (a) (b) (c) We conclude with numerical results for the image decomposition approach of Sect. 5 which we solve again using the split Bregman algorithm. Recall that in dimension two, the solutions of (5.1) are not necessarily the same with the ones of (P ). In fact, we observe that (5.1) cannot always eliminate the staircasing, see for instance Figure 20. Even though, as we have already seen, we can eliminate the staircasing both in the square and in the circle by applying TVL p regularisation, Fig. 20b, d, we cannot obtain equally satisfactory results by solving (5.1). While using the latter we can get rid of the staircasing in the circle, Fig. 20c, this is not possible for the square, Fig. 20a, where we observe—after extensive experimentation—that no values of α and β lead to a staircasing elimination. This is analogous to the difference between TGV and the TV–TV2 infimal convolution of Chambolle– Lions [ 9 ]. However, the strength of the formulation (5.1) lies on its ability to efficiently decompose an image into piecewise constant and smooth parts. We depict that in Fig. 21, where we show the components u and v of the result in Fig. 20c. 7 Conclusion We have introduced a novel first-order, one-homogeneous TV–L p infimal convolution type functional suitable for variational image regularisation. The TVL p functional constitutes a very general class of regularisation functionals exhibiting diverse smoothing properties for different choices of p. In the case p = 2 the well-known Huber TV regulariser is recovered. We studied the corresponding one-dimensional denoising problem focusing on the structure of its solutions. We computed exact solutions of this problem for the case p = 2 for simple one-dimensional data. Hence, as an additional novelty in our paper we presented exact solutions of the onedimensional Huber TV denoising problem. Numerical experiments for several values of p indicate that our model leads to an elimination of the staircasing effect. We show that we can further enhance our results by increasing the contrast via a Bregman iteration scheme and thus obtaining results of similar quality to those of TGV. Furthermore, as p increases the structure of the solutions changes from piecewise smooth to piecewise linear and the model, in contrast to TGV, is capable of preserving sharp spikes in the reconstruction. This observation motivates a more detailed study of the TVL p functional for large p and in particular for the case p = ∞. This concludes the first part of the study of the TVL p model for p < ∞. The second part [ 8 ], is devoted to the p = ∞ case. There we explore further, both in an analytical and an experimental level, the capability of the TVL∞ model to promote affine and spike-like structures in the reconstructed image and we discuss several applications. Acknowledgments The authors would like to thank the anonymous reviewers for their interesting comments and suggestions which especially motivated our more detailed discussion on the generalised Huber total variation functional. The authors acknowledge support of the Royal Society International Exchange Award No. IE110314. This work is further supported by the King Abdullah University for Science and Technology (KAUST) Award No. KUK-I1-007-43, the EPSRC first Grant No. EP/J009539/1 and the EPSRC Grant No. EP/M00483X/1. MB acknowledges further support by ERC via Grant EU FP 7-ERC Consolidator Grant 615216 LifeInverse. KP acknowledges the financial support of EPSRC and the Alexander von Humboldt Foundation while in UK and Germany, respectively. EP acknowledges support by Jesus College, Cambridge and Embiricos Trust Scholarship. Appendix: Radon Measures and Functions of Bounded Variation In what follows ⊂ Rd is an open, bounded set with Lipschitz boundary whose Lebesgue measure is denoted by | |. We denote by M( , Rd ) (and M( ) if d = 1) the space of finite Radon measures on . The total variation measure of μ ∈ M( , Rd ) is denoted by |μ|, while we denote the polar decomposition of μ by μ = sgn(μ)|μ|, where sgn(μ) = 1 |μ|-almost everywhere. Recall that the Radon norm of a Rd -valued distribution T on is defined as T M := sup u, φ : φ ∈ Cc∞( , Rd ), φ ∞ ≤ 1 . It can be shown that T M < ∞ if and only if T can be represented by a measure μ ∈ M( , Rd ) and in that case μ M = |μ|( ). Regarding the subdifferential of the Radon norm we have that it can be characterised, at least for C0 functions, as follows [ 6 ] ∂ · M (μ) ∩ C0( ) = Sgn(μ) ∩ C0( ), where here Sgn(μ) denotes the set-valued sign Sgn(μ) = v ∈ L∞( ) ∩ L∞( , |μ|) : v ∞ ≤ 1, v = sgn(μ), |μ| − a.e.} . A function u ∈ L1( ) is a function of bounded variation if its distributional derivative Du is representable by a finite Radon measure. We denote by BV( ), the space of functions of bounded variation which is a Banach space under the norm (8.1) (8.2) u BV( ) := u L1( ) + Du M . The term Du M = sup u divφ dx : φ ∈ Cc∞( , Rd ), φ ∞ ≤ 1 , is called the total variation of u, also commonly denoted by TV(u). From the Radon–Nikodym theorem, the measure μ can be decomposed into an absolutely continuous and a singular part with respect to the Lebesgue measure Ld , that is Du = Da u + Ds u. Here, Da u = ∇uLd , i.e. ∇u denotes the Radon–Nikodym derivative of Da u with respect to Ld . Note that we also have Du M = Da u M + Ds u M. When d = 1, ∇u is simply denoted by u . Finally, we recall that BV( ) → Lr ( ) for 1 ≤ r ≤ d −d1 (resp. 1 ≤ r < d −d1 ) with continuous embedding (resp. compact embedding) . Recall also the following basic inequality regarding inclusions of L p spaces 1 1 h L p1 ( ) ≤ | | p1 − p2 h L p2 ( ) , 1 ≤ p1 < p2 ≤ ∞. Unless otherwise stated q denotes the Hölder conjugate of the exponent p, i.e. q = p p−1 1 if p ∈ (1, ∞), if p = ∞. Carola-Bibiane Schönlieb graduated from the Institute for Mathematics, University of Salzburg (Austria) in 2004. From 2004 to 2005 she held a teaching position in Salzburg. She received her Ph.D. degree from the University of Cambridge (UK) in 2009. After one year of postdoctoral activity at the University of Gttingen (Germany), she became a Lecturer in Applied and Computational Analysis at the Department of Applied Mathematics and Theoretical Physics (DAMTP), University of Cambridge (UK) in 2010, promoted to Reader in 2015. There, she is head of the Cambridge Image Analysis group and since 2011 a fellow of Jesus College Cambridge. Her current research interests range from nonlinear partial differential equations to computational- and convex analysis, with applications in digital image processing and inverse problems. 1. Ambrosio , L. , Fusco , N. , Pallara , D. : Functions of Bounded Variation and Free Discontinuity Problems . Oxford Science Publications, Oxford ( 2000 ) 2. Attouch , H. , Brezis , H.: Duality for the sum of convex functions in general Banach spaces . North-Holland Mathematical Library 34 , 125 - 133 ( 1986 ) 3. Benning , M. , Brune , C. , Burger , M. , Müller , J.: Higher-order TV methods-enhancement via Bregman iteration . J. Sci. Comput . 54 ( 2-3 ), 269 - 310 ( 2013 ). doi:10.1007/s10915-012-9650-3 4. Bergounioux , M. , Piffet , L. : A second-order model for image denoising . Set-Valued Var. Anal . 18 ( 3-4 ), 277 - 306 ( 2010 ). doi:10. 1007/s11228-010-0156-6 5. Bredies , K. , Kunisch , K. , Pock , T. : Total generalized variation . SIAM J. Imaging Sci . 3 ( 3 ), 492 - 526 ( 2010 ). doi:10.1137/ 090769521 6. Bredies , K. , Kunisch , K. , Valkonen , T. : Properties of L1-TGV 2: the one-dimensional case . J. Math. Anal. Appl . 398 ( 1 ), 438 - 454 ( 2013 ). doi: 10 .1016/j.jmaa. 2012 . 08 .053 7. Burger , M. , Osher , S.: A guide to the TV zoo . Level Set and PDE Based Reconstruction Methods in Imaging , pp. 1 - 70 . Springer, Berlin ( 2013 ) 8. Burger , M. , Papafitsoros , K. , Papoutsellis , E. , Schönlieb , C.-B.: Infimal convolution regularisation functionals of BV and L p spaces. The case p = ∞ . Submitted ( 2015 ) 9. Chambolle , A. , Lions , P.L. : Image recovery via total variation minimization and related problems . Numerische Mathematik 76 , 167 - 188 ( 1997 ). doi: 10 .1007/s002110050258 10. Chambolle , A. , Pock , T. : A first-order primal-dual algorithm for convex problems with applications to imaging . J. Math. Imaging Vis . 40 ( 1 ), 120 - 145 ( 2011 ). doi:10.1007/s10851-010-0251-1 11. Chan , T. , Marquina , A. , Mulet , P. : High-order total variation-based image restoration . SIAM J. Sci. Comput . 22 ( 2 ), 503 - 516 ( 2001 ). doi: 10 .1137/S1064827598344169 12. Chan , T.F. , Esedoglu , S. , Park , F.E. : Image decomposition combining staircase reduction and texture extraction . J. Vis. Commun. Image Represent . 18 ( 6 ), 464 - 486 ( 2007 ). doi: 10 .1016/j.jvcir. 2006 . 12 .004 13. Chen , C. , He , B. , Ye , Y. , Yuan , X. : The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent , Math. Program. 1-23 ( 2014 ) 14. Yuan , X. , Dai , Y., Han , D. , Zhang, W.: A sequential updating scheme of Lagrange multiplier for separable convex programming , Math. Comput. , to appear ( 2013 ) 15. Ekeland , I. , Témam , R.: Convex Analysis and Variational Problems . SIAM, Philadelphia ( 1999 ) 16. Elion , C. , Vese , L. : An image decomposition model using the total variation and the infinity Laplacian . Proc. SPIE 6498 , 64980W64980W - 10 ( 2007 ). doi: 10 .1117/12.716079 17. Esser , E.: Applications of Lagrangian-based alternating direction methods and connections to split Bregman . CAM Rep . 9 , 31 ( 2009 ) 18. Goldstein , T. , Osher , S.: The split Bregman algorithm method for L1-regularized problems . SIAM J. Imaging Sci. 2 , 323 - 343 ( 2009 ). doi:10.1137/080725891 19. Grant , M. , Boyd , S. : CVX: Matlab software for disciplined convex programming , version 2 .1, ( 2014 ) 20. Hansen , P. : Discrete inverse problems . Soc. Ind. Appl . Math. ( 2010 ). doi:10.1137/1 .9780898718836 21. He , B. , Hou , L. , Yuan , X. : On full jacobian decomposition of the augmented lagrangian method for separable convex programming, preprint ( 2013 ) 22. He , B. , Tao , M. , Yuan , X. : Convergence rate and iteration complexity on the alternating direction method of multipliers with a substitution procedure for separable convex programming , Math. Oper. Res., under revision 2 ( 2012 ) 23. Hintermüller , M. , Stadler , G. : An infeasible primal-dual algorithm for total bounded variation-based inf-convolution-type image restoration . SIAM J. Sci. Comput . 28 ( 1 ), 1 - 23 ( 2006 ). doi:10.1137/ 040613263 24. Huber , P.J.: Robust regression: asymptotics, conjectures and monte carlo . Ann. Stat. 1 , 799 - 821 ( 1973 ). http://www.jstor.org/stable/ 2958283 25. Kim , Y. , Vese , L. : Image recovery using functions of bounded variation and Sobolev spaces of negative differentiability . Inverse Probl Imaging 3 , 43 - 68 ( 2009 ). doi: 10 .3934/ipi. 2009 . 3 . 43 26. Kuijper , A. : P-Laplacian driven image processing . IEEE Int. Conf. Image Process . 5 , 257 - 260 ( 2007 ). doi: 10 .1109/ICIP. 2007 . 4379814 27. Lefkimmiatis , S. , Bourquard , A. , Unser , M. : Hessian-based norm regularization for image restoration with biomedical applications . IEEE Trans. Image Process . 21 , 983 - 995 ( 2012 ). doi: 10 .1109/TIP. 2011 .2168232 28. Lindqvist , P.: Notes on the p-Laplace equation , University of Jyvaskyla ( 2006 ) 29. Lysaker , M. , Lundervold , A. , Tai , X.C. : Noise removal using fourth-order partial differential equation with applications to medical magnetic resonance images in space and time . IEEE Trans. Image Process . 12 ( 12 ), 1579 - 1590 ( 2003 ). doi: 10 .1109/TIP. 2003 . 819229 30. Lysaker , M. , Tai , X.C. : Iterative image restoration combining total variation minimization and a second-order functional . Int. J. Comput. Vis . 66 ( 1 ), 5 - 18 ( 2006 ). doi:10.1007/s11263-005-3219-7 31. Nien , H. , Fessler , J.A. : A convergence proof of the split Bregman method for regularized least-squares problems , arXiv: 1402 .4371, preprint ( 2014 ) 32. Osher , S. , Burger , M. , Goldfarb , D. , Xu , J. , Yin , W.: An iterative regularization method for total variation based image restoration . SIAM Multisc. Model. Simul . 4 , 460 - 489 ( 2005 ). doi:10.1137/ 040605412 33. Papafitsoros , K. , Bredies , K. : A study of the one dimensional total generalised variation regularisation problem . Inverse Probl. Imaging 9 , 511 - 550 ( 2015 ). doi: 10 .3934/ipi. 2015 . 9 . 511 34. Papafitsoros , K. , Schönlieb , C.B. : A combined first and second order variational approach for image reconstruction . J. Math. Imaging Vis . 48 ( 2 ), 308 - 338 ( 2014 ). doi:10.1007/s10851-013-0445-4 35. Papafitsoros , K. , Valkonen , T. : Asymptotic behaviour of total generalised variation . Scale Space and Variational Methods in Computer Vision 9087 , 702 - 714 ( 2015 ) 36. Papoutsellis , E.: First-order gradient regularisation methods for image restoration. reconstruction of tomographic images with thin structures and denoising piecewise affine images , Ph.D. thesis , University of Cambridge, Cambridge ( 2015 ) 37. Ring , W. : Structural properties of solutions to total variation regularisation problems . ESAIM Math. Model. Numer. Anal . 34 , 799 - 810 ( 2000 ). doi: 10 .1051/m2an: 2000104 38. Rudin , L. , Osher , S. , Fatemi , E.: Nonlinear total variation based noise removal algorithms . Phys. D: Nonlinear Phenom . 60 , 259 - 268 ( 1992 ). doi: 10 .1016/ 0167 - 2789 ( 92 ) 90242 -F 39. Vese , L. : A study in the BV space of a denoising-deblurring variational problem . Appl. Math. Optim . 44 ( 2 ), 131 - 161 ( 2001 ) 40. Vogel , C. , Oman , M. : Iterative methods for total variation denoising . SIAM J. Sci. Comput . 17 ( 1 ), 227 - 238 ( 1996 ). doi:10.1137/ 0917016 41. Wang , Z. , Bovik , A.C. , Sheikh , H.R. , Simoncelli , E.P. : Image quality assessment: from error visibility to structural similarity . IEEE Trans. Image Process . 13 , 600 - 612 ( 2004 ). doi: 10 .1109/TIP. 2003 . 819861


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs10851-015-0624-6.pdf

Martin Burger, Konstantinos Papafitsoros, Evangelos Papoutsellis, Carola-Bibiane Schönlieb. Infimal Convolution Regularisation Functionals of BV and \(\varvec{\mathrm {L}}^{\varvec{p}}\) Spaces, Journal of Mathematical Imaging and Vision, 2016, 343-369, DOI: 10.1007/s10851-015-0624-6