Exponential ReLU Neural Network Approximation Rates for Point and Edge Singularities
Foundations of Computational Mathematics
https://doi.org/10.1007/s10208-022-09565-9
Exponential ReLU Neural Network Approximation Rates
for Point and Edge Singularities
Carlo Marcati1,2 · Joost A. A. Opschoor1 · Philipp C. Petersen3 ·
Christoph Schwab1
Received: 23 October 2020 / Revised: 17 September 2021 / Accepted: 7 February 2022
© The Author(s) 2022
Abstract
In certain polytopal domains Ω, in space dimension d = 2, 3, we prove exponential
expressivity with stable ReLU Neural Networks (ReLU NNs) in H 1 (Ω) for weighted
analytic function classes. These classes comprise in particular solution sets of source
and eigenvalue problems for elliptic PDEs with analytic data. Functions in these classes
are locally analytic on open subdomains D ⊂ Ω, but may exhibit isolated point
singularities in the interior of Ω or corner and edge singularities at the boundary ∂Ω.
The exponential approximation rates are shown to hold in space dimension d = 2 on
Lipschitz polygons with straight sides, and in space dimension d = 3 on Fichera-type
polyhedral domains with plane faces. The constructive proofs indicate that NN depth
and size increase poly-logarithmically with respect to the target NN approximation
accuracy ε > 0 in H 1 (Ω). The results cover solution sets of linear, second-order
elliptic PDEs with analytic data and certain nonlinear elliptic eigenvalue problems with
analytic nonlinearities and singular, weighted analytic potentials as arise in electron
Communicated by Endre Süli.
B Carlo Marcati
Joost A. A. Opschoor
Philipp C. Petersen
Christoph Schwab
1
Seminar for Applied Mathematics, ETH Zürich, CH8092 Zürich, Switzerland
2
Present Address: Dipartimento di Matematica, Università degli Studi di Pavia, 27100 Pavia, Italy
3
Faculty of Mathematics and Research Network Data Science @ Uni Vienna, University of Vienna,
1090 Vienna, Austria
123
Foundations of Computational Mathematics
structure models. Here, the functions correspond to electron densities that exhibit
isolated point singularities at the nuclei.
Keywords Neural networks · Finite element methods · Exponential convergence ·
Analytic regularity · Singularities · Electron structure
Mathematics Subject Classification 35Q40 · 41A25 · 41A46 · 65N30
1 Introduction
The application of deep neural networks (DNNs) as approximation architecture in
numerical solution methods of partial differential equations (PDEs), possibly on highdimensional parameter- and state-spaces, attracted increasing attention in recent years.
An incomplete list of recently proposed algorithmic approaches is [11, 45, 46, 52, 54]
and references therein. In these works, DNN-based approaches for the numerical
approximation of solutions of elliptic and parabolic boundary value problems are proposed. Two key ingredients in these approaches are: (a) use of DNNs as approximation
architecture for the numerical approximation of solutions (thus using DNNs in place
of, e.g., finite element, finite volume or finite difference methods), and (b) incorporation of a suitable weak form of the PDE of interest into the loss function of the DNN
training. For example, weak residuals, least squares or, for variational formulations
from continuum mechanics, total potential energies in variational principles [11] have
been proposed.
In the study of NNs as numerical methods for solving PDEs, usually three types
of errors are identified. After fixing a NN architecture and activation function, the
approximation error indicates how well the PDE solution can be approximated by
NNs with that architecture. An additional error is incurred when the NN must be
trained on only a finite amount of possibly corrupted data about the PDE solution.
This contribution to the overall error, in particular there where the given data are
uninformative, is the generalization error and is in addition to further errors that are
caused by the training algorithm, which can be called optimization error. In this paper,
we study the approximation error of deep ReLU neural networks.
One condition for good performance of these computational approaches requires
the DNNs to achieve a high rate of approximation uniformly over the solution set associated with the PDE under consideration. This is analogous to what has been found in
the mathematical convergence rate analysis of, e.g., finite element methods: convergence rate bounds are well-known to be related, via stability and quasi-optimality, to
approximability of solutions sets of PDEs from the finite element spaces under consideration. Since numerical solutions are (generally oblique) projections of the unknown
solution onto finite-dimensional subspaces, the convergence rates are naturally determined by approximation rates of the subspace families under consideration within the
regularity classes of PDE. For elliptic boundary and eigenvalue problems, function
classes of (weighted) Sobolev or Besov type are well known to describe both solution
regularity and approximation rates.
123
Foundations of Computational Mathematics
For functions belonging to a smoothness space of finite differentiation order such as
continuously differentiable, Sobolev-regular, or Besov-regular functions on a bounded
domain, upper bounds for algebraic approximation rates by NNs were established for
example in [9, 10, 16, 32, 55, 57, 58]. Here, we only mentioned results that use
the ReLU activation function. Besides, for PDEs, in particular in high-dimensional
domains approximation rates of the solution that go beyond classical smoothnessbased results were established in [5, 12, 26, 29, 51]. Again, we confine the list to
publications with approximation rates for NNs with the ReLU activation function
(referred to as ReLU NNs below).
In the present paper, we prove that exponential approximation rates are achieved by
deep ReLU NNs for weighted, analytic solution classes of linear and nonlinear elliptic
source and eigenvalue problems on polygonal and polyhedral domains. Mathematical
results on weighted analytic regularity [2, 6, 8, 17–20, 24, 35, 38, 39] imply that
these classes consist of functions that are analytic with possible corner, edge, and
corner-edge singularities.
In contrast to the previously mentioned approximation results for ReLU NNs, the
function class studied here is special in the sense that it admits extremely high regularity in most parts of the domain except for designated locations, i.e., the edges and
corners of a domain, where the regularity is assumed to be very low. An approximation scheme to realize the exponential approximation rates associated with analytic
regularity, therefore, hinges on a successful resolution of the singularities. We will see
that, in addition to emulating local polynomial approximation, the presented scheme
is strongly adapted to the potentially complex geometries of the underlying domains.
Our analysis provides, for the aforementioned functions, approximation errors in
Sobolev norms that decay exponentially in terms of (...truncated)