Theoretical guarantees for permutation-equivariant quantum neural networks

Feb 2024

Despite the great promise of quantum machine learning models, there are several challenges one must overcome before unlocking their full potential. For instance, models based on quantum neural networks (QNNs) can suffer from excessive local minima and barren plateaus in their training landscapes. Recently, the nascent field of geometric quantum machine learning (GQML) has emerged as a potential solution to some of those issues. The key insight of GQML is that one should design architectures, such as equivariant QNNs, encoding the symmetries of the problem at hand. Here, we focus on problems with permutation symmetry (i.e., symmetry group Sn), and show how to build Sn-equivariant QNNs We provide an analytical study of their performance, proving that they do not suffer from barren plateaus, quickly reach overparametrization, and generalize well from small amounts of data. To verify our results, we perform numerical simulations for a graph state classification task. Our work provides theoretical guarantees for equivariant QNNs, thus indicating the power and potential of GQML.

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41534-024-00804-1.pdf

Theoretical guarantees for permutation-equivariant quantum neural networks

www.nature.com/npjqi ARTICLE OPEN Theoretical guarantees for permutation-equivariant quantum neural networks Louis Schatzki 1,2,6 ✉ , Martín Larocca3,4,6, Quynh T. Nguyen 3,5 , Frédéric Sauvage3 and M. Cerezo 1✉ Despite the great promise of quantum machine learning models, there are several challenges one must overcome before unlocking their full potential. For instance, models based on quantum neural networks (QNNs) can suffer from excessive local minima and barren plateaus in their training landscapes. Recently, the nascent field of geometric quantum machine learning (GQML) has emerged as a potential solution to some of those issues. The key insight of GQML is that one should design architectures, such as equivariant QNNs, encoding the symmetries of the problem at hand. Here, we focus on problems with permutation symmetry (i.e., symmetry group Sn), and show how to build Sn-equivariant QNNs We provide an analytical study of their performance, proving that they do not suffer from barren plateaus, quickly reach overparametrization, and generalize well from small amounts of data. To verify our results, we perform numerical simulations for a graph state classification task. Our work provides theoretical guarantees for equivariant QNNs, thus indicating the power and potential of GQML. 1234567890():,; npj Quantum Information (2024)10:12 ; https://doi.org/10.1038/s41534-024-00804-1 INTRODUCTION Symmetry studies and formalizes the invariance of objects under some set of operations. A wealth of theory has gone into describing symmetries as mathematical entities through the concept of groups and representations. While the analysis of symmetries in nature has greatly improved our understanding of the laws of physics, the study of symmetries in data has just recently gained momentum within the framework of learning theory. In the past few years, classical machine learning practitioners realized that models tend to perform better when constrained to respect the underlying symmetries of the data. This has led to the blossoming field of geometric deep learning1–5, where symmetries are incorporated as geometric priors into the learning architectures, improving trainability and generalization performance6–13. The tremendous success of geometric deep learning has recently inspired researchers to import these ideas to the realm of quantum machine learning (QML)14–16. QML is a new and exciting field at the intersection of classical machine learning, and quantum computing. By running routines in quantum hardware, and thus exploiting the exponentially large dimension of the Hilbert space, the hope is that QML algorithms can outperform their classical counterparts when learning from data17. The infusion of ideas from geometric deep learning to QML has been termed ‘geometric quantum machine learning’ (GQML)18–24. GQML leverages the machinery of group and representation theory25 to build quantum architectures that encode symmetry information about the problem at hand. For instance, when the model is parametrized through a quantum neural network (QNN)16,26–28, GQML indicates that the layers of the QNN should be equivariant under the action of the symmetry group associated to the dataset. That is, applying a symmetry transformation on the input to the QNN layers should be the same as applying it to its output. One of the main goals of GQML is to create architectures that solve, or at least significantly mitigate, some of the known issues of standard symmetry non-preserving QML models16. For instance, it has been shown that the optimization landscapes of generic QNNs can exhibit a large number of local minima29–32, or be prone to the barren plateau phenomenon33–45 whereby the loss function gradients vanish exponentially with the problem size. Crucially, it is known that barren plateaus and excessive local minima are connected to the expressibility30,32,37,43,46 of the QNN, so that problem-agnostic architectures are more likely to exhibit trainability issues. In this sense, it is expected that following the GQML program of baking symmetry directly into the algorithm, will lead to models with sharp inductive biases that suitably limit their expressibility and search space. In this work, we leverage the GQML toolbox to create models that are permutation invariant, i.e., models whose outputs remain invariant under the action of the symmetric group Sn (see Fig. 1). We focus on this particular symmetry as learning problems with permutation symmetries abound. Examples include learning over sets of elements47,48, modeling relations between pairs (graphs)49–54 or multiplets (hypergraphs) of entities55–57, problems defined on grids (such as condensed matter systems)58–61, molecular systems62–64, evaluating genuine multipartite entanglement65–68, or working with distributed quantum sensors69–71. Our first contribution is to provide guidelines to build unitary Snequivariant QNNs. We then derive rigorous theoretical guarantees for these architectures in terms of their trainability and generalization capabilities. Specifically, we prove that Sn-equivariant QNNs do not lead to barren plateaus, can be overparametrized with polynomially deep circuits, and generalize well with only a polynomial number of training points. We also identify problems (i.e., datasets) for which the model is trainable, but also datasets leading to untrainability. All these appealing properties are also demonstrated in numerical simulations of a graph classification 1 Information Sciences, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. 2Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA. 3Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. 4Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. 5Harvard Quantum Initiative, Harvard University, Cambridge, ML 02138, USA. 6These authors contributed equally: Louis Schatzki, Martín Larocca. ✉email: ; Published in partnership with The University of New South Wales L. Schatzki et al. 2 will restrict to L-layered QNNs U θ ¼ U LθL      U 1θ1 ; where U lθl ðρÞ ¼ eiθl Hl ρeiθl Hl ; QL (1) iθl Hl . for some Hermitian generators {Hl}, so that UðθÞ ¼ l¼1 e Moreover, we consider models that depend on a loss function of the form ℓθ ðρi Þ ¼ Tr½U θ ðρi ÞO; (2) where O is a Hermitian observable. We quantify the training error via the so-called empirical loss, or training error, which is defined as b LðθÞ ¼ M X ci ℓθ ðρi Þ: (3) 1234567890():,; i¼1 Fig. 1 GQML embeds geometric priors into a QML model. Incorporating prior knowledge through Sn-equivariance heavily restricts the search space of the model. We show that such inductive biases lead to models that do not exhibit barren plateaus, can be efficiently overparametrized, and require small amounts of data to generalize well. task. Our empirical results verify our theoretical ones, and even show that the pe (...truncated)


This is a preview of a remote PDF: https://www.nature.com/articles/s41534-024-00804-1.pdf
Article home page: https://www.nature.com/articles/s41534-024-00804-1

Schatzki, Louis, Larocca, Martín, Nguyen, Quynh T., Sauvage, Frédéric, Cerezo, M.. Theoretical guarantees for permutation-equivariant quantum neural networks, DOI: 10.1038/s41534-024-00804-1