Theoretical guarantees for permutation-equivariant quantum neural networks (pdf)

Article PDF cannot be displayed. You can download it here:

https://www.nature.com/articles/s41534-024-00804-1.pdf

Theoretical guarantees for permutation-equivariant quantum neural networks

www.nature.com/npjqi ARTICLE OPEN Theoretical guarantees for permutation-equivariant quantum neural networks Louis Schatzki 1,2,6 ✉ , Martín Larocca3,4,6, Quynh T. Nguyen 3,5 , Frédéric Sauvage3 and M. Cerezo 1✉ Despite the great promise of quantum machine learning models, there are several challenges one must overcome before unlocking their full potential. For instance, models based on quantum neural networks (QNNs) can suffer from excessive local minima and barren plateaus in their training landscapes. Recently, the nascent ﬁeld of geometric quantum machine learning (GQML) has emerged as a potential solution to some of those issues. The key insight of GQML is that one should design architectures, such as equivariant QNNs, encoding the symmetries of the problem at hand. Here, we focus on problems with permutation symmetry (i.e., symmetry group Sn), and show how to build Sn-equivariant QNNs We provide an analytical study of their performance, proving that they do not suffer from barren plateaus, quickly reach overparametrization, and generalize well from small amounts of data. To verify our results, we perform numerical simulations for a graph state classiﬁcation task. Our work provides theoretical guarantees for equivariant QNNs, thus indicating the power and potential of GQML. 1234567890():,; npj Quantum Information (2024)10:12 ; https://doi.org/10.1038/s41534-024-00804-1 INTRODUCTION Symmetry studies and formalizes the invariance of objects under some set of operations. A wealth of theory has gone into describing symmetries as mathematical entities through the concept of groups and representations. While the analysis of symmetries in nature has greatly improved our understanding of the laws of physics, the study of symmetries in data has just recently gained momentum within the framework of learning theory. In the past few years, classical machine learning practitioners realized that models tend to perform better when constrained to respect the underlying symmetries of the data. This has led to the blossoming ﬁeld of geometric deep learning1–5, where symmetries are incorporated as geometric priors into the learning architectures, improving trainability and generalization performance6–13. The tremendous success of geometric deep learning has recently inspired researchers to import these ideas to the realm of quantum machine learning (QML)14–16. QML is a new and exciting ﬁeld at the intersection of classical machine learning, and quantum computing. By running routines in quantum hardware, and thus exploiting the exponentially large dimension of the Hilbert space, the hope is that QML algorithms can outperform their classical counterparts when learning from data17. The infusion of ideas from geometric deep learning to QML has been termed ‘geometric quantum machine learning’ (GQML)18–24. GQML leverages the machinery of group and representation theory25 to build quantum architectures that encode symmetry information about the problem at hand. For instance, when the model is parametrized through a quantum neural network (QNN)16,26–28, GQML indicates that the layers of the QNN should be equivariant under the action of the symmetry group associated to the dataset. That is, applying a symmetry transformation on the input to the QNN layers should be the same as applying it to its output. One of the main goals of GQML is to create architectures that solve, or at least signiﬁcantly mitigate, some of the known issues of standard symmetry non-preserving QML models16. For instance, it has been shown that the optimization landscapes of generic QNNs can exhibit a large number of local minima29–32, or be prone to the barren plateau phenomenon33–45 whereby the loss function gradients vanish exponentially with the problem size. Crucially, it is known that barren plateaus and excessive local minima are connected to the expressibility30,32,37,43,46 of the QNN, so that problem-agnostic architectures are more likely to exhibit trainability issues. In this sense, it is expected that following the GQML program of baking symmetry directly into the algorithm, will lead to models with sharp inductive biases that suitably limit their expressibility and search space. In this work, we leverage the GQML toolbox to create models that are permutation invariant, i.e., models whose outputs remain invariant under the action of the symmetric group Sn (see Fig. 1). We focus on this particular symmetry as learning problems with permutation symmetries abound. Examples include learning over sets of elements47,48, modeling relations between pairs (graphs)49–54 or multiplets (hypergraphs) of entities55–57, problems deﬁned on grids (such as condensed matter systems)58–61, molecular systems62–64, evaluating genuine multipartite entanglement65–68, or working with distributed quantum sensors69–71. Our ﬁrst contribution is to provide guidelines to build unitary Snequivariant QNNs. We then derive rigorous theoretical guarantees for these architectures in terms of their trainability and generalization capabilities. Speciﬁcally, we prove that Sn-equivariant QNNs do not lead to barren plateaus, can be overparametrized with polynomially deep circuits, and generalize well with only a polynomial number of training points. We also identify problems (i.e., datasets) for which the model is trainable, but also datasets leading to untrainability. All these appealing properties are also demonstrated in numerical simulations of a graph classiﬁcation 1 Information Sciences, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. 2Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA. 3Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. 4Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. 5Harvard Quantum Initiative, Harvard University, Cambridge, ML 02138, USA. 6These authors contributed equally: Louis Schatzki, Martín Larocca. ✉email: ; Published in partnership with The University of New South Wales L. Schatzki et al. 2 will restrict to L-layered QNNs U θ ¼ U LθL U 1θ1 ; where U lθl ðρÞ ¼ eiθl Hl ρeiθl Hl ; QL (1) iθl Hl . for some Hermitian generators {Hl}, so that UðθÞ ¼ l¼1 e Moreover, we consider models that depend on a loss function of the form ℓθ ðρi Þ ¼ Tr½U θ ðρi ÞO; (2) where O is a Hermitian observable. We quantify the training error via the so-called empirical loss, or training error, which is deﬁned as b LðθÞ ¼ M X ci ℓθ ðρi Þ: (3) 1234567890():,; i¼1 Fig. 1 GQML embeds geometric priors into a QML model. Incorporating prior knowledge through Sn-equivariance heavily restricts the search space of the model. We show that such inductive biases lead to models that do not exhibit barren plateaus, can be efﬁciently overparametrized, and require small amounts of data to generalize well. task. Our empirical results verify our theoretical ones, and even show that the pe (...truncated)