Equivariant analytical mapping of first principles Hamiltonians to accurate and transferable materials models
www.nature.com/npjcompumats
ARTICLE
OPEN
Equivariant analytical mapping of first principles
Hamiltonians to accurate and transferable materials models
Liwei Zhang 1, Berk Onat2, Geneviève Dusson3, Adam McSloy2, G. Anand
James R. Kermode 2 ✉
4
, Reinhard J. Maurer5, Christoph Ortner1 and
We propose a scheme to construct predictive models for Hamiltonian matrices in atomic orbital representation from ab initio data
as a function of atomic and bond environments. The scheme goes beyond conventional tight binding descriptions as it represents
the ab initio model to full order, rather than in two-centre or three-centre approximations. We achieve this by introducing an
extension to the atomic cluster expansion (ACE) descriptor that represents Hamiltonian matrix blocks that transform equivariantly
with respect to the full rotation group. The approach produces analytical linear models for the Hamiltonian and overlap matrices.
Through an application to aluminium, we demonstrate that it is possible to train models from a handful of structures computed
with density functional theory, and apply them to produce accurate predictions for the electronic structure. The model generalises
well and is able to predict defects accurately from only bulk training data.
1234567890():,;
npj Computational Materials (2022)8:158 ; https://doi.org/10.1038/s41524-022-00843-2
INTRODUCTION
The availability of accurate and highly efficient interatomic
potentials is crucial for the atomistic simulation of materials
phenomena with intrinsic length and time scales inaccessible to
first principles electronic structure theory. Examples in materials
science include failure processes such as crack propagation1 and
chemical dynamics at reactive surfaces2. The advent of machinelearning-based interatomic potentials (MLIPs) has meant that highfidelity interatomic potentials based on Kohn–Sham density
functional theory (KS-DFT) and beyond have become much more
widely available3–5. Yet, the effort to generate MLIPs that are both
transferable and accurate is still significant and heavily depends on
the configurational space spanned by the underlying training data
set6. Very few MLIPs have been reported that are able to capture
different materials phases, surface terminations, and the effects of
complex defects on the stability and structure of the material5,7,8.
More importantly, MLIPs and conventional interatomic potentials fundamentally neglect explicit electronic degrees of freedom
of molecules and materials thereby removing access to the
simulation of observables beyond structure and stability, such as
electric conductivity and optical response, which depend on the
electronic subsystem and electron–phonon coupling. While the
ability to predict optical and electronic properties is desirable, the
inclusion of electronic degrees of freedom will likely also benefit
the transferability of MLIPs.
For decades, semi-empirical and tight-binding (TB) models of
electronic structure have sought to combine the efficiency of
interatomic potentials with the explicit description of electrons. A
plethora of approaches based on two-centre and three-centre
integral approximations have led to established method frameworks such as the AM1 and PM3 methods9,10, the density
functional tight-binding (DFTB) method11,12, the Sankey–Niklewski
approach as implemented in the FIREBALL code13,14, and the xTB
approach15. Unfortunately, the rigid mathematical form of the
integral tabulations in most approaches means that TB parametrizations are limited in accuracy and often do not transfer beyond
the materials classes for which they were originally intended.
As ML methods make inroads across a diverse range of molecular
simulation workflows16, approaches beyond MLIPs are being
pursued that incorporate electronic properties. For molecules, Li
et al. have proposed a neural-network-based parametrization
pipeline for DFTB17, while Stoehr et al. have proposed deep tensor
neural networks (DTNNs) to construct beyond-pairwise repulsion
potentials18. Qiao et al. have shown that the use of symmetryadapted atomic-orbital features can significantly improve transferability and prediction accuracy of molecular stability19.
In the realm of condensed phase materials, the automated
construction of tight-binding models from ab initio data has been a
topic of great interest as it can benefit high-throughput materials
screening studies20. Most commonly, electronic structure simulations
of materials are performed in non-atom-centred basis representations such as the pseudopotential plane wave framework, which is
not easily amenable to the construction of TB models. TB
Hamiltonians are typically constructed via transformation into a
maximally localised Wannier function representation21, which
provides a compact atom-centred basis representation with local
support22. It is also possible to fit Slater–Koster parameters directly to
DFT calculations in a data-driven fashion23,24. Materials simulations in
atom-centred orbital representations as provided by, for example,
the FHI-aims code25 are becoming more common, where Wannierization is not necessary and the basis representation provided by the
code is directly amenable to machine learning approaches based on
local representations of atomic neighbourhoods6. Examples of such
representations include Behler–Parinello symmetry functions3,26, the
SOAP descriptor27 or the atomic cluster expansion28,29. First efforts of
direct machine learning prediction of electronic structure have been
reported in literature. For example, SchNOrb30 is a DTNN representation of molecular mean-field electronic structure Hamiltonians, which
1
Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, BC V6T 1Z2, Canada. 2Warwick Centre for Predictive Modelling, School of
Engineering, University of Warwick, Coventry CV4 7AL, UK. 3Laboratoire de Mathématiques, UMR CNRS 6623, Université Bourgogne Franche-Comté, 16 route de Gray, 25030
Besançon, France. 4Department of Metallurgy and Materials Engineering, Indian Institute of Engineering Science and Technology-Shibpur, Howrah, WB, India. 5Department of
Chemistry, University of Warwick, Coventry CV4 7AL, UK. ✉email:
Published in partnership with the Shanghai Institute of Ceramics of the Chinese Academy of Sciences
L. Zhang et al.
1234567890():,;
2
has been used to predict Hamiltonians in local atomic orbital and
optimised effective minimal basis representations for organic
molecules including up to 13 heavy atoms30,31. Hedge and Bowen32
employed Kernel ridge regression with a bispectrum representation33
for an analytical representation of a minimal basis DFT Hamiltonian
for bulk copper and diamond. Equivariant parameterisations for
molecular systems along similar lines to what we describe here have
been reported, learning either from the Hamiltonian34 or from
wavefunctions and electronic densities35. These works apply linear or
nonlinear equivariant models, respec (...truncated)