Sage Journals: Discover world-class research

Abstract

Crystal structures are naturally represented as graphs, making Graph Neural Networks (GNNs) a powerful tool for capturing complex atomic interactions and geometric relationships. This review summarizes recent advances in GNNs-based representation learning for crystal materials, with a specific focus on addressing the critical challenge of passive symmetry. We critically analyze existing frameworks by classifying them into asymmetric and symmetric paradigms, evaluating how they address periodic invariance and geometric completeness through graph construction and architectural design. We also compile key datasets and benchmarks to provide a systematic performance comparison of representative models. Finally, we discuss unresolved challenges, including the trade-off between architectural rigor and computational efficiency, modeling complex non-ideal systems, and predicting high-order tensorial properties, highlighting promising directions for future research.

Keywords

Graph neural networks passive symmetry in crystals crystal material representation learning deep learning computational materials science

1. Introduction

Accurate property prediction of crystalline materials is critical for industries such as semiconductors and renewable energy.^1–7 Since properties are determined by crystal structures, characterizing these structures effectively is a central research focus.^8–10 Conventional trial-and-error methods are often slow, costly, and labor-intensive,¹¹ limiting the efficient discovery of new materials. To overcome these issues, computational simulations have become essential.¹² Methods such as quantum mechanics-based density functional theory (DFT),^13,14 molecular dynamics simulations,^15,16 and finite element analysis^17,18 have demonstrated effectiveness in predicting the physical and chemical properties of materials and guiding experimental design. However, these simulation methods often struggle to model complex and dynamic material environments, and their high computational complexity reaching up to O(n³) or even O(n⁷),¹⁹ limits their practicality in new materials development.

Recently, machine learning (ML) has advanced rapidly in materials science,^20–28 giving rise to material representation learning techniques.²⁹ Unlike traditional simulation-based approaches, ML-based methods can rapidly predict crystal properties by automatically learning latent features.^4,30–32 This reduces the reliance on domain experts while improving computational efficiency and flexibility.⁵ Moreover, these methods can capture complex nonlinear relationships in the high-dimensional materials data that are often beyond the reach of conventional techniques.³³ However, standard ML methods often fail to fully capture material microstructures, particularly regarding the passive symmetry of crystals.

To address this, GNNs have recently been introduced into materials science.^34–40 GNNs are well-suited for processing graph-structured data,⁴¹ and since chemical structures such as molecules and crystals can be naturally represented as graphs, they offer a natural framework for modeling atomic interactions. By learning representations of atoms (nodes) and bonds (edges), GNNs capture material patterns more effectively than traditional ML schemes.^3,42,43 Consequently, GNNs have become a powerful tool for property prediction. Yet, designing GNNs that strictly respect crystal passive symmetry remains a major challenge.⁴⁴ Capturing passive symmetry is fundamentally important because it ensures that the predicted material properties remain invariant under coordinate transformations (such as rotation and translation), thereby adhering to the intrinsic physical laws of the crystalline state. Without explicit symmetry constraints, models are forced to relearn these invariances from vast amounts of data, often resulting in data inefficiency and poor generalization to unseen structures. While recent solutions have made progress, balancing computational efficiency with rigorous symmetry handling remains an active area of research.

Alongside the advances in macroscopic property prediction, the domain of Machine Learning Interatomic Potentials (MLIPs) for molecular dynamics has been revolutionized by a distinct class of strictly equivariant architectures. Foundational frameworks, such as e3nn,^45–48 have catalyzed the emergence of state-of-the-art models, notably NequIP,^49,50 MACE,^51,52 and Allegro.^50,53,54 Unlike descriptor-based models that rely solely on scalar distances or angles, these architectures utilize irreducible representations (irreps) of the O (3) group and spherical harmonics to perform message passing directly on geometric tensors. While this tensor-based approach elegantly preserves directional information and achieves unprecedented local accuracy for microscopic forces and energies, it introduces substantial computational overhead due to expensive tensor products. Although some pioneering macroscopic frameworks, have recently begun integrating similar tensor product operations to rigorously capture directional information and achieve geometric completeness, scaling these operations across infinite periodic graphs remains highly computationally demanding.

Therefore, rather than focusing on microscopic MLIPs, this review precisely scopes its discussion on computationally scalable GNNs frameworks designed for macroscopic properties. This review analyzes the issue of passive symmetry in crystal materials and surveys various GNNs-based frameworks within this scope. We also compile key datasets and benchmarks to provide a systematic performance comparison of representative models. Finally, we discuss unresolved challenges and outline future directions to further enhance the symmetry-awareness and utility of GNNs in materials science.

2. Passive symmetry challenge

Accurately capturing passive symmetries is a central challenge in predicting crystalline material properties.²⁹ These symmetries fundamentally encompass SE(3) invariance, SO(3) equivariance, and periodic invariance. This review focuses on GNNs-based methods that learn from geometric inputs, such as atomic coordinates and interatomic angles. In this section, we introduce the three principal passive symmetries of crystals and further discuss the critical concept of geometric completeness, which addresses the distinction between rigid-body symmetry and reflection invariance (chirality).

2.1. SO(3) equivariance of crystal materials

The SO(3) group comprises all rotations in three-dimensional space. A unit cell is the smallest periodically repeating unit of a crystal. For a crystalline material, SO(3) equivariance implies a specific consistency constraint: if the crystal’s unit cell is rotated, any vector or tensor properties derived from it (such as interatomic forces or dielectric tensors) must rotate accordingly.⁵⁵

In materials representation learning, this property ensures that predictions depend only on the material’s structure, not on its orientation in the coordinate system. This is vital for modeling directional properties, such as band structures or mechanical deformations. By enforcing SO(3) equivariance, models avoid errors caused by arbitrary coordinate choices, thereby improving predictive accuracy. Figure 1 illustrates this concept, showing an original unit cell (a) and its SO(3) rotated counterpart (b), which corresponds to a 270° rotation around the yellow axis as indicated by the pink arrow.

Figure 1.

SE(3) and SO(3) operations of crystal materials.

The structure of crystal materials will be described using the lattice basis vectors and the unit cells they form. Following the definition by Yan et al.,^19,56 the following relationship can be obtained. For a unit cell with n atoms, the crystalline material S formed by it can be expressed as S =( t , P , A ). Here, t ∈ ℤ^N is the atomic type vector, where the n-th element t_n in t represents the atomic number of the n-th atom in the unit cell, thereby indicating the type of each atom; P =[ p ₁, p ₂, p ₃, ..., p _n, ..., p _N] ∈ ℝ^3×N is the atomic coordinate matrix, with each element p _n representing the coordinate position information of each atom in the unit cell; A =[ a ₁, a ₂, a ₃] ∈ ℝ^3×3 is the lattice basis vector matrix, with the three lattice basis vectors describing how the unit cell repeats periodically in three-dimensional space to form a crystal. The mathematical definition of the SO(3) equivariance of the crystalline material is as follows.

Definition 1

For a crystal material S =( t , P , A ), where P =[ p ₁, p ₂, p ₃, ..., p _n, ..., p _N] ∈ ℝ^3×N and A =[ a ₁, a ₂, a ₃] ∈ ℝ^3×3, suppose there is a rotation matrix R ∈ SO(3) that rotates the positions of all atoms in the crystal material, and the lattice matrix A is also rotated. Specifically, the rotation transformation of the crystal material is defined as:

R S = (t, R P, R A),

(1)

The SO(3) equivariance requirement states that after any rotation transformation, the representation of the crystal material and its corresponding physical and chemical properties should change correspondingly. That is, for any rotation matrix R ∈ SO(3), there is:

R Φ (t, P, A) = Φ (t, R P, R A),

(2)

where Φ(x) is the mapping from the structure of the crystal material to its macroscopic properties.

2.2. SE(3) invariance of crystal materials

The Special Euclidean group, SE(3), consists of all translation and rotation operations in three-dimensional space. It is a subgroup of the Euclidean group, E(3), which includes translations, rotations, and reflections. The key distinction is that SE(3) excludes reflections. Furthermore, the SO(3) group (discussed in Section 2.1) is a subgroup of SE(3) containing only rotations. Due to its coverage of all translation and rotation transformations, SE(3) is crucial in domains like computer graphics,^57–59 materials science,^19,60–62 and robot kinematics.⁶³

SE(3) invariance dictates that rigid transformations (rotations and translations) of the unit cell must not alter the crystal’s intrinsic scalar properties. Quantities such as formation energy, band gap, and total energy must remain constant regardless of the crystal’s position or orientation in space.^19,61 Critically, strictly distinguishing SE(3) from E(3) is essential for geometric completeness. E(3) invariance inherently enforces reflection invariance. Consequently, models built on E(3) invariance cannot distinguish chiral structures from their mirror images. To address this, an ideal framework should strictly satisfy SE(3) invariance, handling rigid transformations, while maintaining sensitivity to chirality. In summary, SE(3) invariance ensures that scalar physical properties remain unaffected by rigid motions, whereas SO(3) equivariance ensures that vector or tensor properties transform consistently with the structure. Figure 1(b) and (c) illustrate the SE(3) operations of rotating the unit cell (specifically, a 270° rotation around the yellow axis indicated by the pink arrow) and translating it (along the direction indicated by the purple arrow), respectively.

The following will provide the mathematical definition of the SE(3) invariance of crystal materials.

Definition 2

For a crystal material S =( t , P , A ), for any operation g =( R , u ) ∈ SE(3), where R ∈ SO(3) is a rotation matrix and u ∈ ℝ³ is a translation vector, its action on S is given by

g S = (t, R P + u, R A)

(3)

Then the mathematical definition of SE(3) invariance is:

\forall g = (R, u) \in SE (3),

Φ (S) = Φ (g S),

(4)

Φ (t, P, A) = Φ (t, R P + u, R A) .

2.3. Periodic invariance of crystal materials

Periodic invariance implies that a crystal’s physical properties are independent of how its unit cell is defined. Since a crystal is an infinite periodic structure, there are infinite ways to select a repeating unit cell to represent it. While these selections change the numerical coordinates and lattice vectors, they describe the exact same physical material.^56,64 Therefore, a robust model must produce identical predictions regardless of this mathematical choice.

Formally, we define this principle as follows:

Definition 3

The periodic invariance requires that for any possible representation ( t , P , A ), the following holds:

Φ (t, P, A) = Φ (t *, P *, A *),

(5)

where ( t* , P* , A* ) is any representation different from ( t , P , A ).

Ensuring this invariance is vital for practical applications. Without it, a model might predict contradictory properties for the same material simply because the input structure was defined differently. This consistency is essential not only for static property prediction but also for tasks involving dynamic transformations and stress analysis.

Figure 2 illustrates it using a two-dimensional example. The colored boxes represent different valid methods for selecting a unit cell from the same atomic arrangement (colored circles). It is evident that while the unit cell selection alters the mathematical description, the underlying crystal structure and its properties remain unchanged.

Figure 2.

Periodic transformation of crystal materials.

2.4. Geometric completeness and chirality

While SE(3) invariance, SO(3) equivariance, and periodic invariance form the fundamental passive symmetries of crystals, a critical issue arises when considering the completeness of the representation, particularly regarding chirality. This challenge stems from the distinction between the SE(3) group and the E(3) group.

Most conventional crystal descriptors, such as bond lengths and angles, are scalars. Since scalars do not change under reflection, models relying on them are inherently E(3)-invariant. While this ensures stability, it creates a flaw: the model cannot distinguish a crystal from its mirror image (enantiomer), as both generate identical graphs. This is problematic for chiral crystals, where left-handed and right-handed structures are distinct physical entities with potentially different properties.

Therefore, a rigorous representation learning framework must satisfy geometric completeness. This concept requires the model to be strictly SE(3)-invariant while remaining sensitive to reflection. In simple terms, geometric completeness ensures the mapping from structure to representation is unique: distinct structures must map to distinct graphs. This property allows the model to avoid the “false symmetry” of reflection invariance, enabling the accurate distinction of chiral materials.

3. Framework for GNNs-based crystal material representation learning

Following the symmetry principles defined in Section 2, a robust representation learning framework must enforce SE(3) invariance, SO(3) equivariance, and periodic invariance. These constraints guide the design of the two core components of the framework: crystal graph construction and GNNs architecture design.⁶⁵

The crystal graph construction phase is the foundational step where molecules, crystal unit cells, and even complex polycrystalline systems are mapped into graph structures consisting of “atoms (nodes)” and “chemical bonds or spatial interactions (edges).” Crucially, this phase determines whether the passive symmetry constraints, especially periodic invariance, are rigorously preserved or compromised by implementation choices (e.g., neighbor selection strategies). Subsequently, the GNNs architecture must effectively encode these features while maintaining symmetry consistency between local and global representations.

Section 3 first presents the general prediction pipeline. It then critically examines graph construction strategies, focusing on the integration of passive symmetries. Finally, we classify and analyze representative Crystal GNNs architectures, comparing their design features and compliance with symmetry constraints.

3.1. A pipeline for GNNs-based crystal property prediction

Figure 3 illustrates the standard pipeline for GNNs-based crystal property prediction, comprising three key stages:

(i) Graph Construction (Input): The process begins by converting the crystal structure into a graph format. Atoms are treated as nodes (carrying features like element types), while interactions form edges. A major challenge here is encoding the infinite periodic nature of crystals into a finite graph. To ensure the input is physically valid, specific strategies and high-order geometric features are required to strictly adhere to the material’s passive symmetries.^66–71

(ii) Feature Learning (Processing): Next, the GNNs learns structural representations through iterative message passing. By employing techniques such as graph convolutions⁷² or attention mechanisms,^73,74 the model captures complex interdependencies between atoms. This process effectively extracts microscopic structural features, enabling the model to learn the underlying physics from data.

(iii) Property Prediction (Output): Finally, the trained model outputs predictions, generally categorized into two levels: graph-level outputs for macroscopic properties (e.g., formation energy, band gap) and node-level outputs for atom-specific properties (e.g., forces, magnetic moments). These predictions provide essential insights for both material characterization and inverse design.

Figure 3.

A pipeline for GNNs-based crystal property prediction.

3.2. Evolution of crystal graph construction

Section 3.2 details the construction of crystal graphs in the pipeline. Crystal graph construction is a crucial step in translating physical entities into mathematical representations, and the quality of this construction significantly influences the performance of the crystal material representation learning framework. The fundamental elements of graph construction involve defining the nodes, edges, and the final mathematical representation of the graph. In molecular graph construction, the process is relatively straightforward: atoms serve as nodes, with their chemical properties as node features, and chemical bonds or spatial distances between atoms are represented as edge features.^75–79 In crystal graph construction, however, the unique periodicity of crystal materials, along with their invariance to SE(3) transformations and equivariance to SO(3) transformations, presents a core challenge: integrating these constraints and features into the graph construction process.

3.2.1. Basic connectivity: nearest-neighbor multi-edge graphs

Prior work shows an evolution in crystal-graph construction from implicit representations to explicit modeling of SE(3)/SO(3) symmetries and periodicity. A seminal example, CGCNN,³⁴ introduced a multi-edge graph scheme coupled with nearest-neighbor search to encode the infinite periodicity of crystalline materials. Atoms in a unit cell are treated as graph nodes, with each node implicitly representing the atom and its infinite periodic copies along all lattice directions. The interaction between atom pairs needs to be defined by recording the Euclidean distance after considering the nearest neighbor condition or truncation condition, which will result in multiple edges representing the interaction between different periodic copies for the same pair of atoms.

Owing to its simplicity and efficiency, this multi-edge paradigm has been widely adopted in subsequent models, including MEGNET,⁸⁰ GATGNN,⁸¹ iCGCNN,⁸² and DeeperGATGNN.⁸³ However, it is critical to distinguish between the two primary neighbor selection strategies employed within this paradigm: radius-based cutoffs and fixed k-nearest neighbor (k-NN) search. Models like MEGNET typically employ a strict cutoff radius, where all atoms within a distance R_c are connected. This strategy is deterministic and physically consistent. In contrast, standard implementations of CGCNN and GATGNN often rely on a fixed k-NN strategy for computational efficiency.

It is explicitly worth noting that this specific implementation choice, basing neighbor selection on hard cutoffs (fixed k), can violate periodic invariance in certain cases.⁵⁶ As shown in the left of Figure 4(a), the multi-edge construction with nearest neighbor search selects neighbors using a Voronoi face–sharing criterion. Each node connects to its neighbors via multiple edges, reflecting interactions with periodic copies of the same atomic species. Node features embed atomic properties (e.g., atomic number), while edge features encode geometric attributes (e.g., bond length). On the right of Figure 4(a), a counterexample is shown: nodes of the same color denote the same atomic species, and a central green atom is equidistant from four neighbors. Under a standard k-NN rule with a fixed cutoff k, the set of selected neighbors becomes ambiguous due to numerical tie-breaking dependence on atomic indexing. Consequently, different unit cell choices yield different graph topologies. Such local non-uniqueness disrupts global periodic consistency, alters the mapping to physical properties, and violates Equation (5) thereby failing to satisfy periodic invariance.

Figure 4.

Diagram of different crystal graph construction methods. (a) Basic connectivity: nearest-neighbor multi-edge graphs; (b) beyond connectivity: encoding higher-order geometry; (c) incorporating physical priors: Symmetry-aware graph construction; (d) beyond the unit cell: Graphs for long-range periodicity.

Crucially, this limitation is an artifact of the specific implementation adopted by these models (i.e., the use of hard cutoffs for computational efficiency) rather than a fundamental flaw in the graph representation itself. On the contrary, schemes based on deterministic physical criteria, such as a strict cutoff radius without neighbor count truncation (as seen in MEGNET) or an ideal nearest-neighbor graph including all equidistant neighbors, have been proven to strictly meet periodic invariance.⁵⁶ Therefore, the violations observed in the aforementioned works arise from the use of arbitrary neighbor truncation strategies in high-symmetry environments, rather than the multi-edge formalism itself. Nevertheless, since classical models like CGCNN and GATGNN default to these hard-cutoff implementations, they inherently fail to guarantee strict periodic invariance.

3.2.2. Beyond connectivity: Encoding higher-order geometry

Relying exclusively on pairwise distance information (i.e., two-body interactions) introduces limitations in capturing complex local geometric environments. To capture higher-order interactions, more geometric information has been incorporated into the graph construction process. For instance, ALIGNN⁶⁶ simultaneously considers distance and bond angle (three-body interaction), introducing the concept of a “line graph”. In a line graph, the edges of the original graph are converted into nodes, with edges in the line graph representing the relationships between corresponding edges in the original graph. This approach seeks to capture more intricate geometric relationships. Similarly, to encode three-body interactions, M3GNet⁸⁴ and CHGNet⁸⁵ employ advanced graph construction strategies based on cutoff radii. M3GNet incorporates bond angle information into its multibody computation module, while CHGNet introduces a novel dual-graph structure, using a secondary graph with “chemical bonds” as nodes and “bond angles” as edges to specifically model three-body interactions. As shown in Figure 4(b), these graph construction schemes that incorporate more high-order geometric information integrate bond angles as the main features into node or edge features. These graph construction strategies have made significant progress in enhancing the framework for representation learning of crystal materials to capture complex internal structural information of crystals.

Nevertheless, the effectiveness of these high-order representations in preserving global symmetry strictly depends on the underlying neighbor selection strategy. While geometric descriptors such as bond angles and bond lengths are inherently invariant to rigid-body rotation and translation, specific implementation choices can still compromise periodic invariance. This is particularly evident in models like ALIGNN, which typically define the graph topology using the k-NN algorithm (e.g., k=12) before computing bond angles. As detailed in Section 3.2.1, k-NN strategies introduce topological ambiguities in highly symmetric crystals where neighbor selection depends on arbitrary atomic indexing. Consequently, in the case of ALIGNN, different unit cell selections can lead to topologically different graphs, and thus different sets of computed bond angles, violating the strict Periodic Invariance (Equation (5)). In contrast, M3GNet and CHGNet utilize radius-based cutoffs, which theoretically preserve periodic invariance regarding connectivity, although they represent a different approach to balancing computational cost and symmetry constraints compared to the fully symmetric frameworks discussed in Section 3.3.2.

3.2.3. Incorporating physical priors: Symmetry-aware graph construction

Matformer,⁵⁶ a novel framework that sets six self-connection edges on a global node, with edge weights encoding the lengths of the three lattice basis vectors and the angles between them. This enables the model to directly capture macroscopic periodicity, or the periodic pattern, of the lattice. Additionally, this framework introduces two different graph construction strategies: one is a fully connected graph, and the other is an adaptive radius graph. In the latter strategy, the cutoff radius is dynamically determined by the distance of the k-th nearest neighbor (e.g., k=12). Crucially, Matformer includes all atoms within this adaptive radius, ensuring that equidistant neighbors are treated equally without arbitrary truncation. Both construction methods, combined with the lattice-encoding self-connections, have been designed to strictly satisfy Equation (5), ensuring the periodic invariance of crystal materials.

Building on this, ComFormer¹⁹ further enhances the handling of crystal symmetries through a sophisticated “passive symmetry treatment”, ensuring the model’s geometric completeness. ComFormer offers two graph construction options: an SE(3)-invariant graph based on the two invariants of the bond-lattice vector angle and bond length, and an SO(3)-equivariant graph based on the invariant bond length and the equivariant bond vector. These options offer customized geometric representations for predicting and optimizing various material properties. Since the bond-lattice vector angle and bond length do not change with the overall rotation and translation of the crystal graph, nor do they vary with the selection of different unit cells, the SE(3)-invariant graph satisfies Equation (4) and Equation (5). Crucially, by explicitly encoding the angles relative to the lattice basis vectors, the graph construction in ComFormer ensures a bijective mapping between the crystal structure and its graph representation. This rigorously solves the “many-to-one” problem, thereby achieving geometric completeness for both SE(3)-invariant graph and SO(3)-equivariant graph. While iComFormer relies on complete invariant scalars, eComFormer utilizes equivariant bond vectors to explicitly capture directional information, fully realizing the three passive symmetries of crystal materials with higher expressivity, which will be discussed in Section 3.3.2.

Two recent works, PerCNet⁸⁶ and GCPNet,⁸⁷ have adopted similar strategies to address the issue of geometric completeness. PerCNet introduces more crucial dihedral angles to address the “many to one” problem (a lack of geometric completeness) existing in previous models (such as ALIGNN and GATGNN), where different crystal structures are mapped to the same crystal graph by the graph construction scheme. By resolving this problem, PerCNet achieves a geometrically complete representation. GCPNet, on the other hand, introduces the concept of “crystal pattern graphs”, which incorporate three elements: atomic types, bond types, and global descriptors, ensuring the passive symmetries of crystal materials from a global perspective. Figure 4(c) illustrates such symmetry-aware graph construction schemes that incorporating physical priors. These schemes utilize more geometric information of crystal materials, such as self-connection edges, the angles between bonds and lattice basis vectors, and dihedral angles. By combining this information with original node and edge features, these graph construction schemes can satisfy varying degrees of periodic invariance, SE(3) invariance, and SO(3) equivariance, while significantly improving geometric completeness.

3.2.4. Beyond the unit cell: Graphs for long-range periodicity

This section introduces graph construction strategies that focus on how to effectively handle and capture the long-range periodic interactions within crystals. These efforts leverage the “Ewald summation” concept from physics and computational materials science to address the long-range interactions arising from periodicity. For instance, PotNet⁸⁸ avoids the information loss caused by distance truncation by classifying the potential energy interactions between different atomic types in the lattice and introducing approximate functions to calculate the total sum of each type of interaction throughout the infinite lattice, thereby encoding the long-range periodic effects of the crystal into node representations. In contrast, Ewald-MP⁸⁹ more directly applies the decomposition idea of Ewald summation, decoupling the information transfer into two parts: short-range signals processed in real space and long-range signals efficiently modeled and filtered through Fourier transformation in reciprocal (frequency) space, achieving a clear separation and capture of periodic structural information. As shown in Figure 4(d), this scheme integrates potential energy terms into edge or node features, thereby enhancing the physical rationality of the graph construction strategy.

While the evolution of these schemes reflects ongoing efforts in constructing complete and physically rigorous crystal graph representations, it is crucial to distinguish between their theoretical validity and numerical implementation. Theoretically, methods based on Ewald summation are rigorously periodic and thus strictly satisfy Equation (5) (periodic invariance) by design, as they mathematically account for interactions across the infinite lattice. However, in practical implementations, the infinite summation series must inevitably be approximated via truncation or function fitting due to computational constraints. Consequently, the primary challenge for these methods lies not in a theoretical violation of symmetry principles, but rather in managing numerical stability and precision errors introduced by these necessary approximations.

Furthermore, graph construction strategies are often optimized for specific tasks. For instance, CartNet,⁹⁰ which is specifically designed for predicting Atomic Displacement Parameters (ADP) adopts a radius-based graph construction strategy. It relies solely on spatial distances when building graphs and disregards chemical bond information, focusing more on the packing effects of atoms within the lattice. This approach offers significant computational efficiency advantages. Crucially, unlike models relying on fixed k-NN logic, CartNet’s radius-based approach avoids neighbor ambiguity, thereby strictly satisfying Equation (5) and preserving the periodic invariance of crystal materials. Additionally, this method incorporates SO(3) data augmentation into the constructed graphs to align with the physical characteristics of ADP, satisfying Equation (2).

Regardless of the final graph construction strategy adopted, the crystal graph ultimately composed of nodes and edges needs to be encoded into a machine-readable mathematical form. The core of this is the adjacency matrix A and the node feature matrix X . For crystal graphs, A encodes the connectivity between nodes, often including multi-edge information to represent periodic interactions and spatial distances, while X contains the initial physical and chemical properties of all nodes. These two matrices together constitute the input of GNNs, enabling the model to establish a profound mapping from microscopic atomic structures to macroscopic physical properties by learning the interaction patterns between atoms on this structured data.

3.3. GNNs architectures for learning on crystal graphs

The transformation of crystal graph representations into meaningful material properties relies on the design of Crystal GNNs. A key challenge in the architecture design of Crystal GNNs is how to accurately and completely capture the passive symmetry of crystal materials based on the input crystal graph. In this review, we classify Crystal GNNs into two categories: symmetric Crystal GNNs and asymmetric Crystal GNNs.

Crucially, this classification is not solely based on basic periodic invariance, but on a more rigorous criterion: whether the framework explicitly incorporates architectural designs to fully realize the passive symmetries. This entails strictly guaranteeing periodic invariance, while simultaneously achieving either geometric completeness (resolving the “many-to-one” ambiguity) or explicit global symmetry modeling (e.g., through lattice encoding or infinite summation), without relying on unstable local approximations.

3.3.1. Asymmetric crystal GNNs

Asymmetric Crystal GNNs primarily focus on characterizing local chemical environments but fail to strictly enforce the full spectrum of global symmetry constraints. These models generally fall into two sub-types based on their architectural choices.

The first sub-type includes foundational models focusing on local pairwise interactions. Early works such as CGCNN,³⁴ iCGCNN,⁸² MEGNET,⁸⁰ GATGNN,⁸¹ DeeperGATGNN,⁸³ and SchNet⁹¹ laid the foundation. These models share similar input crystal graphs but demonstrate distinct architectural choices in message passing. SchNet pioneered the use of continuous-filter convolutions, where interatomic distances are expanded using radial basis functions (RBF) and processed by a filter-generating network to map continuous spatial positions to convolutional weights. In contrast, CGCNN and GATGNN introduce more refined control mechanisms. CGCNN specifically employs a gated convolution architecture involving a sigmoid-softplus unit, which dynamically regulates the information flow between atoms to mimic chemical bonding environments. GATGNN further advances this by utilizing attention scores to weight neighbors based on their impact. Critically, the symmetry compliance of this group varies by implementation. Models like CGCNN and GATGNN typically employ a fixed k-NN strategy for efficiency. As discussed in Section 3.2.1, this introduces topological ambiguities in high-symmetry crystals, directly violating periodic invariance (Equation (5)). Conversely, models like MEGNet typically use a radius-based cutoff, which avoids strict periodicity violations. However, they are still classified as asymmetric here because they conceptually reduce the infinite crystal to a localized chemical network, missing the explicit encoding of macroscopic lattice parameters required for global structural awareness.

The second sub-type includes models like ALIGNN,⁶⁶ M3GNet,⁸⁴ and CHGNet,⁸⁵ which integrate higher-order geometric information. These models employ advanced architectures to process edge features such as bond angles and bond lengths. ALIGNN integrates concepts from line graph GNNs, processing edge features with a line graph GNNs before handling node features with an ordinary GNNs to reason about crystal geometry from a third-order perspective. However, ALIGNN is classified as asymmetric because its graph construction relies on a fixed k-NN algorithm. This reliance causes the graph topology to change under periodic unit cell transformations, preventing it from achieving strict periodic invariance despite using invariant scalar features. In parallel, M3GNet and CHGNet utilize radius-based cutoffs and incorporate three-body interactions into their message passing (e.g., CHGNet’s dual-graph structure). While they technically satisfy periodic invariance regarding connectivity, these frameworks prioritize the capture of complex local atomic forces over geometric completeness. By not explicitly resolving the “many-to-one” mapping ambiguity, they remain susceptible to mapping distinct periodic structures to identical graph representations, placing them outside the strictly symmetric category.

3.3.2. Symmetric crystal GNNs

In contrast to conventional approaches, symmetric Crystal GNNs strictly guarantee periodic invariance and typically achieve either geometric completeness or explicit global symmetry modeling through rigorous architectural designs or physical formulations. These models can be broadly categorized into two streams: those based on geometric graph construction and those leveraging physics-informed global summation.

The first category focuses on geometric graph construction, unified by the explicit handling of crystal symmetries within their architectures. Matformer⁵⁶ exemplifies the integration of global periodicity into message passing. Unlike conventional models that rely solely on local neighbors, Matformer utilizes a specialized Transformer architecture with multi-head attention. Its key innovation lies in a multigraph construction with self-connecting edges that explicitly encode periodic lattice vectors (lattice lengths and angles). This design allows the network to intuitively capture macroscopic periodic patterns, ensuring strict periodic invariance and global structural awareness regardless of local connectivity details.

Taking geometric rigor further, ComFormer¹⁹ employs a Transformer-based architecture that alternates between node-level and edge-level Transformers to maximize expressivity while ensuring stability. Crucially, ComFormer addresses the “many-to-one” ambiguity, where distinct structures map to identical graphs, thereby achieving geometric completeness. Unlike ALIGNN’s unstable k-NN topology, ComFormer utilizes a deterministic lattice-based graph construction. Notably, while sharing identical rotation-invariant inputs (bond lengths and angles) with ALIGNN, iComFormer distinguishes itself through topological rigor. As mathematically detailed in Supplemental Material A.2, ALIGNN’s k-NN construction is susceptible to topological instability, whereas iComFormer’s lattice-based strategy strictly guarantees the periodic invariance required for a symmetric classification (proven in Supplemental Material A.3). Within this framework, iComFormer, operates on SE (3)-invariant graphs using bond-lattice angles, while eComFormer advances this by utilizing SO(3)-equivariant vector representations. The use of bond vectors allows eComFormer to capture directional information, with node-level update layers designed to simultaneously satisfy SE(3) invariance and SO(3) equivariance. Following a similar pursuit of completeness, PerCNet⁸⁶ and GCPNet⁸⁷ utilize hybrid architectures combining convolutional and attention channels. PerCNet specifically introduces dihedral angles to resolve geometric ambiguities, while GCPNet employs “crystal pattern graphs” with global descriptors to enforce symmetry from a global perspective.

Beyond general property prediction, symmetry-aware designs are also tailored for specific physical quantities, as exemplified by CartNet.⁹⁰ Designed for predicting ADP, CartNet utilizes a radius-based graph construction to avoid neighbor ambiguity. However, because its architecture prioritizes computational efficiency by processing only pairwise spatial distances, it inherently lacks geometric completeness. Furthermore, while CartNet applies SO(3) data augmentation during training to help the model learn rotational equivariance, this augmentation does not retroactively grant the input distance graph the topological ability to resolve chirality. Nevertheless, CartNet is classified as a symmetric GNNs under our rigorous framework because it incorporates explicit symmetry modeling at the output level to align with the tensor properties of ADP: uniquely, its Cholesky Head enforces strict physical constraints to ensure the output matrix is always mathematically valid (symmetric and positive semidefinite).

The second category, represented by PotNet⁸⁸ and Ewald-MP,⁸⁹ achieves symmetry not through geometric descriptors alone, but by explicitly modeling the infinite periodic lattice through physics-based summation strategies. Unlike local graph methods that suffer from truncation boundaries, these models theoretically capture the complete periodic interaction field. PotNet incorporates infinite potential energy summation directly into the message passing and aggregation process. Ewald-MP adopts the Ewald summation architecture, decomposing interactions into real-space and reciprocal-space components. It employs Fourier transformation to efficiently model long-range signals in frequency space. Crucially, while practical implementations involve numerical approximations, the underlying theoretical framework of these models is based on rigorously periodic physics equations, classifying them as symmetric architectures.

Furthermore, Crystal GNNs can be classified into three types based on their message passing mechanisms: convolutional, attention-based, and hybrid. Crystal GNNs such as CGCNN, iCGCNN, MEGNET, and ALIGNN can be categorized as convolutional type, while GATGNN, Matformer, and ComFomer can be classified as attention type. Finally, PerCNet and GCPNet, which employ both convolution and attention mechanisms, can be classified as hybrid type.

In summary, Table 1 outlines the methods employed by symmetric Crystal GNNs. The distinction reflects strict architectural compliance. Symmetric models ensure crystal symmetry by incorporating lattice-based or radius-based construction, explicit equivariant layers, or physics-informed global summation. In contrast, asymmetric models rely on standard neighbor aggregation strategies (like k-NN) or lack explicit global symmetry encoding.

Table 1.

Summary of methods employed in symmetric crystal GNNs.

Frameworks	Methods
Matformer⁵⁶	Set up six self-connected edges, and use multi-head attention in the model to enable the model to intuitively perceive the periodic pattern of the crystal
PotNet⁸⁸	Encodes global periodic interactions by explicitly calculating the infinite potential energy summation for each atom
Ewald-MP⁸⁹	Applies Ewald summation to decompose message passing into real and reciprocal space components, strictly enforcing periodic boundary conditions
eComFormer¹⁹	Using the invariant of bond length and the equivariant of bond vector, the model employs node-level equivariant update layers
iComFormer¹⁹	Using the two invariants of bond length and the angle between bond and base vector, the model alternately connects the interaction layers of nodes and edges
PerCNet⁸⁶	According to the rules, define a series of dihedral angles to solve the “many to one” problem
CartNet⁹⁰	Perform SO(3) enhancement on the data
GCPNet⁸⁷	Create a crystal pattern diagram, the model uses the GCAO operator to capture the passive symmetry

3.4. A comparative analysis of framework paradigms

A GNNs-based crystal material representation learning framework comprises crystal graph construction and GNNs architecture design. Table 2 summarizes these features, where complexity is denoted by O (nk) (n atoms, k average neighbors).

Table 2.

Comparative analysis of framework features of different material representation learning frameworks.²⁹

Frameworks	SE(3) invariant	SO(3) equivariant	Periodic invariant	Geometric information order	Geometric complete-ness	Number of model parame-ters	Complexity
CGCNN³⁴	✓	✗	✗	2	✗	-	-
SchNet⁹¹	✓	✗	✗	2	✗	-	-
MEGNET⁸⁰	✓	✗	✗	2	✗	-	-
GATGNN⁸¹	✓	✗	✗	2	✗	-	-
ALIGNN⁶⁶	✓	✗	✗	3	✗	15.4 M	O (nk²)
Matformer⁵⁶	✓	✗	✓	2	✗	11.0 M	O (nk)
PotNet⁸⁸	✓	✗	✓	2	✗	6.7 M	-
eComFormer¹⁹	✓	✓	✓	2	✓	12.4 M	O (nk)
iComFormer¹⁹	✓	✗	✓	3	✓	5.0 M	O (nk)
PerCNet⁸⁶	✓	✗	✓	3	✓	12 M	-
CartNet⁹⁰	✓	✓	✓	2	✗	2.5 M	-
GCPNet⁸⁷	✓	✗	✓	2	✗	-	-

Table 2 highlights a fundamental divergence between asymmetric and symmetric paradigms regarding passive symmetry compliance. First, regarding periodic invariance: Conventional frameworks often prioritize efficiency or local expressivity over strict symmetry. For instance, CGCNN and ALIGNN rely on k-NN construction. As analyzed in Section 3.3, this introduces topological instability, leading to violations of periodic invariance. In contrast, symmetric frameworks (e.g., iComFormer, Matformer, PotNet) adopt deterministic strategies, such as lattice-based construction or global summations, to strictly guarantee the topological stability required by the crystalline state.

Second, regarding geometric completeness: While early frameworks satisfy basic SE(3) invariance, they often rely on scalar distances, implicitly imposing reflection invariance and failing to distinguish chiral structures. Advanced frameworks explicitly address this. Notably, eComFormer achieves geometric completeness alongside periodic invariance through vector-based message passing, while PerCNet incorporate chiral-sensitive descriptors (e.g., dihedral angles).

Essentially, the comparison of ALIGNN, iComFormer, and PerCNet, all utilizing third-order geometric information (bond angles), reveals a key design principle: High-order geometric descriptors alone are insufficient for symmetry compliance; they must be grounded on a topologically rigorous graph structure. ALIGNN leverages bond angles on an unstable k-NN topology (asymmetric), whereas iComFormer integrates them within a strictly invariant framework (symmetric).

Ultimately, designing a robust crystal learning framework requires satisfying two levels of symmetry: (1) Strict periodic invariance to ensure validity across infinite lattice transformations, and (2) either geometric completeness to uniquely distinguish chemically distinct structures (including chiral enantiomers), or explicit global symmetry modeling to capture macroscopic lattice behaviors and physical constraints. Furthermore, task-specific constraints often drive architectural specialization, as seen in CartNet, which enforces physical validity (positive semidefiniteness) for ADP prediction. This trend indicates a shift from generic graph models toward physics-informed architectures where architectural rigor aligns strictly with material laws.

4. Datasets and related metrics

This section surveys the core datasets, key physical quantities, and evaluation metrics used to benchmark Crystal GNNs. We also analyze the comparative performance of leading frameworks on these standard tasks.

4.1. Core dataset

Currently, the data utilized in the research of crystal material representation learning frameworks using GNNs primarily comes from high-throughput calculations based on DFT. The following introduces three key datasets.

4.1.1. Materials project dataset

Initially, a widely used dataset was the MP-2018.6.1 subset⁸⁰ derived from the Materials Project database.⁹² This dataset, covering four fundamental regression tasks (formation energy, band gap, bulk modulus, and shear modulus), provided a common platform for evaluating the performance of early models.

(i) Element Distribution: Figure 5(a) illustrates the elemental distribution within the dataset. As expected, chemically reactive non-metals (e.g., O, F) and transition metals (e.g., Fe, Cu) appear frequently, reflecting their prevalence in stable oxides and alloys. Conversely, inert gases (He, Ne, Ar) are rare due to their low reactivity. This imbalance highlights that the dataset is biased towards chemically stable, naturally forming compounds rather than a uniform sampling of the periodic table.

(ii) Property Distribution: Figure 6 details the data distributions for the four benchmark tasks. The formation energy exhibits a bimodal distribution with predominantly negative values, indicating that the dataset is dominated by thermodynamically stable materials. Meanwhile, the band gap shows a strictly right-skewed distribution concentrated near zero, implying a prevalence of metals and narrow-bandgap semiconductors. Finally, both the bulk and shear moduli follow a log-normal-like pattern peaking between 1.5 and 2.0 log (GPa), reflecting the typical mechanical stiffness range of solid-state materials and their strong resistance to deformation.

Figure 5.

Heatmap of element distribution in the datasets. (a) Materials project dataset; (b) Jarvis dataset.

Figure 6.

Distribution map of data for different tasks in the materials project dataset.

4.1.2. JARVIS dataset

As research advances, the need for broader and deeper data has led to the development of more comprehensive data platforms, with JARVIS emerging as a prominent representative. JARVIS provides a vast testing ground, with its task set far surpassing early benchmarks, including up to 29 regression tasks and 10 classification tasks.^66,93,94 This breadth allows researchers to more thoroughly evaluate the generalization ability of a model architecture. For example, in the JARVIS dataset, researchers can not only test the prediction of thermodynamic stability, such as formation energy, total energy, and distance to convex hull energy, but also evaluate the model’s ability to distinguish band gaps calculated by different DFT functionals (such as OPT and MBJ), reflecting the depth of its data and attention to computational details.

(i) Element Distribution: Figure 5(b) illustrates the element distribution in JARVIS. generally, it mirrors the trend seen in the Materials Project: chemically reactive elements like oxygen and fluorine are frequent, while inert gases are rare. However, a distinction lies in the distribution of specific categories, such as rare earth and actinide elements, which varies due to the specific material systems targeted by the JARVIS consortium.

(ii) Property Distribution: Figure 7 details the property distributions, reflecting typical characteristics of inorganic materials. The formation energy predominantly falls within the negative range (approximately -4 to 0 eV/atom) with a unimodal distribution, indicating that the dataset focuses primarily on relatively stable candidate structures. Regarding electronic properties, both OPT and MBJ band gap calculations show a concentration near 0–1 eV, implying a prevalence of metals and narrow-bandgap semiconductors. Notably, the MBJ data exhibits a longer tail extending to tens of eV compared to OPT, effectively capturing the systematic functional corrections applied to wide-bandgap insulators. In terms of thermodynamic stability, the distance to convex hull displays a distinct “sharp peak and heavy tail” pattern: the peak at 0 eV signifies a high density of stable phases, while the tail represents a considerable number of metastable candidates, consistent with high-throughput screening data. Finally, the total energy shows a symmetrical, bell-shaped distribution centered in the negative range, primarily reflecting the statistical variance in elemental composition and stoichiometry.

Figure 7.

Distribution map of data for different tasks in the Jarvis dataset.

4.1.3. The Matbench dataset

To address the difficulty of comparing models across unstandardized tasks, Matbench⁹⁵ was introduced as a rigorous benchmarking suite. Unlike raw databases, Matbench is a curated collection of 13 tasks (10 regression, 3 classification) integrated from various sources. Its primary value lies in its standardized evaluation protocol. Instead of allowing random data splitting, Matbench provides fixed, nested cross-validation splits for every task. This design strictly minimizes “information leakage” and ensures that all models are evaluated under identical conditions. Consequently, despite containing fewer tasks than JARVIS, Matbench serves as a critical standard for ensuring fairness and reproducibility in comparative studies.

Table 3 summarizes the scale and tasks of these core datasets. The Materials Project (69,239 samples) serves as a baseline for fundamental regression tasks like formation energy and mechanical moduli. JARVIS (55,722 samples) adds depth by including thermodynamic stability and high-precision electronic properties, such as band gaps calculated via different functionals. Finally, Matbench offers the largest scale and diversity, covering benchmarks ranging from 2D exfoliation energy to metallicity classification. Overall, the progression from Materials Project to Matbench reflects both an increase in data volume and an expansion in task complexity. This evolution allows researchers to rigorously test whether frameworks can accurately model the intricate structures and passive symmetries of crystal materials.

Table 3.

The data volume and common tasks of different datasets.

Datasets	Data volume	Common tasks
Materials Project^80,92	69239	Formation Energy; Bandgap; Shear Moduli; Bulk Moduli
Jarvis^66,93,94	55722	Formation Energy; Bandgap (OPT); Bandgap (MBJ); Ehull; Total Energy
Matbench⁹⁵	132752	Jdft2d; Formation Energy; Bandgap; Is_Metal

4.2. Key physical quantities and evaluation metrics

Model evaluation focuses on specific physical properties that determine a material’s practical value. Table 4 presents these quantities alongside their calculation formulas.

(i) Physical Quantities: From Stability to Functionality. In any computational screening, thermodynamic feasibility is the primary filter. Total Energy serves as the foundation for these evaluations, where E_i represents the energy of the i-th atom. Derived from this, Formation Energy acts as a critical “gatekeeper” to determine if experimental synthesis is feasible. It quantifies stability by comparing the compound’s total energy (E_compound) against the energies of its constituent elements (E_element,i), weighted by their molar quantities (n_i). A stricter stability criterion is provided by the Distance to Convex Hull (E_hull). This metric predicts the tendency for phase decomposition by measuring the energy difference between the compound (E_compound) and the lowest energy of known stable phases (E_min). Once thermodynamic stability is confirmed, the focus shifts to functional performance. For electronic applications like semiconductors, the Electronic Band Gap is the core parameter governing conductivity and optical behavior, defined as the energy difference between the conduction band (E_conduction _band) and the valence band (E_valence _band). Meanwhile, for structural applications, Bulk Modulus and Shear Modulus jointly characterize the material’s mechanical response. They reflect resistance to compression and shear, respectively, providing a comprehensive measure of stiffness and toughness. Consequently, an ideal framework must accurately predict this full spectrum, moving coordinatedly from assessing “whether a material can exist” to determining “what functions it performs.”

(ii) Evaluation Metrics. To quantitatively assess model performance, standard metrics are employed (see Table 5). For regression tasks (predicting continuous physical quantities), MAE (Mean Absolute Error) is the most widely used metric due to its intuitive physical interpretation, calculated as the average absolute difference between the predicted value (ŷ_i) and the actual value (y_i) over n samples. Complementing this, MSE (Mean Squared Error) and RMSE (Root Mean Squared Error) emphasize larger deviations; RMSE, in particular, implies a heavier penalty for extreme errors, offering a stricter test of reliability.

Table 4.

Key physical quantities for evaluating model performance.

Physical quantities	Calculation formula
Total Energy	$E_{t o t a l} = \sum_{i} E_{i}$
Formation Energy	$E_{f o r m} = E_{c o m p o u n d} - \sum_{i} n_{i} E_{e l e m e n t, i}$
Energy Above the Convex Hull	$E_{h u l l} = E_{c o m p o u n d} - E_{\min}$
Electronic Band Gap	$E_{g a p} = E_{c o n d u c t i o n b a n d} - E_{v a l e n c e b a n d}$
Bulk Modulus	$K_{b u l k} = \frac{v o l u m e t r i c s t r e s s}{v o l u m e t r i c s t r a i n}$
Shear Modulus	$G_{s h e a r} = \frac{s h e a r s t r e s s}{s h e a r s t r a i n}$

Table 5.

Key metrics to evaluate model performance.

Evaluation metrics	Calculation formula
Mean Absolute Error (MAE)	$M A E = \frac{1}{n} \sum_{i = 1}^{n} \| y_{i} - {\hat{y}}_{i} \|$
Mean Squared Error (MSE)	$M S E = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}$
Root Mean Squared Error (RMSE)	$R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}$
Accuracy	$A c c u r a c y = \frac{n_{c o r r e c t}}{n_{t o t a l}}$
Precision	$P r e c i s i o n = \frac{T P}{T P + F P}$
Recall	$R e c a l l = \frac{T P}{T P + F N}$
F1 Score	$F 1 = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}$

For classification tasks (predicting material categories), Accuracy measures the overall proportion of correct predictions. However, given the class imbalance common in materials science, strictly relying on accuracy can be misleading. Therefore, Precision (the proportion of true positives among predicted positives), Recall (the proportion of true positives detected), and the F1 Score (the harmonic mean of Precision and Recall) are essential for a balanced evaluation. By systematically comparing models using these unified metrics, researchers can objectively quantify architectural advancements.

4.3. Performance comparison of mainstream frameworks

The preceding sections have detailed the innovations in GNNs architectures and highlighted key datasets. However, the practical outcomes of these works must ultimately be quantified through performance on standardized benchmarks. Section 4.3 provides a multi-dimensional performance evaluation. To ensure a rigorous horizontal comparison, the performance metrics cited in this section (Tables 6 and 7) are derived from original literatures where models were evaluated using identical dataset splits, preprocessing protocols, and evaluation metrics. Standard deviations are reported where available in the source literature (e.g., CartNet); for other models, the reported scalar values represent the converged performance metrics under these standardized conditions.

Table 6.

The test MAE results of different frameworks on the Jarvis dataset.

Frameworks	Form. Energy (meV/atom)	Bandgap (OPT) (eV)	E_total (meV/atom)	Bandgap (MBJ) (eV)	Ehull (meV)
CGCNN³⁴	63	0.20	78	0.41	170
SchNet⁹¹	45	0.19	47	0.43	140
MEGNET⁸⁰	47	0.145	58	0.34	84
GATGNN⁸¹	47	0.170	56	0.51	120
ALIGNN⁶⁶	33.1	0.142	37	0.31	76
Matformer⁵⁶	32.5	0.137	35	0.30	64
PotNet⁸⁸	29.4	0.127	32	0.27	55
PerCNet⁸⁶	28.7	-	30.7	0.265	50.3
eComFormer¹⁹	28.4	0.124	32	0.28	44
iComFormer¹⁹	27.2	0.122	28.8	0.26	47
CartNet⁹⁰	27.05±0.007	0.115±0.003	26.58±0.58	0.253±0.005	43.9±0.36

Table 7.

The test MAE results of different frameworks on the materials project dataset.

Frameworks	Form. Energy (meV/atom)	Bandgap (eV)	Bulk moduli (log (GPa))	Shear moduli (log (GPa))
CGCNN³⁴	31	0.292	0.047	0.077
SchNet⁹¹	33	0.345	0.066	0.099
MEGNET⁸⁰	30	0.307	0.060	0.099
GATGNN⁸¹	33	0.280	0.045	0.075
ALIGNN⁶⁶	22	0.218	0.051	0.078
Matformer⁵⁶	21	0.211	0.043	0.073
PotNet⁸⁸	18.8	0.204	0.040	0.065
PerCNet⁸⁶	18.1	0.200	-	-
eComFormer¹⁹	18.16	0.202	0.0417	0.0729
iComFormer¹⁹	18.26	0.193	0.0380	0.0637
CartNet⁹⁰	17.47±0.38	0.191±0.003	0.033±0.00094	0.0637±0.0008

The data in Table 6 reveals a trajectory of performance improvement driven by architectural evolution and the refinement of graph construction strategies. Early models like CGCNN and SchNet exhibited relatively high MAEs, guiding subsequent optimizations. With the introduction of attention mechanisms and higher-order geometric information, frameworks such as GATGNN and ALIGNN achieved significant improvements. Recently, frameworks like CartNet have elevated performance to new heights. CartNet demonstrates superior performance across all five material property prediction tasks, with significantly lower prediction errors for formation energy, bandgap (OPT), and total energy compared to other models. This strongly demonstrates its exceptional generalization ability in capturing the “structure-property” relationship.

A noteworthy observation is that, although Transformer-based models such as Matformer and iComFormer perform slightly worse than CartNet, they still outperform earlier models with simpler crystal graph constructions that do not incorporate periodic patterns or equivariant information. This reveals an important insight: in crystal material property prediction, a framework tailored to the periodicity and passive symmetry of crystal structures possesses inherent advantages over general frameworks. This emphasizes that framework design should carefully consider inherent physical constraints rather than blindly pursuing complexity.

Table 7 displays the independent evaluation results on the Materials Project dataset. CartNet continues to deliver the best performance across most tasks, while other models exhibit similar trends compared to their previous versions. Although absolute error values differ across the two datasets due to varying material types and structural complexities, the relative performance ranking remains largely consistent. This suggests that effective frameworks possess strong generalization capabilities across diverse data environments.

Finally, Table 8 highlights significant variations in computational costs among the evaluated models. As an earlier model using high-order geometric information, ALIGNN requires extended training times. In contrast, PotNet achieves the highest time efficiency through its optimized potential energy summation. Architecturally complex models like eComFormer and iComFormer show moderate training times per epoch but differentiate in convergence speed. It is important to note that while explicit runtime data for CartNet is not available for a direct standardized comparison in Table 8, its computational paradigm differs fundamentally from the aforementioned models. Unlike architectural approaches that incur cost via complex operations per step, augmentation-based methods like CartNet incur cost via increased effective dataset size. These discrepancies in the source of computational burden, algorithmic complexity versus data volume, prompt a deeper discussion on the cost-effectiveness of different symmetry-preserving strategies, which will be elaborated in the following section.

Table 8.

Training time per epoch, total training time, and total testing time for some different frameworks on JARVIS formation energy prediction.

Frameworks	Time/epoch(s)	Total training time(h)	Total testing time(s)
ALIGNN⁶⁶	327	27.3	156
Matformer⁵⁶	64	8.9	59
PotNet⁸⁸	42	5.8	31
eComFormer¹⁹	115	16.0	-
iComFormer¹⁹	78	15.2	-
PerCNet⁸⁶	-	19.37	115

4.4. Discussion: The trade-off between learning and enforcing symmetry

The performance analysis in Section 4.3 highlights a critical divergence in how modern frameworks address crystal symmetries. Although symmetric architectures such as ComFormer and augmentation-based approaches like CartNet both achieve high accuracy, they represent fundamentally different paradigms with distinct trade-offs regarding training efficiency, data efficiency, and architectural rigidity.

Architecturally invariant models, including ComFormer and Matformer, incorporate symmetry as a hard inductive bias directly into the neural network architecture. This approach yields high data efficiency because the model is mathematically constrained to be invariant, meaning it does not require rotated versions of a crystal to learn that they are identical. However, this rigorous enforcement incurs a specific computational cost. The complexity per forward pass is often higher due to the overhead of specialized geometric operations, such as SO(3) equivariant tensor products or complex edge updates. These models are therefore ideal when training data is scarce or when strict physical guarantees of exact invariance are required.

In contrast, the paradigm of learning invariance via data augmentation, exemplified by CartNet, achieves best performance through a different mechanism. This strategy typically employs a computationally efficient backbone, such as a radius-based graph, trained on a dataset expanded by random SO(3) rotations. While the backbone itself might be lighter per step than an equivariant transformer, the model must learn approximate invariance from the data itself. Theoretically, achieving the same level of symmetry coverage as an invariant architecture requires a significantly larger effective dataset and potentially more training iterations. Therefore, the computational burden shifts from complex operators within the architecture to increased data volume and total training duration. A primary advantage of this paradigm is its flexibility, as it allows researchers to use simpler and more scalable backbone networks without designing complex equivariant layers. This flexibility offers a potential advantage for large-scale pre-training scenarios where data throughput is the primary concern.

Ultimately, the choice between these paradigms depends on the specific constraints of the application. For scenarios characterized by data scarcity or a requirement for theoretical rigor, architecturally invariant symmetric GNNs offer a robust solution. Conversely, for benchmarks where maximizing prediction accuracy is the priority and computational resources permit extensive training on augmented data, the augmentation-based approach currently yields superior empirical results.

5. Conclusion and outlook

This review has systematically investigated a range of GNNs-based representation learning frameworks for crystalline materials, analyzing their architectural distinctions and comparative performance. By leveraging graph-structured modeling and innovative network designs, these frameworks have demonstrated a remarkable ability to overcome the limitations of traditional methods particularly in capturing passive symmetry and higher-order geometric information within complex material systems. As a result, GNNs-based representation learning methods for crystal materials have emerged as highly promising tools in materials science, achieving remarkable success in predicting macroscopic properties of materials (such as band gap, formation energy, total energy, etc.) and accelerating the discovery of novel materials.

However, despite these advances, several critical challenges remain that hinder the broader adoption and performance of GNNs-based crystal models. First, the trade-off between symmetry enforcement and computational efficiency remains unresolved. As analyzed in Section 4.4, strictly symmetric models (e.g., ComFormer) rely on computationally expensive operations like tensor products or spherical harmonics to guarantee invariance, limiting their scalability to large-scale screening. Conversely, augmentation-based models (e.g., CartNet) sacrifice theoretical rigor for speed but suffer from low data efficiency. Second, current frameworks predominantly focus on perfect bulk crystals, neglecting the local symmetry breaking inherent in real-world materials (e.g., defects, grain boundaries). Third, most existing models are designed to output rotation-invariant scalars (e.g., energy), lacking the capability to predict high-order tensorial properties which are crucial for functional materials.

To address these issues, future research should prioritize the following technical directions:

(i) Lightweight Equivariance and Efficient Symmetry-Awareness. The next generation of Crystal GNNs must move beyond the binary choice of “heavy architectural constraints” versus “massive data augmentation.” Future work should explore lightweight equivariant mechanisms that approximate high-order geometric interactions while significantly reducing the computational overhead associated with traditional tensor products (such as those heavily utilized in microscopic irrep-based models like NequIP^49,50 and MACE,^51,52 as well as in rigorous macroscopic architectures like ComFormer¹⁹). A promising direction lies in geometric scalarization techniques,^76,79 which efficiently project vector features into rotation-invariant scalars, thereby bypassing expensive tensor contractions while preserving directional information. Additionally, exploring simplified equivariant message passing schemes, which decouple radial and angular components to streamline the update process, offers a path to scalable symmetry. The ultimate optimization target is to develop architectures that maintain the high “data efficiency” of symmetric models while approaching the “training speed” of standard scalar-based GNNs.

(ii) Modeling Symmetry Breaking in Non-Ideal Complex Systems. Existing symmetric GNNs rely heavily on global lattice periodicity. However, real-world applications often involve non-ideal structures such as point defects, disordered solid solutions, and heterostucture interfaces, where global translational symmetry is broken while local rotational symmetry persists. Future frameworks need to extend “Strict Periodic Invariance” to “Local Symmetry Awareness.” This involves developing adaptive graph construction strategies that can handle variable-sized neighborhoods and explicitly model the perturbation of local symmetry environments caused by defects, rather than treating them as noise within a perfect lattice.

(iii) From Scalar Predictions to Equivariant Tensor Properties. While current benchmarks focus on scalar properties (e.g., formation energy), many critical material functionalities, such as piezoelectricity, thermal conductivity, and dielectric response, are tensorial quantities that transform under rotation. There is a clear gap in the literature for GNNs capable of producing equivariant vector or tensor outputs. Future research should focus on designing readout heads that preserve the rank and equivariance of the internal feature representations, allowing models to directly predict 3×3 tensors or higher-order responses. This expansion is essential for applying GNNs to the discovery of optoelectronic and ferroelectric materials.

(iv) Physics-Informed Interpretability. To transition from “black-box” predictors to tools for physical discovery, models must offer greater interpretability. Future efforts should integrate physical laws not just as constraints, but as inductive biases in the learning process. For instance, aligning attention weights with interatomic force fields or integrating Hamiltonian-based priors into the message-passing update rules could ensure that the learned representations are physically meaningful. This would allow researchers to verify whether the model makes predictions based on genuine physical interactions or statistical artifacts.

In summary, the field is transitioning from the initial phase of demonstrating feasibility to a new phase of rigorous physical alignment. Achieving fully efficient, interpretable, and generalizable material modeling will require continued interdisciplinary effort. With growing computational resources and richer material datasets, these methods are poised to significantly accelerate the intelligent design and discovery of new materials, opening up new frontiers in computational materials science.

Supplemental material

Supplemental material - Representation learning of crystal materials using Graph Neural Networks: Passive symmetry challenges and advances

Supplement material for Representation learning of crystal materials using Graph Neural Networks: Passive symmetry challenges and advances by Jie Cui, Chenglei Han, Jiyao Liang, Lin Li and Fei Wang in Science Progress.

Footnotes

Acknowledgements

Shaorong Fang and Tianfu Wu from Information and Network Center of Xiamen University are acknowledged for the help with the GPU computing.

ORCID iD

Jie Cui

Author contributions

Jie Cui: Conceptualization, Data curation, Investigation, Visualization, Writing – original draft, Writing – review and editing. Chenglei Han: Writing – review and editing. Jiyao Liang: Writing – review and editing. Lin Li: Funding acquisition, Supervision, Writing – review and editing. Fei Wang: Funding acquisition, Supervision, Writing – review and editing.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partly supported by the National Key R&D Program of China (2019YFB2205001); the National Natural Science Foundation of China (62371407); the State Key Lab of Processors, Institute of Computing Technology, CAS - (CLQ202402); and the Shandong Province Natural Science Foundation, China (ZR2024MF073).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

Data will be made available on request.*

Supplemental material

Supplemental material for this article is available online.

References

Merchant

Batzner

Schoenholz

, et al. Scaling deep learning for materials discovery. Nature 2023; 624: 80–85. https://doi.org/10.1038/s41586-023-06735-9

Dinic

Singh

Dong

, et al. Applied machine learning for developing next‐generation functional materials. Advanced Functional Materials 2021; 31: 2104195. https://doi.org/10.1002/adfm.202104195

Shi

Zhou

Huang

, et al. A review on the applications of graph neural networks in materials science at the atomic scale. Materials Genome Engineering Advances 2024; 2: e50. https://doi.org/10.1002/mgea.50

Ward

Agrawal

Choudhary

, et al. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Computational Materials 2016; 2: 1–7. https://doi.org/10.1038/npjcompumats.2016.28

Yao

Lum

Johnston

, et al. Machine learning for a sustainable energy future. Nature Reviews Materials 2023; 8: 202–215. https://doi.org/10.1038/s41578-022-00490-5

Meredig

Agrawal

Kirklin

, et al. Combinatorial screening for new materials in unconstrained composition space with machine learning. Physical Review B 2014; 89: 094104. https://doi.org/10.1103/physrevb.89.094104

Dobrzański

. Significance of materials science for the future development of societies. Journal of Materials Processing Technology 2006; 175: 133–148. https://doi.org/10.1016/j.jmatprotec.2005.04.003

Wang

Deng

, et al. Realization of a Z-classified chiral-symmetric higher-order topological insulator in a coupling-inverted acoustic crystal. Physical review letters 2023; 131: 157201. https://doi.org/10.1103/PhysRevLett.131.157201

Awad

Davies

Kitagawa

, et al. Mechanical properties and peculiarities of molecular crystals. Chemical Society Reviews 2023; 52: 3098–3169. https://doi.org/10.1039/d2cs00481j

10.

Oganov

Pickard

Zhu

, et al. Structure prediction drives materials discovery. Nature Reviews Materials 2019; 4: 331–348. https://doi.org/10.1038/s41578-019-0101-8

11.

Raccuglia

Elbert

Adler

, et al. Machine-learning-assisted materials discovery using failed experiments. Nature 2016; 533: 73–76. https://doi.org/10.1038/nature17439

12.

Maier

Stoewe

Sieg

. Combinatorial and high‐throughput materials science. Angewandte chemie international edition 2007; 46: 6016–6067. https://doi.org/10.1002/anie.200603675

13.

Kohn

Sham

. Self-consistent equations including exchange and correlation effects. Physical review 1965; 140: A1133–A1138. https://doi.org/10.1103/physrev.140.a1133

14.

Hamann

Schlüter

Chiang

. Norm-conserving pseudopotentials. Physical review letters 1979; 43: 1494–1497. https://doi.org/10.1103/physrevlett.43.1494

15.

Alder

Wainwright

. Studies in molecular dynamics. I. General method. The Journal of Chemical Physics 1959; 31: 459–466. https://doi.org/10.1063/1.1730376

16.

Plimpton

. Fast parallel algorithms for short-range molecular dynamics. Journal of computational physics 1995; 117: 1–19. https://doi.org/10.1006/jcph.1995.1039

17.

Clough

. Original formulation of the finite element method. Finite elements in analysis and design 1990; 7: 89–101. https://doi.org/10.1016/0168-874x(90)90001-u

18.

Clough

. Early history of the finite element method from the view point of a pioneer. International journal for numerical methods in engineering 2004; 60: 283–287. https://doi.org/10.1002/nme.962

19.

Yan

Qian

, et al. Complete and Efficient Graph Transformers for Crystal Material Property Prediction. In: The Twelfth International Conference on Learning Representations 2024. OpenReview.net.

20.

Wei

Chu

Sun

, et al. Machine learning in materials science. InfoMat 2019; 1: 338–358. https://doi.org/10.1002/inf2.12028

21.

Ramakrishna

Zhang

T-Y

W-C

, et al. Materials informatics. Journal of Intelligent Manufacturing 2019; 30: 2307–2326. https://doi.org/10.1007/s10845-018-1392-0

22.

Himanen

Geurts

Foster

, et al. Data‐driven materials science: status, challenges, and perspectives. Advanced Science 2019; 6: 1900808. https://doi.org/10.1002/advs.201900808

23.

Schleder

Padilha

Acosta

, et al. From DFT to machine learning: recent approaches to materials science–a review. Journal of Physics: Materials 2019; 2: 032001. https://doi.org/10.1088/2515-7639/ab084b

24.

Zheng

. Methods, progresses, and opportunities of materials informatics. InfoMat 2023; 5: e12425. https://doi.org/10.1002/inf2.12425

25.

Rodrigues

JJF

Florea

De Oliveira

, et al. Big data and machine learning for materials science. Discover Materials 2021; 1: 12. https://doi.org/10.1007/s43939-021-00012-0

26.

Sivan

Satheesh Kumar

Abdullah

, et al. Advances in materials informatics: a review. Journal of Materials Science 2024; 59: 2602–2643. https://doi.org/10.1007/s10853-024-09379-w

27.

Petkovic

Vieira

Dropka

. Machine learning in crystal growth: A review of methods, data, and applications. Progress in Crystal Growth and Characterization of Materials 2025; 71: 100689. https://doi.org/10.1016/j.pcrysgrow.2025.100689

28.

Sadeghian

Palevicius

Janusas

. A comprehensive review of machine-learning approaches for crystal structure/property prediction. Crystals 2025; 15: 925. https://doi.org/10.3390/cryst15110925

29.

Zhang

Wang

Helwig

, et al. Artificial intelligence for science in quantum, atomistic, and continuum systems. Foundations and Trends® in Machine Learning 2025; 18: 385–912. https://doi.org/10.1561/2200000115

30.

Ramprasad

Batra

Pilania

, et al. Machine learning in materials informatics: recent applications and prospects. npj Computational Materials 2017; 3: 54. https://doi.org/10.1038/s41524-017-0056-5

31.

Goodall

Lee

. Predicting materials properties without crystal structure: deep representation learning from stoichiometry. Nature communications 2020; 11: 6280. https://doi.org/10.1038/s41467-020-19964-7

32.

Peivaste

Ramezani

Alahyarizadeh

, et al. Rapid and accurate predictions of perfect and defective material properties in atomistic simulation using the power of 3D CNN-based trained artificial neural networks. Scientific Reports 2024; 14: 36. https://doi.org/10.1038/s41598-023-50893-9

33.

Choudhary

DeCost

Chen

, et al. Recent advances and applications of deep learning methods in materials science. npj Computational Materials 2022; 8: 59. https://doi.org/10.1038/s41524-022-00734-6

34.

Xie

Grossman

. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Physical review letters 2018; 120: 145301. https://doi.org/10.1103/PhysRevLett.120.145301

35.

Gong

Yan

Xie

, et al. Examining graph neural networks for crystal structures: limitations and opportunities for capturing periodicity. Science Advances 2023; 9: eadi3245. https://doi.org/10.1126/sciadv.adi3245

36.

Cheng

Zhang

Dong

. A geometric-information-enhanced crystal graph network for predicting properties of materials. Communications Materials 2021; 2: 92. https://doi.org/10.1038/s43246-021-00194-3

37.

Zhou

, et al. Reinforce crystal material property prediction with comprehensive message passing via deep graph networks. Computational Materials Science 2024; 239: 112958. https://doi.org/10.1016/j.commatsci.2024.112958

38.

Yang

Buehler

. Linking atomic structural defects to mesoscale properties in crystalline solids using graph neural networks. Npj Computational Materials 2022; 8: 198. https://doi.org/10.1038/s41524-022-00879-4

39.

Pandey

Stevanović

, et al. Predicting energy and stability of known and hypothetical crystals using graph neural network. Patterns 2021; 2: 100361. https://doi.org/10.1016/j.patter.2021.100361

40.

Dong

Feng

, et al. SLI-GNN: a self-learning-input graph neural network for predicting crystal and molecular properties. The Journal of Physical Chemistry A 2023; 127: 5921–5929. https://doi.org/10.1021/acs.jpca.3c01558

41.

Scarselli

Gori

Tsoi

, et al. The graph neural network model. IEEE transactions on neural networks 2008; 20: 61–80. https://doi.org/10.1109/TNN.2008.2005605

42.

Reiser

Neubert

Eberhard

, et al. Graph neural networks for materials science and chemistry. Communications Materials 2022; 3: 93. https://doi.org/10.1038/s43246-022-00315-6

43.

Qureshi

Bang

Doh

. Deep learning in-depth analysis of crystal graph convolutional neural networks: A new era in materials discovery and its applications. Nanotechnology Reviews 2025; 14: 20250200. https://doi.org/10.1515/ntrev-2025-0200

44.

Han

Cen

, et al. A survey of geometric graph neural networks: Data structures, models and applications. Frontiers of Computer Science 2025; 19: 1911375. https://doi.org/10.1007/s11704-025-41426-w

45.

Thomas

Smidt

Kearnes

, et al. Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. 2018; arXiv preprint arXiv:180208219.

46.

Weiler

Geiger

Welling

, et al. 3d steerable cnns: Learning rotationally equivariant features in volumetric data. In: Advances in Neural information processing systems 2018. Curran Associates.

47.

Kondor

Lin

Trivedi

. Clebsch–gordan nets: a fully fourier space spherical convolutional neural network. In: Advances in Neural Information Processing Systems 2018. Curran Associates.

48.

Geiger

Smidt

. e3nn: Euclidean neural networks. arXiv preprint arXiv:220709453 2022.

49.

Batzner

Musaelian

Sun

, et al. E (3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature communications 2022; 13: 2453. https://doi.org/10.1038/s41467-022-29939-5

50.

Tan

Descoteaux

Kotak

, et al. High-performance training and inference for deep equivariant interatomic potentials. 2025; arXiv preprint arXiv:250416068.

51.

Batatia

Kovacs

Simm

, et al. MACE: Higher order equivariant message passing neural networks for fast and accurate force fields. In: 1Advances in neural information processing systems. : Curran Associates, 2022, pp. 11423–11436.

52.

Batatia

Batzner

Kovács

, et al. The design space of E (3)-equivariant atom-centred interatomic potentials. Nature Machine Intelligence 2025; 7: 56–67. https://doi.org/10.1038/s42256-024-00956-x

53.

Musaelian

Batzner

Johansson

, et al. Learning local equivariant representations for large-scale atomistic dynamics. Nature Communications 2023; 14: 579. https://doi.org/10.1038/s41467-023-36329-y

54.

Kozinsky

Musaelian

Johansson

, et al. Scaling the leading accuracy of deep equivariant models to biomolecular simulations of realistic size. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Association for Computing Machinery, 2023, pp. 1–12.

55.

Liu

, et al. Symmetry-informed geometric representation for molecules, proteins, and crystalline materials. In: Advances in neural information processing systems. Curran Associates, 2023, pp. 66084–66101.

56.

Yan

Liu

Lin

, et al. Periodic graph transformers for crystal material property prediction. Advances in Neural Information Processing Systems. Curran Associates, 2022, pp. 15066–15080.

57.

Tie

, et al. Leveraging se (3) equivariance for learning 3d geometric shape assembly. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, 2023, pp. 14311–14320.

58.

Lin

Song

Zhang

, et al. SE (3)-equivariant point cloud-based place recognition. In: Conference on Robot Learning. PMLR, 2023, pp. 1520–1530.

59.

Chatzipantazis

Pertigkiozoglou

Dobriban

, et al. SE(3)-Equivariant Attention Networks for Shape Reconstruction in Function Space. In: The Eleventh International Conference on Learning Representations 2023. OpenReview.net.

60.

Ito

Taniai

Igarashi

, et al. Rethinking the role of frames for SE (3)-invariant crystal structure modeling. In: The Thirteenth International Conference on Learning Representations 2025. OpenReview.net.

61.

Pakornchote

Ektarawong

Chotibut

. StrainTensorNet: Predicting crystal structure elastic properties using SE (3)-equivariant graph neural networks. Physical Review Research 2023; 5: 043198. https://doi.org/10.1103/physrevresearch.5.043198

62.

Unke

Bogojeski

Gastegger

, et al. SE (3)-equivariant prediction of molecular wavefunctions and electronic densities. In: Advances in Neural Information Processing Systems 2021. Curran Associates, pp. 14434–14447.

63.

Seo

Yoo

Chang

, et al. SE (3)-equivariant Robot Learning and Control: A Tutorial Survey. International Journal of Control, Automation and Systems 2025; 23: 1271–1306. https://doi.org/10.1007/s12555-025-0193-4

64.

Wang

Kong

Gregoire

, et al. Conformal crystal graph transformer with robust encoding of periodic invariance. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI, 2024, pp. 283–291.

65.

Xiao

Tang

Liu

. Integrating Materials Representations Into Feature Engineering in Machine Learning for Crystalline Materials: From Local to Global Chemistry‐Structure Information Coupling. Wiley Interdisciplinary Reviews: Computational Molecular Science 2025; 15: e70044. https://doi.org/10.1002/wcms.70044

66.

Choudhary

DeCost

. Atomistic line graph neural network for improved materials property predictions. npj Computational Materials 2021; 7: 185. https://doi.org/10.1038/s41524-021-00650-1

67.

Huang

Xing

, et al. Ada-gnn: Atom-distance-angle graph neural network for crystal material property prediction. 2024; arXiv preprint arXiv:240111768.

68.

Wang

Hui

, et al. DenseGNN: universal and scalable deeper graph neural networks for high-performance property prediction in crystals and molecules. npj Computational Materials 2024; 10: 292. https://doi.org/10.1038/s41524-024-01444-x

69.

Dai

Demirel

Liang

, et al. Graph neural networks for an accurate and interpretable prediction of the properties of polycrystalline materials. npj Computational Materials 2021; 7: 103. https://doi.org/10.1038/s41524-021-00574-w

70.

Wang

Zhou

sheng Ren

, et al. Local angle information propagation model based on dual scale for crystal property prediction. Computational Materials Science 2025; 250: 113740. https://doi.org/10.1016/j.commatsci.2025.113740

71.

Feng

Tian

. Improving Crystal Property Prediction from a Multiplex Graph Perspective. Journal of Chemical Information and Modeling 2024; 64: 7376–7385. https://doi.org/10.1021/acs.jcim.4c01200

72.

Kipf

Welling

. Semi-Supervised Classification with Graph Convolutional Networks. In: The Fifth International Conference on Learning Representations 2017. OpenReview.net.

73.

Vaswani

Shazeer

Parmar

, et al. Attention is all you need. In: Advances in neural information processing systems. Curran Associates, 2017, pp. 6000–6010.

74.

Veličković

Cucurull

Casanova

, et al. Graph attention networks. 2017; arXiv preprint arXiv:171010903.

75.

Gasteiger

Groß

Günnemann

. Directional message passing for molecular graphs. 2020; arXiv preprint arXiv:200303123.

76.

Gasteiger

Becker

Günnemann

. Gemnet: Universal directional graph neural networks for molecules. In: Advances in Neural Information Processing Systems. Curran Associates, 2021, pp. 6790–6802.

77.

Liu

Wang

Liu

, et al. Spherical message passing for 3d molecular graphs. In: The Tenth International Conference on Learning Representations 2022. OpenReview.net.

78.

Wang

Liu

Lin

, et al. ComENet: Towards complete and efficient message passing for 3D molecular graphs. In: Advances in Neural Information Processing Systems. Curran Associates, 2022, pp. 650–664.

79.

Schütt

Unke

Gastegger

. Equivariant message passing for the prediction of tensorial properties and molecular spectra. In: International conference on machine learning. PMLR, 2021, pp. 9377–9388.

80.

Chen

Zuo

, et al. Graph networks as a universal machine learning framework for molecules and crystals. Chemistry of Materials 2019; 31: 3564–3572. https://doi.org/10.1021/acs.chemmater.9b01294

81.

Louis

S-Y

Zhao

Nasiri

, et al. Graph convolutional neural networks with global attention for improved materials property prediction. Physical Chemistry Chemical Physics 2020; 22: 18141–18148. https://doi.org/10.1039/d0cp01474e

82.

Park

Wolverton

. Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery. Physical Review Materials 2020; 4: 063801. https://doi.org/10.1103/physrevmaterials.4.063801

83.

Omee

Louis

S-Y

, et al. Scalable deeper graph neural networks for high-performance materials property prediction. Patterns 2022; 3: 100491. https://doi.org/10.1016/j.patter.2022.100491

84.

Chen

Ong

. A universal graph deep learning interatomic potential for the periodic table. Nature Computational Science 2022; 2: 718–728. https://doi.org/10.1038/s43588-022-00349-3

85.

Deng

Zhong

Jun

, et al. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nature Machine Intelligence 2023; 5: 1031–1041. https://doi.org/10.1038/s42256-023-00716-3

86.

Huang

Xing

, et al. PerCNet: Periodic complete representation for crystal graphs. Neural Networks 2025; 181: 106841. https://doi.org/10.1016/j.neunet.2024.106841

87.

Gao

Guo

X-W

, et al. GCPNet: An interpretable Generic Crystal Pattern graph neural Network for predicting material properties. Neural Networks 2025; 188: 107466. https://doi.org/10.1016/j.neunet.2025.107466

88.

Lin

Yan

Luo

, et al. Efficient approximations of complete interatomic potentials for crystal property prediction. In: International conference on machine learning. PMLR, 2023, pp. 21260–21287.

89.

Kosmala

Gasteiger

Gao

, et al. Ewald-based long-range message passing for molecular graphs. In: International Conference on Machine Learning. PMLR, 2023, pp. 17544–17563.

90.

Solé

Mosella-Montoro

Cardona

, et al. A Cartesian encoding graph neural network for crystal structure property prediction: application to thermal ellipsoid estimation. Digital Discovery 2025; 4: 694–710. https://doi.org/10.1039/d4dd00352g

91.

Schütt

Kindermans

P-J

Sauceda

FHE

, et al. Schnet: A continuous-filter convolutional neural network for modeling quantum interactions. In: Advances in neural information processing systems. Curran Associates, 2017, pp. 992–1002.

92.

Jain

Ong

Hautier

, et al. Commentary: The Materials Project: A materials genome approach to accelerating materials innovation. APL materials 2013; 1: 011002. https://doi.org/10.1063/1.4812323

93.

Choudhary

Garrity

Reid

, et al. The joint automated repository for various integrated simulations (JARVIS) for data-driven materials design. npj computational materials 2020; 6: 173. https://doi.org/10.1038/s41524-020-00440-1

94.

Wines

Gurunathan

Garrity

, et al. Recent progress in the JARVIS infrastructure for next-generation data-driven materials design. Applied Physics Reviews 2023; 10: 041302. https://doi.org/10.1063/5.0159299

95.

Dunn

Wang

Ganose

, et al. Benchmarking materials property prediction methods: the Matbench test set and Automatminer reference algorithm. npj Computational Materials 2020; 6: 138. https://doi.org/10.1038/s41524-020-00406-3

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.24 MB

0.00 MB