A critical assessment of bonding descriptors for predicting materials properties
Abstract
Most machine learning models for materials science rely on descriptors based on materials compositions and structures, even though the chemical bond has been proven to be a valuable concept for predicting materials properties. Over the years, various theoretical frameworks have been developed to characterize bonding in solid-state materials. However, integrating bonding information from these frameworks into machine learning pipelines at scale has been limited by the lack of a systematically generated and validated database. Recent advances in high-throughput bonding analysis workflows have addressed this issue, and our previously computed Quantum-Chemical Bonding Database for Solid-State Materials was extended to include approximately 13,000 materials. This database is then used to derive a new set of quantum-chemical bonding descriptors. A systematic assessment is performed using statistical significance tests to evaluate how the inclusion of these descriptors influences the performance of machine-learning models that otherwise rely solely on structure- and composition-derived features. Models are built to predict elastic, vibrational, and thermodynamic properties typically associated with chemical bonding in materials. The results demonstrate that incorporating quantum-chemical bonding descriptors not only improves predictive performance but also helps identify intuitive expressions for properties such as the projected force constant and lattice thermal conductivity via symbolic regression.
Source: arXiv:2602.12109v1 - http://arxiv.org/abs/2602.12109v1 PDF: https://arxiv.org/pdf/2602.12109v1 Original Link: http://arxiv.org/abs/2602.12109v1