ExplorerChemistryChemistry
Research PaperResearchia:202607.02035

High-performance parallel implementation of high-order coupled-cluster theories

Yu Jin

Abstract

High-order coupled-cluster theories with iterative triples (CCSDT), perturbative quadruples [CCSDT(Q)], and iterative quadruples (CCSDTQ) provide benchmark-quality correlation energies, but their steep computational scalings, $O(N^8), O(N^9)$, and $O(N^{10})$, together with the large memory requirements of high-order amplitude tensors, have historically limited their application to small molecules. In this work, we develop efficient open-source implementations of spin-restricted CCSDT (RCCSDT), ...

Submitted: July 2, 2026Subjects: Chemistry; Chemistry

Description / Details

High-order coupled-cluster theories with iterative triples (CCSDT), perturbative quadruples [CCSDT(Q)], and iterative quadruples (CCSDTQ) provide benchmark-quality correlation energies, but their steep computational scalings, O(N8),O(N9)O(N^8), O(N^9), and O(N10)O(N^{10}), together with the large memory requirements of high-order amplitude tensors, have historically limited their application to small molecules. In this work, we develop efficient open-source implementations of spin-restricted CCSDT (RCCSDT), RCCSDT(Q), RCCSDTQ, and spin-unrestricted CCSDT (UCCSDT) within the PySCF package. The shared-memory implementation combines compact triangular storage of the highest-order amplitude tensors with the multithreaded tensor contraction backend pytblis, enabling efficient use of modern many-core CPU architectures. This design delivers near-ideal thread scaling up to 90 cores and achieves wall times shorter than or comparable to existing single-node implementations for representative benchmark molecules. We further extend RCCSDT, RCCSDT(Q), and RCCSDTQ to distributed-memory architectures using MPI-based algorithms. By distributing compact high-order amplitudes across MPI ranks and overlapping communication with computation through nonblocking data transfers, the distributed implementation achieves near-ideal strong scaling on up to 32 nodes, corresponding to approximately 3,000 CPU cores. These developments substantially extend the practical reach of canonical high-order CC theory, enabling CCSDT(Q) calculations with approximately 100 correlated electrons in 450 orbitals and CCSDTQ calculations with approximately 50 correlated electrons in 115 orbitals. Applications to ππ-stacked noncovalent dimers, the CO dissociation energy of Cr(CO)6_6, and the Cope rearrangement of semibullvalene demonstrate that canonical high-order CC benchmarks are now feasible for chemically realistic molecular systems.


Source: arXiv:2607.00981v1 - http://arxiv.org/abs/2607.00981v1 PDF: https://arxiv.org/pdf/2607.00981v1 Original Link: http://arxiv.org/abs/2607.00981v1

Please sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Access Paper
View Source PDF
Submission Info
Date:
Jul 2, 2026
Topic:
Chemistry
Area:
Chemistry
Comments:
0
Bookmark
High-performance parallel implementation of high-order coupled-cluster theories | Researchia