Constructing the integrals' tensor is complicated6 and is the main reason for the poor scaling of electronic structure methods. The \(1s\) orbitals are the simplest case, and here two types of integrals are analytical (S, T) while the rest already lacks a closed-form solution (V, ERI):
Derivation strategy
We need to compute the overlap (S), kinetic energy (T), nuclear attraction (V), and four-center (ERI) integrals. Crucially, the product of two Gaussians at different centers is proportional to a Gaussian at a scaled center. This property, combined with the Laplacian of a Gaussian, readily yields S and T. The remaining two sets are more complex: we combine the Gaussians as before, then transform to reciprocal space where the delta distribution arises and simplifies the problem to this integration by reduction:
\begin{equation*} I(x) = \int_0^{\infty}{{{e^ {- a\,k^2 }\,\sin \left(k\,x\right)}\over{k}}\;dk} \sim \text{Erf}(x) \end{equation*}The following namespace exports the corresponding integral arrays. Extending the code to an arbitrary number of atoms implies mapping over an array of coordinates, as opposed to fusing them in the implementation.
Performance
The computation of the ERIs is expected to be the primary bottleneck, as there are N⋆4 of them—in our case, 16. The required tensors have a shape of 6¨↕4. As shown in the profile below, using an array-based strategy for the ERIs significantly improved their computational efficiency compared to the two-center integrals. For the latter, I increased the depth by grouping the tables (block matrices). The resulting code was significantly slower than replicating the elements to match each axis' length, like I do for the ERIs.
Morals: Never underestimate the power of vectorization and reshaping operations are often computationally trivial.
.png)

