Helonium's Hartree-Fock Program

4 months ago 16

Constructing the integrals' tensor is complicated6 and is the main reason for the poor scaling of electronic structure methods. The \(1s\) orbitals are the simplest case, and here two types of integrals are analytical (S, T) while the rest already lacks a closed-form solution (V, ERI):

S ← {a‿b𝕊𝕩: (1.5⋆˜π÷a+b) × ⋆-𝕩×a(×÷+)b} T ← {a‿b𝕊𝕩: f ← a(×÷+)b ⋄ ×´⟨1.5⋆˜π÷a+b, (3×f)-2×𝕩×טf, ⋆-𝕩×f⟩} V ← {a‿b‿z𝕊r‿s: ×´⟨-2×z×π÷a+b, E s×a+b, ⋆-r×a(×÷+)b⟩} ERI ← {a‿b‿c‿d𝕊r1‿r2‿r3‿r4: r5 ← -´⟨a‿b ⋄ c‿d⟩ (+´∘×÷+´∘⊣)¨ ⟨r1‿r2 ⋄ r3‿r4⟩ f1‿f2‿f3 ← a‿b ({(√∘+××)⋈(×÷+)}○(+´)∾<∘⋈○((×÷+)´)) c‿d ×´⟨f1÷˜2×π⋆5÷2, E f2×טr5, ⋆-+´f3×ט-´¨⟨r1‿r2, r3‿r4⟩⟩ }

Derivation strategy

We need to compute the overlap (S), kinetic energy (T), nuclear attraction (V), and four-center (ERI) integrals. Crucially, the product of two Gaussians at different centers is proportional to a Gaussian at a scaled center. This property, combined with the Laplacian of a Gaussian, readily yields S and T. The remaining two sets are more complex: we combine the Gaussians as before, then transform to reciprocal space where the delta distribution arises and simplifies the problem to this integration by reduction:

\begin{equation*} I(x) = \int_0^{\infty}{{{e^ {- a\,k^2 }\,\sin \left(k\,x\right)}\over{k}}\;dk} \sim \text{Erf}(x) \end{equation*}

The following namespace exports the corresponding integral arrays. Extending the code to an arbitrary number of atoms implies mapping over an array of coordinates, as opposed to fusing them in the implementation.

I ← {𝕊e1‿e2‿z1‿z2‿r: bs‿na‿nb ← (<∾·≢⊏∘>)⍉>STO¨ e1‿e2 M ← {∾‿×({2: {nb(⋈˜/∘⋈˜)⊸⊔𝕎⌜˜𝕩}; 4: {𝕎⌜⍟3˜𝕩}}𝕩)¨○⊢<∘∾˘bs} sm‿hcore ⇐ {e𝕊c: mst ← ⌽⊸≍∾⟜0טr r1‿r2 ← <˘⍉⁼> (r⊸-⊸⋈˜×⟜r÷+)⌜´ ⊏bs mv ← ט∘{[0‿2,3‿1]⊏({0‿𝕨¨𝕩}⟜𝕩¨𝕨)∾⋈⟜⍉r⋈¨𝕩}´¨⟨0‿r, r1⟩‿⟨r‿0, r2⟩ (⊑⋈·+´1⊸↓)+´∘⥊¨¨ c<⊸× ({e𝕏¨¨mst}¨S‿T) ∾ z1‿z2{e∾⟜𝕨⊸V¨¨𝕩}¨mv }´ M 2 erim ⇐ {e𝕊c: meri ← (c⊸×⊣ERI¨⊢/˜·<¨≢∘⊣÷≢∘⊢)⟜{0‿r⊏˜⚇1↕na¨↕=𝕩} e =⊸{+˝∘⥊⎉𝕨 (2×↕𝕨)⍉⁼(na‿nb⥊˜na×𝕨)⥊𝕩} meri }´ M 4 }

Performance

The computation of the ERIs is expected to be the primary bottleneck, as there are N⋆4 of them—in our case, 16. The required tensors have a shape of 6¨↕4. As shown in the profile below, using an array-based strategy for the ERIs significantly improved their computational efficiency compared to the two-center integrals. For the latter, I increased the depth by grouping the tables (block matrices). The resulting code was significantly slower than replicating the elements to match each axis' length, like I do for the ERIs.

)profile {𝕊: I∘System 1.4632}¨↕1e4
Got 38006 samples (self-hosted runtime1): 1067 samples (REPL): 36939 samples: 72│I ← {𝕊e1‿e2‿z1‿z2‿r: 68│ bs‿na‿nb ← (<∾·≢⊏∘>)⍉>STO¨ e1‿e2 2053│ M ← {∾‿×({2: {nb(⋈˜/∘⋈˜)⊸⊔𝕎⌜˜𝕩}; 4: {𝕎⌜⍟3˜𝕩}}𝕩)¨○⊢<∘∾˘bs} │ 245│ sm‿hcore ⇐ {e𝕊c: 75│ mst ← ⌽⊸≍∾⟜0טr 4181│ r1‿r2 ← <˘⍉⁼> (r⊸-⊸⋈˜×⟜r÷+)⌜´ ⊏bs 16277│ mv ← ט∘{[0‿2,3‿1]⊏({0‿𝕨¨𝕩}⟜𝕩¨𝕨)∾⋈⟜⍉r⋈¨𝕩}´¨⟨0‿r, r1⟩‿⟨r‿0, r2⟩ 8830│ (⊑⋈·+´1⊸↓)+´∘⥊¨¨ c<⊸× ({e𝕏¨¨mst}¨S‿T) ∾ z1‿z2{e∾⟜𝕨⊸V¨¨𝕩}¨mv 3708│ }´ M 2 │ 8│ erim ⇐ {e𝕊c: 1100│ meri ← (c⊸×⊣ERI¨⊢/˜·<¨≢∘⊣÷≢∘⊢)⟜{0‿r⊏˜⚇1↕na¨↕=𝕩} e 318│ =⊸{+˝∘⥊⎉𝕨 (2×↕𝕨)⍉⁼(na‿nb⥊˜na×𝕨)⥊𝕩} meri 4│ }´ M 4 │}

Morals: Never underestimate the power of vectorization and reshaping operations are often computationally trivial.

Read Entire Article