Skip to article frontmatterSkip to article content

Molecular quantum mechanics

In this chapter I briefly present the background required to understand the basic algorithms used to run ab-initio molecular dynamics (AIMD) simulations. The methods presented here makes it possible to simulate the dynamics of atoms and molecules from first principles, which is what “ab initio” means.

1A minimalistic introduction to quantum mechanics

Quantum mechanics is a theoretical framework that describes the behavior of matter and energy at the smallest scales. Unlike classical mechanics, where objects have definite positions and velocities, quantum mechanics introduces the concept of wave-particle duality, where particles exhibit both wave-like and particle-like properties.

The full information about the quantum state of a given quantum system is contained in a complex-valued function of the spatial and time coordinates r\vec{r} and tt, called the wavefunction, Φ(t,r)\Phi(t, \vec{r}). Max Born’s interpretation of the wavefunction provides a probabilistic view of quantum mechanics. According to Born’s rule, the probability of finding a particle at position r\vec{r} at time tt is proportional to Φ(t,r)2|\Phi(t, \vec{r})|^2. Thus, the wavefunction must be normalized, meaning that the total probability of finding the particle anywhere in space must equal 1:

VΦ(t,r)2dr=VΨ(r)2dr=1,\int_V |\Phi(t, \vec{r})|^2 d\vec{r} = \int_V |\Psi(\vec{r})|^2 d\vec{r} = 1,

where we used the factorisation Φ(t,r)=ϕ(t)Ψ(r)\Phi(t,\vec{r}) = \phi(t) \Psi(\vec{r}) which will be justified in a moment, and the integral is carried out over the system’s volume, VV. This probabilistic interpretation is one of the fundamental departures from classical mechanics, where particles have deterministic trajectories. Note that, in order for Eq. (1) to make sense, wave functions need to be “square-integrable” or, in other words, to belong to the Hilbert space L2(R3)L^2(\mathbb{R}^3). This means that it is also possible to define an inner product between any two wave functions ϕ and ψ:

ϕψ=ϕ(r)ψ(r)dr,\langle \phi | \psi \rangle = \int \phi^*(\vec r) \psi(\vec r) d\vec r,

where ϕ(r)\phi^*(\vec{r}) is the complex conjugate of ϕ(r)\phi(\vec r), and we also introduced the bra-ket notation.

In quantum mechanics, physical quantities (observables) such as energy, momentum, and position are represented by operators that act on the wavefunction. For example, the position operator is r^=r\hat{\vec{r}} = \vec{r}, and the momentum operator, is p^=i\hat{\vec{p}} = -i\hbar \nabla, where h/2π\hbar \equiv h / 2 \pi is the reduced Planck constant. The expectation value of an observable over a wavefunction Ψ, which is the average value measured in an experiment, is given by:

A^=ΨA^Ψ=Ψ(r)A^Ψ(r)dr,\langle \hat{A} \rangle = \langle \Psi | \hat A | \Psi \rangle = \int \Psi^*(\vec{r}) \hat{A} \Psi(\vec{r}) d\vec{r},

where A^\hat{A} is the operator corresponding to the observable.

The Hamiltonian operator, or simply Hamiltonian, of a system, H^\hat{H}, represents the total energy of the system (kinetic and potential) and determines its evolution. The Hamiltonian typically takes the form:

H^=p^22m+V(r)=22m2+V(r)\hat{H} = \frac{\hat p^2}{2m} + V(\vec{r}) = -\frac{\hbar^2}{2m} \nabla^2 + V(\vec{r})

where mm is the mass of the particle, p^=i\hat p = -i\hbar \nabla is the particle’s momentum, 2\nabla^2 is the Laplacian operator, representing the kinetic energy contribution, and V(r)V(\vec{r}) is the potential energy as a function of position. For a system of multiple particles, the Hamiltonian becomes more complex, accounting for the kinetic and potential energies of all particles as well as their interactions.

Given a Hamiltonian, the fundamental equation that determines how the wavefunction of a system evolves over time is the time-dependent Schrödinger equation:

idΦ(t,r)dt=H^Φ(t,r).i\hbar \frac{d \Phi(t, \vec{r})}{dt} = \hat H \Phi(t, \vec{r}).

If H^\hat H does not depend explicitly on time, the dependence on time and spatial variables of the wavefunction can be separated, i.e. Φ(t,r)=Ψ(r)ϕ(t)\Phi(t, \vec{r}) = \Psi(\vec{r}) \phi(t). The time-dependence of Eq. (5) can be explicitly integrated, yielding

ϕ(t)=eiEt/,\phi(t) = e^{-iEt / \hbar},

where EE is the expectation value of the Hamiltonian and will be further discussed below. The time-independent part of the Schrödinger equation is instead

H^Ψ(r)=EΨ(r).\hat{H} \Psi(\vec{r}) = E \Psi(\vec{r}).

This is a partial differential eigenvalue equation, where an operator acts on a function, called eigenfunction, and returns the same function multiplied by the eigenvalue associated to the eigenfunction. Here the eigenfunction is the wavefunction, and the eigenvalue is its energy, EE. Since the kinetic part of the Hamiltonian always contains 2\nabla^2 terms, the Schrödinger equation is a second-order differential equation.

1.1The many-body problem

For systems with multiple interacting particles (e.g., an atom, a molecule or larger objects), the Schrödinger equation becomes a many-body problem. The Hamiltonian for such a system includes terms for the kinetic energy of each particle, the potential energy due to external fields, and the interaction energy between the particles. For instance, for a system of NN charged particles, the many-body Hamiltonian can be written as:

H^=i=1N(22mii2+V(ri))+12iijqiqjrirj,\hat{H} = \sum_{i=1}^{N} \left( -\frac{\hbar^2}{2m_i} \nabla_i^2 + V(\vec{r}_i) \right) + \frac{1}{2} \sum_{i} \sum_{i \neq j} \frac{q_i q_j}{|\vec{r}_i - \vec{r}_j|},

where mim_i and qiq_i are the mass and charge of the ii-th particle, respectively. Here the first term represents the kinetic energy of the particles, the second term represents the potential energy due to an external field (e.g., from atomic nuclei if we are considering electrons), and the third term represents the Coulomb interaction. Here we set the Coulomb constant 1/4πϵ01/4\pi\epsilon_0 to 1 to simplify the notation.

Solving the Schrödinger equation for many-body systems exactly is extremely challenging, if not impossible, for all but the simplest cases, such as the hydrogen atom. Therefore, approximations and numerical methods are essential. This is where ab initio methods like Density Functional Theory (DFT) come into play, offering a way to handle the complexities of interacting electrons.

2The Born-Oppenheimer approximation

When dealing with molecules or solids, one faces a many-body problem involving both the nuclei and the electrons. The total Hamiltonian of a molecular system includes terms for the kinetic energy of the nuclei, the kinetic energy of the electrons, and the potential energy due to interactions between all particles (nuclei-nuclei, nuclei-electrons, and electron-electron interactions). This problem is, in general, analitically unsolvable due to the coupling between the electronic and nuclear degrees of freedom.

To make the problem tractable, we introduce the Born-Oppenheimer (BO) approximation. This approximation exploits the large mass difference between nuclei and electrons (3\approx 3 orders of magnitude) to decouple their motions, making it possible to simplify the Schrödinger equation and compute electronic structures for fixed nuclear configurations.

For a molecule composed of NN electrons of mass mem_e and MM nuclei, each of mass MIM_I, the total Hamiltonian can be written as:

H^=T^nuc+T^el+V^nuc-nuc+V^nuc-el+V^el-el,\hat{H} = \hat{T}_\text{nuc} + \hat{T}_\text{el} + \hat{V}_\text{nuc-nuc} + \hat{V}_\text{nuc-el} + \hat{V}_\text{el-el},

where T^nuc=I=1M22MII2\hat{T}_\text{nuc} = -\sum_{I=1}^{M} \frac{\hbar^2}{2M_I} \nabla_I^2 and T^el=i=1N22mei2\hat{T}_\text{el} = -\sum_{i=1}^{N} \frac{\hbar^2}{2m_e} \nabla_i^2 are the kinetic energy operators for the nuclei and electrons, respectively, while V^nuc-nuc\hat{V}_\text{nuc-nuc}, V^nuc-el\hat{V}_\text{nuc-el} and V^el-el\hat{V}_\text{el-el} represent the nucleus-nucleus, electron-nucleus, and electron-electron Coulomb interactions, respectively.

The total wavefunction Ψ({RI},{ri})\Psi(\{\vec{R}_I\}, \{\vec{r}_i\}) depends on both the positions of the nuclei {RI}\{\vec{R}_I\} and of the electrons {ri}\{\vec{r}_i\}. Solving for the full wavefunction is complicated because the nuclear and electronic motions are coupled through the potential terms. However, the mass disparity between the nucleus and the electrons suggests that, on the timescale of nuclear motion, the electrons can adjust almost instantaneously to any change in nuclear positions. This allows us to decouple the nuclear and electronic motions by assuming that the electronic wavefunction can be solved for a fixed configuration of nuclei.

We assume that the total wavefunction Ψ({RI},{ri})\Psi(\{\vec{R}_I\}, \{\vec{r}_i\}) can be factored into a product of a nuclear wavefunction χ({RI})\chi(\{\vec{R}_I\}) and an electronic wavefunction ψ({ri};{RI})\psi(\{\vec{r}_i\}; \{\vec{R}_I\}):

Ψ({RI},{ri})χ({RI})ψ({ri};{RI})\Psi(\{\vec{R}_I\}, \{\vec{r}_i\}) \approx \chi(\{\vec{R}_I\}) \psi(\{\vec{r}_i\}; \{\vec{R}_I\})

Here, the electronic wavefunction ψ({ri};{RI})\psi(\{\vec{r}_i\}; \{\vec{R}_I\}) is parameterized by the fixed nuclear positions {RI}\{\vec{R}_I\}, while χ({RI})\chi(\{\vec{R}_I\}) describes the motion of the nuclei.

2.1Solving the Schrödinger equation

With the nuclei fixed, we can solve the electronic Schrödinger equation for a given set of nuclear positions:

H^elψ({ri};{RI})=Eel({RI})ψ({ri};{RI}).\hat{H}_\text{el} \psi(\{\vec{r}_i\}; \{\vec{R}_I\}) = E_\text{el}(\{\vec{R}_I\}) \psi(\{\vec{r}_i\}; \{\vec{R}_I\}).

The electronic Hamiltonian H^el\hat{H}_\text{el} includes the kinetic energy of the electrons and the potential energy due to electron-electron and electron-nuclei interactions:

H^el=T^el+V^el-el+V^nuc-el.\hat{H}_\text{el} = \hat{T}_\text{el} + \hat{V}_\text{el-el} + \hat{V}_\text{nuc-el}.

The result of solving this equation is the electronic energy Eel({RI})E_\text{el}(\{\vec{R}_I\}), which depends parametrically on the nuclear positions. This energy forms a potential energy surface (PES) on which the nuclei move.

Once the electronic problem is solved, the nuclear dynamics can be described by an effective nuclear Hamiltonian, where the potential energy surface Eel({RI})E_\text{el}(\{\vec{R}_I\}) from the electronic calculation acts as the potential for the nuclei:

H^nucχ({RI})=(T^nuc+V^nuc-nuc+Eel({RI}))χ({RI})\hat{H}_\text{nuc} \chi(\{\vec{R}_I\}) = \left( \hat{T}_\text{nuc} + \hat{V}_\text{nuc-nuc} + E_\text{el}(\{\vec{R}_I\}) \right) \chi(\{\vec{R}_I\})

This equation governs the motion of the nuclei on the potential energy surface created by the electrons. In practice, depending on the temperature and energy scales, this motion can be treated either classically or quantum mechanically.

2.2Implications and limitations

The Born-Oppenheimer approximation is widely used in ab initio simulations because it allows the separation of electronic structure calculations from nuclear motion, significantly reducing the computational complexity. It underpins most of the computational chemistry methods, including Hartree-Fock theory, post-Hartree-Fock methods, and density functional theory (DFT). However, the approximation has some limitations. For instance, in systems where nuclear and electronic motions are strongly coupled (e.g., near conical intersections or in the presence of non-adiabatic effects, see box above), the Born-Oppenheimer approximation breaks down, and one must account for the simultaneous evolution of electrons and nuclei. Moreover, the approximation often treats nuclei as classical particles moving on a potential energy surface, which may neglect important quantum effects like tunneling or zero-point energy, particularly in light atoms like hydrogen.

3The Hohenberg-Kohn theorems

The Hohenberg-Kohn theorems form the foundation of Density Functional Theory (DFT), a widely used approach in ab initio simulations. DFT simplifies the quantum many-body problem by shifting the focus from the many-body wavefunction to the electron density n(r)n(\vec{r}). These theorems, established by Pierre Hohenberg and Walter Kohn in 1964, provide the theoretical justification for this approach, proving that all ground-state properties of a system are uniquely determined by its electron density. By leveraging this result, it is possible to reformulate quantum mechanics in terms of the electron density, which depends on only three spatial coordinates, rather than the many-body wavefunction, which depends on the coordinates of all electrons. This leads to a significant reduction in complexity and makes DFT one of the most efficient methods for electronic structure calculations in large systems.

Expliciting the terms of an NN-electron Hamiltonian (see Eqs. (8) and (12)) one obtains

H^el=ipi22me+12iije2rirj+iV(ri),\hat H_\text{el} = \sum_i \frac{p^2_i}{2 m_e} + \frac{1}{2} \sum_i \sum_{i \neq j} \frac{e^2}{|\vec r_i - \vec r_j|} + \sum_i V(\vec r_i),

where the terms can be grouped to yield H^el=F^+V^\hat H_\text{el} = \hat F + \hat V, where FF are the kinetic and electron-electron interaction energies, respectively, and VV is the “external” potential energy. The ground state Ψ0\Psi_0 is uniquely determined by NN and V({r})V(\{\vec r\}), and we normalise it so that Ψ0Ψ0=N\langle \Psi_0 | \Psi_0 \rangle = N. Its associated electron density is

n0(r)=Ψ0ρ(r)Ψ0=NΨ0(r,r2,,rN)2dr2drN,n_0(\vec r) = \langle \Psi_0 | \rho(\vec r) | \Psi_0 \rangle = N \int |\Psi_0(\vec r, \vec r_2, \ldots, \vec r_N)|^2 d\vec r_2 \ldots d\vec r_N,

where ρ(r)=iδ(rri)\rho(\vec r) = \sum_i \delta(\vec r - \vec r_i) is the particle density operator. By using this definition, we can write the total potential energy as

Ψ0V^Ψ0=iV(ri)Ψ0(r1,r2,,rN)2dr1dr2drN=NV(r)Ψ0(r,r2,,rN)2drdr2drN=n0(r)V(r)dr.\begin{aligned} \langle \Psi_0 | \hat V | \Psi_0 \rangle &= \int \sum_i V(\vec r_i) |\Psi_0(\vec r_1, \vec r_2, \ldots, \vec r_N)|^2 d\vec r_1 d\vec r_2 \ldots d\vec r_N \\ & = N \int V(\vec r) |\Psi_0(\vec r, \vec r_2, \ldots, \vec r_N)|^2 d\vec r d\vec r_2 \ldots d\vec r_N \\ & = \int n_0(\vec r) V(\vec r) d\vec r. \end{aligned}

This theorem implies that the many-body wavefunction Ψ(r1,r2,,rN)\Psi(\vec{r}_1, \vec{r}_2, \dots, \vec{r}_N) is no longer the central object in quantum mechanics. Instead, the electron density n(r)n(\vec{r}) can be used to determine all ground-state properties. Specifically, the ground-state energy E0E_0 is a functional of the electron density, E[n]E[n], and any other observable that depends on the external potential, such as forces on nuclei or response functions, is also determined by the electron density. Therefore, if one can find the correct ground-state electron density, one can in principle reconstruct the entire external potential and solve for all other properties of the system.

The second Hohenberg-Kohn theorem provides a practical way to find the ground-state electron density by minimizing the energy functional E[n]E[n]. This is the basis of modern DFT numerical calculations:

  1. Start with a trial electron density n(r)n(\vec{r}).
  2. Calculate the total energy functional E[n]E[n].
  3. Adjust the electron density to minimize the energy.
  4. The density that minimizes the energy is the true ground-state density, and the corresponding energy is the ground-state energy.

This approach bypasses the need to solve the many-body Schrödinger equation directly, instead focusing on finding the electron density that minimizes the energy functional. The challenge in DFT is to find an accurate expression for the universal functional F[n]=T[n]+U[n]F[n] = T[n] + U[n], as this functional is not known explicitly for interacting electrons. As we will now see, various approximations, such as the Local Density Approximation (LDA) and Generalized Gradient Approximation (GGA), have been developed to approximate this functional.

4The Kohn-Sham approximation

The Hohenberg-Kohn theorems laid the foundation for Density Functional Theory (DFT) by demonstrating that all ground-state properties of a many-electron system are determined by the electron density n(r)n(\vec{r}). However, the exact form of the universal energy functional E[n]E[n], which includes the kinetic energy and electron-electron interaction energy, is unknown for interacting electrons. This is the key challenge in practical DFT calculations.

The Kohn-Sham (KS) approximation, introduced by Walter Kohn and Lu Jeu Sham in 1965, provides a solution to this problem by mapping the interacting electron system onto an auxiliary system of non-interacting electrons that has the same ground-state electron density. This makes it possible to treat the complex interactions in a computationally efficient way while still capturing the essential physics of the many-body problem.

4.1The Kohn-Sham equations

The central idea of the Kohn-Sham approach is to introduce a fictitious system of non-interacting electrons that reproduces the exact ground-state electron density of the real, interacting system. For this non-interacting system, the kinetic energy can be computed exactly in terms of single-particle orbitals, which in this context are called the Kohn-Sham orbitals, ψi\psi_i. The electron density is also constructed from the Kohn-Sham orbitals as:

n(r)=i=1Nψi(r)2.n(\vec{r}) = \sum_{i=1}^{N} |\psi_i(\vec{r})|^2.

Instead of attempting to directly approximate the total energy functional E[n]E[n] for the interacting system, we decompose it into parts that can be computed exactly and parts that require approximations. The total energy functional in the Kohn-Sham formalism is written as:

E[n]=Ts[n]+V(r)n(r)dr+Uel-el[n]+Exc[n],E[n] = T_s[n] + \int V(\vec{r}) n(\vec{r}) \, d\vec{r} + U_\text{el-el}[n] + E_{\text{xc}}[n],

where Ts[n]T_s[n] is the kinetic energy of the non-interacting electrons, Uel-el[n]U_\text{el-el}[n] is the Coulomb energy, representing the classical electrostatic interaction between electrons, and Exc[n]E_{\text{xc}}[n] is the exchange-correlation energy, which includes all the many-body effects not captured by the other terms (e.g the electron-electron effective repulsion due to the Pauli exclusion principle).

To find the ground-state density, the Kohn-Sham approach involves solving a set of single-particle equations, known as the Kohn-Sham equations. These equations describe the motion of non-interacting electrons in an effective potential veff(r)v_{\text{eff}}(\vec{r}), and can be obtained by using the fact that the functional in Eq. (27) is minimised by the ground-state electron density. Therefore, δE/δnn(r)=n0(r)=0\delta E / \delta n|_{n(\vec r) = n_0(\vec r)} = 0. If we carry out the functional derivative with the constraint that the Kohn-Sham orbitals are orthonormal, we obtain (see Appendix B of Giustino (2014) for the full derivation)

(22m2+veff(r))ψi(r)=ϵiψi(r),\left( -\frac{\hbar^2}{2m} \nabla^2 + v_{\text{eff}}(\vec{r}) \right) \psi_i(\vec{r}) = \epsilon_i \psi_i(\vec{r}),

ϵi\epsilon_i is the single-particle energy corresponding to ψi\psi_i, and the effective potential, which includes the effects of the external potential, the Hartree potential, and the exchange-correlation potential, is given by

veff(r)=V(r)+n(r)rrdr+vxc(r)v_{\text{eff}}(\vec{r}) = V(\vec{r}) + \int \frac{n(\vec{r'})}{|\vec{r} - \vec{r'}|} d\vec{r'} + v_{\text{xc}}(\vec{r})

The Kohn-Sham equations must be solved self-consistently:

  1. Start with an initial guess for the electron density n(r)n(\vec{r}).
  2. Construct the effective potential veff(r)v_{\text{eff}}(\vec{r}).
  3. Solve the Kohn-Sham equations to obtain the orbitals ψi(r)\psi_i(\vec{r}).
  4. Calculate a new electron density from the orbitals.
  5. Repeat until the electron density converges.

4.2The exchange-correlation functional

The exchange-correlation functional Exc[n]E_{\text{xc}}[n] plays a crucial role in DFT, as it captures the many-body effects of electron exchange and correlation that are not accounted for in the other terms of the Kohn-Sham functional. This term is responsible for the accuracy of DFT calculations, and its exact form is unknown.

Several approximate forms for the exchange-correlation functional have been developed, each offering a different balance between accuracy and computational cost. Here I list the most common approximations.

4.2.1Local Density Approximation

In the Local Density Approximation (LDA), the exchange-correlation energy density at each point in space is assumed to depend only on the local electron density n(r)n(\vec{r}):

ExcLDA[n]=ϵxc(n(r))n(r)dr.E_{\text{xc}}^{\text{LDA}}[n] = \int \epsilon_{\text{xc}}(n(\vec{r})) n(\vec{r}) d\vec{r}.

The LDA is based on the exchange-correlation energy of a homogeneous electron gas. While it can be surprisingly accurate for systems with slowly varying densities (e.g., bulk solids), it fails, sometimes dramatically, in cases where the electron density varies rapidly, such as in molecules or surfaces.

4.2.2Generalized Gradient Approximation

The Generalized Gradient Approximation (GGA) improves upon the LDA by incorporating not only the local electron density but also its gradient n(r)\nabla n(\vec{r}):

ExcGGA[n]=ϵxc(n(r),n(r))n(r)dr.E_{\text{xc}}^{\text{GGA}}[n] = \int \epsilon_{\text{xc}}(n(\vec{r}), \nabla n(\vec{r})) n(\vec r) d\vec{r}.

GGA functionals, such as the Perdew-Burke-Ernzerhof functional, are more accurate for systems with significant density variations, such as molecules, surfaces, and low-dimensional materials. They are widely used in practical DFT calculations.

4.2.3Hybrid Functionals

Hybrid functionals go beyond the local and semi-local approximations by mixing a portion of exact exchange energy from Hartree-Fock theory with the exchange-correlation energy from DFT. The most well-known hybrid functional is B3LYP, which is popular in quantum chemistry for molecular systems. Hybrid functionals generally provide higher accuracy but at a significantly increased computational cost.

4.3Implications and limitations

The Kohn-Sham approximation is a powerful and practical method for implementing Density Functional Theory. By mapping the interacting electron problem onto a non-interacting system with the same electron density, it makes DFT calculations computationally tractable. In fact, DFT with the Kohn-Sham approximation can be applied to systems with hundreds or even thousands of atoms, from molecules to solids, and can provide a good balance between computational cost and accuracy, particularly with carefully chosen exchange-correlation functionals. Unfortunately, the choice of the functional greatly impacts the accuracy of the DFT calculations, and no single functional is universally accurate.

Moreover, for some systems (e.g., strongly correlated materials), DFT may fail to provide reliable results. For instance, the most approximate functionals (e.g., LDA and GGA) suffer from self-interaction errors, where an electron incorrectly interacts with itself. This can lead to inaccuracies, particularly in systems with localized states, such as transition metal oxides and open-shell molecules. Moreover, DFT is primarily a ground-state theory, and its application to excited states is limited. Time-dependent DFT extends DFT to excited states, but it also inherits the limitations of the exchange-correlation functionals used.

5The Hellmann-Feynman Theorem

In the sections above I have sketched a way of solving the electronic problem and obtain the ground-state electron density. However, in ab initio simulations we have to also evolve the positions of the nuclei. How do we connect n(r)n(r) to the forces acting on the nuclear degrees of freedom? The Hellmann-Feynman theorem provides a convenient and efficient way to compute these forces within the framework of quantum mechanics, without requiring the explicit calculation of the derivatives of the wavefunction.

The theorem states that when a system is in an eigenstate of its Hamiltonian, the forces acting on a nucleus can be calculated directly from the electron density and the external potential. This result greatly simplifies force calculations in Density Functional Theory (DFT) and other quantum mechanical methods, making it possible to efficiently perform simulations of atomic motion.

In the context of DFT, the Hellmann-Feynman theorem is particularly useful for calculating the forces on nuclei during molecular dynamics or geometry optimization. The force FI\vec{F}_I on nucleus II is given by the negative gradient of the total energy with respect to the position of that nucleus:

FI=RIEtotal({R})\vec{F}_I = -\nabla_{\vec{R}_I} E_{\text{total}}(\{\vec{R}\})

Consider the full Hamiltonian of a molecular system, Eq. (9): the only two terms that depend on the positions of nuclei {RI}\{\vec{R}_I\} are the nuclear-nuclear and nuclear-electron interactions. Using the Hellmann-Feynman theorem, these two contributions can be explicitly calculated:

FI=n(r)RIV(r;{R})drRIVnuc-nuc({R}),\vec{F}_I = -\int n(\vec{r}) \nabla_{\vec{R}_I} V(\vec{r}; \{\vec{R}\}) \, d\vec{r} - \nabla_{\vec{R}_I} V_\text{nuc-nuc}(\{\vec{R}\}),

where we used the expression of the electron-nucleus interaction in terms of the electron density n(r)n(r) (see Eq. (20)). This result shows that the force on a nucleus depends only on the electron density and the derivative of the external potential (e.g., the Coulomb potential from the other nuclei). Importantly, it does not require calculating the derivatives of the wavefunctions with respect to the nuclear positions, which is computationally demanding.

5.1Implications and limitations

The Hellmann-Feynman theorem is widely used in ab initio simulations for calculating forces on atoms, particularly in DFT and other quantum chemistry methods. The theorem simplifies force calculations by avoiding the need for wavefunction derivatives. This makes molecular dynamics and geometry optimization in DFT feasible for large systems. Interestingly, the forces calculated using the Hellmann-Feynman theorem are exact as long as the wavefunction is an exact eigenstate of the Hamiltonian. In practice, errors can arise if the wavefunction is not fully converged, but these can often be minimized with proper numerical techniques. Moreover, the Hellmann-Feynman theorem also applies to systems where the parameter λ represents something other than nuclear positions, such as the strength of an external field or the variation of a coupling constant, making it a versatile tool.

However, note that in some cases, additional corrections beyond the Hellmann-Feynman theorem are needed. For example, when using approximate wavefunctions or basis sets, so-called Pulay forces arise from using a non-complete basis functions that depend on the nuclear positions, contributing to the total force calculation in methods where the basis set is not fixed (e.g., Gaussian-type orbitals).

This can be demonstrated by using the following derivation (heavily inspired by this answer by Michael F. Herbst). Consider a non-complete basis {f}\{ f \}, which we use to express our wavefunction:

Ψ=icifi,\Psi = \sum_{i} c_i f_i,

where cic_i are constants chosen so that the resulting Ψ is a good approximation of the ground state, and it is associated to an energy EE. Here “good” means that each coefficient cic_i obeys the variational principle:

0=dEdci=2Ψ|H|dΨdci=2Ψ|H|fi0 = \frac{dE}{dc_i} = 2 \left\langle \Psi \middle| H \middle| \frac{d\Psi}{dc_i} \right\rangle = 2 \left\langle \Psi \middle| H \middle| f_i \right\rangle

Since we are interested in forces, instead of taking the derivative of the energy with respect to a generic parameter λ, we derive the energy with respect to the nuclear positions:

dEdR=Ψ|dHdR|Ψ+2Ψ|H|dΨdR.\frac{dE}{d\vec{R}}=\left\langle\Psi \middle|\frac{dH}{d\vec{R}} \middle| \Psi\right\rangle + 2 \left\langle \Psi \middle| H \middle| \frac{d\Psi}{d\vec{R}} \right\rangle.

As we have seen in Proof 3, the second (“Pulay”) term vanishes if Ψ is an eigenstate. Let’s see what happens in this case. Consider as an example the derivative with respect to R1R_1. Its Pulay term is

Ψ|H|(icidfidR1)\left\langle \Psi \middle| H \middle| \left( \sum_i c_i \frac{df_i}{dR_1} \right)\right\rangle

When does this term vanish? There are two possibilities:

  1. The derivatives dfi/dR1df_i/dR_1 are all zero, i.e. if the basis functions are independent of atomic positions, e.g. plane waves.
  2. If the derivatives df1dR1\frac{df_1}{dR_1} and df2dR1\frac{df_2}{dR_1} are themselves basis functions or can be exactly represented by the basis. In this case,
    dfidR1=jkijfi\frac{df_i}{dR_1} = \sum_j k_{ij} f_i
    for some values of kijk_{ij}. Now the Pulay term can be rewritten by leveraging Eq. (41) as
    Ψ|H|(icidfidR1)=Ψ|H|(i,jcikijfj)==i,jcikijΨHfj=0. \begin{aligned} \left\langle \Psi \middle| H \middle| \left( \sum_i c_i \frac{df_i}{dR_1} \right)\right\rangle & = \left\langle \Psi \middle| H \middle| \left( \sum_{i,j} c_i k_{ij} f_j \right)\right\rangle = \\ & = \sum_{i,j} c_i k_{ij} \langle \Psi | H | f_j \rangle = 0. \end{aligned}

6The Car-Parrinello method

The Car-Parrinello method, introduced by Roberto Car and Michele Parrinello in 1985, revolutionized the field of ab initio molecular dynamics (AIMD) by combining classical molecular dynamics with electronic structure calculations based on DFT. The key innovation of this method is its ability to simultaneously evolve both nuclear and electronic degrees of freedom within a unified framework.

As discussed earlier, in traditional Born-Oppenheimer molecular dynamics, the forces on the nuclei are obtained by solving the electronic structure problem for each configuration of the nuclei, possibly with DFT-like approaches. The nuclei are then moved according to classical equations of motion, and this process is repeated iteratively. Although this approach is accurate, it is computationally expensive because a self-consistent field (SCF) calculation must be performed at every MD time step to obtain the forces.

In contrast, the Car-Parrinello molecular dynamics (CPMD) method evolves both the nuclear positions and the electronic wavefunctions simultaneously. This avoids the need for repeated SCF calculations, making the method more efficient while retaining accuracy. The key idea is to treat the Kohn-Sham orbitals as fictitious dynamical variables governed by a classical-like equation of motion, allowing them to follow the nuclear motion without needing to re-optimize the electronic structure at each step.

6.1The equations of motion

The Car-Parrinello method is based on a Lagrangian formulation that includes both the nuclear and electronic degrees of freedom. The total Lagrangian LL for the system is given by:

L=12IMIR˙I2+μ2iψ˙iψ˙iEtot[{ψi},{RI}]+ijλij(ψiψjδij),L = \frac{1}{2} \sum_I M_I \dot{\vec{R}}_I^2 + \frac{\mu}{2} \sum_i \langle \dot{\psi}_i | \dot{\psi}_i \rangle - E_\text{tot}[\{\psi_i\}, \{\vec{R}_I\}] + \sum_{ij} \lambda_{ij} (\langle \psi_i | \psi_j \rangle - \delta_{ij}),

where RI\vec{R}_I are the nuclear positions, MIM_I are the nuclear masses, R˙I\dot{\vec{R}}_I are the nuclear velocities, ψi\psi_i are the Kohn-Sham orbitals, μ is a fictitious mass-like parameter associated with the electronic degrees of freedom, ψ˙i\dot{\psi}_i are the time derivatives (“velocities”) of the Kohn-Sham orbitals, and Etot[{ψi},{RI}]=E[{ψi},{RI}]+Enuc,nuc({RI})E_\text{tot}[\{\psi_i\}, \{\vec{R}_I\}] = E[\{\psi_i\}, \{\vec{R}_I\}] + E_\text{nuc,nuc}(\{\vec{R}_I\}) is the sum of the DFT (Kohn-Sham) energy functional and of the nuclear-nuclear interaction potential.

The first term in Eq. (46) represents the kinetic energy of the nuclei, the second term represents the kinetic energy of the fictitious electronic degrees of freedom, the third term Etot[{ψi},{RI}]E_\text{tot}[\{\psi_i\}, \{\vec{R}_I\}] is the total potential energy, which includes contributions from the nuclear-nuclear, nuclear-electronic, and electron-electron interactions, while the fourth term enforces the orthonormality of the Kohn-Sham orbitals.

From this Lagrangian, one can derive the equations of motion for both the nuclei and the electronic wavefunctions using the Euler-Lagrange formalism:

MIR¨I=Etot[{ψi},{RI}]RI+i,jλijRIψiψjμψ¨i(r)=δEtot[{ψi},{RI}]δψi(r)+jλijψj\begin{aligned} M_I \ddot{\vec{R}}_I & = -\frac{\partial E_\text{tot}[\{\psi_i\}, \{\vec{R}_I\}]}{\partial \vec{R}_I} + \sum_{i,j} \lambda_{ij} \frac{\partial}{\partial \vec R_I} \langle \psi_i | \psi_j \rangle\\ \mu \ddot{\psi}_i(\vec{r}) & = -\frac{\delta E_\text{tot}[\{\psi_i\}, \{\vec{R}_I\}]}{\delta \psi_i^*(\vec{r})} + \sum_{j} \lambda_{ij} \psi_j \end{aligned}

Here, the fictitious mass parameter μ controls the dynamics of the electronic wavefunctions. These equations ensure that the electronic wavefunctions follow the nuclear motion, remaining close to the instantaneous ground state of the electronic structure, without the need for explicit SCF iterations at each time step. Note that in the functional derivative that appears in the electronic equations of motions, only the terms of the non-interacting (Kohn-Sham) Hamiltonian survives, since the nuclear-nuclear interaction does not depend on the ψi\psi_i orbitals.

The constant of motion (i.e. the quantity that is conserved) corresponding to the time-independence of the Car-Parrinello Langrangian is

E=12IMIR˙I2+μ2iψ˙iψ˙i+Etot[{ψ},{R}]Ephys+Te.E = \frac{1}{2} \sum_I M_I \dot{\vec{R}}_I^2 + \frac{\mu}{2} \sum_i \langle \dot{\psi}_i | \dot{\psi}_i \rangle + E_\text{tot}[\{\psi\}, \{\vec{R}\}] \equiv E_\text{phys} + T_e.

As well put in an excellent review,

As the kinetic energy TeT_e of the electronic degrees of freedom will vary with time, so will EphysE_\text{phys}, the total energy of the relevant Born-Oppenheimer system. For a physically meaningful simulation, it is therefore necessary that TeT_e only performs bound oscillations around a constant small value. The interpretation of this state of the system is that the total CP dynamical system consists of two adiabatically decoupled subsystems, the cold electronic degrees of freedom and the nuclear degrees of freedom at the relevant physical temperature. The adiabatic separation will never be complete and it will be necessary to choose the mass parameter μ in a way that the two systems stay decoupled over the full range of the simulation. However, the choice of μ also directly affects the efficiency of the simulation, as the maximal time step of the numerical integration is proportional to μ\sqrt{\mu}. The choice of an optimal value of the free parameter μ in a CP simulation needs care and has been discussed at length [in the literature].

Hutter (2011)

Indeed, the success of the Car-Parrinello method relies on a careful choice of the fictitious mass parameter μ. This parameter determines the timescale on which the electronic wavefunctions evolve. To ensure that the electrons remain adiabatically “slaved” to the nuclei, μ must be small enough that the electronic motion remains much faster than the nuclear motion. However, μ cannot be too small, or the time step required for stable integration of the electronic equations of motion would become prohibitively small. Therefore, μ should be chosen such that the electronic degrees of freedom evolve rapidly enough to stay close to the instantaneous ground state, but not so fast that the computational efficiency is compromised (which is easier said than done).

6.2Implications and limitations

The Car-Parrinello method offers several advantages over traditional approaches (such as those based on the Born-Oppenheimer approximation). By evolving the electronic wavefunctions dynamically, CPMD avoids the need for repeated SCF calculations, leading to significant savings in computational cost, particularly for large systems. Moreover, the method retains the accuracy of DFT-based molecular dynamics since the electronic wavefunctions are kept close to the ground state throughout the simulation. Finally, the treatment of both nuclear and electronic degrees of freedom within a single Lagrangian formalism provides a natural and elegant way to simulate systems where electronic structure and nuclear motion are strongly coupled.

However, the introduction of the fictitious kinetic energy for the electronic wavefunctions means that energy is not strictly conserved over long simulation times. Although this can be controlled by adjusting the fictitious mass μ, it remains a source of error that requires careful monitoring. The need to integrate both the nuclear and electronic equations of motion imposes restrictions on the time step size, which can be smaller than in traditional Born-Oppenheimer molecular dynamics, depending on the choice of μ. The Car-Parrinello method assumes that the electronic wavefunctions remain close to the ground state during nuclear motion. However, in systems where non-adiabatic effects (such as electronic excitations) are important, more advanced methods may be required. Extensions of the Car-Parrinello method have been developed to address some of these limitations. For example, extended Lagrangian methods introduce additional degrees of freedom to improve energy conservation, while time-dependent DFT can be used to model systems where excited-state dynamics play a significant role.

7Running simulations

While I will not provide any example, here I mention some of the most used open-source software packages for performing ab initio molecular dynamics simulations:

Most of these packages can be used to simulate systems at varying (and even mixed) levels of details.

Footnotes
  1. In the derivation I drop the explicit dependence on λ for clarity.

References
  1. Giustino, F. (2014). Materials modelling using density functional theory: properties and predictions. Oxford University Press.
  2. Böttcher, L., & Herrmann, H. J. (2021). Computational Statistical Physics. Cambridge University Press.
  3. Hohenberg, P., & Kohn, W. (1964). Inhomogeneous Electron Gas. Physical Review, 136(3B), B864–B871. 10.1103/physrev.136.b864
  4. Kohn, W., & Sham, L. J. (1965). Self-Consistent Equations Including Exchange and Correlation Effects. Physical Review, 140(4A), A1133–A1138. 10.1103/physrev.140.a1133
  5. Perdew, J. P., Burke, K., & Ernzerhof, M. (1996). Generalized Gradient Approximation Made Simple. Physical Review Letters, 77(18), 3865–3868. 10.1103/physrevlett.77.3865