Skip to article frontmatterSkip to article content

Enhanced sampling

In biophysics, reactions (in the sense given above) are extremely important, and are often the focus of any enhanced sampling technique discussed in the context of biomolecules. However, these same techniques have been very often introduced in, or can be easily adapted to, other contexts[1].

1Collective variables and reaction coordinates

The phase space of a many-body system is a highly-dimensional manifold where each point corresponds to a microstate. As the system evolves in time, the collection of points in phase space it visits form a trajectory that fully describes its evolution. In many cases it is useful to project the trajectory on a space with a lower dimensionality: the resulting quantities can be used to obtain coarse-grained models, but also to describe the essential features of a system in a way that is easier to analyse and visualise.

In the context of molecular simulations, any quantity that is used to describe a system in a coarse-grained fashion is a collective variable (CV), and can be written as a function of (all or a subset of) the microscopic degrees of freedom {r}\dofs. In condensed-matter physics, collective variables are often called order parameters. By contrast, in the biophysics and biochemistry worlds the most used term is reaction coordinate (RC), defined as a parameter or set of parameters that describe the progress of the reaction of interest, serving as a way to quantify and track the changes occurring within a system as it evolves through the reaction pathway. Since we will be focussing on reactions, in the following I will tend to refer to the coarse-grained variable(s) of interest as RC or reaction coordinate(s), but most of what I will say translates naturally to any other CV.

The primary purpose of a reaction coordinate is to provide a simplified description of the system’s thermodynamics, making it possible to monitor and analyze the progress of a reaction in terms of a single or a few variables: by using a reaction coordinate we are reducing the complexity of a many-body system with many degrees of freedom to obtain a simplified description that can be used to investigate the reaction itself, effectively applying a dimensionality reduction procedure. This simplification is essential for understanding the microscopic underpinnings of the reaction of interest. Defining a reaction coordinate makes it possible to draw a diagram such as the one shown in Figure 1, which are often called free-energy profiles or landscapes, where the variation of the free energy along a particular reaction coordinate or collective variable is plotted.

A sketch of the free energy landscape of a system displaying two basins separated by a transition state.

Figure 1:A sketch of the free energy landscape of a system displaying two basins separated by a transition state.

The choice of the RC depends on the specific process being studied and it is not, in general, unique. It can be a simple geometric parameter such as bond length, bond angle, or dihedral angle, or a more complex collective variable that captures the overall structural changes in the system, such as the distance between two key functional groups, the position or coordination number of a particular atom, or the solvent-accessible surface area of a biomolecule. Figure 2 shows two “real-world” examples, one taken from simulations of the oxDNA coarse-grained model, and the other from atomistic simulations of proteins.

(a) Free energy profiles for three different duplexes of length 12 as a function of the number of complementary (native) base pairs of the two strands. The oxDNA simulations for each duplex were run at their respective melting  temperatures, namely 48 ^\circC, 73 ^\circC, and 80 ^\circC. Adapted from . (b) Free-energy surface of the dimerization process of the fibritin foldon domain (PDB code: 1RFO). The two CVs are the distance between the centres of mass of the monomers, R_{MM}, and the number of specific monomer-monomer contacts, N_D. Structures representative of the two free-energy basins N and C (gray) are compared with dimer arrangement in the trimeric experimental structure (red and blue). Adapted from .

Figure 2:(a) Free energy profiles for three different duplexes of length 12 as a function of the number of complementary (native) base pairs of the two strands. The oxDNA simulations for each duplex were run at their respective melting temperatures, namely 48 °C, 73 °C, and 80 °C. Adapted from Šulc et al. (2012). (b) Free-energy surface of the dimerization process of the fibritin foldon domain (PDB code: 1RFO). The two CVs are the distance between the centres of mass of the monomers, RMMR_{MM}, and the number of specific monomer-monomer contacts, NDN_D. Structures representative of the two free-energy basins N and C (gray) are compared with dimer arrangement in the trimeric experimental structure (red and blue). Adapted from Barducci et al. (2013).

Once we have chosen a RC to characterise the reaction of interest, which is in general not an easy task, and an active area of research in itself, we can use it to describe the reaction. We can formally see how if we calculate the partially-integrated partition function (see e.g. Hénin et al. (2022)) by integrating the Boltzmann factor over all the degrees of freedom at constant ξ, viz.:

Q(ξ)=VeβH({r})δ(ξξ({r}))d{r},Q(\xi) = \int_V e^{-\beta H(\dofs)} \delta(\xi - \xi(\dofs)) d\dofs,

where δ()\delta(\cdot) is the Dirac delta distribution function. Note that the same procedure has been already carried out in the context of coarse graining, where the partial integration on the phase space made it possible to obtain a simplified description of the system, showing that the process of dimensionality reduction we apply is strictly the same in the two cases.

In turn, Q(ξ)Q(\xi) can be used to obtain a free-energy profile (sometimes called free-energy surface if ξ is multidimensional) such as the ones presented in Figure 1 and Figure 2, which is defined as

F(ξ)=kBTln(Q(ξ)).F(\xi) = -k_B T \ln (Q(\xi)).

Now consider an observable that can be written as a function of ξ. Its ensemble average can be formally written as

O(ξ({r})=VeβH({r})O(ξ({r}))d{r}VeβH({r})d{r},\langle O(\xi(\dofs) \rangle = \frac{\int_V e^{-\beta H(\dofs)} O(\xi(\dofs)) d\dofs}{\int_V e^{-\beta H(\dofs)} d\dofs},

which can be simplified by using Eq. (1) to

O(ξ)=ξminξmaxO(ξ)Q(ξ)dξξminξmaxQ(ξ)dξ=ξminξmaxO(ξ)P(ξ)dξ,\langle O(\xi) \rangle = \frac{\int_{\xi_{\rm min}}^{\xi_{\rm max}} O(\xi) Q(\xi) d\xi}{\int_{\xi_{\rm min}}^{\xi_{\rm max}} Q(\xi) d\xi} = \int_{\xi_{\rm min}}^{\xi_{\rm max}} O(\xi) P(\xi) d\xi,

where ξmin\xi_{\rm min} and ξmax\xi_{\rm max} correspond to the minimum and maximum values of ξ, and we have defined the marginal probability density

P(ξ)=Q(ξ)ξminξmaxQ(ξ)dξ=Q(ξ)Q,P(\xi) = \frac{Q(\xi)}{\int_{\xi_{\rm min}}^{\xi_{\rm max}} Q(\xi) d\xi} = \frac{Q(\xi)}{Q},

where we have used the partition function Q=ξminξmaxQ(ξ)dξQ = \int_{\xi_{\rm min}}^{\xi_{\rm max}} Q(\xi) d\xi.

2Reactions and rare events

Consider a system that can switch, possibly reversibly, between two macrostates, AA and BB. Here the term macrostate is used loosely to indicate ensembles of microstates where the system resides for times that are much larger than the microscopic characteristic time; in thermodynamic parlance, AA and BB, which are sometimes called basins, should be either metastable or equilibrium states, and therefore separated by a free-energy barrier ΔFb\Delta F_b larger than the thermal energy.

In this context the free-energy barrier[2] between AA from BB, ΔFbAB=FmaxFA\Delta F_b^{A \to B} = F_{\rm max} - F_A, is defined as the difference between the free energy of AA, FAF_A and that of the transition state, FmaxF_{\rm max}, which is the highest free-energy point along the reaction pathway connecting AA to BB[3]. Note that ΔFbAB\Delta F_b^{A \to B} controls not only the probability of jumping from AA to BB, but also the the rate of the reaction, which is proportional to eβΔfbe^{-\beta \Delta f_b} (see e.g. Eyring (1935) and Evans & Polanyi (1935)). See Figure 1 for a graphical definition of these quantities.

It is often the case that what interests us is the reaction itself rather than the AA and BB states, which are often known (and possibly have been characterised) beforehand. In this case, simulations starting from one of the two states, say AA, would remain in AA for sometime, then quickly jump to state BB, where it would again reside for some time before switching basin once again, and so on. If the free-energy barrier between the two basins is large (compared to kBTk_B T), the number of transitions from AA to BB and back will be very small. Therefore, using unbiased simulations[4] to sample the transition itself, for instance to evaluate the free-energy landscape as in Figure 1, requires a large computational effort which is mostly wasted in sampling uninteresting parts of the phase space.

In this part we will understand how the sampling of the transition from AA to BB can be enhanced by using advanced computational techniques collectively known as rare event sampling techniques, which are methods used also outside the context of molecular models (see Le Priol et al. (2024) for an interesting example on climate-change-related extreme events).

3Umbrella Sampling

The first technique I will present is the venerable umbrella sampling (US), which was introduced in the Seventies by Torrie and Valleau.

The basic idea behind umbrella sampling is to bias the system along a chosen reaction coordinate by adding a so-called biasing potential[5] that confines the system to different regions (also known as windows) along that coordinate. By running multiple simulations with biasing potentials centred on different points along the reaction coordinate, the entire range of interest can be sampled. After sufficient sampling is done in each window, the bias introduced by the additional potentials can be removed to obtain the unbiased free energy profile (or any other observable of interest) along the reaction coordinate.

A typical umbrella sampling simulation thus comprises several steps, which I will discuss separately.

3.1Choosing the reaction coordinate

This is arguably the most important step, since choosing a sub-optimal RC can sometimes massively increase the required simulation time. Fortunately, most of the times the choice is either obvious (e.g. the concentration of the product in a chemical reaction), or dictated by the observable(s) of interest (see below for an example).

3.2Selecting a biasing potential

The role of the biasing potential Vbias(ξ)V^{\rm bias}(\xi) is to confine a system within a (usually rather narrow) region of the reaction coordinate. As such it must be a function of the reaction coordinate(s) only, without any explicit dependence on any of the microscopic {r}\dofs. The most common choice is a harmonic potential, whose shape gives the method its name and usually takes the form

Vbias(ξ)=12K(ξξˉ)2,V^{\rm bias}(\xi) = \frac{1}{2} K (\xi - \bar\xi)^2,

where ξˉ\bar\xi is the position of the minimum of the potential and KK is the spring constant. Other choices are possible (see e.g. here or here for examples of biases that are not differentiable and therefore can only be used in Monte Carlo simulations).

3.3Partitioning the reaction coordinate into windows

Next, we need to split the range of interest, [ξmin,ξmax][\xi_{\rm min}, \xi_{\rm max}], into windows. The most common strategy is to divide the reaction coordinate into equispaced windows centred on ξ1,ξ2,ξ3,\xi_1, \xi_2, \xi_3, \ldots, with i[1,N]i \in [1, N], where NN is the total number of windows (and hence of independent simulations). The distance between two neighbouring windows, Δξi=ξi+1ξi\Delta \xi_i = \xi_{i + 1} - \xi_i, which is often taken as a constant, should be chosen carefully: on one hand it should be as large as possible to make NN as small as possible; on the other hand, Δξ\Delta \xi should be chosen so that there is some overlap between adjacent windows to prevent discontinuities in the free energy profile. This is to ensure that the neighboring windows provide sufficient sampling for accurate reweighting. An often good-enough first estimate can be made by assuming that P(ξi)P(ξi+1)P(\xi_i) \approx P(\xi_{i + 1}), and then by choosing a Δξ\Delta \xi-value for which Vbias(Δξ)V^{\rm bias}(\Delta \xi) is of the order ot kBTk_B T, i.e. that the value of the biasing potential calculated in the midpoint separating two neighbouring windows is of the order of the thermal energy.

In practice, the number, size and spacing of the windows depends on the curvature of the free energy profile along the reaction coordinate, which is not known beforehand. Smaller windows may be needed in regions with steep gradients or large energy barriers, while larger windows may suffice in more gradually changing regions. Fortunately, given the independent nature of the simulations that run in each window, the partitioning can be improved upon a posteriori: if one realises that the explored range is not sufficient, it can be extended by adding simulations with biasing potentials centred beyond ξmin\xi_{\rm min} and/or ξmax\xi_{\rm max}. Sampling can also be improved by adding simulations in regions of the RC where the P(R)P(R) is steeper.

At the end of this procedure, each window will be assigned a biasing potential Vibias(ξ)V^{\rm bias}_i(\xi).

3.4Sampling

Molecular dynamics or Monte Carlo simulations are performed within each window, allowing the system to equilibrate and sample configurations consistent with the biasing potential. In general ensuring that a given window has converged is not necessarily straightforward, but can be done by techniques such as block averaging.

3.5Reweighting and combining the data

In the final step we gather the data from each window and combine it together to calculate the unbiased quantities of interest. I will first show how to unbias the data from each window, and then how to join all the results together.

In analogy with Eq. (5) we can defined a biased marginal probability density for the ii-th windows, Pib(ξ)P^b_i(\xi), as

Pib(ξ)=Qib(ξ)ξminξmaxQib(ξ)dξ=Veβ(H({r})+Vibias(ξ))δ(ξξ({r}))d{r}Veβ(H({r})+Vibias(ξ))d{r},P^b_i(\xi) = \frac{Q^b_i(\xi)}{\int_{\xi_{\rm min}}^{\xi_{\rm max}} Q^b_i(\xi) d\xi} = \frac{\int_V e^{-\beta (H(\dofs) + V^{\rm bias}_i(\xi))} \delta(\xi - \xi(\dofs)) d\dofs}{\int_V e^{-\beta (H(\dofs) + V^{\rm bias}_i(\xi))} d\dofs},

where the biased partially-integrated partition function Qib(ξ)Q^b_i(\xi) has also been defined. We note that the biasing factor eβVibias(ξ)e^{-\beta V^{\rm bias}_i(\xi)} depends only on ξ and therefore, since integration is performed on all degrees of freedom but ξ, can be moved outside of the integral. If we do so and then multiply and divide by QQ we obtain

Pib(ξ)=eβVibias(ξ)VeβH({r})δ(ξξ({r}))d{r}Veβ(H({r})+Vibias(ξ))d{r}QiQi==eβVibias(ξ)VeβH({r})δ(ξξ({r}))d{r}QQVeβ(H({r})+Vibias(ξ))d{r}==eβVibias(ξ)Pi(ξ)1eβVibias(ξ),\begin{aligned} P^b_i(\xi) & = e^{-\beta V^{\rm bias}_i(\xi)} \frac{\int_V e^{-\beta H(\dofs)} \delta(\xi - \xi(\dofs)) d\dofs}{\int_V e^{-\beta (H(\dofs) + V^{\rm bias}_i(\xi))} d\dofs} \frac{Q_i}{Q_i} = \\ & = e^{-\beta V^{\rm bias}_i(\xi)} \frac{\int_V e^{-\beta H(\dofs)} \delta(\xi - \xi(\dofs)) d\dofs}{Q} \frac{Q}{\int_V e^{-\beta (H(\dofs) + V^{\rm bias}_i(\xi))} d\dofs} = \\ & = e^{-\beta V^{\rm bias}_i(\xi)} P_i(\xi) \frac{1}{\left\langle e^{-\beta V^{\rm bias}_i(\xi)}\right\rangle}, \end{aligned}

where \langle \cdot \rangle represents an unbiased ensemble average and Pi(ξ)P_i(\xi) is the marginal probability density of the ii-th window. Note that, being an ensemble average, eβVibias(ξ)1\left\langle e^{-\beta V^{\rm bias}_i(\xi)}\right\rangle^{-1} does not depend on ξ, and therefore it is a (in general unknown) constant[6]. As a consequence, we can obtain the unbiased marginal probability density up to a multiplicative constant:

Pi(ξ)=Pib(ξ)eβVibias(ξ)Pi(ξ).\Pc_i(\xi) = P^b_i(\xi) e^{\beta V^{\rm bias}_i(\xi)} \propto P_i(\xi).

where I use the symbol Pi(ξ)\Pc_i(\xi) in place of Pi(ξ)P_i(\xi), since the former is unnormalised and therefore not a proper probability density. This procedure is known as unbiasing, and it is a special case of histogram reweighting[7]. Applying Eq. (10) yields NN functions Pi(ξ)\Pc_i(\xi) that are shifted relative to each other because of the unknown constant. The total P(ξ)\Pc(\xi) can be recoverd by stitching together all the Pi(ξ)\Pc_i(\xi), utilising the regions of the ξ-space where each pair of windows overlap significantly to find the unknown multiplying constants[8]. There are several methods available to perform this task. Here I will present two such methods: a simple least-squares fit and the (much more powerful) WHAM.

3.5.1Least-squares method

Consider two windows ii and jj (with ji=1|j - i| = 1), whose unnormalised marginal probability densities overlap in a ξ-region {ξo}\lbrace \xi_o \rbrace. We want to find the constant CijC_{ij} that, multiplying Pj,ζ\Pc_{j,\zeta}, minimises the mean-squared error between the two overlapping portions of the histograms, which is defined as

MSEij=ζξo(Pi,ζCijPj,ζ)2.{\rm MSE}_{ij} = \sum_{\zeta \in {\xi_o}} \left(\Pc_{i, \zeta} - C_{ij} \Pc_{j, \zeta} \right)^2.

Imposing dMSEijdCij=0\frac{d {\rm MSE}_{ij}}{d C_{ij}} = 0 we find

Cij=ζξoPi,ζPj,ζζξoPj,ζ2.C_{ij} = \frac{\sum_{\zeta \in {\xi_o}} \Pc_{i, \zeta} \Pc_{j, \zeta}}{\sum_{\zeta \in {\xi_o}} \Pc_{j,\zeta}^2}.

In practice, with this method the 0-th window data are unchanged, while all the subsequent ones are rescaled one after the other by repeatedly applying Eq. (12). Once the final histogram Pζ\Pc_\zeta is obtained, we can either normalise it (if we need a proper probability density), or use it to compute the associated free-energy profile

Fζ=kBTlnPζ+const,F_\zeta = -k_B T \ln \Pc_\zeta + {\rm const},

where I made explicit the fact that classical free energies are always specified up to an additive constant, which can be chosen freely. If we are interested in a reaction between states AA and BB, it is common to set F(ξA)F(\xi_A) or F(ξB)F(\xi_B) to 0, while for potentials of mean force (see for instance the example below) it is customary to set limξF(ξ)=0\lim_{\xi \to \infty} F(\xi) = 0.

3.5.2The Weighted Histogram Analysis Method (WHAM)

WHAM is a widely used reweighting technique for combining data from multiple biased simulations to obtain an unbiased estimate of the free energy profile. The basic idea behind WHAM is to reweight the probability distributions obtained from each window simulation such that they are consistent with each other and with the unbiased distribution. The reweighting process involves applying a set of equations that account for the biasing potentials applied in each window and the overlap between adjacent windows.

The following derivation has been inspired by Guillaume Bouvier’s “traditional” derivation, which in turn is based on the original one.

We start by defining some accessory quantities that make it possible to generalise the method beyond a specific Umbrella Sampling to any case where one wants to join multiple overlapping histograms:

Using the above definitions, the best estimate for the unbiased probability of the ζ-th bin of simulation ii is

Pi,ζ=Pi,ζbui,ζfi=mi,ζMiui,ζfi.P_{i,\zeta} = \frac{P^b_{i,\zeta}}{u_{i,\zeta}f_i} = \frac{m_{i,\zeta}}{M_i u_{i,\zeta} f_i}.

We assume that pζp^\circ_\zeta can be written as a weighted sum of all the reweighted histograms, Pi,ζP_{i,\zeta}, as follows:

pζ=iωi,ζPi,ζ,p^\circ_\zeta = \sum_i \omega_{i,\zeta} P_{i,\zeta},

where the set of weights {ωi,ζ}\lbrace \omega_{i,\zeta} \rbrace are normalised, e.g. iωi,ζ=1\sum_i \omega_{i,\zeta} = 1. The WHAM method boils down to ensuring that the weights are chosen so as to minimise the expected variance of pζp^\circ_\zeta, which is defined as

Varpζ=(pζpζ)2=(iωi,ζ(Pi,ζPi,ζ))2,\Var{p^\circ_\zeta} = \left\langle \left( p^\circ_\zeta - \langle p^\circ_\zeta \rangle \right)^2 \right\rangle = \left\langle \left( \sum_i \omega_{i,\zeta} (P_{i,\zeta} - \langle P_{i,\zeta} \rangle) \right)^2 \right\rangle,

where the average is taken over all the windows that have sampled ξζ\xi_\zeta. Defining δPi,ζPi,ζPi,ζ\delta P_{i,\zeta} \equiv P_{i,\zeta} - \langle P_{i,\zeta} \rangle we obtain

Varpζ=(iωi,ζδPi,ζ)2==iωi,ζ2δPi,ζ2+2jkωj,ζωk,ζδPj,ζδPk,ζ==iωi,ζ2δPi,ζ2+2jkωj,ζωk,ζδPj,ζδPk,ζ\begin{aligned} \Var{p^\circ_\zeta} & = \left\langle \left( \sum_i \omega_{i,\zeta} \delta P_{i,\zeta} \right)^2 \right\rangle =\\ & = \left\langle \sum_i \omega^2_{i,\zeta} \delta P^2_{i,\zeta} \right\rangle + 2 \left\langle \sum_{j \neq k} \omega_{j,\zeta} \omega_{k,\zeta} \delta P_{j,\zeta} \delta P_{k,\zeta} \right\rangle = \\ & = \sum_i \omega^2_{i,\zeta} \left\langle \delta P^2_{i,\zeta} \right\rangle + 2 \sum_{j \neq k} \omega_{j,\zeta} \omega_{k,\zeta} \left\langle \delta P_{j,\zeta} \delta P_{k,\zeta} \right\rangle \end{aligned}

We note that δPi,ζ2=Var(Pi,ζ)\left\langle \delta P^2_{i,\zeta} \right\rangle = \Var(P_{i,\zeta}). Assuming that simulations jj and kk are uncorrelated[9], δPj,ζδPk,ζ=0\left\langle \delta P_{j,\zeta} \delta P_{k,\zeta} \right\rangle = 0 and therefore

Varpζ=iωi,ζ2Var(Pi,ζ).\Var{p^\circ_\zeta} = \sum_i \omega^2_{i,\zeta} \Var(P_{i,\zeta}).

We can minimise the variance with respect to the set of weights {ωi,ζ}\lbrace \omega_{i,\zeta} \rbrace subjects to the constraints iωi,ζ=1\sum_i \omega_{i,\zeta} = 1 by using a Lagrange multiplier λζ\lambda_\zeta. The quantity to minimise is therefore

Liωi,ζ2Var(Pi,ζ)+λζiωi,ζ,L \equiv \sum_i \omega^2_{i,\zeta} \Var(P_{i,\zeta}) + \lambda_\zeta \sum_i \omega_{i,\zeta},

whose derivative reads

Lωj,ζ=2ωj,ζVar(Pj,ζ)+λζ\frac{\partial L}{\partial \omega_{j,\zeta}} = 2 \omega_{j,\zeta} \Var(P_{j,\zeta}) + \lambda_\zeta

whence we obtain

ωj,ζ=λζ2Var(Pj,ζ).\omega_{j,\zeta} = -\frac{\lambda_\zeta}{2 \Var(P_{j,\zeta})}.

Applying the normalisation constraint we find

jωj,ζ=jλζ2Var(Pj,ζ)=1,\sum_j \omega_{j,\zeta} = - \sum_j \frac{\lambda_\zeta}{2 \Var(P_{j,\zeta})} = 1,

which can be solved for λζ\lambda_\zeta, yielding

λζ=2jVar(Pj,ζ).\lambda_\zeta = -2 \sum_j \Var(P_{j,\zeta}).

Using this latter relation in Eq. (23) gives the optimal value for weight ωj,ζ\omega_{j,\zeta}:

ωj,ζ=kVar(Pk,ζ)Var(Pj,ζ).\omega_{j,\zeta} = \frac{\sum_k \Var(P_{k,\zeta})}{\Var(P_{j,\zeta})}.

We now need to write Var(Pk,ζ)\Var(P_{k,\zeta}) in terms of the simulation output. Recalling that Var(aX)=a2Var(x)\Var(aX) = a^2 \Var(x) and using Eq. (16) we can write

Var(Pk,ζ)=Var(mk,ζ)Mk2uk,ζ2fk2.\Var(P_{k,\zeta}) = \frac{\Var(m_{k,\zeta})}{M^2_k u^2_{k,\zeta} f^2_k}.

The variance of the original histograms mk,ζm_{k,\zeta} can be approximated by first considering that the probability of having mm counts in a specific histogram bin is given by the binomial distribution,

P(m)=(Mm)pm(1p)Mm,P(m) = \binom{M}{m} p^m (1 - p)^{M - m},

where pp is the probability of the event associated with the bin and MM is the total number of counts stored in the histogram. According to the Poisson limit theorem, in the limit of large MM and small pp[10] the binomial distribution can be approximated by the Poisson distribution

P(m)=eMp(Mp)mm!,P(m) = e^{-Mp} \frac{(Mp)^m}{m!},

whose mean and variance are both equal to MpMp. We connect this result with our derivation by noting that the probability pp for bin ζ and simulation ii is the biased probability qi,ζb=ui,ζpζfiq^b_{i,\zeta} = u_{i,\zeta} p^\circ_\zeta f_i, and therefore

Var(mk,ζ)=Miui,ζpζfi.\Var(m_{k,\zeta}) = M_i u_{i,\zeta} p^\circ_\zeta f_i.

Using this relation in Eq. (27) yields

Var(Pk,ζ)=pζMkuk,ζfk,\Var(P_{k,\zeta}) = \frac{p^\circ_\zeta}{M_k u_{k,\zeta} f_k},

which gives for the weights

ωj,ζ=pζMjuj,ζfjkpζMkuk,ζfk=Mjuj,ζfjkMkuk,ζfk.\omega_{j,\zeta} = \frac{p^\circ_\zeta M_j u_{j,\zeta}f_j}{\sum_k p^\circ_\zeta M_k u_{k,\zeta} f_k} = \frac{M_j u_{j,\zeta}f_j}{\sum_k M_k u_{k,\zeta} f_k}.

Substituting these weights in Eq. (17) and using the fact that Mjuj,ζfjPj,ζ=mj,ζM_j u_{j,\zeta}f_j P_{j,\zeta} = m_{j,\zeta} (see Eq. (16)) we obtain the WHAM set of equations

pζ=jMjuj,ζfjPj,ζkMkuk,ζfk=jmj,ζkMkuk,ζfk.p^\circ_\zeta = \frac{\sum_j M_j u_{j,\zeta}f_j P_{j,\zeta}}{\sum_k M_k u_{k,\zeta} f_k} = \frac{\sum_j m_{j,\zeta}}{\sum_k M_k u_{k,\zeta} f_k}.

Note that this is a system of #bins\#_{\rm bins} non-linear equations, which are complemented by the #sims\#_{\rm sims} equations (15) that define the fif_i. This sytem of equations can be solved either with non-linear solvers (such as scipy’s fsolve) or iteratively, setting the {fi}\lbrace f_i \rbrace to some initial values (e.g. all equal to 1), using them to evaluate the {pζ}\lbrace p^\circ_\zeta \rbrace through Eq. (33), which in turn are used to update {fi}\lbrace f_i \rbrace, and so on, repeating the process until convergence is achieved.

3.6A real-world example

As discussed in the chapter on coarse-grained force fields, the effective interaction between two objects composed of multiple interacting units (atoms, molecules or beads) can be estimated in the dilute limit (i.e. at low density) as

Ueff(R)=kBTlng(R),U_{\rm eff}(R) = -k_B T \ln g(R),

where RR is the distance between the two objects (defined e.g. as the distance between the two centres of mass) and g(R)g(R) is the associated radial distribution function. For complicated objects composed of many parts, estimating g(R)g(R) with sufficient accuracy through means of unbiased simulations requires an ungodly amount of computation time and is therefore unfeasible. Here I show an example where this issue has been overcome by using umbrella sampling.

Using the language introduced in this section, RR is the reaction coordinate and g(R)=P(R)/4πR2g(R) = P(R) / 4 \pi R^2 the observable of interest, where P(R)P(R) is the marginal probability density.

A simulation snapshot showing the two chains (coloured differently).

(a)A simulation snapshot showing the two chains (coloured differently).

The biased radial distribution functions, with each colour corresponding to a different simulation.

(b)The biased radial distribution functions, with each colour corresponding to a different simulation.

The unbiased radial distribution functions, with each colour corresponding to a different simulation.

(c)The unbiased radial distribution functions, with each colour corresponding to a different simulation.

The reconstructed effective interaction

(d)The reconstructed effective interaction

Figure 3:The results of umbrella sampling simulations: the raw data is unbiased and then combined together to yield the final free-energy profile. Here the reaction coordinate RR is the distance between the centres of mass of two polymers.

Figure 3 shows the results of umbrella sampling simulations of a system composed of two polymer chains, where the chosen reaction coordinate is the distance between the two centres of mass, RR, and the final output is the effective chain-chain interaction as a function of RR. Figure 3a shows a snapshot of the two chains, Figure 3b shows the raw (biased) gib(R)g^b_i(R) data for all the windows ii, and Figure 3c contains the giu(R)g^u_i(R), unbiased according to Eq. (10). Finally, Figure 3d contains the effective interaction, obtained with the WHAM method and shifted so that it vanishes at large distances.

4Metadynamics

Umbrella sampling is a technique that is relatively simple ot implement and works well in many real-world systems. However, it suffers from several drawbacks when applied to non-trivial systems. For instance, it scales badly with the number of collective variables, and requires a trial-and-error approach to determine the spring constants and window partitioning. Among the plethora of methods developed to overcome some (or all) of these limitations, here I will present one of the most successfull and used ones: metadynamics, which was introduced for the first time in Laio & Parrinello (2002).

In metadynamics simulations, the system is pushed away from the regions of the phase space where it would tend to reside by adding a history-dependent biasing potential. In this way, if the simulation is long enough, the system will diffuse over an almost flat biased free-energy profile. The original information (i.e. the free-energy profile of the unbiased system) can be reconstructed from the “learned” biased potential or from the histogram of the biased CVs, depending on the metadynamics flavour employed (more on this latter).

In practice, in ordinary metadynamics, after the set of collective variables has been chosen, an unbiased simulation is started (often from a configuration that is within a basin). Then, every fixed amount of time steps, the bias potential is updated as follows:

Vt+1bias(ξ)=Vtbias(ξ)+wexp((ξtξ)22σ2),V_{t + 1}^\text{bias}(\xi) = V_{t}^\text{bias}(\xi) + w \exp{\left( -\frac{(\xi_t - \xi)^2}{2 \sigma^2} \right)},

where tt measures time by counting the number of times that the bias potential has been updated, Vtbias(ξ)V_{t}^\text{bias}(\xi) is the bias potential at time tt, ξt\xi_t is the value of the CV(s) at time tt, and ww and σ are two parameters which set the height and width of the gaussians that are used to progessively fill the wells of the free-energy profile. Leveraging Eq. (35), the bias potential at time tt can be written as

Vtbias(ξ)=wt<texp((ξtξ)22σ2).V_{t}^\text{bias}(\xi) = w \sum_{t' < t} \exp{\left( -\frac{(\xi_{t'} - \xi)^2}{2 \sigma^2} \right)}.
Example of a metadynamics simulation in a one-dimensional model potential. The time t is measured by counting the number of Gaussians deposited. (Top) Time evolution of the collective variable s during the simulation. (Bottom) Representation of the progressive filling of the “true” underlying (“true”) potential (thick line) by means of the Gaussians deposited along the trajectory. The sum of the underlying potential and of the metadynamics bias is shown at different times (thin lines). Taken from .

Figure 4:Example of a metadynamics simulation in a one-dimensional model potential. The time tt is measured by counting the number of Gaussians deposited. (Top) Time evolution of the collective variable ss during the simulation. (Bottom) Representation of the progressive filling of the “true” underlying (“true”) potential (thick line) by means of the Gaussians deposited along the trajectory. The sum of the underlying potential and of the metadynamics bias is shown at different times (thin lines). Taken from Barducci et al. (2011).

To see how the bias affects the exploration of the phase space of a simple system, consider the simple one-dimensional potential with three minima AA, BB, and CC presented in the bottom panel of Figure 4. The system is initialised to be in the local minimum BB, where it would most likely remain if sampled with unbiased simulations, since the barriers separating it from the other minima are larger than 10  kBT10 \; k_BT. However, if metadynamics is used, the deposited Gaussians make the bias potential grow, until for t135t \approx 135 the system is pushed out of BB, moving into AA, which is the closest basin (in terms of barrier height). Here, the Gaussians accumulation starts anew, filling the underlying free-energy basin at t430t \approx 430. At this point, the system can freely diffuse in the region around AA and BB. Eventually (t810t \approx 810), the bias potential fills this region, and the system can easily access also the basin of CC. Finally, at t1650t \approx 1650 also this minimum is compensanted by the bias, and the dynamics of the system becomes diffusive over the whole free-energy profile (see the top panel of Figure 4).

This simple example illustrates the main features of metadynamics:

  1. It accelerates the sampling of the phase space by pushing the system away from local free-energy minima.
  2. It make it possible to efficiently explore reaction pathways, since the system tends to escape the minima passing through the lowest free-energy saddle points.
  3. No a priori knowledge of the landscape is required, at variance with umbrella sampling, where the optimal choice of the bias stiffness KK and of the number of windows NN depend on the local steepness of the free energy and of the CV range of interest.
  4. The method is intrinsically parallelisable. Indeed, multiple simulations can be run to independently explore the phase space, and each simulation contributes to the bias potential, which is shared among all “walkers” (Raiteri et al. (2005)).

After a transient, the bias potential VbiasV^\text{bias} and the underlying free energy are formally connected by the following relationship

limtVtbias(ξ)=F(ξ)+C,\lim_{t \to \infty} V_t^\text{bias}(\xi) = -F(\xi) + C,

where CC is an immaterial constant. In reality, the long-time VbiasV^\text{bias} oscillates around the real free energy rather than converging to it, since the deposition of the Gaussians continue even when the dynamics become diffusive. Moreover, it is not always obvious when simulations should be stopped.

These two issues can be solved by well-tempered metadynamics, introduced in Barducci et al. (2008). With this variation, the height of the Gaussian to be deposited at the given point ξ decreases exponentially with the strength of the bias potential already present in ξ. In other words, Eq. (35) becomes

Vt+1bias(ξ)=Vtbias(ξ)+wexp(Vtbias(ξ)kBΔT)exp((ξtξ)22σ2),V_{t + 1}^\text{bias}(\xi) = V_{t}^\text{bias}(\xi) + w \exp{\left( -\frac{V_{t}^\text{bias}(\xi)}{k_B \Delta T} \right)} \exp{\left( -\frac{(\xi_t - \xi)^2}{2 \sigma^2} \right)},

where ΔT\Delta T, which has the dimension of a temperature, controls the rate at which the height decreases, which is always inversely proportional to the time spent by the simulation in the CV point where the Gaussian is to be deposited. This inverse relationship guarantees that the bias potential converges. However, in this case the relationship between the long-time potential bias and the free energy, which is given by Eq. (37) in ordinary metadynamics, becomes

limtVtbias(ξ)=ΔTT+ΔTF(ξ)+C,\lim_{t \to \infty} V_t^\text{bias}(\xi) = -\frac{\Delta T}{T + \Delta T} F(\xi) + C,

which suggests that ΔT\Delta T has a double role: it decreases the height of the deposited Gaussians, therefore damping the sampling fluctuations, but it also controls the effective temperature at which the CV is sampled. Ordinary metadynamics is recovered in the ΔT\Delta T \to \infty limit, while the ΔT0\Delta T \to 0 limit corresponds to ordinary MD simulations.

In ordinary metadynamics, after an initial (transient) time tfillt_\text{fill}, the free-energy minima are filled with the bias potential and the free-energy profile is essentially flat. However, VtbiasV_t^\text{bias} does not converge to the free energy since the process of Gaussian depositing never stops. However, the profile of the bias potential retains, on average, the same shape. Therefore, the best estimator for the free-energy profile at time tt is the time-average of VtbiasV_t^\text{bias} rather than its its instantaneous value[11], e.g.:

Ft(ξ)=kBTlog(1ttfillt=tfilltVtbias(ξ)).F_t(\xi) = - k_B T \log\left( \frac{1}{t - t_\text{fill}} \sum_{t' = t_\text{fill}}^t V_{t'}^\text{bias}(\xi)\right).

The error associated to Eq. (40) can be estimated by using a block analysis. With this technique, the equilibration part of the simulation is discarded, and the rest is split in NbN_b blocks. The standard error associated to the average bias potentials of the blocks is computed as a function of NbN_b. If ttfillt - t_\text{fill} is sufficiently long, the error estimate will be approximately independent of the number of the blocks, and therefore can be used to assess the statistical accuracy of the free energy profile.

Schematic representation of error calculation in ordinary (upper) and well-tempered (lower) metadynamics. The initial part of the simulations (dashed profiles) sould always be discarded, while block analysis is applied over the rest to estimate the final free-energy profile. For ordinary and well-temperered metadynamics, the block analysis is performed on the bias potentials and on the histograms of the biased CVs, respectively. Taken from the Supplemetary Information of .

Figure 5:Schematic representation of error calculation in ordinary (upper) and well-tempered (lower) metadynamics. The initial part of the simulations (dashed profiles) sould always be discarded, while block analysis is applied over the rest to estimate the final free-energy profile. For ordinary and well-temperered metadynamics, the block analysis is performed on the bias potentials and on the histograms of the biased CVs, respectively. Taken from the Supplemetary Information of Bussi & Laio (2020).

This procedure is sketched in the top part of Figure 5.

In well-tempered metadynamics, the final free-energy profile can be extracted from the final bias potential by exploiting Eq. (39). However, a better way, which also makes it possible to estimate the bin-wise error, is to perform a block analysis (described above) to the histogram of the biased CV H(ξ)H(\xi). With this analysis, Eq. (39) is replaced by[12]

F(ξα)=kBTlogH(ξα)Vbias(ξα),F(\xi_\alpha) = - k_B T \log H(\xi_\alpha) - V^\text{bias}(\xi_\alpha),

where the fact that we are dealing with histograms is made explicit by the α subscript. The error on the free energy of bin α is the standard error associated to the histogram block average converted to the error on the free-energy which, following Bussi & Laio (2020), can be written as

1NbVarα[log(eVi+1bias(ξα)kBΔTeVibias(ξα)kBΔT)],\frac{1}{\sqrt{N_b}} \sqrt{\Var_\alpha\left[ \log \left( e^{\frac{V_{i + 1}^\text{bias}(\xi_\alpha)}{k_B \Delta T}} - e^{\frac{V_{i}^\text{bias}(\xi_\alpha)}{k_B \Delta T}} \right) \right]},

where ii is the block index and the variance is computed across all blocks. This procedure is presented in the bottom part of Figure 5.

5Thermodynamic & Hamiltonian integration

The rare-event methods discussed above make it possible to map the free energy profile along one or more reaction coordinates, and to characterise the free-energy barriers and intermediates that separate the (meta)stable states. However, in many cases we are interested only in the free-energy difference between two states that can be connected by a reversible transformation (e.g., turning off interactions, changing the positions of nature of atoms or molecules, changing the thermodynamic parameters such as temperature or density).

Here I will give a brief introduction to numerical techniques that address this challenge by introducing a structured approach to systematically connect different states of a system. Both methods rest on the same idea: free energy differences can be obtained by “transforming” one state into another while carefully accounting for the changes encountered along the way. Instead of attempting to capture the full complexity of a system’s phase space in a single calculation, these methods break the problem into manageable steps, enabling precise and controlled exploration of the underlying thermodynamics.

5.1Thermodynamic integration

Given two distinct states AA and BB differing for their thermodynamic conditions (temperature, pressure, etc.), the thermodynamic integration (TI) method makes it possible to compute their free energy difference, FBFAF_B - F_A. The idea is to define a reversible path in the space of the thermodynamic variables that connects AA and BB[13], and then run simulations along this path to estimate the derivatives of the free energy that can be integrated to yield the free energy difference, since

F(x2)F(x1)=x1x2Fxdx,F(x_2) - F(x_1) = \int_{x_1}^{x_2} \frac{\partial F}{\partial x} dx,

where xx is the thermodynamic variable that varies along the path.

A (P-T) phase diagram featuring a gas-liquid phase transition ending in a critical point (marked with a diamond). The two points A and B are connected by three paths (blue dashed lines). Two paths (1 and 3) are reversible, but in path 3 both P and T vary simultaneously, making it less useful for TI applications. By contrast, path 2 crosses the phase transition, and therefore it is not reversible.

Figure 6:A (PT)(P-T) phase diagram featuring a gas-liquid phase transition ending in a critical point (marked with a diamond). The two points AA and BB are connected by three paths (blue dashed lines). Two paths (1 and 3) are reversible, but in path 3 both PP and TT vary simultaneously, making it less useful for TI applications. By contrast, path 2 crosses the phase transition, and therefore it is not reversible.

Figure 6 shows examples of reversible and irreversible paths in a (P,T)(P, T) phase diagram. Although paths 1 and 3 are both reversible and therefore in principle, amenable to TI, in practice it is more convenient to change a single thermodynamic variable at a time as done in path 1.

For one-component systems, common thermodynamic variables are temperature TT, pressure PP, number of particles NN, volume VV, density ρ, and a state is defined by two thermodynamic variables (often either (P,T)(P, T), (ρ,T(\rho, T) or (V,T(V, T)). The integration can follow three directions:

5.2Hamiltonian integration

With Hamiltonian integration (HI), it is possible to estimate the free energy difference between systems that are in the same thermodynamic state but have different Hamiltonians, HAH_A and HBH_B. This is accomplished by introducing a new Hamiltonian H(λ)H(\lambda) that uses a parameter λ to interpolate between HAH_A and HBH_B. The most common form used in HI is

H(λ)=HA+λ(HBHA),H(\lambda) = H_A + \lambda (H_B - H_A),

so that H(0)=HAH(0) = H_A and H(1)=HBH(1) = H_B. The free energy of the system becomes a function of λ, viz.

F(N,V,T,λ)=kBTlog[qNN!exp(βH(λ))d{r}]F(N, V, T, \lambda) = -k_B T \log \left[ \frac{q^N}{N!} \int \exp(-\beta H(\lambda)) d\{ \vec r \} \right]

where qq is the “molecular” partition function (which includes the momenta contributions). Deriving Eq. (48) with respect to λ yields

F(N,V,T,λ)λ=H(λ)λN,V,T,λ,\frac{\partial F(N, V, T, \lambda)}{\partial \lambda} = \left\langle \frac{\partial H(\lambda)}{\partial \lambda} \right\rangle_{N, V, T, \lambda},

which can be integrated to obtain the free energy difference between AA and BB:

FB(N,V,T)FA(N,V,T)=F(N,V,T,1)F(N,V,T,0)==01H(λ)λN,V,T,λdλ.\begin{split} F_B(N, V, T) - F_A(N, V, T) & = F(N, V, T, 1) - F(N, V, T, 0) = \\ & = \int_0^1 \left\langle \frac{\partial H(\lambda)}{\partial \lambda} \right\rangle_{N, V, T, \lambda} d\lambda. \end{split}

Note that the above equation is formally identical if PP is kept fixed instead of VV, i.e. in the isobaric-isothermal ensemble.

5.3A complete example

Advances in materials science and nanotechnology show that the specificity of the Watson-Crick mechanism offers numerous possibilities for realizing nano- and mesoscopic supramolecular constructs entirely made up of DNA. These can be used as testing grounds for theoretical predictions (e.g. Biffi et al. (2013)), to build novel materials and devices for technological applications (e.g. Liu et al. (2024)), but also as a biophysical tool (see e.g. Biffi et al. (2013) and Samanta et al. (2024)).

A simple but powerful framework to use DNA to build materials is provided by DNA nanostars, which are DNA constructs composed of a certain number of arms ending with sticky ends, whose collective phase behavior has been investigated both experimentally and numerically investigated. In Rovigatti et al. (2014), we have used oxDNA to numerically demonstrate that the ground state[14] of tetravalent DNA nanostars in solution is a disordered state (a “liquid”) rather than a crystal, which is a rather surprising result.

(a) The sequence used to build tetravalent DNA nanostars, and a sketch of the secondary structure of a single nanostar. (b) Cartoon of the protocol employed to compute the free energy of the fluid phase of a system made of tetravalent DNA nanostars. The path connecting the initial and final states comprises thermodynamic integration parts (green dashed lines), and a Hamiltonian integration part (orange dotted line) that connects a system of tetramers with switched-off sticky ends to the
fully interacting fluid. The snapshots show typical configurations at the marked state points. Here, DNA strands are colored according to the following scheme: DNA sequences interacting only through excluded volume are in gray, unbound sticky ends are in violet, and bound ones are in green. Adapted from .

Figure 7:(a) The sequence used to build tetravalent DNA nanostars, and a sketch of the secondary structure of a single nanostar. (b) Cartoon of the protocol employed to compute the free energy of the fluid phase of a system made of tetravalent DNA nanostars. The path connecting the initial and final states comprises thermodynamic integration parts (green dashed lines), and a Hamiltonian integration part (orange dotted line) that connects a system of tetramers with switched-off sticky ends to the fully interacting fluid. The snapshots show typical configurations at the marked state points. Here, DNA strands are colored according to the following scheme: DNA sequences interacting only through excluded volume are in gray, unbound sticky ends are in violet, and bound ones are in green. Adapted from Rovigatti et al. (2014).

The sequences and secondary structure of the strands composing the nanostars is shown in Figure 7(a), while the thermodynamic/Hamiltonian path used to evaluate the free energy of the liquid state is shown in Figure 7(b). The integration starts at high temperature and low density, where the system is well approximated by an ideal gas, for which we have an analyitical expression for the free energy. Here we used a modified Hamiltonian in which the sticky ends that provide inter-nanostar bonding have been disabled by setting λ=0\lambda = 0. We then used Eq. (44) to obtain the free energy of a high-TT liquid at the target density, and then HI to obtain the free energy of a high-TT system with interacting sticky ends (λ=1\lambda = 1). Finally, we used Eq. (46) to evaluate the free energy of the system at low temperature, where essentially all inter-nanostar bonds are formed and the system is very close to its classical ground state.

In order to demonstrate that the liquid state has a lower free energy than the crystal, we used a particular HI technique, called Einstein crystal and described thoroughly in Vega et al. (2008), to evaluate the free energy of the latter. This method is a bit out of scope here, but if you are interested you can read our paper to find out what we did.

6Alchemical free-energy calculations

In chemistry and biophysics, it is common to employ so-called alchemical free energy (AFE) calculations to quantify free energy differences between two thermodynamic states. The basic idea is the same underlying Hamiltonian integration. Indeed, the term “alchemical” refers to the use of nonphysical transformations, akin to the transmutation of elements imagined by medieval alchemists, to connect two states by parametrising the transition through a λ coupling variable varying between 0 to 1. In biophysics, AFE calculations are applied to diverse problems, such as determining solvation free energies, calculating relative binding affinities, and studying the effects of mutations in biomolecules.

Illustration of a thermodynamic cycle for the relative binding free energy, \Delta \Delta G_\text{bind}, between two ligands (“Ligand 1” and “Ligand 2”). Taken from .

Figure 8:Illustration of a thermodynamic cycle for the relative binding free energy, ΔΔGbind\Delta \Delta G_\text{bind}, between two ligands (“Ligand 1” and “Ligand 2”). Taken from York (2023).

One common application is shown in Figure 8. In the figure, the green arrows represent the absolute binding free energy, ΔGbind\Delta G_\text{bind}, of each ligand (indicated by the superscript), which involves changing the environment from unbound in the aqueous phase to bound in a complex with the protein target. These quantities are experimentally measurable but are challenging to directly compute, as the change in the environment can be considerably complicated. The red arrows represent alchemical transformations where Ligand 1 is mutated into a similar Ligand 2 in the same environment. These transformations are frequently more amenable to practical computations. The yellow circles in the figures indicate the region of each ligand that undergoes the most significant changes in the alchemical transformation and would likely be modeled using a “softcore potential” during the alchemical transformation.

There are three principal methodologies underpin AFE calculations, each offering distinct approaches to determining free energy differences. Here I want to introduce them very briefly, so that the reader is aware of their existence

6.1Hamiltonian integration

As discussed above, free energy changes are obtained by integrating the thermodynamic derivative U/λ\partial U / \partial \lambda along the alchemical pathway. This approach requires simulations at several discrete λ points, where the average derivative is computed. Numerical integration over these points yields the total free energy difference. HI is known for its robustness and straightforward implementation but can be computationally intensive due to the need for dense sampling along the λ coordinate.

6.2Free Energy Perturbation (FEP)

FEP directly estimates the free energy difference between states based on the overlap of their phase spaces. The idea is to run simulations at NN values of {λi}\{ \lambda_i \}, so that the total free energy change is

ΔFAB=i=1N1ΔFλiλi+1.\Delta F_{A \to B} = \sum_{i=1}^{N - 1} \Delta F_{\lambda_i \to \lambda_{i + 1}}.

The free-energy difference between neighboring states is estimated using the exponential averaging of the energy difference (the so-called Zwanzig equation), where the average can be performed in the λk\lambda_k, λk+1\lambda_{k+1} or in both systems, so that there are three possible ways of computing ΔFλiλi+1\Delta F_{\lambda_i \to \lambda_{i + 1}}:

ΔFλiλi+1=kBTlogeβΔHi,i+1λiΔFλiλi+1=kBTlogeβΔHi,i+1λi+1ΔFλiλi+1=kBTlog[f(β(ΔHi,i+1C))if(β(CΔHi,i+1))i+1]+C,\begin{aligned} \Delta F_{\lambda_i \to \lambda_{i + 1}} &= - k_B T \log \left\langle e^{-\beta \Delta H_{i, i+1}} \right\rangle_{\lambda_i}\\ \Delta F_{\lambda_i \to \lambda_{i + 1}} &= - k_B T \log \left\langle e^{\beta \Delta H_{i, i+1}} \right\rangle_{\lambda_{i+1}}\\ \Delta F_{\lambda_i \to \lambda_{i + 1}} &= - k_B T \log \left[ \frac{\left\langle f(\beta(\Delta H_{i, i+1} - C)) \right\rangle_i}{\left\langle f(\beta(C - \Delta H_{i, i+1})) \right\rangle_{i+1}} \right] + C, \end{aligned}

where ΔHi,i+1=H(λi+1)H(λi)\Delta H_{i, i+1} = H(\lambda_{i + 1}) - H(\lambda_i), f=1/(1+ex)f = 1 / (1 + e^x) is the Fermi function, and CC is an arbitrary constant. In principle, FEP can calculate free energy changes by sampling only the end states, AA and BB. However, if the sampled configurations from AA poorly represent those of BB, the ensemble average eβΔHA\left\langle e^{-\beta \Delta H} \right\rangle_{A} becomes dominated by rare, poorly sampled configurations, leading to large statistical errors. Therefore, in practice, a number of intermediate λ windows are often required to ensure adequate phase space overlap and convergence, making the computational cost of the method comparable to that of HI.

6.3Nonequilibrium Work (NEW) methods

Nonequilibrium Work (NEW) methods provide a distinct approach to alchemical free energy calculations, departing from traditional equilibrium-based techniques. These methods calculate free energy differences by performing rapid transformations between thermodynamic states, where the system is intentionally driven out of equilibrium. At the heart of these calculations lies the Jarzynski equality, a foundational equation in nonequilibrium statistical mechanics. The Jarzynski equality is expressed as:

eβΔF=eβWe^{-\beta \Delta F} = \overline{e^{-\beta W}}

where WW is the work done during the transformation, and the overline denotes an ensemble average over multiple realizations of the nonequilibrium process. Using the chain rule,

W=0τH(λ)λdλdtdt,W = \int_0^\tau \frac{\partial H(\lambda)}{\partial \lambda} \frac{d \lambda}{dt}dt,

where τ is the duration of the transformation. The Jarzynski equality equality reveals that the free energy difference between two states can be computed from an ensemble of nonequilibrium work measurements, even if the system does not equilibrate during the transformation.

In practice, the transformation is performed by smoothly varying the coupling parameter λ from 0 to 1 over a finite time τ, which generates a trajectory where the work WW done on the system is recorded. A series of such trajectories is then used to compute the exponential average of WW, providing an estimate of ΔF\Delta F.

Unlike equilibrium methods that require exhaustive sampling across several λ windows, NEW simulations focus on generating multiple rapid trajectories, each starting from a well-equilibrated initial state. These trajectories can be run independently, allowing for significant parallelization and reduced wall-clock time. However, the accuracy of NEW methods depends on the careful sampling of initial conditions and the balance between transformation speed and system relaxation. Extremely fast transformations may lead to poor convergence, as large deviations from equilibrium can bias the results.

Footnotes
  1. To provide an example that you can already appreciate, the bottom-up effective interactions discussed in the coarse graining chapter can be computed with any of the techniques discussed in this chapter.

  2. Sometimes also called activation (free) energy.

  3. Note that, per this definition, ΔFbABΔFbBA=FmaxFB\Delta F_b^{A \to B} \neq \Delta F_b^{B \to A} = F_{\rm max} - F_B.

  4. Unbiased simulations are “regular” simulations, where the system evolves according to the fundamental laws of physics (Newton’s laws of motion in this context), without any artificial bias or constraints imposed.

  5. The bias often takes the form of a harmonic potential whose shape, resembling an umbrella, gives the method its name.

  6. This constant will take different values in different windows, since it is an ensemble average of a window-dependent quantity, Vibias(ξ)V^{\rm bias}_i(\xi).

  7. Under certain conditions, histograms computed with some parameters (e.g. temperature or chemical potential) can be reweighted to obtain the same quantity for some other (usually nearby) values of the same parameters, without having to run additional simulations.

  8. Note that the same procedure can be used to reconstruct the profile of any observable that depends solely on the RC, since Eq. (9) is valid for any O=O(ξ)O = O(\xi).

  9. This uncorrelation seems obvious, but there are computational techniques such as parallel tempering where this assumption may not hold.

  10. Both assumptions are reasonable for most real-world examples, since MM should be large to have a good statistics, and pp should be small in a well-sampled, biased simulation, where many bins should have non-zero counts.

  11. As noted in Bussi & Laio (2020), this is similar to a regular observable in an unbiased simulation, which converges to some specific average value after a certain equilibration time, but whose instantaneous value does not have any macroscopic meaning.

  12. Here we reweight the histogram, as you would do with ordinary umbrella sampling histograms.

  13. by definition, the free energy difference between two states do not depend on the path connecting them, as long as such transformation is reversible.

  14. Here ground state is to be meant classically as the equilibrium state of the system as T0T \to 0.

References
  1. Šulc, P., Romano, F., Ouldridge, T. E., Rovigatti, L., Doye, J. P. K., & Louis, A. A. (2012). Sequence-dependent thermodynamics of a coarse-grained DNA model. The Journal of Chemical Physics, 137(13). 10.1063/1.4754132
  2. Barducci, A., Bonomi, M., Prakash, M. K., & Parrinello, M. (2013). Free-energy landscape of protein oligomerization from atomistic simulations. Proceedings of the National Academy of Sciences, 110(49). 10.1073/pnas.1320077110
  3. Hénin, J., Lelièvre, T., Shirts, M. R., Valsson, O., & Delemotte, L. (2022). Enhanced Sampling Methods for Molecular Dynamics Simulations [Article v1.0]. Living Journal of Computational Molecular Science, 4(1). 10.33011/livecoms.4.1.1583
  4. Eyring, H. (1935). The Activated Complex in Chemical Reactions. The Journal of Chemical Physics, 3(2), 107–115. 10.1063/1.1749604
  5. Evans, M. G., & Polanyi, M. (1935). Some applications of the transition state method to the calculation of reaction velocities, especially in solution. Transactions of the Faraday Society, 31, 875. 10.1039/tf9353100875
  6. Le Priol, C., Monteiro, J. M., & Bouchet, F. (2024). Using rare event algorithms to understand the statistics and dynamics of extreme heatwave seasons in South Asia. Environmental Research: Climate, 3(4), 045016. 10.1088/2752-5295/ad8027
  7. Torrie, G. M., & Valleau, J. P. (1977). Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. Journal of Computational Physics, 23(2), 187–199. 10.1016/0021-9991(77)90121-8
  8. Rovigatti, L., Gnan, N., Parola, A., & Zaccarelli, E. (2015). How soft repulsion enhances the depletion mechanism. Soft Matter, 11(4), 692–700. 10.1039/c4sm02218a
  9. Virnau, P., & Müller, M. (2004). Calculation of free energy through successive umbrella sampling. The Journal of Chemical Physics, 120(23), 10925–10930. 10.1063/1.1739216
  10. Bartels, C., & Karplus, M. (1997). Multidimensional adaptive umbrella sampling: Applications to main chain and side chain peptide conformations. Journal of Computational Chemistry, 18(12), 1450–1462. https://doi.org/10.1002/(sici)1096-987x(199709)18:12<;1450::aid-jcc3>3.0.co;2-i
  11. Bartels, C., & Karplus, M. (1998). Probability Distributions for Complex Systems: Adaptive Umbrella Sampling of the Potential Energy. The Journal of Physical Chemistry B, 102(5), 865–880. 10.1021/jp972280j
  12. Kumar, S., Rosenberg, J. M., Bouzida, D., Swendsen, R. H., & Kollman, P. A. (1992). THE weighted histogram analysis method for free‐energy calculations on biomolecules. I. The method. Journal of Computational Chemistry, 13(8), 1011–1021. 10.1002/jcc.540130812
  13. Barducci, A., Bonomi, M., & Parrinello, M. (2011). Metadynamics. WIREs Computational Molecular Science, 1(5), 826–843. 10.1002/wcms.31
  14. Bussi, G., & Laio, A. (2020). Using metadynamics to explore complex free-energy landscapes. Nature Reviews Physics, 2(4), 200–212. 10.1038/s42254-020-0153-0
  15. Laio, A., & Parrinello, M. (2002). Escaping free-energy minima. Proceedings of the National Academy of Sciences, 99(20), 12562–12566. 10.1073/pnas.202427399