1The main assignment¶
The SantaLucia (SL) model is the most famous (and widely-used) nearest-neighbour model for DNA. The mandatory part of this assignment is to write two codes, one that computes the melting temperature of a double strand given its sequence, and another that computes the melting temperature of the secondary structure of a single strand.
The following instructions apply to the first code:
- The code takes as input the two strands that make up the double strand and check that they are the same length and fully complementary. Nota Bene: this check is not required if you decide to also include the terms that take care of mismatches.
- You can assume that the total strand concentration is M for simplicity.
The following instructions apply to the second code:
- The code takes as input the strand sequence and a secondary structure, specified with the dot-paren notation.
- The code should check that the secondary structure is compatible with the given sequence (e.g. check that they are of the same length, that the base pairs specified in the secondary structure are valid Watson-Crick pairs, and that each opening parenthesis has a closing partner).
The two codes should
- Evaluate the enthalpic and entropic contributions to the given secondary structure and , according to the SantaLucia model.
- Apply the two-state model to predict the melting temperature of the secondary structure. Nota Bene: for complicated examples the main assumption behind the two-state model[1] will not hold, but for the sake of this assignment we will pretend that this is never the case.
2Possible extensions¶
- For the double-strand code, add the possibility of choosing the strand concentrations.
- Add the possibility of visualising the melting curve of the given secondary structure (i.e. the yield of the secondary structure as a function of temperature).
- Add the possibility of choosing the NN model: you can, for instance, add support for one of the RNA models listed here.
- Add all the NN terms.
3Additional details¶
The table below contain the thermodynamic parameters of the model for the different base stacks, together with the initiation, terminal AT, and symmetry correction contributions. You will find the others, together with the explanation of how to use them, in SantaLucia & Hicks (2004).
Table 1:SL parameters for Watson-Crick pairs at 1 M
Term | (kcal / mol) | (cal / mol K) |
---|---|---|
AA/TT | −7.6 | −21.3 |
AT/TA | −7.2 | −20.4 |
TA/AT | −7.2 | −21.3 |
CA/GT | −8.5 | −22.7 |
GT/CA | −8.4 | −22.4 |
CT/GA | −7.8 | −21.0 |
GA/CT | −8.2 | −22.2 |
CG/GC | −10.6 | −27.2 |
GC/CG | −9.8 | −24.4 |
GG/CC | −8.0 | −19.9 |
Initiation | +0.2 | −5.7 |
Terminal AT penalty | +2.2 | +6.9 |
Symmetry correction | 0.0 | −1.4 |
For the hairpin code, the free-energy cost of a loop of length in kcal / mol is given by the following table
Table 2:Turner parameters for hairpin loops
3 | 3.20 | 1.30 |
4 | 3.60 | 4.80 |
5 | 4.00 | 3.60 |
6 | 4.40 | -2.90 |
7 | 4.60 | 1.30 |
8 | 4.70 | -2.90 |
9 | 4.80 | 5.00 |
>9 | 5.00 |
Just to give you some numbers you can use to test your code, consider the following double strand, that forms a duplex made of 6 base pairs:
5'-CGTTGA-3'
3'-GCAACT-5'
Its SL free energy contributions are kcal/mol, cal/mol K, so that kcal/mol at 37°. Its melting temperature at M is C.
That there are no stable or metastable intermediate states.
- SantaLucia, J., & Hicks, D. (2004). The Thermodynamics of DNA Structural Motifs. Annual Review of Biophysics and Biomolecular Structure, 33(1), 415–440. 10.1146/annurev.biophys.32.110601.141800