OAT scripting interface documentation
Each script has been broken down into two types of functions:
An importable module with the same name as the corresponding command line invocation which runs the entire script and returns the result as a Python object.
A series of sub-modules which implement individual steps of the overall process.
The first type of import can be used to compose multiple analysis types into a single script or Jupyter notebook. For example, to first align a trajectory, then decimate the resulting trajectory to contain only every 10th trajectory while skipping the first 200 confgiruations, each using 5 processes, you would run:
from oxDNA_analysis_tools.align import align
from oxDNA_analysis_tools.decimate import decimate
traj = 'traj.dat'
align_output = 'aligned.dat'
decimate_output = 'decimated.dat'
align(traj, align_output, ncpus=5)
decimate(align_output, decimate_output, ncpus=5, start=200, stride=10)
Align
Align a trajectory to a ref_conf and print the result to a file. |
|
Single-value decomposition-based alignment of configurations |
- oxDNA_analysis_tools.align.align(traj: str, outfile: str, ncpus: int = 1, indexes: List[int] = [], ref_conf: Configuration | None = None, center: bool = True)
Align a trajectory to a ref_conf and print the result to a file.
- Parameters:
traj (str) – The trajectory file name to align
outfile (str) – The file name to write the aligned trajectory to
ncpus (int) – (optional) How many cpus to parallelize the operation. default=1
indexes (List[int]) – (optional) IDs of a subset of particles to consider for the alignment. default=all
ref_conf (Configuration) – (optional) The configuration to align to. default=first conf
Writes the aligned configuration to outfile
- oxDNA_analysis_tools.align.svd_align(ref_coords: ndarray, coords: ndarray, indexes: ndarray, ref_center: ndarray = array([], dtype=float64)) Tuple[ndarray, ndarray, ndarray]
Single-value decomposition-based alignment of configurations
- Parameters:
ref_coords (numpy.ndarray) – Reference coordinates. Should be indexed before calling this function.
coords (numpy.ndarray) – np.array containing [coordinates, a1s, a3s] for the structure to align
indexes (np.ndarray[int]) – Indexes of the atoms to be aligned in the coords array
ref_center (numpy.ndarray) – (optional) The center of mass of the reference configuration. If not provided, it will be calculated (slightly slower for many confs).
- Returns:
A tuple of the aligned coordinates (coords, a1s, a3s) for the given chunk
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray]
ANM parameterize
Computes the coarse-grained RMSF for a given trajectory. |
- oxDNA_analysis_tools.anm_parameterize.anm_parameterize(particles_array: ndarray, trajectory: str, ref_conf: Configuration) ndarray
Computes the coarse-grained RMSF for a given trajectory.
- Parameters:
particles_array (np.array) – An array containing the indices of the particles in each super particle.
trajectory (str) – The path to the trajectory to evaluate.
ref_conf (Configuration) – The reference configuration.
- Returns:
The RMSF for each super particle.
- Return type:
np.ndarray
Backbone flexibility
|
Calculate backbone flexibility of a trajectory. |
Bond analysis
Compare the bond occupancy of a trajectory with a designed structure |
- oxDNA_analysis_tools.bond_analysis.bond_analysis(traj_info: TrajInfo, top_info: TopInfo, pairs: Dict[int, int], inputfile: str, ncpus: int = 1) Tuple[ndarray, ndarray, ndarray, ndarray]
Compare the bond occupancy of a trajectory with a designed structure
- Parameters:
traj_info (TrajInfo) – Object containing the trajectory information
top_info (TopInfo) – Object containing the topology information
pairs (dict) – Designed pairs ({p1 : q1, p2 : q2})
inputfile (str) – The path to the input file used to run the simulation
ncpus (int) – (optional) number of cores to use
- Returns:
- Number of formed bonds among the specified nucleotides at each step in the simulationNumber of missbonds among specified nucleotides at each step in the simulationNumber of correct bonds among specified nucleotides at each step in the simulationPer-nucleotide correct bond occupancy
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]
Centroid
Find the configuration in a trajectory closest to a provided reference configuration |
- oxDNA_analysis_tools.centroid.centroid(traj_info: TrajInfo, top_info: TopInfo, ref_conf: Configuration, indexes: List[int] = [], ncpus=1) Tuple[Configuration, float]
Find the configuration in a trajectory closest to a provided reference configuration
- Parameters:
traj_info (TrajInfo) – Object containing information about the trajectory
top_info (TopInfo) – Object containing information about the topology
ref_conf (Configuration) – Object containing the reference configuration
indexes (List[int]) – (optional) Indexes of the particles to be used for alignment
ncpus (int) – (optional) Number of CPUs to use for alignment
- Returns:
- The configuration with the lowest RMSD to the referenceThe RMSD from the centroid to the reference
- Return type:
Tuple[Configuration, float]
Clustering
Splits the trajectory into the clustered trajectories. |
|
Takes the output from DBSCAN and finds the centroid of each cluster. |
|
Use the DBSCAN algorithm to identify clusters of configurations based on a given order parameter. |
- oxDNA_analysis_tools.clustering.split_trajectory(traj_info, top_info, labs) None
Splits the trajectory into the clustered trajectories. Names each trajectory cluster_<n>.dat
- oxDNA_analysis_tools.clustering.get_centroid(points: ndarray, metric_name: str, labs: ndarray, traj_info: TrajInfo, top_info: TopInfo) List[int]
Takes the output from DBSCAN and finds the centroid of each cluster.
- Parameters:
- Returns:
The configuration ID for the centroid of each cluster
- Return type:
List[int]
- oxDNA_analysis_tools.clustering.perform_DBSCAN(traj_info: TrajInfo, top_info: TopInfo, op: ndarray, metric: str, eps: float, min_samples: int, op_names: List[str] = [], no_traj: bool = False, interactive_plot: bool = False, min_clusters: int = -1) ndarray
Use the DBSCAN algorithm to identify clusters of configurations based on a given order parameter.
- Parameters:
traj_info (TrajInfo) – Information about the trajectory
top_info (TopInfo) – Information about the topology
op (np.ndarray) – The order parameter(s) to use (shape = n_confs x n_op for metric=euclidean, n_confs x n_confs for metric=precomputed)
metric (str) – Either ‘euclidean’ or ‘precomputed’ for whether the distance needs to be calculated
eps (float) – The maximum distance between two points to be considered in the same neighborhood
min_samples (int) – The minimum number of points to be considered a neighborhood
no_traj (bool) – If True, skip splitting the trajectory (these are slow)
interactive_plot (bool) – If True, show plot interactivley instead of saving as an animation
min_clusters (int) – If less than min_clusters are found, return and don’t do further calculations
- Returns:
The label for each configuration
- Return type:
np.ndarray
Config
Checks if the dependencies are installed. |
|
Sets the number of confs to read at a time for analyses. |
|
Gets the current chunk size. |
- oxDNA_analysis_tools.config.check(to_check: List[str] = ['python', 'numpy', 'matplotlib', 'sklearn', 'oxpy'])
Checks if the dependencies are installed.
- Parameters:
to_check (List[str]) – list of package names to check
Contact map
Calculates the average contact map for a trajectory. |
Decimate
Reduce the number of configurations in a trajectory. |
- oxDNA_analysis_tools.decimate.decimate(traj: str, outfile: str, start: int, stop: int, stride: int)
Reduce the number of configurations in a trajectory.
- Parameters:
traj (str) – The trajectory file name to decimate
outfile (str) – The file name to write the decimates trajectory to
start (int) – (optional) Starting configuration for the new trajectory. Accepts negative indexes. default=0
stop (int) – (optional) Process up to this conf (exclusive). Accepts negative indexes.
stride (int) – (optional) Include only every stride-th conf. (default=10)
Writes the trajectory directly to outfile.
Deviations
Find the deviations of a trajectory from a mean configuration |
|
Create RMSF oxView overlay and RMSD plot |
- oxDNA_analysis_tools.deviations.deviations(traj_info: TrajInfo, top_info: TopInfo, mean_conf: Configuration, indexes: List[int] = [], ncpus: int = 1) Tuple[ndarray, ndarray]
Find the deviations of a trajectory from a mean configuration
- Parameters:
traj_info (TrajInfo) – Information about the trajectory
top_info (TopInfo) – Information about the topology
mean_conf (Configuration) – The mean configuration
indexes (List[int]) – (optional) List of indexes of the particles to be used for alignment
ncpus (int) – (optional) Number of CPUs to use for alignment
- Returns:
- Root mean squared deviation for each configuration in the trajectoryAverage fluctuation for each particle in the structure
- Return type:
Tuple[np.ndarray, np.ndarray]
- oxDNA_analysis_tools.deviations.output(RMSDs: ndarray, RMSFs: ndarray, outfile: str = 'devs.json', plot_name: str = 'rmsd.png', data_file: str = 'rmsd_op.json')
Create RMSF oxView overlay and RMSD plot
- Parameters:
RMSDs (np.array) – Root mean squared deviation for each configuration in the trajectory
RMSFs (np.array) – Average deviation for each particle in the structure
outfile (str) – (optional) Name of the oxView overlay file for the RMSF
plot_name (str) – (optional) Name of the RMSD plot
data_file (str) – (optional) Name of the oxView order parameter file for the RMSD
Distance
Calculates distance between two particles taking PBC into account |
|
Calculates all mutual distances between two sets of points taking PBC into account |
|
Compute the distance between two lists of particles |
- oxDNA_analysis_tools.distance.min_image(p1: ndarray, p2: ndarray, box: float) float
Calculates distance between two particles taking PBC into account
- oxDNA_analysis_tools.distance.vectorized_min_image(p1s: ndarray, p2s: ndarray, box: float) ndarray
Calculates all mutual distances between two sets of points taking PBC into account
- Paramters:
p1s (np.ndarray) : the first set of points (Nx3 array) p2s (np.ndarray) : the second set of points (Mx3 array) box (float) : The size of the box (assumes a cubic box)
- Returns:
the distances between the points (NxM array)
- Return type:
np.ndarray
- oxDNA_analysis_tools.distance.distance(traj_infos: List[TrajInfo], top_infos: List[TopInfo], p1ss: List[List[int]], p2ss: List[List[int]], ncpus: int = 1) List[List[float]]
Compute the distance between two lists of particles
- Parameters:
- Returns:
A list of distances for each trajectory
- Return type:
List[List[float]]
Dot-bracket to force
Converts a dot-bracket string to a list of paired nucleotides. |
|
Convert a dot-bracket string to oxDNA mutual traps |
- oxDNA_analysis_tools.db_to_force.parse_dot_bracket(input: str) ndarray
Converts a dot-bracket string to a list of paired nucleotides.
Accepts (), [], {}, and . characters, otherwise throws an error
- Parameters:
input (str) – A dot-bracket string
- Returns:
A list where each index corresponds to a nucleotide. Value is -1 is unpaired or another index if paired.
- Return type:
np.ndarray
Duplex angle plotter
Read in a duplex list file and return the angles between specified duplexes. |
- oxDNA_analysis_tools.duplex_angle_plotter.get_angle_between(files: List[str], p1s: List[List[int]], p2s: List[List[int]], invert_mask: List[bool]) Tuple[List[List[ndarray]], List[List[float]], List[List[float]], List[List[float]], List[List[float]]]
Read in a duplex list file and return the angles between specified duplexes.
- Parameters:
- Returns:
- The list of angles between the specified duplexes.The mean angle between each pair of duplexes.The median angle between each pair of duplexes.The standard deviation of the angle between each pair of duplexes.The percentage of confs that have the duplexes.
- Return type:
Tuple[List[List[ndarray]], List[List[float]], List[List[float]], List[List[float]], List[List[float]]]
Duplex finder
Defines a nucleic acid duplex structure |
|
Finds the duplexes in a structure |
|
Finds the duplexes in a trajectory |
- class oxDNA_analysis_tools.duplex_finder.Duplex(time: int, index: int, start1: int, end1: int, start2: int, end2: int, axis: ndarray, pos: ndarray)
Bases:
object
Defines a nucleic acid duplex structure
- Parameters:
index (int) – Unique identifier for the duplex
start1 (int) – Particle ID of the first particle in the first strand
end1 (int) – Particle ID of the last particle in the first strand
start2 (int) – Particle ID of the first particle in the complementary strand
end2 (int) – Particle ID of the last particle in the complementary strand
axis (np.array) – Normalized vector fit to the center of the duplex
pos (np.array) – start position in 3D space of the axis
- oxDNA_analysis_tools.duplex_finder.find_duplex(monomers: List[Monomer]) List[Duplex]
Finds the duplexes in a structure
- oxDNA_analysis_tools.duplex_finder.duplex_finder(traj_info: TrajInfo, top_info: TopInfo, inputfile: str, monomers: List[Monomer], ncpus=1) List[List[Duplex]]
Finds the duplexes in a trajectory
- Parameters:
- Returns:
List of lists of duplexes, one list for each step in the trajectory
- Return type:
List[List[Duplex]]
File info
Extract metadata about trajectory files |
- oxDNA_analysis_tools.file_info.file_info(trajectories: List) Dict
Extract metadata about trajectory files
- Parameters:
trajectories (List[str]) – Filepaths to the trajectories to analyze
- Returns:
Dictionary with name, number of particles, number of confs, filezie, starting step and ending step.
- Return type:
Dict
Forces to dot-bracket
Convert a force list to dot-bracket notation. |
Mean
Compute the mean structure of a trajectory. |
- oxDNA_analysis_tools.mean.mean(traj_info: TrajInfo, top_info: TopInfo, ref_conf: Configuration | None = None, indexes: List[int] = [], ncpus: int = 1) Configuration
Compute the mean structure of a trajectory.
- Parameters:
traj_info (TrajInfo) – Information about the trajectory
top_info (TopInfo) – Information about the topology
ref_conf (Configuration) – (optional) The reference configuration to align to. If None, a random configuraiton will be used.
indexes (List[int]) – (optional) The indexes of nucleotides included in the alignment. If None, all nucleotides will be aligned.
ncpus (int) – (optional) The number of CPUs to use. If None, 1 CPU will be used.
- Returns:
The mean structure of the trajectory.
- Return type:
Minify
Make a trajectory smaller by discarding some precision. |
Multidimensional scaling mean
|
Compute the mean configuration of a trajectory using MDS. |
|
Compute the deviations of the contact map from the mean. |
- oxDNA_analysis_tools.multidimensional_scaling_mean.multidimensional_scaling_mean(traj_info: TrajInfo, top_info: TopInfo, ncpus: int = 1) Tuple[Configuration, ndarray]
Compute the mean configuration of a trajectory using MDS.
- Parameters:
- Returns:
The average configuration and the distance matrix with all distances above the cutoff masked
- Return type:
Tuple[Configuration, np.ndarray]
- oxDNA_analysis_tools.multidimensional_scaling_mean.distance_deviations(traj_info: TrajInfo, top_info: TopInfo, masked_mean: ndarray, ncpus: int = 1) ndarray
Compute the deviations of the contact map from the mean.
- Parameters:
- Returns:
The per-particle deviation from the mean distance map
- Return type:
np.ndarray
Output bonds
Computes the potentials in a trajectory |
- oxDNA_analysis_tools.output_bonds.output_bonds(traj_info: TrajInfo, top_info: TopInfo, inputfile: str, visualize: bool = False, conversion_factor: float = 1, ncpus: int = 1)
Computes the potentials in a trajectory
- Parameters:
traj_info (TrajInfo) – Information about the trajectory.
top_info (TopInfo) – Information about the topology.
inputfile (str) – Path to the input file.
visualize (bool) – (optional) If True, the energies are saved as a mean-per-particle oxView file. If False, they are printed to the screen.
conversion_factor (float) – (optional) Conversion factor for the energies. 1 for oxDNA SU, 41.42 for pN nm.
ncpus (int) – (optional) Number of CPUs to use.
- Returns:
If visualize is True, the energies are saved as a mean-per-particle oxView file. If False, they are printed to the screen and None is returned.
- Return type:
np.ndarray or None
OxDNA to PDB
Convert an oxDNA file to a PDB file. |
- oxDNA_analysis_tools.oxDNA_PDB.oxDNA_PDB(conf: Configuration, system: System, out_basename: str, protein_pdb_files: List[str] = [], reverse: bool = False, hydrogen: bool = True, uniform_residue_names: bool = False, one_file_per_strand: bool = False, rmsf_file: str = '')
Convert an oxDNA file to a PDB file. Directly writes the file.
- Parameters:
top_file (str) – Filename of oxDNA topology
conf_file (str) – Filename of oxDNA configuration
out_basename (str) – Filename(-.pdb) to write out to
protein_pdb_files (List[str]) – Filenames of pdb files corresponding to proteins present in the oxDNA files. Default: []
reverse (bool) – Reverse nucleic acid strands from oxDNA files (If you want ‘correct’ PDB files from backwards oxDNA files). Default: False
hydrogen (bool) – Write hydrogens to output file (Probably false if, for example, exporting for GROMACS simulation). Default: True
uniform_residue_names (bool) – Don’t add ‘5’ and ‘3’ to the names of the terminal residues (True if, for example, exporting for GROMACS simulation). Default: False
one_strand_per_file (bool) – Split each strand into a separate PDB file (appends numbers to out_basename). Default: False
rmsf_file (str) – Write oxDNA (json-formatted from deviations) RMSFs into the b-factor field of the PDB file. Default: ‘’
Pairs to dot-bracket
Convert a dictionary of paired nucleotides to dot-bracket notation |
PDB to oxDNA
Convert a PDB string to an oxDNA Configuration/System |
Persistence length
Computes the persistence length of a bonded sequence of nucleotides. |
|
Returns a normalized vector pointing from base midpoint of the nucid-th base pair to the midpoint of the next base pair. |
|
Fits persistence length from tangent vector correlations |
- oxDNA_analysis_tools.persistence_length.persistence_length(traj_info: TrajInfo, inp_file: str, n1: int, n2: int, ncpus: int = 1) Tuple[float, ndarray]
Computes the persistence length of a bonded sequence of nucleotides.
Note that while n1 and n2 must be on the same strand, there can be multiple distinct duplexes within the strand.
Only paired nucleotides will be considered in the persistence length calculation
- Parameters:
- Returns:
Tuple containing the average contour length and correlations between each pair step
- Return type:
Tuple[float, np.ndarray]
- oxDNA_analysis_tools.persistence_length.get_r(conf, nucid: int, pair_dict: Dict) ndarray
Returns a normalized vector pointing from base midpoint of the nucid-th base pair to the midpoint of the next base pair.
- Parameters:
conf (oxpy.config_info) – The current configuration
nucid (int) – ID of the nucleotide in the first strand to compute the vector from
- Returns:
The vector pointing from the midpoint of nucid’s base pair to the +1 base pair.
- Return type:
np.ndarray
Principle component analysis
Single-value decomposition-based alignment of configurations |
|
Transforms each configuration in a trajectory into a point in principal component space |
|
Produces a heatmat plot of the covariance between every particle |
|
Performs a PCA on a trajectory |
- oxDNA_analysis_tools.pca.align_positions(centered_ref_coords: ndarray, coords: ndarray) ndarray
Single-value decomposition-based alignment of configurations
This one only considers positions, unlike the one in align which also handles the a vectors
- Parameters:
centered_ref_coords (np.array) – reference coordinates, centered on [0, 0, 0]
coords (np.array) – coordinates to be aligned
- Returns:
Aligned coordinates for the given conf
- Return type:
np.ndarray
- oxDNA_analysis_tools.pca.map_confs_to_pcs(ctx: ComputeContext_map, chunk_size: int, chunk_id: int)
Transforms each configuration in a trajectory into a point in principal component space
- Parameters:
ctx (ComputeContext_map) – A compute context which contains trajectory metadata, the align conf and the components
cunk_id (int) – The id of the current chunk
- Returns:
The positions of each frame of the trajectory in principal component space.
- Return type:
np.ndarray
- oxDNA_analysis_tools.pca.make_heatmap(covariance: ndarray)
Produces a heatmat plot of the covariance between every particle
- Parameters:
covariance (numpy.array) – The covariance matrix of particle positions
- Displays:
A matplotlib imshow plot of the covariance
- oxDNA_analysis_tools.pca.pca(traj_info: TrajInfo, top_info: TopInfo, mean_conf: Configuration, ncpus: int = 1) Tuple[ndarray, ndarray, ndarray]
Performs a PCA on a trajectory
- Parameters:
traj_info (TrajInfo) – Information about the trajectory
top_info (TopInfo) – Information about the topology
mean_conf (Configuration) – The mean structure of the trajectory
ncpus (int) – (optional) The number of CPUs to use for the computation
- Returns:
The structures mapped to coordinate space, the eigenvalues and the eigenvectors
- Return type:
Tuple[np.ndarray, np.ndarray, np.ndarray]
Subset trajectory
Splits a trajectory into multiple trajectories, each containing a subset of the particles in the original configuration. |
- oxDNA_analysis_tools.subset_trajectory.subset(traj_info: TrajInfo, top_info: TopInfo, system: System, indexes: List[List[int]], outfiles: List[str], ncpus=1)
Splits a trajectory into multiple trajectories, each containing a subset of the particles in the original configuration.
- Parameters:
traj_info (TrajInfo) – Information about the trajectory
top_info (TopInfo) – Information about the topology
system (System) – The system object describing the topology of the original structure.
indexes (List[List[int]]) – A list of lists of indexes. Each list corresponds to one set of particles to include in one of the output trajectories.
outfiles (List[str]) – A list of output file names. The number of elements in this list must match the number of elements in the indexes list.
ncpus (int) – (optional) The number of CPUs to use
The output trajectories will be written to the files specified in outfiles.
Superimpose
Superimposes one or more structures sharing a topology to a reference structure |
- oxDNA_analysis_tools.superimpose.superimpose(ref: Configuration, victims: List[str], indexes: ndarray = array([], shape=(1, 0), dtype=float64))
Superimposes one or more structures sharing a topology to a reference structure
- Parameters:
ref (Configuration) – the reference configuration to superimpose to
victims (List[Configuration]) – the configurations to superimpose on the reference
indexes (np.ndarray[int]) – the indexes of the particles to superimpose on the reference (default: all)
- Returns:
Aligned configurations
- Return type:
List[Configuration]