Utilities

The utils folder contains scripts and programs that provide additional functionality for working with oxDNA.

generate-sa.py

This script generates starting configurations for DNA strands in a simulation box. It supports single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) with blunt ends, and dsDNA with sticky ends.

Note

This script generates old-format topology files.

Usage

python generate-sa.py <box_size> <sequences_file>

<box_size>: The side length of the cubic simulation box.
<sequences_file>: A file containing the DNA sequences, one per line, with optional prefixes.

The script reads the sequences from the file and generates positions for the nucleotides, ensuring no overlaps occur. The input file format supports:

Plain sequences (e.g., ATATATA) for single strands.
DOUBLE <sequence> for double-stranded DNA with blunt ends.
STICKY <sticky_length> <top_sequence> <bottom_sticky_end_sequence> for double-stranded DNA with sticky ends of the specified length. As an example, STICKY 4 AAAAACGACG TTTT will generate a double strand with AAAA and TTTT sticky ends.

convert.py

This script converts oxDNA topology and configuration files between the old and new formats.

Usage

python convert.py <topology> <configuration> [-p prefix] [-i]

<topology>: The topology file to convert.
<configuration>: The configuration file to convert.
-p prefix: Optional prefix for output files (default: “converted_”).
-i: Write output files to the input directory instead of current directory.

The script automatically detects the format of the input files and converts them accordingly.

decompress_zstd.py

This script decompresses oxDNA files that have been compressed using Zstandard compression, which is done by either setting trajectory_compression = true in the main input file (for trajectories), or by setting compress = true in data_output_* observable sections. It requires the zstandard library.

Usage

python decompress_zstd.py <compressed_file> [output_file]

<compressed_file>: The compressed oxDNA file to decompress.
[output_file]: Optional output file name. If not provided, the output will be written to <compressed_file>.decompressed.

The script reads the custom header (magic bytes ‘OXD\x01’, version, compression level) and decompresses the Zstandard-compressed data stream.

Compressing trajectory files

Overview

Simulation trajectory files often become extremely large because they store:

floating‑point particle coordinates
simulation time values
simulation box vectors
energy values
particle state values
repeated structural data across frames

Standard trajectory formats store numbers as ASCII text, which is inefficient both in storage and parsing performance.

Here we provide two utility programs that can be used to compress and decompress oxDNA trajectory files using a custom format developed by Victor Slivinskis. This custom format, dubbed VTJ1, significantly reduces file size by combining several compression techniques specifically designed for simulation trajectories:

Fixed‑point quantization
Delta compression
Predictive coding
Variable‑length integer encoding
Global metadata storage

Because molecular simulations typically produce smooth particle motion between frames, these techniques achieve very high compression ratios while maintaining precise numerical reconstruction.

Important

The utils/PostSimulationCompression/README.md file contains additional details about the VTJ1 format.

Benchmarks

The benchmark figure below summarizes the encoder performance across several trajectory sizes.

_images/VTJ1_benchmark.png — The datasets used in the benchmark are summarized below.

Original Size (MB)

Frames

Particles

2

1200

3

809

200

15449

1553

300

19499

5103

1000

19499

40643

7175

21649

The benchmark compares:

Compression Ratio vs Precision: Lower precision (fewer digits) increases compression efficiency, while higher precision preserves more numerical detail.

Encoding Time per Frame: Encoding time is normalized by frame count, allowing fair comparison between datasets with different trajectory lengths.

These results demonstrate the tradeoff between precision, compression efficiency, and computational cost across a wide range of simulation sizes.

Compilation

The folder includes a `Makefile` that simplifies compilation of the encoder and decoder.

Run `make` to compile the two programs, `encode_trj` and `decode_trj`.

By default, compilation uses the GCC compiler and the following optimization flags:

`-O3`

`-march=native`

`-ffast-math`

`-funroll-loops`

`-std=c11`

These flags maximize performance for trajectory processing workloads. Edit the `Makefile` to customise compilation. To remove the compiled binaries run `make clean`

Usage

Encode a Trajectory

`./encode_trj input.dat output.bin n_cols scale_digits`

Parameters:

Parameter

Description

input.dat

plaintext trajectory

output.bin

compressed binary output

n_cols

number of columns per particle row

scale_digits

decimal precision retained

Example:

`./encode_trj traj.dat traj.bin 15 14`

Decode a Trajectory

`./decode_trj input.bin output.dat`

Example:

`./decode_trj traj.bin traj_reconstructed.dat`

Original Size (MB)	Frames	Particles
2	1200	3
809	200	15449
1553	300	19499
5103	1000	19499
40643	7175	21649

Parameter	Description
input.dat	plaintext trajectory
output.bin	compressed binary output
n_cols	number of columns per particle row
scale_digits	decimal precision retained

Utilities

generate-sa.py

Usage

convert.py

Usage

decompress_zstd.py

Usage

Compressing trajectory files

Overview

Benchmarks

Compilation

Usage

Encode a Trajectory

Decode a Trajectory