Baggianalysis¶

Baggianalysis is a library aimed at simplifying the analysis of particle-based simulations. It makes it easy to parse, convert and analyse trajectories generated by simulation codes in an agnostic way. It is written in C++ and provides Python bindings. It is modular and can be extended from C++ and Python.

Installation

A simple example¶

The following code imports the baggianalysis module, parses a single LAMMPS data file, initialises its topology and perform some computations:

import baggianalysis as ba
import numpy as np
import sys

if len(sys.argv) < 2:
    print("Usage is %s data_file" % sys.argv[0])
    exit(1)

# initialise a parser for LAMMPS data files with atom_style "bond"
parser = ba.LAMMPSDataFileParser("bond")
# parse the system
system = parser.make_system(sys.argv[1])
# initialise the topology from the same file
topology = ba.Topology.make_topology_from_file(sys.argv[1], ba.parse_LAMMPS_topology)
# apply the topology to the system
topology.apply(system)

print("Number of molecules: %d" % len(system.molecules()))
# compute the centres of mass of all the molecules and store them in a list 
coms = list(map(lambda mol: mol.com(), system.molecules()))
# print the centres of mass to the "coms.dat" file
np.savetxt("coms.dat", coms)

The library makes it straightforward to work with whole trajectories. Here is an example where we compute the centre of mass of the first molecule of the system averaged over a whole trajectory:

import baggianalysis as ba
import numpy as np
import sys

if len(sys.argv) < 3:
    print("Usage is %s topology_file dir" % sys.argv[0])
    exit(1)
	
parser = ba.LAMMPSDataFileParser("bond")
trajectory = ba.FullTrajectory(parser)
# the first parameter is the directory where the trajectory is stored
# the second parameter is the pattern that will be used to match the filenames
# the third parameter is True if we want to sort the files, False otherwise 
trajectory.initialise_from_folder(sys.argv[2], "no_*", True)
topology = ba.Topology.make_topology_from_file(sys.argv[1], ba.parse_LAMMPS_topology)

com = np.array([0., 0., 0.])
for system in trajectory.frames:
    topology.apply(system)
    com += system.molecules()[0].com()
    
print("The average COM is: %lf %lf %lf" % tuple(com / len(trajectory.frames)))

Note that baggianalysis provides a LazyTrajectory class that parses files one by one to avoid taking up too much memory. This can be useful when dealing with very large trajectories.

Features¶

Supports parsing of Gromacs, LAMMPS and oxDNA configurations and trajectories out of the box. See here for instructions about how to write custom parsers.
Configurations can be pre-filtered (by excluding some particles, or modifying others). See here for a list of filters.
Makes available some common (and less commond) observables (mean-squared displacement, radial distribution function, bond-order parameters, etc.). See here for the complete list of observables.

Main classes¶

Each particle is an instance of the Particle class.
Simulation snapshots are stored in System objects that have several attributes that allow to retrieve the properties of the particles they contain.
Multiple systems (also called frames in this context) can be stored in a trajectory object.
The library provides a set of built-in observables that can be used to analyse both single systems and whole trajectories.
The Topology class can be used to manage the topology of a configuration. Topologies can be initialised in two ways:
- by hand, using the make_empty_topology() static method to create a new topology and then adding bonds one after the other with the add_bond method
- by using an helper function to parse the topology out of a file through the make_topology_from_file() static method. Baggianalysis comes with some ready-made functions that can be used to parse topologies.

Logging¶

Several library methods and functions output some logging information, which by default is printed to the standard error. This behaviour can be altered by using the set_logging_mode() static method. Here are a few examples:

import baggianalysis as ba

ba.set_logging_mode(ba.STDERR) # this is the default

ba.set_logging_mode(ba.SILENT) # switch off logging

ba.set_logging_mode(ba.FILE)   # redirect logging to "ba_log.txt"

ba.set_logging_mode(ba.FILE, "my_log.txt")   # redirect logging to "my_log.txt"

Library API¶

Extending baggianalysis¶

Notes¶

By default, the core library is compiled dynamically. However, if Python bindings are enabled, the core library is compiled statically.
The timestep associated to a configuration must be an integer number. If your preferred format stores it as a floating-precision number, your parser will have to find a way of converting that to an integer. This is by design, as the time of a configuration is used as a key in several maps around the code, and floating-point numbers are not good at that. Moreover, integer numbers can be stored without losing any precision, in contrast to floats.
Normal trajectories need not load all the frames at once. Trajectories that do are called “full trajectories”. Many observables, in general, do not require access to all frames at once, which means that frames can parsed (and hence loaded) one by one when needed (lazy loading). This allows to work on big trajectories without consuming up too much memory.
Lists of 3D vectors are copied when accessed from the Python’s side. This means that their c++ counterparts (which are std::vectors) are not modified when append or similar Python methods are used.
Simple Python parsers can be used to either parse single Systems or to initialise trajectories from file lists and folders only. In order to do so, parsers should inherit from BaseParser and override the parse_file method, which takes a string as its only argument.
Molecules built by the Topology class are named mol_XXX, where XXX is an index that runs from zero to the number of molecules minus one.