Skip to content

olsonbg/TraceHBonds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TraceHBonds

A program for finding strings of hydrogen bonded atoms in a trajectory file generated from the Discover and LAMMPS molecular dynamics program.

Build Status

Hydrogen bond strings, colored by chain length

Contents

Installation

To compile the program

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=release ..
make
src/TraceHBonds

To generate the documentation as html, which will be available at docs/html/index.html, use

make docs

To cross-compile, use your toolchain cmake file as:

cmake -DCMAKE_BUILD_TYPE=release -DCMAKE_TOOLCHAIN_FILE=<Toolchain cmake file> ..

Usage

A typical command line would look like this when using CAR/MDF files:

TraceHBonds --input molecule.arc -p HBonds -s .dat -H h1o -A o2h -r 2.5 -a 90.0 --verbose --all

or this when using LAMMPS trajectory files:

TraceHBonds --input molecule.data --trajectory molecule.lammpstrj --molecule molecule.dat -p HBonds -s .dat -H h1o -A o2h -r 2.5 -a 90.0 --verbose --all

Below is a table of all options available from the command line. The long form options are preceded with --, and short form a single -. For options with both a long and short form, either one may be used on the command line.

Long form Short form Option Type Required? Description
input i string yes The archive file generated from Discover (without .arc), or the LAMMPS data file with the extension.
trajectory t string no The trajectory file generated from LAMMPS, required when load LAMMPS data.
molecule m string no The molecule file used for defining molecules in LMMPS data file.
outprefix p string yes All output will have this string as a prefix to the filenames. For example, to save data as HBonds1.dat, use -p HBonds as the prefix
outsuffix s string yes All output will have this string as a suffix to the filenames. For example, to save data as 'HBonds1.dat', use -s .dat as the suffix
rcutoff r real number yes Set the cutoff length, in angstroms, for the determination of a hydrogen bond (e.g. -r 2.5).
anglecutoff a real number yes Set the cutoff angle, in degrees, for the determination of a hydrogen bond (e.g. -a 90.0).
hydrogen H string yes Set the force field of donor hydrogens for hydrogen bonding (e.g. -H h1o). More than one force field may be used by specifying this option multiple times. NOTE the short option is a capital 'H.'
acceptor A string yes Set the force field of acceptor atoms for hydrogen bonding. More than one force field may be used by specifying this option multiple times (e.g. -A o2h -A o1=). NOTE the short option is a capital 'A.'
bins b integer no Minimum number of bins to show in histograms (e.g. -b 20).
povray no Output in povray format, relevant for --sizehist only.
json no Output in json format, relevant for --sizehist only. Useful for processing with python, and blender scripts in the blender/ directory
jsonall no Saves the chemical structure for all frames in json format. Useful for processing with python, and blender scripts in the blender/ directory.
incell no Apply PBC to all hydrogen bond chains (each chain will start inside the PBC cell). Relevant for --sizehist only.
verbose no Show verbose messages while running.
brief no Show brief messages while running.
lifetime no Calculate hydrogen bond lifetime correlations.
lengths no Save length of all hydrogen bonds.
angles no Save angle of all hydrogen bonds.
list no Save list of all hydrogen bonds.
sizehist no Save hydrogen bond strings and histograms.
neighborhist no Save neighbor length lists.
all no Do all calculations and save all data.
help h no This help screen

Input

The standard CAR/MDF and LAMMPS trajectory files are recognized. If compiled with support for LZMA, GZIP, and/or BZIP2, compressed files may be read and written.

Discover

The standard CAR/MDF files are read.

LAMMPS

Data file (--input)

The --input option, when loading a LAMMPS file, defines the name of the data file. The masses section of the data file must have a comment, it will be used as the forefield type used for atom selection with the --hydrogen and --acceptor flags.

And example masses section of a LAMMPS data file:

Masses

   1  15.999400 # o
   2  12.011150 # c2
   3  12.011150 # c
   4   1.007970 # h
   5  12.011150 # cPrime
   6  12.011150 # c3
   7  15.999400 # oPrime
   8  15.999400 # oh
   9   1.007970 # ho

Trajectory (--trajectory)

The --trajectory option defines the name of a LAMMPS trajectory file. The trajectory file must be saved by a dump custom or custom/gz command with atom attributes starting with 'id mol x y z', any other atom attribute may follow.

Molecule (--molecule)

The --molecule option is used when loading LAMMPS trajectory and data files, and loads a file that defines which atoms belong to which molecule. While the LAMMPS data file has a column, Molecule-Id for defining molecules, the msi2lmp script uses residue IDs there. This file is required when using LAMMPS files, whether msi2lmp is used or not.

The example file below defines 4 molecules and their associated atoms. The atoms must be in sequence for each molecule.

# Molecule FirstAtom LastAtom
     1      1    248
     2    249    496
     3    497    744
     4    745    992

Output

Depending upon the command line options, many data files may be generated. The description of all calculations are listed below

Hydrogen bond strings (--sizehist)

The --sizehist option calculates hydrogen bonds, traces the hydrogen bonds into connected strings, and tabulates the sizes.

Files Created

  • <prefix>#<suffix>

Where # indicates the frame of the trajectory. For a trajectory containing 100 frames, 100 files will be generated.

Description of files

The files generated from the --sizehist option consist of two parts: the individual chains and their atoms, and histograms of chain sizes.

Individual Chains

As an example, a few lines from a data file follows, showing 3 chains. The output will look a little different if either the --povray, or --json options are used.

# Current Element : 589
# Atoms in Chain : 3
# Molecules : 2
# Unique forcefields : 3
# Times chain switched between Molecules (switching) : 1
# Periodic boundary conditions applied.
  21.6276   27.0261   44.9056 [O]  Molecule  O87  o2h
  21.3585   27.9405   44.8265 [H]  Molecule  H592  h1o
  20.1321   29.5776   43.8281 [O]  Molecule2  O1493  o1=
# Chain end-to-end distance: 3.147660


# Current Element : 590
# Atoms in Chain : 9
# Molecules : 2
# Unique forcefields : 3
# Times chain switched between Molecules (switching) : 1
# Periodic boundary conditions applied.
  24.6835   23.4992   41.1162 [O]  Molecule  O95  o2h
  24.2312   22.9546   41.6786 [H]  Molecule  H601  h1o
  23.0622   23.9573   43.4825 [O]  Molecule  O88  o2h
  22.4585   23.9183   44.2600 [H]  Molecule  H593  h1o
  21.5617   24.1262   45.5633 [O]  Molecule9  O8207  o2h
  21.6362   24.7840   46.3132 [H]  Molecule9  H8711  h1o
  22.6499   25.4540   47.5703 [O]  Molecule9  O8208  o2h
  23.5335   25.7281   47.6493 [H]  Molecule9  H8712  h1o
  25.0893   24.1132   46.6706 [O]  Molecule9  O8209  o1=
# Chain end-to-end distance: 5.602967


# Current Element : 591
# Atoms in Chain : 5
# Molecules : 2
# Unique forcefields : 2
# Times chain switched between Molecules (switching) : 1
# Periodic boundary conditions applied.
  28.8885   22.9984   40.5006 [O]  Molecule  O96  o2h
  29.6141   22.3548   40.3725 [H]  Molecule  H602  h1o
  30.7792   21.1779   38.8075 [O]  Molecule6  O5419  o2h
  31.6526   21.1038   39.2359 [H]  Molecule6  H5925  h1o
  33.3777   21.2051   38.4486 [O]  Molecule6  O5420  o2h
# Chain end-to-end distance: 5.251614

Chain Histograms

This histogram shows how many chains of a specific length there are in this frame. It also shows how many chains form closed loops (begin and end at the same hydrogen bond). This particular data had no closed loops.

# Atoms/HBonds |Count| (For all Chains, including Closed Loops)
#    3 /   1   |  175|**************************************************************
#    5 /   2   |   72|**************************
#    7 /   3   |   19|*******
#    9 /   4   |   16|******
#   11 /   5   |    7|**
#   13 /   6   |    1|
#   15 /   7   |    1|
#   17 /   8   |    1|
#
# Atoms/HBonds |Count| (For Closed Loops)
#

The next series of histograms show how many times a hydrogen bond chain switches to another molecule, for each chain length. Switching of 0 means the chain was on a single molecule. Switching of 1 means it started on a molecule and ended on another. The switching number does not indicate how many molecules a chain is composed of, since it may switch back and forth between two molecules multiple times.

# Switching |Count| (For Chain length of 3)
#    0      |  107|**************************************************************
#    1      |   68|***************************************
#
# Switching |Count| (For Chain length of 5)
#    0      |   18|************************
#    1      |   46|**************************************************************
#    2      |    8|***********
#
# Switching |Count| (For Chain length of 7)
#    0      |    8|**************************************************************
#    1      |    6|***********************************************
#    2      |    5|***************************************

The following histograms shows how many molecules a chain is composed of, for each chain length.

# Molecules |Count| (For Chain length of 3)
#    1      |  107|**************************************************************
#    2      |   68|***************************************
#
# Molecules |Count| (For Chain length of 5)
#    1      |   18|**********************
#    2      |   50|**************************************************************
#    3      |    4|*****
#
# Molecules |Count| (For Chain length of 7)
#    1      |    8|**************************************************************
#    2      |    8|**************************************************************
#    3      |    3|***********************

Generating Images

Using Blender

The program Blender can be used to convert the output of --sizehist --json to an image or even a movie of the image rotating. Here is an example image using the hbchain.py script in the blender directory (I need to add a legend):

Figure 1: Hydrogen bond strings, colored by chain length. Rendered with Blender

The image was created by Blender using

blender -P blender/hbchain.py -- -f output/HBonds1.json

where output/HBonds1.json is the output of TraceHBonds with the appropriate options.

For help with the Blender script, try

blender -P hbchain.py -- --help

Using POV-Ray

The program POV-Ray can be used to convert the output of --sizehist --povray to an image or even a movie of the image rotating. Here is an example image using the prettybox.pov script in the povray directory:

Figure 2: Hydrogen bond strings, colored by chain length. Rendered with POV-Ray.

The legend text was added with GIMP.

Neighbor Distance in chains (--neighborhist)

The --neighborhist option calculates the distance between non-hydrogen atoms in the hydrogen bond chains. For a chain consisting of only oxygen and hydrogen atoms, this would calculate the neighbor distances between oxygen atoms in the chain.

Files Created

  • <prefix>-NN-AllFrames<suffix>
  • <prefix>-NN-Combined<suffix>
  • <prefix>-NN-only<suffix>

All files contain tab delimited text.

Description of files

The data start with statistics of each individual frame in a trajectory NN-AllFrames, then all frames together NN-Combined, followed by combining all chain lengths together for a list of only neighbor distances NN-only.

NN-AllFrames

The table in this file consists of a Count, Average, and StdDev column for each frame of the trajectory, so the table can have many columns if a trajectory with many frames is used. It's not uncommon to read 100, 1000, or more frames. Sample output is shown below, in table format for easy viewing. Note: Only the first 2 off 1000 frames of a large trajectory file are shown, and the first 13 of 27 atoms in chain for this particular sample trajectory file.

n.n. = Nearest neighbor, and f is the frame number.

Atoms in chain Nth n.n. Count(f=0) Average(f=0) StdDev(f=0) Count(f=1) Average(f=1) StdDev(f=1)
3 1 175 2.76054 0.193364 181 2.74879 0.189819
5 1 144 2.75077 0.197673 154 2.76424 0.204004
2 72 4.75806 0.635903 77 4.64895 0.732237
7 1 57 2.7841 0.201227 60 2.75164 0.22737
2 38 5.01798 0.581104 40 4.78955 0.619429
3 19 6.88486 0.948441 20 6.28423 1.4411
9 1 64 2.7419 0.195794 52 2.75299 0.217351
2 48 4.70959 0.743358 39 4.64655 0.670621
3 32 6.49354 1.32184 26 6.21771 1.27282
4 16 8.07276 1.88673 13 7.66016 2.06277
11 1 35 2.69268 0.130133 25 2.69679 0.159238
2 28 4.6182 0.587221 20 4.7146 0.696855
3 21 6.43492 0.672441 15 6.60273 1.07617
4 14 8.2164 1.09446 10 8.3314 1.70064
5 7 9.65632 1.52155 5 9.95754 2.07065
13 1 6 2.76977 0.150354 12 2.78778 0.272727
2 5 4.82386 0.386727 10 4.65379 0.546978
3 4 6.49485 0.499925 8 6.08429 0.413477
4 3 8.00401 0.466091 6 7.38176 1.13275
5 2 9.38291 0 4 8.29956 2.52054
6 1 9.95807 0 2 9.34289 0

NN-Combined

The table in this file combines data from all the frames into single Count, Average, and StdDev columns. Sample data is shown below, in table format for easy viewing. Note: Only the first 13 of 27 atoms in a chain are shown for this particular sample trajectory file.

Atoms in chain Nth n.n. Count Average StdDev
3 1 22655 2.86888 0.159418
5 1 22230 2.86208 0.159733
2 11115 4.68886 0.70739
7 1 19182 2.85872 0.159745
2 12788 4.67906 0.703396
3 6394 6.07321 1.31757
9 1 15888 2.85741 0.158921
2 11916 4.68648 0.702248
3 7944 6.08516 1.26822
4 3972 7.19271 1.89908
11 1 12900 2.85904 0.159156
2 10320 4.68125 0.705725
3 7740 6.08836 1.27751
4 5160 7.21112 1.88283
5 2580 8.16827 2.45298
13 1 10452 2.85698 0.15854
2 8710 4.67941 0.690471
3 6968 6.06397 1.27943
4 5226 7.13685 1.88829
5 3484 8.06878 2.4206
6 1742 8.9083 2.88292

NN-only

This file combines all frames as in NN-Combined, and also all the 'Atoms in chain' column for a complete nearest neighbor table. Sample output is shown below, in table format for easy viewing. Only first 6 of 13 shown for this particular sample trajectory file.

Nth n.n. Count Average StdDev
1 145342 2.85969 0.159263
2 92595 4.68331 0.700193
3 62503 6.08867 1.27689
4 43526 7.22511 1.85633
5 30943 8.2102 2.36345
6 22332 9.11149 2.79441

Hydrogen bond lengths (--lengths)

The --lengths option calculates the length of all hydrogen bonds (hydrogen-acceptor distance), in every frame.

File Created

  • <prefix>-lengths<suffix>

Description of file

Single column of data listing the hydrogen bond lengths in angstroms.

Hydrogen bond angles (--angles)

The --angles option calculates the angle of all hydrogen bonds, in every frame.

File Created

  • <prefix>-angles<suffix>

Description of file

Single column of data listing the hydrogen bond angles in degrees.

Hydrogen bond list (--list)

The --list option calculates the list of all hydrogen bonds, in every frame.

File Created

  • <prefix>-list<suffix>

Description of file

Seven (7) column data, in tab delimited format. The columns are:

  1. Frame number, starting with 0.
  2. Molecule of atom connected to donor hydrogen.
  3. Name of atom connected to donor hydrogen.
  4. Molecule of donor hydrogen atom.
  5. Name of donor hydrogen atom.
  6. Molecule of acceptor atom.
  7. Name of acceptor atom.

Hydrogen bond lifetime correlations (--lifetime)

The --lifetime option calculates the continuous and intermittent hydrogen bond lifetime autocorrelation.

File Created

  • <prefix>-lifetimes<suffix>

Description of file

Three column data, in tab delimited format. The columns are:

  1. Frame number, starting with 0.
  2. Continuous hydrogen bond lifetime correlation
  3. Intermittent hydrogen bond lifetime correlation

About

Find strings of hydrogen bonded atoms in a trajectory file.

Resources

License

Stars

Watchers

Forks

Packages

No packages published