(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright Ross Walker 2015

Section 3: Running Minimization and Molecular Dynamics (in vacuo)

Updated for AMBER 15

This section will introduce sander and show how it can be used for minimization and molecular dynamics of our previously created DNA models. We will initially run our simulations on the in vacuo model and analyze the results before moving on to running simulations with implicit and explicit solvent. For this section of the tutorial we shall be using the in vacuo prmtop and inpcrd files we created previously (polyAT_vac.prmtop and polyAT_vac.rst7).

This section of the tutorial will consist of 3 stages:

Relaxing the system prior to MD
Molecular dynamics at constant temperature
Analyzing the results

3.1) Relaxing the System Prior to MD

In the previous section we used NAB to build us a starting structure. Since this "default" geometry may not correspond to the actual minima in the force field we are using and may also result in conflicts and overlaps with atoms in other residues, it is always a good idea to minimize the structure before commencing molecular dynamics. Failure to successfully minimize the structure may lead to instabilities when we run MD.

So, given the in vacuo prmtop (polyAT_vac.prmtop) and rst7 (polyAT_vac.rst7) files we created, we will now use sander to conduct a short minimization run. Since we just want to "fix up" the positions of the atoms in order to remove any bad contacts that may lead to unstable molecular dynamics we will run a short (500 steps) minimization. This will take us towards the closest local minima. Minimization with sander will only ever take you to the nearest minima, it cannot cross transition states to reach lower minima. This is fine for our purposes, however, since all we want to do is remove the largest strains in the system.

The basic usage for sander is as follows:

sander [-O] -i mdin -o mdout -p prmtop -c rst7 -r restrt
[-ref refc] [-x nc] [-v mdvel] [-e mden] [-inf mdinfo]

Arguments in []'s are optional
If an argument is not specified, the default name will be used.
-O overwrite all output files (the default behavior is to quit if any output files already exist)
-i the name of the input file (which describes the simulation options), mdin by default.
-o the name of the output file, mdout by default.
-p the parameter/topology file, prmtop by default.
-c the set of initial coordinates for this run, nc by default.
-r the final set of coordinates from this MD or minimization run, restrt by default.
-ref reference coordinates for positional restraints, if this option is specified in the input file, refc by default.
-x the molecular dynamics trajectory file (if running MD), nc by default.
-v the molecular dynamics velocities file (if running MD), mdvel by default.
-e a summary file of the energies (if running MD), mden by default.
-inf a summary file written every time energy information is printed in the output file for the current step of the minimization of MD, useful for checking on the progress of a simulation, mdinfo by default.

3.1.1) Building the mdin input file

Now that we have the prmtop and rst7 files from xleap, all we need to run sander is the mdin file which specifies the myriad of possible options for this run.

The run time input to control sander is via "namelist" variables (for more information see the manual) specified in the mdin file. For example:

Test run 1
&cntrl
IMIN = 1, NCYC = 250, MAXCYC = 500
/

In the absence of a variable specification in the input file, default values are chosen; every specified variable, except the last one, needs to be followed by a comma. The sander manual describes all of these inputs, for each of the possible namelists. The namelist used depends on the specification above, such as &cntrl or &ewald. At a minimum the &cntrl namelist must be specified. Also notice the space or empty first column before specification of the namelist control variable; this is necessary. It is also necessary to end each namelist with a forward slash /. Other namelists (such as the &ewald namelist) are optional. After the namelist some other information may be specified, such as "GROUP" input, which allows atom selections for restraints. Note that the input variable and namelists have changed somewhat from earlier versions of AMBER. Refer to the sander section of the manual for more information.

Next we will build a minimal input file for performing minimization on our DNA. In theory, it would probably be best to run a dual stage minimization where we initially use position restraints on all the heavy atoms so that in the first stage of minimization only the hydrogens that xleap added are minimized. Then in the second stage, we allow minimization of all atoms in the system. Since our system is fairly small and simplistic, it should be fine to skip the first stage and just minimize everything. An example of such a two-stage minimization approach will be given when we run simulations on our more complex solvated model in the next section.

To create our input file: To turn on minimization, we specify IMIN = 1. We want a fairly short minimization since we don't actually need to reach the minima, but just want to move away from any local maxima, so we select 500 steps of minimization by specifying MAXCYC = 500. Sander supports a number of different minimization algorithms, the most commonly used being steepest descent and conjugate gradient. The steepest descent algorithm is good for quickly removing the largest strains in the system but it also converges slowly when close to a minima. Here the conjugate gradient method is more efficient. The use of these two algorithms can be controlled using the NCYC flag. If NCYC < MAXCYC, sander will use the steepest descent algorithm for the first NCYC steps before switching to the conjugate gradient algorithm for the remaining (MAXCYC - NCYC) steps. In this case we will run an equal number of steps with each algorithm so we set NCYC = 250. Since sander assumes that the system is periodic by default we need to explicitly turn this off (NTB = 0). In this simulation we will be using a constant dielectric and not an implicit (or explicit) solvent model so we set IGB = 0 (no generalized born solvation model). This is the default so we don't strictly need to specify this, but I will include it here so that we can see what differences we have in the input file when we switch on implicit solvent later. We also need to choose a value for the non-bonded cut off. A larger cut off introduces less error in the non-bonded force evaluation but increases the computational complexity and thus calculation time. 12 angstroms would seem like a reasonable tradeoff for gas phase so that is what we will initially use (CUT = 12). Here is what the input file looks like:

polyA-polyT 10-mer: initial minimization prior to MD
&cntrl
imin   = 1,
maxcyc = 500,
ncyc   = 250,
ntb    = 0,
igb    = 0,
cut    = 12
/

So, now to run sander for minimization.

3.1.2) Running sander for the first time

To run sander, we simply execute the following:

$AMBERHOME/bin/sander -O -i polyAT_vac_init_min.in -o polyAT_vac_init_min.out -c polyAT_vac.rst7 -p polyAT_vac.prmtop -r polyAT_vac_init_min.ncrst

Input files: polyAT_vac_init_min.in, polyAT_vac.rst7, polyAT_vac.prmtop
Output files: polyAT_vac_init_min.out, polyAT_vac_init_min.ncrst

This should run very quickly. On a 1.4 GHz Intel Core i5 machine it takes about 2.2 seconds

Take a look at the output file produced during the minimization (polyAT_vac_init_min.out). You will see that the energy dropped considerably between the first and last steps:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -1.4123E+03     1.9444E+01     9.4276E+01     C2'       634

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
    500      -2.3331E+03     6.1397E-01     3.8981E+00     C6        208

In spite of this, however, the structure did not change very much. This is because, as already mentioned, minimization will only find the nearest local minima. If you create PDB files for the start (polyAT_vac.rst7) and final structures (polyAT_vac_init_min.ncrst) and compare the root-mean-squared deviations (RMSd), you will see that the two structures differ by only about 0.5 angstroms (all atom).

Initial structure = Green, Minimized structure = Blue

3.1.3) Creating PDB files from the AMBER coordinate files

You may want to generate a new pdb file so you can look at the structure using the minimized coordinates from polyAT_vac_init_min.ncrst. A pdb file can be created from the parm topology and coordinates (rst7 or restrt) using the program ambpdb.

$AMBERHOME/bin/ambpdb -p polyAT_vac.prmtop -c polyAT_vac_init_min.ncrst > polyAT_vac_init_min.pdb

This will take the prmtop and rst7 file specified (in this case the final structure from the minimization, the .ncrst file) and create (on stdout) a pdb file. In this case we redirected stdout to the file - polyAT_vac_init_min.pdb.

A VMD screenshot of the generated PDB file.

In general, users should carefully inspect any starting structures and minimized structures; specifically check to make sure the hydrogens were placed where you thought they should be, histidines are in the correct protonation state, the terminal residues are properly terminated, stereochemistry is reasonable, etc. There is nothing worse than finding out after you've run a nanosecond of solvated dynamics that an H1' atom was on the wrong side and that you have simulated some strange anomer of DNA!

3.2) Running MD in-vacuo

In section 3.1 we used sander to minimize our system in order to remove any bad contacts introduced by the hydrogenation step in xleap. This led to the creation of the coordinate file named polyAT_vac_init_min.ncrst. For the next section of this tutorial we will use this coordinate file as the starting structure for our in-vacuo MD simulation. The following in-vacuo simulation is designed to give a flavor of how MD simulations are run. In general, one would not run an in-vacuo simulation unless the system was inherently gas phase. For liquid phase systems, such as our DNA, one would normally include solvent either explicitly or implicitly. This type of simulation will be covered in section 4. For the time being we will stick to gas phase simulations since they are simple and quick to run.

To make the calculations tractable for the purposes of this tutorial, we can only run short simulations of the order of 100 ps. Although these are "short" simulations (in terms of what is likely required to answer specific research questions), you will see that they are still rather costly, depending on what type of machine you using. However, to get a better idea of the issues involved and potential artifacts of the various models that will be used (for the simulation of DNA) it is probably useful to run through these examples, either by running them yourself or by just looking at the output files and trajectories supplied. You may also want to try some different examples from the two given below, such as a distance dependent dielectric constant, or see what happens if you increase the dielectric constant to, say, 80.0.

For this tutorial we will run two in-vacuo simulations and compare the results. The simulations we will run will be:

polyAT_vac_md1_12Acut.in: 12.0 angstrom long range cutoff, dielectric = 1

polyAT_vac_md1_nocut.in: no long range cutoff, dielectric = 1

To run a molecular dynamics simulation with sander, we need to turn off minimization (IMIN=0). Since we are running in-vacuo we also need to disable periodicity (NTB=0) and set IGB=0 since we are not using implicit solvent. For these two examples we will write information to the output file and trajectory coordinates file every 100 steps (NTPR=100, NTWX=100). For most simulations, writing to the coordinate file every 100 steps is too frequent because it consumes excess disk space and can impact performance. However, this simulation here will show some interesting behavior over a short time scale that I would like you to see and so we will use a value of 100 to ensure that this is suitably sampled. (For multiple nano-second simulations of stable systems a more suitable value for NTWX would be in the 1000 to 2000 range.)

The CUT variable specifies the cut off range for the long range non-bonded interactions. Here, two different values for the cutoff will be used, one run with a cut off of 12 angstroms (CUT = 12.0) and one run without a cutoff. To run without a cutoff we simply set CUT to be larger than the extent of the system (e.g. CUT = 999).

For temperature regulation we will use the Langevin thermostat (NTT=3) to maintain the temperature of our system at 300 K. This temperature control method uses Langevin dynamics with a collision frequency given by GAMMA_LN and is significantly more efficient at equilibrating the system temperature than the Berendsen temperature coupling scheme (NTT=1) that was recommended for older versions of AMBER. The biggest problem with the Berendsen method is that the algorithm simply ensures that the kinetic energy is appropriate for the desired temperature; it does nothing to ensure that the temperature is even over all parts of the molecule. This can lead to the phenomenon of hot solvent, cold solute. To avoid this, elaborate temperature scaling techniques for slowly heating the molecule over the course of the simulation were recommended. The Langevin system is much more efficient, however, at equilibrating the temperature and is now the recommended choice. The efficiency is such that if we have a reasonably good structure, which we do in this case, we can actually start the system at 300 K and avoid the need to slowly heat over, say, 20 ps from 0 K to room temperature. Use the Langevin temperature regulation scheme with care, however. While it allows efficient equilibration the system's temperature, it will alter the fast dynamics of your system. As such, if you are interested in things such as correlation functions, it is often better to equilibrate your system using NTT=3 and then once equilibrated, pure Newtonian dynamics (NTT=0). It should also be noted that the performance of NTT=3 is less than a NVT simulation using NTT=1 or a NVE simulation. Hence it can often be beneficial to heat in NVT with NTT=3, equilibrate in NPT with NTT=3 and then run production as either NVE (NTT=0) or as NVT with NTT=1 and a weak coupling constant of around 10 ps^-1. The exact choice, though, will depend on what you aim to learn from your simulation and so you should consult literature for previous examples and discussions.

For simplicity we will run this entire simulation with NTT=3 and GAMMA_LN=1. We shall also set the initial and final temperatures to 300 K (TEMPI=300.0, TEMP0=300.0) which will mean our system's temperature should remain around 300 K. In these two examples we will run a total of 100,000 steps each (NSTLIM=100000) with a 1 fs time step (DT=0.001) giving simulation lengths of 100 ps (100,000 x 1fs). The two input files are shown below:

polyAT_vac_md1_12Acut.in

10-mer DNA MD in-vacuo, 12 angstrom cut off
&cntrl
imin = 0, ntb = 0,
igb = 0, ntpr = 100, ntwx = 100,
ntt = 3, gamma_ln = 1.0,
tempi = 300.0, temp0 = 300.0
nstlim = 100000, dt = 0.001,
cut = 12.0
/

polyAT_vac_md1_nocut.in

10-mer DNA MD in-vacuo, infinite cut off
&cntrl
imin = 0, ntb = 0,
igb = 0, ntpr = 100, ntwx = 100,
ntt = 3, gamma_ln = 1.0,
tempi = 300.0, temp0 = 300.0
nstlim = 100000, dt = 0.001,
cut = 999
/

Caution: In the examples in this tutorial we do not change the value of the random seed used for the random number generator. This is controlled by the namelist variable ig. This is largely for issues of reproducibility of the results within a tutorial setting. However, when running production simulations, especially when using NTT=2 or 3 (Anderson or Langevin thermostats) it is essential that you change the random number seed from the default on EVERY MD restart. If you are using AMBER 10 (bugfix.26 or later) or AMBER 11 or later, you can do this automatically by setting ig=-1 in the cntrl namelist. For more details on the pitfalls of not doing this you should refer to the following publication:

Cerutti DS, Duke, B., et al., "A Vulnerability in Popular Molecular Dynamics Packages Concerning Langevin and Andersen Dynamics", JCTC, 2008, 4, 1669-1680

So, to run the two jobs we issue the following two commands. Note, if you have a dual (or more) processor machine then you can run both jobs simultaneously, otherwise you should run them sequentially (note we use the restrt file from the minimization as our starting structure):

$AMBERHOME/bin/sander -O -i polyAT_vac_md1_12Acut.in -o polyAT_vac_md1_12Acut.out -c polyAT_vac_init_min.ncrst -p polyAT_vac.prmtop -r polyAT_vac_md1_12Acut.ncrst -x polyAT_vac_md1_12Acut.nc

$AMBERHOME/bin/sander -O -i polyAT_vac_md1_nocut.in -o polyAT_vac_md1_nocut.out -c polyAT_vac_init_min.ncrst -p polyAT_vac.prmtop -r polyAT_vac_md1_nocut.ncrst -x polyAT_vac_md1_nocut.nc

These will take longer to run than the minimization, so it is probably a good time to go off and get a cup of coffee. The 12 angstrom cut off run takes about 400s (7 mins) on a single 1.4 GHz Intel Core i5. If it is taking too long on your machine, you can always reduce the simulation length by lowering NSTLIM to say 10,000 (10 ps). You can follow the progress of the job by following the output file with the following command:

tail -f polyAT_vac_md1_12Acut.out

Input files: polyAT_vac_md1_12Acut.in, polyAT_vac_md1_nocut.in, polyAT_vac_init_min.ncrst, polyAT_vac.prmtop

Output files: polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out, polyAT_vac_md1_12Acut.ncrst, polyAT_vac_md1_nocut.ncrst, polyAT_vac_md1_12Acut.nc, polyAT_vac_md1_nocut.nc

Note that we now have an additional option, the -x option which specifies the name of the output file for the MD trajectory. This file will contain a snapshot of the entire system's Cartesian coordinates every NTWX (100) steps.

Please be aware that the nocut simulation will likely stop after around 26,400 steps. This is not a bug, it is a problem with simulating DNA in vacuo which will become clear when we visualize the trajectory files. More on this later.

3.3) Analyzing the results

So, now we've run the simulations what do we want to look at? First off, we want to look through the mdout files (polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out). It is also a good idea to extract the various energies as a function of time to plot the total, kinetic, potential energies, etc. We also want to calculate RMSd vs. time, etc. To look at movies (which is much more fun :-) ) we can use a visualization program such as VMD to load up the "parm and crd" (i.e. AMBER prmtop and MD trajectory) and use the animation controls to view it-more on this later.

3.3.1) Extracting the energies, etc. from the mdout file

One can write an awk or sed script to pull out the energies, etc. These can be pulled out of the mden file if energy information was written out (which we didn't do) or directly from the mdout file. We will use the script process_mdout.perl that has been designed to process mdout files and create a series of files with the various different pieces of information in them. The program uses a default output filename so it is best to create a sub directory for each of your output files and move to there before running the script e.g.

mkdir polyAT_vac_md1_12Acut
cd polyAT_vac_md1_12Acut

process_mdout.perl ../polyAT_vac_md1_12Acut.out

You should repeat this process for the nocut output file, even though the simulation did not complete all 100,000 steps:

mkdir polyAT_vac_md1_nocut
cd polyAT_vac_md1_nocut
process_mdout.perl ../polyAT_vac_md1_nocut.out

This script takes a series of mdout files and will create a series, leading off with the prefix "summary." such as "summary.EPTOT", of output files. These files are just columns of the time vs. the value for each of the energy components. Tar archives of the summary files for each of the two MD runs above are available here (polyAT_vac_md1_12Acut.summary.tar.gz, polyAT_vac_md1_nocut.summary.tar.gz).

You can plot the summary files with your favorite graphing program. A useful program for just quickly looking at a plot of a file is xmgrace, a very simple plotting program, but ideal for our purpose. To plot the total potential energy, of both runs, as a function of time we run, assuming we are in our master simulation directory:

xmgrace ./polyAT_vac_md1_12Acut/summary.EPTOT ./polyAT_vac_md1_nocut/summary.EPTOT

This should give you a plot similar to the following:

Potential energies with and without interaction cutoffs: the black line represents the potential energy for the 12 angstrom cutoff simulation vs. time, while the red line represents the potential energy for the simulation without a cutoff vs. time.

As we can see, the 12 angstrom cutoff simulation seems to be fairly stable since the potential energy is fluctuating around a constant mean value. The simulation without a cutoff, however, is significantly different. To begin with the potential energy is some 3,000 kcal/mol higher than the 12 angstrom cutoff simulation. The potential energy is so large in fact that it is actually positive. The potential energy also decreases rapidly over the first 10 ps of simulation suggesting that a large structural change is occurring. The simulation then ends abruptly after 26.4ps with the following error:

 Frac coord min, max:    5.5997502608777254E-003   1.0000543187011288    
 The system has extended beyond the extent of the virtual box.
 Restarting sander will recalculate a new virtual box with 30 Angstroms extra on each side, if there is a
    restart file for this configuration.
 SANDER BOMB in subroutine Routine: map_coords (ew_force.f)
 Atom out of bounds. If a restart has been written,
 restarting should resolve the error

Why the difference in potential energy? Well, by looking at our two output files this can be tracked down to the difference in the electrostatic energy. For the first step we have:

12 angstom cutoff

 NSTEP =        0   TIME(PS) =       0.000  TEMP(K) =   297.60  PRESS =     0.0
 Etot   =     -1767.1574  EKtot   =       565.9592  EPtot      =     -2333.1166
 BOND   =        16.1950  ANGLE   =        91.3196  DIHED      =       392.1141
 1-4 NB =       162.9680  1-4 EEL =      -347.7440  VDWAALS    =      -349.1563
 EELEC  =     -2298.8130  EHBOND  =         0.0000  RESTRAINT  =         0.0000

No cutoff

 NSTEP =        0   TIME(PS) =       0.000  TEMP(K) =   297.60  PRESS =     0.0
 Etot   =      1885.1396  EKtot   =       565.9592  EPtot      =      1319.1804
 BOND   =        16.1950  ANGLE   =        91.3196  DIHED      =       392.1141
 1-4 NB =       162.9680  1-4 EEL =      -347.7440  VDWAALS    =      -351.8268
 EELEC  =      1356.1545  EHBOND  =         0.0000  RESTRAINT  =         0.0000

Notice how the value of EELEC is so much larger in the no cutoff simulation.

So, what has happened and how can we learn more? Well, first, let's look at another type of analysis we can perform on the results, in this case, analysis of the MD trajectory files (polyAT_vac_md1_12Acut.nc, polyAT_vac_md1_nocut.nc). From this we can hopefully undencrstand what is happening.

Calculating the RMSd vs. time

The next step in analyzing our results is to calculate the RMSd as a function of time using cpptraj (an analysis program provided with AmberTools). The precise parameter we will be calculating in this example is the mass weighted RMSd fit between each successive structure and the first structure of our trajectory. This is done by providing an input file to cpptraj containing a list of commands describing the relevant files and what we want it to do, etc.:

trajin polyAT_vac_md1_12Acut.nc
rms first mass out polyAT_vac_md1_12Acut.rms time 0.1

where trajin specifies the name of the trajectory file to process, rms first mass tells cpptraj that we want it to calculate a MASS weighted, RMS fit to the FIRST structure. out specifies the name of the file to write the results to and time 0.1 tells cpptraj that each frame of the coordinate file represents 0.1 ps. Since we have two simulations we have two input files, with different trajectory and output files in each, polyAT_vac_md1_12Acut.calc_rms, polyAT_vac_md1_nocut.calc_rms.

Run the two cpptraj jobs with the following commands:

$AMBERHOME/bin/cpptraj -p polyAT_vac.prmtop -i polyAT_vac_md1_12Acut.calc_rms

$AMBERHOME/bin/cpptraj -p polyAT_vac.prmtop -i polyAT_vac_md1_nocut.calc_rms

The output from each job will look something like this:

CPPTRAJ: Trajectory Analysis. V15.00
    ___  ___  ___  ___
     | \/ | \/ | \/ | 
    _|_/\_|_/\_|_/\_|_

	Reading 'polyAT_vac.prmtop' as Amber Topology
INPUT: Reading Input from file polyAT_vac_md1_12Acut.calc_rms
  [trajin polyAT_vac_md1_12Acut.nc]
	Reading 'polyAT_vac_md1_12Acut.nc' as Amber Trajectory
  [rms first mass out polyAT_vac_md1_12Acut.rms time 0.1]
    RMSD: (*), reference is finncncrst frame (*), with fitting, mass-weighted.
---------- RUN BEGIN -------------------------------------------------

PARAMETER FILES:
 0: 'polyAT_vac.prmtop', 638 atoms, 20 res, box: None, 2 mol, 1000 frames

INPUT TRAJECTORIES:
 0: 'polyAT_vac_md1_12Acut.nc' is an AMBER trajectory, Parm polyAT_vac.prmtop (reading 1000 of 1000)
  Coordinate processing will occur on 1000 frames.
TIME: Run Initialization took 0.0001 seconds.

BEGIN TRAJECTORY PROCESSING:
.....................................................
ACTION SETUP FOR PARM 'polyAT_vac.prmtop' (1 actions):
  0: [rms first mass out polyAT_vac_md1_12Acut.rms time 0.1]
	Target mask: [*](638)
	Reference mask: [*](638)
----- polyAT_vac_md1_12Acut.nc (1-1000, 1) -----
 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.

Read 1000 frames and processed 1000 frames.
TIME: Trajectory processing: 0.1576 s
TIME: Avg. throughput= 6345.0166 frames / second.

ACTION OUTPUT:

DATASETS:
  1 data set:
	RMSD_00000 "RMSD_00000" (double, rms), size is 1000

DATAFILES:
  polyAT_vac_md1_12Acut.rms (Standard Data File):  RMSD_00000
---------- RUN END ---------------------------------------------------
TIME: Total execution time: 0.1940 seconds.
--------------------------------------------------------------------------------
To cite CPPTRAJ use:
Daniel R. Roe and Thomas E. Cheatham, III, "PTRAJ and CPPTRAJ: Software for
  Processing and Analysis of Molecular Dynamics Trajectory Data". J. Chem.
  Theory Comput., 2013, 9 (7), pp 3084-3095.

We should get two output files, one for each simulation (polyAT_vac_md1_12Acut.rms, polyAT_vac_md1_nocut.rms).

We can now plot the RMSd's vs time. Note, RMSd is in angstroms and time is in ps.

xmgrace polyAT_vac_md1_12Acut.rms polyAT_vac_md1_nocut.rms

This should give a plot similar to the following:

RMSD versus time: the simulation without a cutoff is shown in red.

And here we see the problem. While the 12 angstrom cutoff simulation has a largely constant RMSd around 2.2 angstroms the no cutoff simulation shows a steadily increasing RMSd which after 20 ps is already over 30 angstroms!!! This would suggest that something has gone wrong with the simulation since our system has essentially "blown up." But, is this the whole story? Or does it mask a larger problem with running a simulation in vacuum? Why does the use of a cutoff result in a stable trajectory while the use of no cutoff (which should in theory be more accurate) leads to a "blow up"?

3.3.3) Visualizing the trajectories with VMD

Let's think about this more carefully. Our DNA molecule consists of two chains held together by hydrogen bonds. Each chain has a net charge of -9 electrons. So, our chains have a large electrostatic repulsion as shown by the value of EELEC in the no cutoff simulation (EELEC = 1356.1545). Thus the two chains are repelling each other. For the system to be stable, the interaction between the chains, largely due to hydrogen bonding, must counteract this large electrostatic repulsion. Now, in the case where we used a 12 angstrom cutoff, all charges beyond the 12 angstroms distance were considered to have zero energy. Since the average distance between the negative charges on the DNA backbone due to the phosphate groups is 15 angstroms, the repulsion between the phosphate atoms on opposite chains was ignored when the 12 angstrom cutoff was employed. The electrostatic attraction due to hydrogen bonding between opposite bases is due to a much closer interaction, of the order of 2 angstroms. Thus while the main electrostatic repulsion interaction was excluded when we used the 12 angstrom cutoff, the main attractive force between the two chains was included so the chain held together and we obtained a stable trajectory. Let's take a look at it in vmd:

vmd

Open our 12 angstrom cutoff trajectory S. file. Select:

File -> New Molecule

Loading a molecule definition (sometimes called a parm, a top, or a prmtop) to VMD.

VMD supports multiple trajectory files for a single molecule. Thus a molecule is defined by loading the associated prmtop file. So, start by selecting Browse and finding the polyAT_vac.prmtop file. Then under the heading where it states "Determine file type:" select AMBER7 Parm. Note: VMD has two definitions for prmtop files, 'AMBER Parm' and 'AMBER7 Parm'. The 'AMBER Parm' refers to prmtop files for AMBER versions of 6 and lower. The prmtop format changed slightly between Amber 6 and Amber 7, largely to allow for larger molecules, so that the AMBER7 Parm option is required for AMBER v7 and later prmtop files.

Files of format "AMBER7 Parm" define the molecular topology and paramters.

Then hit 'Load'.

Next we need to choose which trajectory to load. In this case, we will load the 12 angstrom cutoff trajectory file polyAT_vac_md1_12Acut.nc (remember to gunzip the file before use). Select Browse, browse for the file and then under 'determine file type' you need to select "AMBER Coordinates". Note: if this were a periodic boundary simulation then we would select "AMBER Coordinates with Periodic Box" since the trajectory file would also contain information on the box size. In this situation, however, there is no box information since this is a non-periodic in vacuo simulation. When you hit "Load" again you should see all of the frames loaded into the main molecule window. 1,000 frames in total.

The VMD display window showing a DNA double helix.

We can now use the playback tools in the "VMD main" control panel to play our movie:

VMD trajectory viewing controls.

You should have a go at playing the video of the trajectory. Notice how the DNA holds its secondary structure. Though there is considerable movement in the extremities, the overall structure is preserved.

Is this the correct answer though? A stable trajectory doesn't necessarily mean it is correct. Is a strand of charged DNA in vacuum really likely to be stable? In solvent, such electrostatic repulsions are shielded by the solvent but in vacuum, there is no such shielding and there are no external forces to help hold the chains together. Lets take a look at the trajectory obtained from our no cutoff simulation:

If VMD is already open, close it - or if you know how, simply delete the existing molecule.

Run vmd (based on vmd 1.9.2):

vmd

Open our no cut off trajectory file. Select:

File -> New Molecule

Repeat the process as described above. Use the same polyAT_vac.prmtop file but this time select the no cut off trajectory file polyAT_vac_md1_nocut.nc. This time when you hit "Load" VMD should load the trajectory up until the point where the simulation crashed (264 frames of 0.1ps each = 26.4 ps). Have a look at the trajectory, the difference from the last simulation should be obvious. The instability of the DNA dimer is clear.

0 ps	2.5 ps	5 ps
7.5 ps	10 ps	12.5 ps
15 ps	17.5 ps	20 ps
22.5 ps

If we view the DNA dimer from the top during the movie of the MD trajectory it is slightly easier to see what is happening, the large repulsive charge on the two chains is causing them to uncoil and move away from each other:

0 ps	2.5 ps	5 ps
7.5 ps	10 ps	12.5 ps
15 ps	17.5 ps	20 ps
22.5 ps

Here we see that the two chains have, due to the large electrostatic repulsion forces, rapidly separated from each other and have started to drift apart. The simulation stopped when the distance between the two strands exceeded pre-determined parameters within the code.

So, which simulation is the correct one? Well, since this simulation was in vacuo and we had no neutralizing ions the conditions did not really represent laboratory conditions. Indeed in this "harsh" environment, with no clustered water or ions, it is likely that the DNA 10-mer is going to be unstable and so the behavior shown by our no cut off simulation is most likely the closest to reality.

The take home lesson here is that you should think very carefully about what you are simulating. Are you really simulating realistic conditions? How are the parameters you have chosen biasing your results? A cutoff can be a good way to increase the speed of a simulation, but you need to be aware that it can introduce very large artifacts into your simulation. So, think very carefully, and try out several scenarios before you try to reach firm conclusions.

One way to improve considerably on our in vacuo simulations is to make our physical model of DNA much closer to reality, i.e. include explicit neutralizing ions and also to include solvent effects, either implicitly within our model or via the use ofexplicit solvent. This is the subject of the next two stages of our tutorial here.

CLICK HERE TO GO TO SECTION 4