(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright Ross Walker 2010

Section 3: Running Minimization and Molecular Dynamics (in vacuo)

This section will introduce sander and show how it can be used for minimization and molecular dynamics of our previously created DNA models. We will initially run our simulations on the in vacuo model and analyze the results before moving on to running simulations with implicit and explicit solvent. For this section of the tutorial we shall be using the in vacuo prmtop and inpcrd files we created previously (polyAT_vac.prmtop and polyAT_vac.inpcrd).

This section of the tutorial will consist of 3 stages:

  1. Relaxing the system prior to MD
  2. Molecular dynamics at constant temperature
  3. Analyzing the results

3.1) Relaxing the System Prior to MD

In the previous section we used NAB to build us a starting structure. Since this "default" geometry may not correspond to the actual minima in the force field we are using and may also result in conflicts and overlaps with atoms in other residues, it is always a good idea to minimize the structure before commencing molecular dynamics. Failure to successfully minimize the structure may lead to instabilities when we run MD.

So, given the in vacuo prmtop (polyAT_vac.prmtop) and inpcrd (polyAT_vac.inpcrd) files we created, we will now use sander to conduct a short minimization run. Since we just want to "fix up" the positions of the atoms in order to remove any bad contacts that may lead to unstable molecular dynamics we will run a short (500 steps) minimization. This will take us towards the closest local minima. Minimization with sander will only ever take you to the nearest minima, it cannot cross transition states to reach lower minima. This is fine for our purposes, however, since all we want to do is remove the largest strains in the system.

The basic usage for sander is as follows:

sander [-O] -i mdin -o mdout -p prmtop -c inpcrd -r restrt
[-ref refc] [-x mdcrd] [-v mdvel] [-e mden] [-inf mdinfo]

3.1.1) Building the mdin input file

Now that we have the prmtop and inpcrd files from xleap, all we need to run sander is the mdin file which specifies the myriad of possible options for this run.

The run time input to control sander is via "namelist" variables (for more info see the manual) specified in the mdin file. For example:

Test run 1
 &cntrl
     IMIN = 1, NCYC = 250, MAXCYC = 500

In the absence of a variable specification in the input file, default values are chosen; every specified variable, except the last one, needs to be followed by a comma. The sander manual describes all of these inputs, for each of the possible namelists. Which namelist is used depends on the specification above, such as &cntrl or &ewald. At a minimum the &cntrl namelist must be specified. Also notice the space or empty first column before specification of the namelist control variable; this is necessary. It is also necessary to end each namelist with a forward slash /. Other namelists (such as the &ewald namelist) are optional. After the namelist some other information may be specified, such as "GROUP" input, which allows atom selections for restraints. Note that the input variable and namelists have changed somewhat from earlier versions of AMBER. Refer to the sander section of the manual for more information.

Next we will build a minimal input file for performing minimization of our DNA. In theory it would probably be best to run a dual stage minimization where we initially use position restraints on all the heavy atoms so that in the first stage of minimization only the hydrogens that xleap added are minimized. Then in the second stage we allow minimization of all atoms in the system. Since our system is fairly small and simplistic it should be fine to skip the first stage and just minimize everything. An example of such a two stage minimization approach will be given when we run simulations on our more complex solvated model in the next section.

To create our input file: To turn on minimization, we specify IMIN = 1. We want a fairly short minimization since we don't actually need to reach the minima, just move away from any local maxima, so we select 500 steps of minimization by specifying MAXCYC = 500. Sander supports a number of different minimization algorithms, the most commonly used being steepest descent and conjugate gradient. The steepest descent algorithm is good for quickly removing the largest strains in the system but converges slowly when close to a minima. Here the conjugate gradient method is more efficient. The use of these two algorithms can be controlled using the NCYC flag. If NCYC < MAXCYC sander will use the steepest descent algorithm for the first NCYC steps before switching to the conjugate gradient algorithm for the remaining (MAXCYC - NCYC) steps. In this case we will run an equal number of steps with each algorithm so we set NCYC = 250. Since sander assumes that the system is periodic by default we need to explicitly turn this off (NTB = 0). In this simulation we will be using a constant dielectric and not an implicit (or explicit) solvent model so we set IGB = 0 (no generalized born solvation model), this is the default so we strictly don't need to specify this but I will include it here so that we can see what differences in the input file we have when we switch on implicit solvent later. We also need to choose a value for the non-bonded cut off. A larger cut off introduces less error in the non-bonded force evaluation but increases the computational complexity and thus calculation time. 12 angstroms would seem like a reasonable tradeoff for gas phase so that is what we will initially use (CUT = 12). Here is what the input file looks like:

polyA-polyT 10-mer: initial minimization prior to MD
 &cntrl
  imin   = 1,
  maxcyc = 500,
  ncyc   = 250,
  ntb    = 0,
  igb    = 0,
  cut    = 12
 /

So, now to run sander for minimization.

3.1.2) Running sander for the first time

To run sander, we simply execute the following:

$AMBERHOME/bin/sander -O -i polyAT_vac_init_min.in -o polyAT_vac_init_min.out -c polyAT_vac.inpcrd -p polyAT_vac.prmtop -r polyAT_vac_init_min.rst

Input files: polyAT_vac_init_min.in, polyAT_vac.inpcrd, polyAT_vac.prmtop
Output files: polyAT_vac_init_min.out, polyAT_vac_init_min.rst

This should run very quickly. On a 3GHz Pentium 4 machine it takes about 4.3 seconds

Take a look at the output file produced during the minimization (polyAT_vac_init_min.out). You will see that the energy dropped considerably between the first and last steps:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       9.2064E+01     4.9542E+01     6.8763E+02     O5'       321
   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
    500      -2.2929E+03     5.7810E-01     3.4109E+00     N1        362

In spite of this, however, the structure did not change very much. This is because, as already mentioned, minimization will only find the nearest local minima. If you create PDB files for the start (polyAT_vac.inpcrd) and final structures (polyAT_vac_init_min.rst) and compare the root-mean-squared deviations (RMSd), you will see that the two structures differ by only about 0.5 angstroms (all atom).


minimized DNA
Initial structure = Green, Minimized structure = Blue

3.1.3) Creating PDB files from the AMBER coordinate files

You may want to generate a new pdb file so you can look at the structure using the minimized coordinates from polyAT_vac_init_min.rst. A pdb file can be created from the parm topology and coordinates (inpcrd or restrt) using the program ambpdb.

$AMBERHOME/bin/ambpdb -p polyAT_vac.prmtop < polyAT_vac_init_min.rst > polyAT_vac_init_min.pdb

This will take the prmtop and inpcrd file specified (in this case the final structure from the minimization, the .rst file) and create (on stdout) a pdb file. In this case we redirected stdout to the file - polyAT_vac_init_min.pdb.


vmd snapshot
A VMD screenshot of the generated PDB file.

In general, users should carefully inspect any starting structures and minimized structures; specifically check to make sure the hydrogens were placed where you thought they should be, histidines are in the correct protonation state, the terminal residues are properly terminated, stereochemistry is reasonable, etc. There is nothing worse than finding out after you've run a nanosecond of  solvated dynamics that an H1' atom was on the wrong side and that you have simulated some strange anomer of DNA!

3.2) Running MD in-vacuo

In section 3.1 we used sander to minimize our system in order to remove any bad contacts introduced by the hydrogenation step in xleap. This led to the creation of the coordinate file named polyAT_vac_init_min.rst. For the next section of this tutorial we will use this coordinate file as the starting structure for our in-vacuo MD simulation. The following in-vacuo simulation is designed to give a flavor of how MD simulations are run. In general one would not run an in-vacuo simulation unless the system was inherently gas phase. For liquid phase systems, such as our DNA, one would normally include solvent either explicitly or implicitly. This type of simulation will be covered in section 4. For the time being we will stick to gas phase simulations since they are simple and quick to run.

To make the calculations tractable for the purposes of this tutorial, we can only run short simulations of the order of 100 ps. Although these are "short" simulations (in terms of what is likely required to answer specific research questions), you will see that they are still rather costly, depending on what type of machine you using. However, to get a better idea of the issues involved and potential artifacts of the various models that will be used (for the simulation of DNA) it is probably useful to run through these examples, either running them yourself or just looking at the output files and trajectories supplied. You may also want to try some different examples from the two given below, such as a distance dependent dielectric constant, or see what happens if you increase the dielectric constant to say 80.0.

For this tutorial we will run two in-vacuo simulations and compare the results. The simulations we will run will be:

  1. polyAT_vac_md1_12Acut.in: 12.0 angstrom long range cutoff, dielectric = 1
  2. polyAT_vac_md1_nocut.in:    no long range cutoff, dielectric = 1

To run a molecular dynamics simulation with sander, we need to turn off minimization (IMIN=0). Since we are running in-vacuo we also need to disable periodicity (NTB=0) and set IGB=0 since we are not using implicit solvent. For these two examples we will write information to the output file and trajectory coordinates file every 100 steps (NTPR=100, NTWX=100). For most simulations writing to the coordinate file every 100 steps is too frequently, it both consumes excess disk space and can impact performance, however, this simulation here will show some interesting behavior over a short time scale that I would like you to see and so we will use a value of 100 to ensure that this is suitably sampled. For multiple nano-second simulations of stable systems a more suitable value for NTWX would be in the 1000 to 2000 range. The CUT variable specifies the cut off range for the long range non-bonded interactions. Here two different values for the cutoff will be used, one run will be with a cut off of 12 angstroms (CUT = 12.0) and one run will be without a cutoff. To run without a cutoff we simply set CUT to be larger than the extent of the system (e.g. CUT = 999). For temperature regulation we will use the Langevin thermostat (NTT=3) to maintain the temperature of our system at 300 K. This temperature control method uses Langevin dynamics with a collision frequency given by GAMMA_LN. This temperature control method is significantly more efficient at equilibrating the system temperature than the Berendsen temperature coupling scheme (NTT=1) that was the recommended method for older versions of AMBER. The biggest problem with the Berendsen method is that the algorithm simply ensures that the kinetic energy is appropriate for the desired temperature; it does nothing to ensure that the temperature is even over all parts of the molecule. This can lead to the phenomenon of hot solvent, cold solute. To avoid this, elaborate temperature scaling techniques for slowly heating the molecule over the course of the simulation were recommended. The Langevin system is much more efficient, however, at equilibrating the temperature and is now the recommended choice. The efficiency is such that if we have a reasonably good structure, which we do in this case, we can actually start the system at 300 K and avoid the need to slowly heat over say 20 ps from 0 K to room temperature. Use the Langevin temperature regulation scheme with care, however, since while it will allow you to equilibrate the temperature of your system efficiently it will alter the fast dynamics of your system. As such if you are interested in things such as correlation functions then it is often better to equilibrate your system using ntt=3 and then, once equilibrated pure Newtonian dynamics (ntt=0). It should also be noted that the performance of NTT=3 is less than a NVT simulation using NTT=1 or a NVE simulation. Hence it can often be beneficial to heat in NVT with NTT=3, equilibrate in NPT with NTT=3 and then run production as either NVE (NTT=0) or as NVT with NTT=1 and a weak coupling constant of say 10 ps-1. The exact choice though will depend on what you aim to learn from your simulation and so you should consult the literature for previous examples and discussions.

For simplicity we will run this entire simulation with NTT=3 and GAMMA_LN=1. We shall also set the initial and final temperatures to 300 K (TEMPI=300.0, TEMP0=300.0) which will mean our system's temperature should remain around 300 K. In these two examples we will run a total of 100,000 steps each (NSTLIM=100000) with a 1 fs time step (DT=0.001) giving simulation lengths of 100 ps (100,000 x 1fs). The two input files are shown below:

polyAT_vac_md1_12Acut.in

10-mer DNA MD in-vacuo, 12 angstrom cut off
 &cntrl
  imin = 0, ntb = 0,
  igb = 0, ntpr = 100, ntwx = 100,
  ntt = 3, gamma_ln = 1.0,
  tempi = 300.0, temp0 = 300.0
  nstlim = 100000, dt = 0.001,
  cut = 12.0
 /
 

polyAT_vac_md1_nocut.in

10-mer DNA MD in-vacuo, infinite cut off
 &cntrl
  imin = 0, ntb = 0,
  igb = 0, ntpr = 100, ntwx = 100,
  ntt = 3, gamma_ln = 1.0,
  tempi = 300.0, temp0 = 300.0
  nstlim = 100000, dt = 0.001,
  cut = 999
 /
 

 

Caution: In the examples in this tutorial we do not change the value of the random seed used for the random number generator. This is controlled by the namelist variable ig. This is largely for issues of reproducibility of the results within a tutorial setting. However, when running production simulations, especially when using ntt=2 or 3 (Anderson or Langevin thermostats) it is essential that you change the random number seed from the default on EVERY MD restart. If you are using AMBER 10 (bugfix.26 or later) or AMBER 11 or later you can do this automatically by setting ig=-1 in the cntrl namelist. Otherwise you can specify a positive random number of your choosing for ig each time you restart a calculation. For more details on the pitfalls of not doing this you should refer to the following publication:

Cerutti DS, Duke, B., et al., "A Vulnerability in Popular Molecular Dynamics Packages Concerning Langevin and Andersen Dynamics", JCTC, 2008, 4, 1669-1680

So, to run the two jobs we issue the following two commands. Note, if you have a dual (or more) processor machine then you can run both jobs simultaneously, otherwise you should run them sequentially (note we use the restrt file from the minimization as our starting structure):

$AMBERHOME/bin/sander -O -i polyAT_vac_md1_12Acut.in -o polyAT_vac_md1_12Acut.out -c polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_12Acut.rst -x polyAT_vac_md1_12Acut.mdcrd

$AMBERHOME/bin/sander -O -i polyAT_vac_md1_nocut.in -o polyAT_vac_md1_nocut.out -c polyAT_vac_init_min.rst -p polyAT_vac.prmtop -r polyAT_vac_md1_nocut.rst -x polyAT_vac_md1_nocut.mdcrd

These will take considerably longer to run than the minimization, so it is probably a good time to go off and get a cup of coffee. The 12 angstrom cut off run takes about 900s (15 mins) on a single P4 3GHz machine, if it is taking too long on your machine you can always reduce the simulation length by lowering NSTLIM to say 10,000 (10 ps). You can follow the progress of the job by following the output file with the following command:

tail -f polyAT_vac_md1_12Acut.out

Input files: polyAT_vac_md1_12Acut.in, polyAT_vac_md1_nocut.in, polyAT_vac_init_min.rst, polyAT_vac.prmtop

Output files: polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out, polyAT_vac_md1_12Acut.rst, polyAT_vac_md1_nocut.rst, polyAT_vac_md1_12Acut.mdcrd, polyAT_vac_md1_nocut.mdcrd

Warning: the mdcrd files are over 15MB in size - you can download gzip compressed (5.3MB) versions here (polyAT_vac_md1_12Acut.mdcrd.gz, polyAT_vac_md1_nocut.mdcrd.gz). You will need to unzip them before use. e.g gunzip polyAT_vac_md1_12Acut.mdcrd.gz.

Note that we now have an additional option, the -x option which specifies the name of the output file for the MD trajectory. This file will contain a snapshot of the entire system's Cartesian coordinates every NTWX (100) steps.

Please be aware that the nocut simulation will likely stop after around 22,900 steps. This is not a bug, it is a problem with simulating DNA in vacuo which will become clear when we visualize the trajectory files. More on this later.

3.3) Analyzing the results

So, now we've run the simulations what do we want to look at? First off, we may want to look through the mdout files (polyAT_vac_md1_12Acut.out, polyAT_vac_md1_nocut.out). It is also a good idea to extract the various energies as a function of time to plot total, kinetic, potential energies etc. We might also want to calculate RMSd vs. time, etc. To look at movies (which is much more fun :-) ) we can use a visualization program such as vmd to load up the "parm and crd" (i.e. AMBER prmtop and MD trajectory) and use the animation controls to view it, more on this later.

3.3.1) Extracting the energies, etc. from the mdout file

One can write an awk or sed script to pull out the energies, etc. These can be pulled out of the mden file if energy information was written out (which we didn't do) or directly from the mdout file. The following is a perl script (process_mdout.perl) that has been designed to process mdout files and create a series of files with the various different pieces of information in them. The program uses a default output filename so it is best to create a sub directory for each of your output files and move to there before running the script e.g.

mkdir polyAT_vac_md1_12Acut
cd polyAT_vac_md1_12Acut

perl process_mdout.perl ../polyAT_vac_md1_12Acut.out

You should repeat this process for the nocut output file, even though the simulation did not complete all 100,000 steps:

mkdir polyAT_vac_md1_nocut
cd polyAT_vac_md1_nocut
process_mdout.perl ../polyAT_vac_md1_nocut.out

This script takes a series of mdout files and will create a series, leading off with the prefix "summary." such as "summary.EPTOT", of output files. These files are just columns of the time vs. the value for each of the energy components. Tar archives of the summary files for each of the two MD runs above are available here (polyAT_vac_md1_12Acut.summary.tar.gz, polyAT_vac_md1_nocut.summary.tar.gz).

You can plot the summary files with your favorite graphing program. A useful program for just quickly looking at a plot of a file is xmgr which was installed on older versions of Linux or xmgrace which should be on newer installations (if it isn't then just simply use a plotting package of your preference). xmgrace is a very simple plotting program, but ideal for our purpose. To plot the total potential energy, of both runs, as a function of time we run, assuming we are in our master simulation directory:

xmgrace ./polyAT_vac_md1_12Acut/summary.EPTOT ./polyAT_vac_md1_nocut/summary.EPTOT

This should give you a plot similar to the following:


summed energies
Potential energies with and without interaction cutoffs: the black line represents the potential energy for the 12 angstrom cutoff simulation vs. time while the red line represents the potential energy for the simulation without a cutoff.

As we cansee the 12 angstrom cutoff simulation seems to be fairly stable, the potential energy is fluctuating around a constant mean value. The simulation without a cutoff, however, is significantly different. To begin with the potential energy is some 3,000 kcal/mol higher than the 12 angstrom cutoff simulation. The potential energy is so large in fact that it is actually positive. The potential energy also decreases rapidly over the first 10 ps of simulation suggesting that a large structural change is occurring. The simulation then ends abruptly after 21.9ps with the following error:

 Frac coord min, max:  -5.132892457340335E-005  0.962793346753183
 The system has extended beyond
     the extent of the virtual box.
 Restarting sander will recalculate
    a new virtual box with 30 Angstroms
    extra on each side, if there is a
    restart file for this configuration.
 SANDER BOMB in subroutine Routine: map_coords (ew_force.f)
 Atom out of bounds. If a restart has been written,
 restarting should resolve the error

Why the difference in potential energy? Well, by looking at our two output files this can be tracked down to the difference in the electrostatic energy. For the first step we have:

12 angstom cutoff

 NSTEP =        0   TIME(PS) =       0.000  TEMP(K) =   298.54  PRESS =     0.0
 Etot   =     -1726.9350  EKtot   =       565.9592  EPtot      =     -2292.8942
 BOND   =        16.0918  ANGLE   =        93.9850  DIHED      =       392.9956
 1-4 NB =       162.1781  1-4 EEL =      -345.0298  VDWAALS    =      -346.6892
 EELEC  =     -2266.4257  EHBOND  =         0.0000  RESTRAINT  =         0.0000

no cutoff

 NSTEP =        0   TIME(PS) =       0.000  TEMP(K) =   298.54  PRESS =     0.0
 Etot   =      1885.9795  EKtot   =       565.9592  EPtot      =      1320.0203
 BOND   =        16.0918  ANGLE   =        93.9850  DIHED      =       392.9956
 1-4 NB =       162.1781  1-4 EEL =      -345.0298  VDWAALS    =      -349.3700
 EELEC  =      1349.1697  EHBOND  =         0.0000  RESTRAINT  =         0.0000

Notice how the value of EELEC is so much larger in the no cutoff simulation.

So, what has happened and how can we learn more? Well, first of all let's look at another type of analysis we can perform on the results, in this case analysis of the MD trajectory files (polyAT_vac_md1_12Acut.mdcrd, polyAT_vac_md1_nocut.mdcrd). From this we can hopefully understand what is happening.

Calculating the RMSd vs. time

The next step in analyzing our results is to calculate the RMSd as a function of time using cpptraj (an analysis program provided with AmberTools). The precise parameter we will be calculating in this example is the mass weighted RMSd fit between each successive structure and the first structure of our trajectory. This is done by providing an input file to cpptraj containing a list of commands describing the relevant files and what we want it to do etc.:

trajin polyAT_vac_md1_12Acut.mdcrd
rms first mass out polyAT_vac_md1_12Acut.rms time 0.1

where trajin specifies the name of the trajectory file to process, rms first mass tells cpptraj that we want it to calculate a MASS weighted, RMS fit to the FIRST structure. out specifies the name of the file to write the results to and time 0.1 tells cpptraj that each frame of the coordinate file represents 0.1 ps. Since we have two simulations we have two input files, with different trajectory files and output files in each, polyAT_vac_md1_12Acut.calc_rms, polyAT_vac_md1_nocut.calc_rms.

Run the two cpptraj jobs with the following commands:

$AMBERHOME/bin/cpptraj -p polyAT_vac.prmtop -i polyAT_vac_md1_12Acut.calc_rms

$AMBERHOME/bin/cpptraj -p polyAT_vac.prmtop -i polyAT_vac_md1_nocut.calc_rms

The output from each job will look something like this:

CPPTRAJ: Trajectory Analysis. V14.00
    ___  ___  ___  ___
     | \/ | \/ | \/ | 
    _|_/\_|_/\_|_/\_|_
	Reading 'polyAT_vac.prmtop' as Amber Topology
INPUT: Reading Input from file polyAT_vac_md1_12Acut.calc_rms
  [trajin polyAT_vac_md1_12Acut.mdcrd]
	Reading 'polyAT_vac_md1_12Acut.mdcrd' as Amber Trajectory
  [rms first mass out polyAT_vac_md1_12Acut.rms time 0.1]
    RMSD: (*), reference is first frame (*), with fitting, mass-weighted.
---------- RUN BEGIN -------------------------------------------------

PARAMETER FILES:
 0: 'polyAT_vac.prmtop', 638 atoms, 20 res, box: None, 2 mol, 1000 frames

INPUT TRAJECTORIES:
 0: 'polyAT_vac_md1_12Acut.mdcrd' is an AMBER trajectory, Parm polyAT_vac.prmtop (reading 1000 of 1000)
  Coordinate processing will occur on 1000 frames.
TIME: Run Initialization took 0.0000 seconds.

BEGIN TRAJECTORY PROCESSING:
.....................................................
ACTION SETUP FOR PARM 'polyAT_vac.prmtop' (1 actions):
  0: [rms first mass out polyAT_vac_md1_12Acut.rms time 0.1]
	Mask [*] corresponds to 638 atoms.
	Mask [*] corresponds to 638 atoms.
----- polyAT_vac_md1_12Acut.mdcrd (1-1000, 1) -----
 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Complete.

Read 1000 frames and processed 1000 frames.
TIME: Trajectory processing: 0.1719 s
TIME: Avg. throughput= 5816.4559 frames / second.

ACTION OUTPUT:

DATASETS:
  1 data set:
	RMSD_00000 "RMSD_00000" (double, rms), size is 1000

DATAFILES:
  polyAT_vac_md1_12Acut.rms (Standard Data File):  RMSD_00000
---------- RUN END ---------------------------------------------------
TIME: Total execution time: 0.1932 seconds.
--------------------------------------------------------------------------------
To cite CPPTRAJ use:
Daniel R. Roe and Thomas E. Cheatham, III, "PTRAJ and CPPTRAJ: Software for
  Processing and Analysis of Molecular Dynamics Trajectory Data". J. Chem.
  Theory Comput., 2013, 9 (7), pp 3084-3095.

We should get two output files, one for each simulation (polyAT_vac_md1_12Acut.rms, polyAT_vac_md1_nocut.rms).

We can now plot the RMSd's vs time. Note, RMSd is in angstroms and time is in ps.

>xmgrace polyAT_vac_md1_12Acut.rms polyAT_vac_md1_nocut.rms

This should give a plot similar to the following:


RMSD plot
RMSD versus time: the simulation without a cutoff is shown in red.

And here we see the problem. While the 12 angstrom cutoff simulation has a largely constant RMSd around 2.2 angstroms the no cutoff simulation shows a steadily increasing RMSd which after 20 ps is already over 30 angstroms!!! This would suggest that something has gone wrong with the simulation since our system has essentially "blown up." But, is this the whole story? Or does it mask a larger problem with running a simulation in vacuum. Why does the use of a cutoff here result in a stable trajectory while no cutoff (which should in theory be more accurate) leads to a "blow up."

3.3.3) Visualizing the trajectories with VMD

Let's think about this more carefully. Our DNA molecule consists of two chains held together by hydrogen bonds. Each chain has a net charge of -9 electrons. So, our chains have a large electrostatic repulsion as shown by the value of EELEC in the no cutoff simulation (EELEC = 1349.1697). Thus the two chains are repelling each other. For the system to be stable the interaction between the chains, largely due to hydrogen bonding, must counteract this large electrostatic repulsion. Now, in the case where we used a 12 angstrom cutoff, all charges beyond 12 angstroms distance were considered to have zero energy. Since the average distance between the negative charges on the DNA backbone, due to the phosphate groups, is 15 angstroms so the repulsion between the phosphate atoms on opposite chains was ignored when the 12 angstrom cutoff was employed. The electrostatic attraction due to hydrogen bonding between opposite bases is due to a much closer interaction, of the order of 2 angstroms. Thus while the main electrostatic repulsion interaction was excluded when we used the 12 angstrom cutoff, the main attractive force between the two chains was included. Thus during the simulation the chain held together and we obtained a stable trajectory. Let's take a look at it in vmd:

vmd

Open our 12 angstrom cutoff trajectory file. Select:

File -> New Molecule

vmd dialog box
Loading a molecule definition (sometimes called a parm, a top or a prmtop) to VMD.

VMD supports multiple trajectory files for a single molecule. Thus a molecule is defined by loading the associated prmtop file. So, start by selecting Browse and finding the polyAT_vac.prmtop file. Then under the heading where it states "Determine file type:" select AMBER7 Parm. Note: VMD has two definitions for prmtop files, 'AMBER Parm' and 'AMBER7 Parm'. The 'AMBER Parm' refers to prmtop files for AMBER versions of 6 and lower. The prmtop format changed slightly between amber 6 and amber 7, largely to allow for larger molecules, so that the AMBER7 Parm option is required for AMBER v7 and later prmtop files.


vmd dialog box
Files of format "AMBER7 Parm" define the molecular topology and paramters.

Then hit 'Load'.

Next we need to choose which trajectory to load, in this case we will load the 12 angstrom cutoff trajectory file polyAT_vac_md1_12Acut.mdcrd (remember to gunzip the file before use). Select browse again, browse for the file and then under 'determine file type' you need to select "AMBER Coordinates". Note: if this were a periodic boundary simulation then we would select "AMBER Coordinates with Periodic Box" since the trajectory file would also contain information on the box size. In this situation, however, there is no box information since this is a non-periodic in vacuo simulation. When you hit "Load" again you should see all of the frames loaded into the main molecule window. 1,000 frames in total.


VMD screenshot
The VMD display window showing a DNA double helix.

We can now use the playback tools in the "VMD main" control panel to play our movie:


VMD viewer
VMD trajectory viewing controls.

You should have a go at playing the video of the trajectory. Notice how the DNA holds its secondary structure there is considerable movement in the extremities but the overall structure is preserved.

Is this the correct answer though? Just because a trajectory is stable doesn't necessarily mean it is correct. Is a strand of charged DNA in vacuum really likely to be stable? In solvent such electrostatic repulsions are shielded by the solvent but in vacuum there is no such shielding and no external forces to help hold the chains together. Lets take a look at the trajectory obtained from our no cutoff simulation:

If VMD is already open, close it - or if you know how simply delete the existing molecule.

Run vmd (based on vmd 1.8.4):

vmd

Open our no cut off trajectory file. Select:

File -> New Molecule

Repeat the process as described above. Use the same polyAT_vac.prmtop file but this time select the no cut off trajectory file polyAT_vac_md1_nocut.mdcrd. This time when you hit "Load" VMD should load the trajectory up until the point where the simulation crashed (229 frames of 0.1ps each = 22.9 ps). Have a look at the trajectory, the difference from the last simulation should be obvious. The instability of the DNA dimer is clear.


snapshot
0 ps
snapshot
2.5 ps
snapshot
5 ps
snapshot
7.5 ps
snapshot
10 ps
snapshot
12.5 ps
snapshot
15 ps
snapshot
17.5 ps
snapshot
20 ps
snapshot
22.5 ps
   

If we view the DNA dimer from the top during the movie of the MD trajectory it is slightly easier to see what is happening, the large repulsive charge on the two chains is causing them to uncoil and move away from each other:

snapshot
0 ps
snapshot
2.5 ps
snapshot
5 ps
snapshot
7.5 ps
snapshot
10 ps
snapshot
12.5 ps
snapshot
15 ps
snapshot
17.5 ps
snapshot
20 ps
snapshot
22.5 ps
   
Here we see that the two chains have, due to the large electrostatic repulsion forces, rapidly separated from each other and have started to drift apart. The simulation stopped when the distance between the two strands exceeded pre-determined parameters within the code.

So, which simulation is the correct one? Well, since this simulation was in vacuo and we had no neutralizing ions the conditions did not really represent laboratory conditions. Indeed in this "harsh" environment, with no clustered water or ions, it is likely that the DNA 10-mer is going to be unstable and so the behavior shown by our no cut off simulation is most likely the closest to reality.

The take home lesson here is that you should think very carefully about what you are simulating, Are you really simulating realistic conditions, how are the parameters you have chosen biasing your results? A cutoff can be a good way to increase the speed of a simulation, but you need to be aware that it can introduce very large artifacts into your simulation. So, think very carefully, and try out several scenarios before you try to reach firm conclusions.

One way to improve considerably on our in vacuo simulations is to make our physical model of DNA much closer to reality, i.e. include explicit neutralizing ions and also to include solvent effects, either implicitly within our model or via the use of explicit solvent. This is the subject of the next two stages of our tutorial here.

CLICK HERE TO GO TO SECTION 4

(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright Ross Walker 2010