(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright Ross Walker 2015

Section 5: Running Minimization and MD (in explicit solvent)

Updated for AMBER 15

Up to this point we have been running simulations based on our in vacuo 10-mer polyA-polyT DNA structure. With this structure we have run both in vacuo simulations and also generalized Born implicit solvent simulations. In the next part of this tutorial we are going to run simulations in explicit solvent with periodic boundaries. This significantly increases the complexity of the simulations, increasing both the time required to run the simulations and the way in which we set up the simulations and visualize the results.

5.1) Equilibrating and running MD with solvated poly(A)-poly(T)

The problems with bad van der Waal (non-bond) and electrostatic interactions in the initial structure that led to the need to minimize our in vacuo structure, are considerably magnified when we start to run simulations in explicit solvent. Our initial structure was a gas phase DNA molecule with no protons. xleap added these for us at pre-determined positions. We then added 18 sodium ions at positions of high negative electric potential around our DNA molecule in order to neutralize it. To this system we then added a truncated octahedral box of pre-equilibrated TIP3P water. This water has not felt the influence of the solute or charges and moreover, there may be gaps between the solvent and solute, and solvent and box edges. If we are not careful such holes can lead to the formation of "vacuum" bubbles and subsequently an instability in our molecular dynamics simulation. Thus we need to run careful minimization before slowly heating our system to 300 K. It is also a good idea to allow the water box to relax during a MD equilibration stage prior to running production MD, since we will be using periodic boundaries to keep the pressure constant and allow the volume of the box to change. This approach will allow the water to equilibrate around the solute and come to an equilibrium density. It is essential that we monitor this equilibrium phase in order to be certain our solvated system has reached equilibrium before we start obtaining results (production data) from our MD simulation.

This section will present one possible path towards equilibrating the solvated DNA structure such that it is ready for use in production MD. For this we will use periodic boundary simulations based on the particle mesh Ewald (PME) method. After equilibration has been demonstrated and is deemed to have been successful, we will set up a "production" MD run. Please note, there are many differing opinions on how one should equilibrate a simulation, herein is presented only one, based on my experience with performing molecular dynamics simulations on biological systems.

Note: Some of the calculations, particularly the equilibration phase, may take a very long time (several days) to run unless you have access to a computer cluster for parallel simulation. For the purposes of this simulation I will provide copies of all of the input files along with example output files run on multiple CPUs of a i7-2600. I will make it clear where the simulations are likely to take a long time such that you can just use the provided output files.

5.1.1) A brief introduction to periodic boundaries

A realistic model of a solution requires a very large number of solvent molecules to be included along with the solute. Simply placing the solute in a box of solvent is not sufficient, however. Though some solvent molecules will be at the boundary between solute and solvent and others will be within the bulk of the solvent, a large number will be at the edge of the solvent and the surrounding vacuum. This is obviously not a realistic picture of a bulk fluid. In order to prevent the outer solvent molecules from boiling off into space, and to allow a relatively small number of solvent molecules to reproduce the properties of the bulk, periodic boundary conditions are employed. In this method the particles being simulated are enclosed in a box which is then replicated in all three dimensions to give a periodic array, a two-dimensional representation of which is shown below. In this periodic array, a particle at position r represents an infinite set of particles at:

r + ix + jy + kz (i,j,k -> -inf, +inf)

where x, y and z are the vectors corresponding to the box edges. During the simulation only one of the particles is represented, but the effects are reproduced over all the image particles with each particle not only interacting with the other particles but also with their images in neighboring boxes. Particles that leave one side of the box re-enter from the opposite side as their image. In this way the total number of particles in the central box remains constant.

Upon initial inspection such a method would appear to be very computationally intensive, requiring the evaluation of an infinite number of interacting pairs. However, by employing a technique known as the Ewald sum, or its more modern equivalent the Particle Mesh Ewald (PME) method, it is possible to obtain the infinite electrostatics in a way that scales as NlnN time. This involves dividing the calculation up between a real space component and a reciprocal space component. A detailed discussion is beyond the scope of this tutorial (for more information, reference the manual), however, the take home message is that even though we use a cutoff when simulating periodic boundaries in sander, the use of PME (by default) means we are calculating all of the 'infinite' electrostatics and so are not actually truncating the electrostatics. The VDW interactions are still needed though which means we cannot make the cut-off too small. For production calculations, the ideal range is between 8 and 10 angstroms. A value of 8 is generally considered reasonable. Although running with a larger cut-off will not cause harm, it simply makes the calculation more expensive. However, one should never reduce the cut-off below 8 angstroms for periodic boundary (PME) calculations.

A two-dimensional array of boxes. As molecule 1 moves from the central box into box B it is replaced by it's image which moves from box F into the central box. This movement is replicated across all the boxes.

5.1.3) Minimization Stage 1 - Holding the solute fixed

Our minimization procedure for solvated DNA will consist of a two stage approach. In the first stage we will keep the DNA fixed and just minimize the positions of the water and ions. Then in the second stage we will minimize the entire system.

There are two options for keeping the DNA atoms fixed during the initial minimization: one method is to use the "BELLY" option in which all the selected atoms have the forces on them zeroed at each step. This results in what is essentially a partial minimization. This method used to be the favored method but it can, especially if used with constant pressure periodic boundary simulations, lead to instabilities and strange artifacts. As a result of this, it is no longer the suggested method. Instead we shall use "positional restraints" on each of the DNA atoms to keep them essentially fixed in the same position. Such restraints work by specifying a reference structure, in this case our starting structure, and then restraining the selected atoms to conform to this structure via the use of a harmonic potential. This can be visualized as attaching a spring to each of the solute atoms that connects it to its initial position. Moving the atom from this position results in a force which acts to restore it to the initial position. By varying the magnitude of the force constant, the effect can be increased or decreased. This can be especially useful with structures based on homology models which may be a long way from equilibrium and require a number of minimization stages/MD with the force constant being reduced each time.

Here is the input file we shall use for our initial minimization of solvent and ions:

polyAT_wat_min1.in

polyA-polyT 10-mer: initial minimization solvent + ions
 &cntrl
  imin   = 1,
  maxcyc = 1000,
  ncyc   = 500,
  ntb    = 1,
  ntr    = 1,
  cut    = 10.0
 /
Hold the DNA fixed
500.0
RES 1 20
END
END

The meaning of each of the terms are as follows:

IMIN = 1: Minimization is turned on (no MD)
MAXCYC = 1000: Conduct a total of 1,000 steps of minimization.
NCYC = 500: Initially do 500 steps of steepest descent minimization followed by 500 steps (MAXCYC - NCYC) steps of conjugate gradient minimization.
NTB = 1: Use constant volume periodic boundaries (PME is always "on" when NTB > 0).
CUT = 10.0: Use a cutoff of 10 angstroms. (This is probably overkill here, we could get away with 8.0 or 9.0 angstroms if we wanted but using a larger cutoff does not harm the accuracy of the results. It just makes the calculations slightly more costly.)
NTR = 1: Use position restraints based on the GROUP input given in the input file. GROUP input is described in detail in the appendix of the user's manual. In this example, we use a force constant of 500 kcal mol^-1 angstrom^-2 and restrain residues 1 through 20 (the DNA). This means that the water and counterions are free to move. The GROUP input is the last 5 lines of the input file:

Hold the DNA fixed
500.0
RES 1 20
END
END

Note that whenever you run using the GROUP option in the input you should carefully check the top of the output file to make sure you've selected as many atoms as you thought you did. (Note also that 500 kcal/mol-A**2 is a very large force constant, much larger than is needed. You can use this for minimization, but for dynamics, stick to much smaller values, say around 10.)

    ----- READING GROUP     1; TITLE:
 Hold the DNA fixed

     GROUP    1 HAS HARMONIC CONSTRAINTS   500.00000
 GRP    1 RES    1 TO    20
      Number of atoms in this group  =   638
    ----- END OF GROUP READ -----

We are now ready to run the minimization. Note that we have an extra option on the command line (-ref). This specifies the structure to which we want to restrain the atoms, in this case our initial structure in the rst7 file. We will be using the solvated prmtop and rst7 files we created at the beginning of this tutorial.

$AMBERHOME/bin/sander -O -i polyAT_wat_min1.in -o polyAT_wat_min1.out -p polyAT_wat.prmtop -c polyAT_wat.rst7 -r polyAT_wat_min1.ncrst -ref polyAT_wat.rst7

Input files: polyAT_wat_min1.in, polyAT_wat.prmtop, polyAT_wat.rst7
Output files: polyAT_wat_min1.out, polyAT_wat_min1.ncrst

This takes about 146 s (2 mins, 46 s) on a 1.4 GHz Intel Core i5.

Take a look at the output file from this job. You should notice that there is initially a rather high van der Waals (VDWAALS and 1-4 VDW) energy which represents bad contacts in both the water and DNA. You should also see that the electrostatic energy (EEL) drops very rapidly. This is due to optimization of the favorable water-water and water-DNA electrostatic interactions as the waters re-orientate themselves into a lower energy geometry. The RESTRAINT energy represents the effect of the harmonic restraints we have imposed. It rises rapidly but then levels off.

5.1.4) Minimization Stage 2 - Minimizing the entire system

Now we have minimized the water and ions the next stage of our minimization is to minimize the entire system. In this case we will run 2,500 steps of minimization without the restraints this time. Here is the input file:

polyAT_wat_min2.in

polyA-polyT 10-mer: initial minimization whole system
 &cntrl
  imin   = 1,
  maxcyc = 2500,
  ncyc   = 1000,
  ntb    = 1,
  ntr    = 0,
  cut    = 10.0
 /

Note: Choosing the number of minimization steps to run is a bit of a black art. Running too few can lead to instabilities when you start running MD. Running too many will not do any real harm since we will just get ever closer to the nearest local minima. It will, however, waste CPU time. Since minimization is generally very quick with respect to the molecular dynamics it is often a good idea to run more minimization steps than really necessary. Here, 2,500 should be more than enough.

We run this using the following command. Note that we use the ncrst file from the previous run, which contains the last structure from the first stage of minimization, as the input structure for stage 2 of our minimization. We also no longer need the -ref switch:

$AMBERHOME/bin/sander -O -i polyAT_wat_min2.in -o polyAT_wat_min2.out -p polyAT_wat.prmtop -c polyAT_wat_min1.ncrst -r polyAT_wat_min2.ncrst

Input files: polyAT_wat_min2.in, polyAT_wat.prmtop, polyAT_wat_min1.ncrst
Output files: polyAT_wat_min2.out, polyAT_wat_min2.ncrst

This takes approximately 352.0 s (5 min 52 s) on a 1.4 GHz Intel Core i5.

Lets take a quick look at the structure, in order to ensure it is still sensible, before proceeding:

$AMBERHOME/bin/ambpdb -p polyAT_wat.prmtop -c polyAT_wat.rst7 > polyAT_wat.pdb

$AMBERHOME/bin/ambpdb -p polyAT_wat.prmtop -c polyAT_wat_min2.ncrst > polyAT_wat_min2.pdb

We shall use VMD to visualize the 2 structures (polyAT_wat.pdb, polyAT_wat_min2.pdb) and compare the structure of the DNA at the beginning and after our minimization:

vmd
File -> New Molecule (load the first pdb polyAT_wat.pdb)

DNA in water.

We can make the DNA stand out from the water better:

Graphics -> Representations
Hit "Create Rep" This will create a second representation of our molecule
In the " Selected Atoms " box type: all not water
Then under "Drawing Method" select "NewRibbons"
Next select the representation that still has the style "Lines"
And change the drawing method for this to "Points".

And there you have it, a much clearer representation of our initial DNA structure in the truncated octahedral water box:

An attractive "ribbons" representation of the DNA that will ray-trace nicely.

We can now compare the starting structure and the minimized structure. This time we will instruct VMD not to draw the solvent. The quickest way to do this is to close VMD (File->Quit), and then re-load it. We will specify the first pdb file on the command line so we only need to load the second one manually:

vmd polyAT_wat.pdb

File -> New Molecule Load the polyAT_wat_min2.pdb file

Turn off the water to make it easier to compare the DNAs:

Select Graphics->Representations In the Selected Atoms box enter: all not water and click "Apply"
Now under the Selected Molecule drop down box at the top of the Graphical Representations window select molecule 0: polyAT_wat.pdb and repeat the process above.

A superposition of the starting and minimized structures.

You can see that the structure has changed slightly but there are no major problems so it would appear that the minimization was successful. Note: The non-bonded blue dots are the sodium ions.

5.1.5) Molecular Dynamics (heating) with restraints on the solute

Now that we have successfully minimized our system the next stage in our equilibration protocol is to allow our system to heat up from 0 K to 300 K. In order to ensure this happens without any wild fluctuations in our solute we will use a weak restraint, as we did in stage 1 of the minimization, on the solute (DNA) atoms. Prior to AMBER 8 the recommended method for maintaining temperature was to use the Berendsen thermostat (NTT=1). This method is not very efficient at ensuring the temperature remains even across the system and so one would typically have to use NMR restraints in order to ensure that the heating occurred very gradually over a timescale of about 20 ps. This was essential in order to avoid problems with hot solvent cold solute etc. AMBER now supports the new Langevin temperature equilibration scheme (NTT=3) which is significantly better at maintaining and equalizing the system temperature. As a result, the use of NMR restraints is no longer necessary. In fact it may even be possible to simply start our system at 300 K but we shall remain cautious and start from 0 K.

Our final aim is to run production dynamics at constant temperature and pressure since this more closely resembles laboratory conditions. However, at low temperatures, as our system will be for the first few picoseconds, the calculation of pressure is very inaccurate and so using constant pressure periodic boundaries in this situation can lead to problems. Using constant pressure with restraints can also cause problems so initially we will run 20 ps of MD at constant volume. Once our system has equilibrated over approximately 20 ps we will then switch off the restraints and change to constant pressure before running a further 100 ps of equilibration at 300 K.

Since these simulations are significantly more computationally expensive than the in vacuo and implicit solvent simulations, it is essential that we try to reduce the computational complexity as much as possible. One way to do this is to use triangulated water, that is water in which the angle between the hydrogens is kept fixed. One such model is the TIP3P water model with which we solvated our system. When using such a water model it is essential that the hydrogen atom motion of the water is also fixed since failure to do this can lead to very large inaccuracies in the calculation of the densities, etc. Since the hydrogen atom motion in our DNA is unlikely to effect its large scale dynamics, we can fix these hydrogens as well. One method of doing this is to use the SHAKE algorithm in which all bonds involving hydrogen are constrained (NTC=2). This method of removing hydrogen motion has the fortunate effect of removing the highest frequency oscillation in the system, that of the hydrogen vibrations. Since it is the highest frequency oscillation that determines the maximum size of our time step (1 fs for the in vacuo runs), by removing the hydrogen motion we safely increase our time step to 2 fs without introducing any instability into our MD simulation. This has the effect of allowing us to cover a given amount of phase space in half as much time since we need only 50,000 steps to cover 100 ps in time as opposed to the 100,000 we required with a 1 fs time step.

Note: If your system is very inhomogeneous, you may need to run short MD runs with smaller time steps, say 0.5 fs, before conducting long MD runs with a 2 fs time step. This approach allows the system to rapidly move out of an unfavorable geometry without wild oscillations in the energy.

So, stage 1, 20 ps of MD with weak positional restraints on the DNA. Here is the input file:

polyAT_wat_md1.in

polyA-polyT 10-mer: 20ps MD with res on DNA
 &cntrl
  imin   = 0,
  irest  = 0,
  ntx    = 1,
  ntb    = 1,
  cut    = 10.0,
  ntr    = 1,
  ntc    = 2,
  ntf    = 2,
  tempi  = 0.0,
  temp0  = 300.0,
  ntt    = 3,
  gamma_ln = 1.0,
  nstlim = 10000, dt = 0.002
  ntpr = 100, ntwx = 100, ntwr = 1000
 /
Keep DNA fixed with weak restraints
10.0
RES 1 20
END
END

As mentioned in Section 3, when running production simulations with ntt=2/3 you should always change the random seed (ig) between restarts. If you are using AMBER 10 (bugfix.26 or later) or AMBER 11 or later then you can achieve this automatically by setting ig=-1 in the ctrl namelist. We do not do this here in the tutorial for reproducibility but it is generally recommended that you do this in your calculations.

The meaning of each of the terms are as follows:

IMIN = 0: Minimization is turned off (run molecular dynamics)
IREST = 0, NTX = 1: We are generating random initial velocities from a Boltzmann distribution and only read in the coordinates from the rst7. In other words, this is the first stage of our molecular dynamics. Later we will change these values to indicate that we want to restart a molecular dynamics run from where we left off.
NTB = 1: Use constant volume periodic boundaries (PME is always "on" when NTB>0).
CUT = 10.0: Use a cutoff of 10 angstroms.
NTR = 1: Use position restraints based on the GROUP input given in the input file. In this case we will restrain the DNA with a force constant of 10 angstroms.
NTC = 2, NTF = 2: SHAKE should be turned on and used to constrain bonds involving hydrogen.
TEMPI = 0.0, TEMP0 = 300.0: We will start our simulation with a temperature, derived from the kinetic energy, of 0 K and we will allow it to heat up to 300 K. The system should be maintained, by adjusting the kinetic energy, as 300 K.
NTT = 3, GAMMA_LN = 1.0: The Langevin dynamics should be used to control the temperature using a collision frequency of 1.0 ps^-1.
NSTLIM = 10000, DT = 0.002: We are going to run a total of 10,000 molecular dynamics steps with a time step of 2 fs per step, possible since we are now using SHAKE, to give a total simulation time of 20 ps.
NTPR = 100, NTWX = 100, NTWR = 1000: Write to the output file (NTPR) every 100 steps (200 fs), to the trajectory file (NTWX) every 100 steps and write a restart file (NTWR), in case our job crashes and we want to restart it, every 1,000 steps.

We run this using the following command. Note, we use the ncrst file from the second stage of our minimization since this contains the final minimized structure. We also use this as the reference structure with which to restrain the DNA to:

$AMBERHOME/bin/sander -O -i polyAT_wat_md1.in -o polyAT_wat_md1.out -p polyAT_wat.prmtop -c polyAT_wat_min2.ncrst -r polyAT_wat_md1.ncrst -x polyAT_wat_md1.nc -ref polyAT_wat_min2.ncrst

Input files: polyAT_wat_md1.in, polyAT_wat.prmtop, polyAT_wat_min2.ncrst
Output files: polyAT_wat_md1.out, polyAT_wat_md1.ncrst, polyAT_wat_md1.nc [11.8 MB])

Note: This simulation may take a long time. In parallel on eight 3.6 GHz processors of a AMD FX-8150, it takes approximately 1777 s (29 mins 37 s). You can run this yourself if you want. Alternatively you can just use the output files provided in the links above.

5.1.6) Running MD Equilibration on the whole system

Now that we have successfully heated our system at constant volume with weak restraints on the DNA, the next stage is to switch to using constant pressure so that the density of our water can relax. At the same time we can safely remove the restraints on our DNA, because we are at 300 K .

We will run this equilibration for 100 ps to give our system plenty of time to relax.

polyAT_wat_md2.in

polyA-polyT 10-mer: 100ps MD
 &cntrl
  imin = 0, irest = 1, ntx = 7,
  ntb = 2, pres0 = 1.0, ntp = 1,
  taup = 2.0,
  cut = 10.0, ntr = 0,
  ntc = 2, ntf = 2,
  tempi = 300.0, temp0 = 300.0,
  ntt = 3, gamma_ln = 1.0,
  nstlim = 50000, dt = 0.002,
  ntpr = 100, ntwx = 100, ntwr = 1000
 /

The meaning of each of the terms are as follows:

IMIN = 0: Minimization is turned off (run molecular dynamics)
IREST = 1, NTX = 7: We want to restart our MD simulation where we left off after the 20 ps of constant volume simulation. IREST tells sander that we want to restart a simulation, so the time is not reset to zero but will start at 20 ps. Previously we have had NTX set at the default of 1 which meant only the coordinates were read from the restrt file. This time, however, we want to continue from where we finished so we set NTX = 7 which means the coordinates, velocities and box information will be read from a formatted (ASCII) restrt file.
NTB = 2, PRES0 = 1.0, NTP = 1, TAUP = 2.0: Use constant pressure periodic boundary with an average pressure of 1 atm (PRES0). Isotropic position scaling should be used to maintain the pressure (NTP=1) and a relaxation time of 2 ps should be used (TAUP=2.0).
CUT = 10.0: Use a cut off of 10 angstroms.
NTR = 0: We are no longer using positional restraints.
NTC = 2, NTF = 2: SHAKE should be turned on and used to constrain bonds involving hydrogen.
TEMPI = 300.0, TEMP0 = 300.0: Our system was already heated to 300 K during the first stage of MD so here it will start at 300 K and should be maintained at 300 K.
NTT = 3, GAMMA_LN = 1.0: The Langevin dynamics should be used to control the temperature using a collision frequency of 1.0 ps^-1.
NSTLIM = 50000, DT = 0.002: We are going to run a total of 50,000 molecular dynamics steps with a time step of 2 fs per step, possible since we are now using SHAKE, to give a total simulation time of 100 ps.
NTPR = 100, NTWX = 100, NTWR = 1000: Write to the output file (NTPR) every 100 steps (200 fs), to the trajectory file (NTWX) every 100 steps and write a restart file (NTWR), in case our job crashes and we want to restart it, every 1,000 steps.

We run this using the following command. Note, we use the ncrst file from the first stage of our MD since this contains the final MD frame, velocities and box information.

$AMBERHOME/bin/sander -O -i polyAT_wat_md2.in -o polyAT_wat_md2.out -p polyAT_wat.prmtop -c polyAT_wat_md1.ncrst -r polyAT_wat_md2.ncrst -x polyAT_wat_md2.nc

Input files: polyAT_wat_md2.in, polyAT_wat.prmtop, polyAT_wat_md1.ncrst
Output files: polyAT_wat_md2.out, polyAT_wat_md2.ncrst,(polyAT_wat_md2.nc [58.8 MB])

Note: This simulation may take considerable amount of time to run (about 5 times more than the 20 ps run above). On one 3.4 GHz processor of an Intel Core i7-2600 it takes approximately 6947 s (1 hr 55 min 10 s). You can run this yourself if you want or use the provided output files.

5.2) Analyzing the results to test the equilibration

Now that we have equilibrated our system it is essential that we check that equilibrium has successfully been achieved before moving on to running any production MD simulations in which we would hope to learn something new about our molecule.

There are a number of system properties that we should monitor to check the quality of our equilibrium. These include:

Potential, Kinetic and Total energy
Temperature
Pressure
Volume
Density
RMSd

The following steps will illustrate how to extract this data and plot it and also discuss what we expect to see for a successful equilibration.

5.2.1) Analyzing the output files

In this section we will look at the system properties that can be extracted from the data written to the mdout files during our 120 ps of MD simulation (polyAT_wat_md1.out, polyAT_wat_md2.out). This includes the potential, kinetic and total energies, the temperature, pressure, volume and density.

Lets start by extracting the various properties from our two output files. For this we will use the process_mdout Perl script that was introduced earlier. This script can be given any number of files to process and it will append the results to single summary files:

mkdir analysis
cd analysis
process_mdout.perl ../polyAT_wat_md1.out ../polyAT_wat_md2.out

This should give us a series of summary files in our analysis directory (polyAT_wat_md1+2.summary.tar.gz). Lets plot some of these to see if we have successfully reached an equilibrium. First of all the energies: potential, kinetic and total (= potential + kinetic).

xmgrace summary.EPTOT summary.EKTOT summary.ETOT

The energy of the system: the red line (which is positive) shows the kinetic energy, and the black line (which is negative) shows the potential energy. The green line shows the total.

The important points to note with this plot is that all of the energy terms increased during the first few ps, corresponding to our heating from 0 K to 300 K. The kinetic energy then remained constant for the remainder of the simulation implying that our temperature thermostat, which acts on the kinetic energy, was working correctly. The potential energy, and consequently the total energy since total energy = potential energy + kinetic energy, initially increased and then plateaued during the constant volume stage (0 to 20 ps) before decreasing as our system relaxed when we switched off the DNA restraints and moved to constant pressure (20 to 40 ps). The potential energy then leveled off and remained constant for the remainder of our simulation (40 to 120 ps) indicating that the relaxation was completed and that we reached an equilibrium.

Next we plot the temperature:

xmgrace summary.TEMP

The temperature of the system.

Here we see the expected behavior, our system temperature started at 0 K and then increased to 300 K over a period of about 5 ps. The temperature then remained more or less constant for the remainder of the simulation indicating the use of Langevin dynamics for temperature regulation was successful.

Next the pressure:

xmgrace summary.PRES

The pressure of the system.

The pressure plot is slightly different than the previous plots. For the first 20 ps the pressure is zero. This is to be expected since we were running a constant volume simulation in which the pressure wasn't evaluated. At 20 ps we switched to constant pressure, allowing the volume of our box to change, at which point the pressure drops sharply, becoming negative. The negative pressures correspond to a "force" acting to decrease the box size and the positive pressures to a "force" acting to increase the box size. The important point here is that while the pressure graph seems to show that the pressure fluctuated wildly during the simulation the mean pressure stabilized around 1 atm after about 50 ps of simulation. This is sufficient to indicate successful equilibration.

Next we will consider the volume. Unfortunately, we cannot simply plot the summary.VOLUME file since the first 20 ps do not contain any volume information because the volume of the box is not written to the output file during constant volume simulations. So, we need to remove the first 101 lines of the file. You can use your favorite text editor to do this - I like vi:

vi summary.VOLUME
d100 (removes the current (first) line plus the next 100 lines)
<esc>:wq summary.VOLUME.modified

Here's the modified file (summary.VOLUME.modified).

xmgrace summary.VOLUME.modified

The volume of the system.

Notice how the volume of our system (in angstroms³) initially decreases as our water box relaxes and reaches an equilibrated density, and thus volume. The smooth transitions in this plot followed by the oscillations about a mean value suggest our equilibration has been successful.

Finally, lets take a look at the density, which we expect to mirror the volume. We have the same problem with this summary file as we did with the volume since the density is not written to the output during constant volume simulations. Here is the modified file with the top 101 lines removed (summary.DENSITY.modified).

The density of the system.

This is as expected. Our system has equilibrated at a density of approximately 1.04 g cm^-3. This seems reasonable. The density of pure liquid water at 300 K is approximately 1.00 g cm^-3 so adding a 10-mer of DNA and the associated charges has increased the density by around 4 %.

5.2.2) Analyzing the trajectory

Now we have analyzed our output files and seen that the main system properties suggest that our equilibration has been successful, the next step is to consider the structures. Have they remained reasonable? One useful measure to consider is the root mean square deviation (RMSd) from the starting structure. We can use cpptraj to calculate this for us as a function of time. In this situation we are interested in the backbone of our DNA so we shall consider just the main backbone atoms, P, O3*, O5* C3*, C4* and C5* (P, O3', O5', C3', C4' and C5' in our prmtop). We will conduct a standard (not mass weighted) RMSd fit to the first structure of our MD (the final structure from the minimization). Note, the time is set to 0.2 this time around since although we wrote to the trajectory file every 100 steps, the time step was 2 fs - hence 100 steps = 0.2 ps. Here is the input file for cpptraj:

polyAT_wat_calc_backbone_rms.in

trajin polyAT_wat_md1.nc
trajin polyAT_wat_md2.nc
rms first out polyAT_wat_backbone.rms.in @P,O3',O5',C3',C4',C5' time 0.2

Note: in early versions of AMBER it was often necessary to re-image molecules back into the primary box before calculating RMSd fits. This is no longer necessary since by default AMBER tracks the original molecule, even if it moves out of the central box during the course of the simulation. In early versions of AMBER you would see the parts of the molecule that moved out of the central box re-appear at the opposite edge. Calculating the RMSd of such a structure would result in huge (>20 angstrom) jumps in the results. This is no longer a problem. However, it you want to calculate certain parameters, such as densities, you will need to re-image everything back into the central box. This is also necessary if you want movies of the trajectory to look "good" since otherwise it will look like your water molecules are flying off into space. In reality, the periodic boundary approach results in them being re-introduced on the opposite side of the box, this is just not shown in the trajectory file. How to re-image a trajectory file is covered in the next stage of the tutorial.

Run cpptraj to calculate the RMSd of the DNA backbone atoms (P, O3*, O5* C3*, C4* and C5*):

$AMBERHOME/bin/cpptraj -p polyAT_wat.prmtop -i polyAT_wat_calc_backbone_rms.in

Here is the output file (polyAT_wat_backbone.rms).

xmgrace polyAT_wat_backbone.rms

RMSd time series, for explicit solvent calculation.

Here we see that the RMSd of the DNA backbone atoms remained low for the first 20 ps, due to the restraints. Upon removing the restraints, the RMSd shot up as the DNA relaxed within the solvent. After that, our RMSd is fairly stable with no wild oscillations.

5.2.3) Viewing the trajectory

Next we can take a look at the trajectory using VMD. First of all we will do this without re-imaging back to the primary box. We will then use cpptraj to re-image our trajectory and then look at it again. So, first of all lets load both trajectory files into vmd:

vmd

Step 1 - load the prmtop file (polyAT_wat.prmtop) remembering to select "AMBER7 Parm"
Step 2 - load the two trajectory files (polyAT_wat_md1.nc, polyAT_wat_md2.nc), make sure you unzip them with gunzip first, in turn into the new molecule created when we loaded the prmtop file. Note this time we need to select AMBER Coordinates with Periodic Box as file type since our simulations are now periodic boundary runs with the box information stored in the trajectory file.

Now - rewind to the first frame and press play:

water

There you have it, a water droplet flying through space. Notice what happens to the water as we move through the simulation (see below).

0 ps

120 ps

It appears like the water box is breaking down and the waters are boiling off into space. This would indeed be true if we were not using periodic boundaries. However, as mentioned above, the atom coordinates written to the trajectory files are those of the original atom, even if it migrates outside of the original box and is replaced by one of its images. We can "fix" this by re-imaging our trajectory as we will now do.

5.2.4) Re-imaging the trajectory back into the primary box

Lets use cpptraj to re-image our trajectory files so that only the atoms in the primary box are shown. Here is the cpptraj input file:

polyAT_wat_reimage.ptraj

trajin polyAT_wat_md1.nc
trajin polyAT_wat_md2.nc
trajout polyAT_wat_md_reimaged.nc

center :1-20
image familiar
go

$AMBERHOME/bin/cpptraj -p polyAT_wat.prmtop -i polyAT_wat_reimage.ptraj

This will append the md2 trajectory file to the md1 file, change the trajectory such that the centre of geometry of residues 1 to 20 is placed at the origin and then image all of the coordinates outside of the primary box back to their images inside the box. The familiar keyword makes sure cpptraj writes out the box as a familiar truncated octahedron as opposed to the default, triclinic imaging. Finally the converted trajectory will be written to (polyAT_wat_md_reimaged.nc [51.4 MB]).

You can now open this trajectory file in VMD where you will find our flying water droplet is much more intact. Have a play with the representations in vmd, remove the water and look at just the DNA etc. Observe how the dynamics of the DNA molecule change after the first 20ps when the restraints are switched off. Also, observe how much more stable our DNA chain is now than it was in the in vacuo simulations.

5.3) Summary

You have now covered the techniques necessary to create an initial input structure for a small DNA 10-mer, modify the pdb file to make it compatible with LEaP and then load this into xleap. We have covered how to charge neutralize this structure, solvate it in water and then create the necessary input files for running MD using sander. What to do when you have residues that are not in the default AMBER templates will be covered in later tutorials.

You should now know how to use these input files to run simulations in vacuo, in implicit solvent and in explicit solvent. The concept of minimization and equilibration has been covered and finally you now know how to perform simple analysis of the results and view trajectory files.

The final stage of this tutorial, which is optional, is to use these newly learned techniques in a practical example. One might ask the question: What happens if I start the simulations using the structure for A-DNA? In the last stage of this tutorial we shall try just that.

CLICK HERE TO GO TO SECTION 6