(Note: These tutorials are meant to provide
illustrative examples of how to use the AMBER software suite to carry out
simulations that can be run on a simple workstation in a reasonable period of
time. They do not necessarily provide the optimal choice of parameters or
methods for the particular application area.)
Copyright Ross Walker 2010
6.6 Nudged Elastic Band (NEB) simulations - SECTION 4
By Christina Bergonzo, Carlos Simmerling & Ross Walker
4. Heating the System
For this we will use the multisander functionality, which requires a groupfile input. You can refer to section 3.9.5 of the REMD section in the AMBER 11 manual for details on the groupfile and multisander functionality.
We start by running the following script, which generates the working directory for this part of the NEB simulation, as well as the groupfile input for the initial heating of the system to biological temperature.
The first two lines create a directory called 1HEAT and cd to that directory. This is important since keeping all output from each step of the simulation is necessary to check the progress of the simulation, as well as to give coordinates to the next step in the simulation. In this tutorial, we will create 5 directories corresponding to each step in the NEB simulation, and beginning with the numbers 1 through 5. Each script will perform the same three tasks: create and cd to the working directory, create the groupfile input, and create the input file. Then, we will have to cd into each directory to run sander.MPI separately for each step in the simulation.
We are now ready to run the first stage of our NEB MD simulation, to carefully heat the system up. The first thing we should do is allow our system to warm up from 0 to 300K. We need to be very careful here though as our initial spring forces will be very large. Hence we do this stage with a small spring constant. Here is the input file we will use, we will run 20ps of MD with a 0.5fs time step (nstlim=40000, dt=0.0005). The half femtosecond time step is probably overkill but it will improve the chances of our simulation remaining stable early on. We will not be using shake to constrain hydrogens in this simulation since the interesting region of the pathway will involve changes in hydrogen bond lengths (ntc=1, ntf=1). This means that even once we have our system equilibrated we will be restricted to a time step of at a maximum 1 fs. Since the coordinates of the end points are fixed in Cartesian space in NEB and the intervening structures cannot move far due to the springs there is no need to recenter the coordinates every few hundred steps so we turn this off (NSCM=0). We will use generalised born for the implicit solvent with a sodium chloride salt concentration of 0.2M (igb=1, saltcon=0.2). The system is small enough here that I will avoid any issues with nonbonded cutoffs and truncated Born radii calculations by setting these to 999.0 angstroms (cut=999.0, rgbmax=999.0). This has the effect of turning off any nonbond cutoff.
A note on random seeds: In all of these calculations we are setting the random seed (ig) to be -1. This makes the code obtain the wallclock time in microseconds and use this to seed the random number generator. This way we will not end up using the same random number seed in multiple restarted simulations. Reusing the seed can lead to all sorts of correlation issues when running with langevin dynamics (ntt=3) because each time we restart the set of random forces used for each atom per step will be the same since the random number stream will be the same. The down side is that it means that if you run these calculations yourself you will get different trajectories to what is in this tutorial. You may even end up freezing out along a different low energy pathway to the one we calculate in this tutorial. This is NOT a problem and is part of the reason you should normally run multiple simulations in order to locate alternative pathways. You should just be aware of this when running the tutorial calculations yourself.
The NEB specific options are on the last line (ineb=1 [turn on neb], skmin=10, skmax=10 [spring constants]. We start here with relatively small spring constants of 10 KCal/mol/A2. We will increase this value later once the system has begun to equilibrate. New features for the Amber11 NEB implementation are the tgtfit and tgtrms masks. Read more about these masks in the manual 3.6, Targeted MD. Briefly, the tgtrms mask denotes which atoms we want to apply NEB forces to, and the tgtfit mask denotes which atoms we want to RMS fit the system on. Here, we are setting the tgtfit mask to rms fit to all residues of the alanine dipeptide, and we set the tgtrms mask to the backbone C, CA, and N atoms. Therefore, the NEB forces will only be applied to the backbone atoms of the dipeptide.
For temperature regulation we use the langevin thermostat (ntt=3) and we will use a high collision rate of 1000 ps-1 (gamma_ln=1000). This would be extremely high for a regular MD simulation where values of around 5.0 are more appropriate. However, in NEB we want to ensure that our system explores as much phase space as possible before we cool it back down to 0 K and so we use a higher collision rate to ensure rapid uniform heating, give the system some viscosity and improve the stability. Without the strong thermostat, the calculation would become unstable because the periodic motions involving the springs (and the truncation of the forces according to the path tangent) would blow the system apart. Thus it is essential to use large values of gamma_ln with NEB simulations.
The large value of gamma_ln would normally mean that our system would heat up to the value of temp0 very rapidly, within a few pico-seconds. This could cause instabilities in the simulation and so we will use NMR weight restraints (nmropt=1) to linearly increase the value of temp0 from 0.0 to 300.0 K over 35,000 steps of the 40000 step simulation (istep1=0,istep2=35000, value1=0.0, value2=300.0). This means that at step 0 the value of the target temperature (temp0) will be 0.0 K. At step 5,000 it will be 42.86 K, at 10,000 it will be 85.72 K, by 20,000 it will be 171.44 K and by 35,000 it will be at 300.0 K. It will then remain at 300.0 K until the end of our 40,000 step run. In this way our system should heat up in a controlled fashion.
After running the groupgen-initial.sh script, the resulting input file looks like this:
Alanine NEB initial MD with small K
&cntrl imin = 0, irest = 0, ntc=1, ntf=1, ntpr=500, ntwx=500, ntb = 0, cut = 999.0, rgbmax=999.0, igb = 1, saltcon=0.2, nstlim = 40000, nscm=0, dt = 0.0005, ig=-1, ntt = 3, gamma_ln=1000.0, tempi=0.0, temp0=300.0, tgtfitmask=":1,2,3", tgtrmsmask=":1,2,3@N,CA,C", ineb = 1,skmin = 10,skmax = 10, nmropt=1, / &wt type='TEMP0', istep1=0,istep2=35000, value1=0.0, value2=300.0 / &wt type='END' /
The resulting groupfile looks like this:
-O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.000 -o ./neb.out.000 -inf ./neb.info.000 -r ./neb.r.000 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.001 -o ./neb.out.001 -inf ./neb.info.001 -r ./neb.r.001 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.002 -o ./neb.out.002 -inf ./neb.info.002 -r ./neb.r.002 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.003 -o ./neb.out.003 -inf ./neb.info.003 -r ./neb.r.003 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.004 -o ./neb.out.004 -inf ./neb.info.004 -r ./neb.r.004 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.005 -o ./neb.out.005 -inf ./neb.info.005 -r ./neb.r.005 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.006 -o ./neb.out.006 -inf ./neb.info.006 -r ./neb.r.006 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.007 -o ./neb.out.007 -inf ./neb.info.007 -r ./neb.r.007 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.008 -o ./neb.out.008 -inf ./neb.info.008 -r ./neb.r.008 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.009 -o ./neb.out.009 -inf ./neb.info.009 -r ./neb.r.009 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.010 -o ./neb.out.010 -inf ./neb.info.010 -r ./neb.r.010 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.011 -o ./neb.out.011 -inf ./neb.info.011 -r ./neb.r.011 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.012 -o ./neb.out.012 -inf ./neb.info.012 -r ./neb.r.012 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.013 -o ./neb.out.013 -inf ./neb.info.013 -r ./neb.r.013 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.014 -o ./neb.out.014 -inf ./neb.info.014 -r ./neb.r.014 -O -p ../str1.prmtop -c ../str1.inpcrd -i ./1heat.in -x ./neb.x.015 -o ./neb.out.015 -inf ./neb.info.015 -r ./neb.r.015 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.016 -o ./neb.out.016 -inf ./neb.info.016 -r ./neb.r.016 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.017 -o ./neb.out.017 -inf ./neb.info.017 -r ./neb.r.017 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.018 -o ./neb.out.018 -inf ./neb.info.018 -r ./neb.r.018 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.019 -o ./neb.out.019 -inf ./neb.info.019 -r ./neb.r.019 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.020 -o ./neb.out.020 -inf ./neb.info.020 -r ./neb.r.020 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.021 -o ./neb.out.021 -inf ./neb.info.021 -r ./neb.r.021 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.022 -o ./neb.out.022 -inf ./neb.info.022 -r ./neb.r.022 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.023 -o ./neb.out.023 -inf ./neb.info.023 -r ./neb.r.023 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.024 -o ./neb.out.024 -inf ./neb.info.024 -r ./neb.r.024 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.025 -o ./neb.out.025 -inf ./neb.info.025 -r ./neb.r.025 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.026 -o ./neb.out.026 -inf ./neb.info.026 -r ./neb.r.026 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.027 -o ./neb.out.027 -inf ./neb.info.027 -r ./neb.r.027 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.028 -o ./neb.out.028 -inf ./neb.info.028 -r ./neb.r.028 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.029 -o ./neb.out.029 -inf ./neb.info.029 -r ./neb.r.029 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.030 -o ./neb.out.030 -inf ./neb.info.030 -r ./neb.r.030 -O -p ../str1.prmtop -c ../str2.inpcrd -i ./1heat.in -x ./neb.x.031 -o ./neb.out.031 -inf ./neb.info.031 -r ./neb.r.031
Each line is a sander command for each individual bead's MD simulation. Here we have 32 replicas, with the first 16 using copies of the str1.inpcrd and the second 16 using copies of the str2.inpcrd.
If we were also planning on including some potential intermediate structures we would also name these sequentially, adding them into the groupfile between the endpoint conformations. This would be done in the order in which each of the structures would appear along the transition path.
We can now run this MD using the AMBER v11 Sander.MPI executable. The number of cpus we specify for the mpirun command must be a multiple of the number of replicas, so in this case a minimum of 32. While you can run such a run on a single processor desktop it will generally be much more efficient to run such a simulation on at least 32 cores, either a 32 core desktop machine or over a parallel cluster. You can also specify more than 32 cpus if more are available. For example specifying 64 cpus will use 2 cpus per replica. The command line I used was as follows (this may need adjusting based on your mpi implementation, queuing system, cluster architecture etc.)
mpirun -np 32 $AMBERHOME/exe/sander.MPI -ng 32 -groupfile groupfile
Here the -ng 32 option specifies that there are 32 groups and the -groupfile specifies the name of the groupfile to be read and indicates to sander.MPI that we want to run in multisander mode.
Depending on the speed of your system this could take anywhere from a couple of minutes to an hour. On a dual-core Athlon this step should take about 7 minutes. You can check the progress by opening a separate terminal and tailing one of the mdout files. When it is complete you should have 32 rst files, 32 trajectory files and 32 mdout files.
A tar file of them is available here: 1HEAT.tar.gz
You should check each output file to make sure that it read the input file correctly, ran without problems, had the correct target temperature, etc. You should also visualize the mdcrd files to make sure there are no problems.
Take a look at one of the output files. It looks just like a normal MD except that you now have the energy of each of the replicates listed. This is the energy of each one as it moves along the potential energy surface. You also see the total energy of the replicates. The lowest energy pathway should be the one for which this sum is the smallest. Thus if we perform simulated annealing calculations and obtain several pathways we can compare the overall sum from each pathway. Take a quick look at the output file and check that the simulation looks stable, i.e. the temperature changes as we expect, slowly increasing to 300.0 K and then oscillating around this value. Any huge temperature jumps are signs of problems with our simulation, poor starting structure, poor choice of simulation options etc. You should also take a look at the replicate energies. These show the potential energy of each replica on the pathway. The most important thing to check at this stage is that on step 0 the first 16 have the same energy, that of the beginning of the pathway and the last 16 have a different but identical energy, that of the end of the pathway. If this is not the case then something is wrong with the groupfile and/or inpcrd files.
NSTEP = 0 TIME(PS) = 0.000 TEMP(K) = 0.00 PRESS = 0.0 Etot = -36.0518 EKtot = 0.0000 EPtot = -36.0518 BOND = 0.4957 ANGLE = 1.2919 DIHED = 8.9077 1-4 NB = 3.0084 1-4 EEL = 45.0565 VDWAALS = -1.8571 EELEC= -75.3077 EGB = -17.6469 RESTRAINT = 0.0000 NEB replicate breakdown: Energy for replicate 1 = -36.0518 Energy for replicate 2 = -36.0518 Energy for replicate 3 = -36.0518 Energy for replicate 4 = -36.0518 Energy for replicate 5 = -36.0518 Energy for replicate 6 = -36.0518 Energy for replicate 7 = -36.0518 Energy for replicate 8 = -36.0518 Energy for replicate 9 = -36.0518 Energy for replicate 10 = -36.0518 Energy for replicate 11 = -36.0518 Energy for replicate 12 = -36.0518 Energy for replicate 13 = -36.0518 Energy for replicate 14 = -36.0518 Energy for replicate 15 = -36.0518 Energy for replicate 16 = -36.0518 Energy for replicate 17 = -34.1210 Energy for replicate 18 = -34.1210 Energy for replicate 19 = -34.1210 Energy for replicate 20 = -34.1210 Energy for replicate 21 = -34.1210 Energy for replicate 22 = -34.1210 Energy for replicate 23 = -34.1210 Energy for replicate 24 = -34.1210 Energy for replicate 25 = -34.1210 Energy for replicate 26 = -34.1210 Energy for replicate 27 = -34.1210 Energy for replicate 28 = -34.1210 Energy for replicate 29 = -34.1210 Energy for replicate 30 = -34.1210 Energy for replicate 31 = -34.1210 Energy for replicate 32 = -34.1210 Total Energy of replicates = -1122.7648 ------------------------------------------------------------------------------
We have to do quite a bit of simulated annealing, however, to find the lowest pathway. We also need to increase the strength of our initial 'weak' spring constant. Before we move on though it is interesting and practical to look at the restart files, which we can combine into one trajectory file, in vmd. Here you will see the final structures along the path, after this initial heating phase. The script to combine these restart files into a trajectory using cpptraj is given below.
#!/bin/bash rm cpptraj.in top='../str1.prmtop' for ((i=0; i<=31; i++)); do ext=`printf "%03i" $i` echo "trajin ./neb.r.$ext" >> cpptraj.in done echo "trajout ./heat.restarts.mdcrd" >> cpptraj.in echo "go" >> cpptraj.in $AMBERHOME/exe/cpptraj $top cpptraj.in
Output file: heat.restarts.mdcrd.gz
Take a look at this in VMD using the str1.prmtop file. First, open VMD. Under File on the menu bar, select New Molecule. Browse to the str1.prmtop we have been using for this NEB simulation. Under the Determine file type menu, select AMBER7 Parm. Click Load. Now, we can load the trajectory we have just created using cpptraj on top of this parmtop. Under Browse select the heat.restarts.mdcrd trajectory we just created. Choose AMBER Coordinates under the Determine file type menu.
As you play the trajectory using the play button in the bottom-right corner of the VMD Main screen, notice how all the structures are still grouped around either one or the other starting structure. This is because our spring constant wasn't large enough to pull them over the hill between the two end points. It has removed some of the strain on the system however.
We are now ready to move onto the next stage of the simulation where we will conduct the simulated annealing and equilibration.