(Note: These tutorials are meant to provide illustrative
examples of how to use the AMBER software suite to carry out simulations that
can be run on a simple workstation in a reasonable period of time. They do not
necessarily provide the optimal choice of parameters or methods for the
particular application area.)
HTML Pages
Copyright Ross Walker 2008
Please note: This tutorial will only work with AMBER v10 or later.
1) Before you begin
This tutorial is intended to be a basic guideline for running Replica Exchange (REMD) simulations with AMBER v10. This tutorial assumes that you have experience running MD simulations with AMBER and know about the terminology and typical variables used for standard MD simulations. If you are NOT familiar with setting up a system, equilibration and running MD simulations please consult the more basic tutorials BEFORE attempting to run REMD simulations. You should also be familiar with the REMD theory. Make sure to read the paper by Sugita and Okamoto ("Replica-exchange molecular dynamics method for protein folding." Chemical Physics Letters 314(1-2): 141-151, 1999) and the relevant sections in the AMBER manual to learn about REMD.
2) Setting up the system
In this section we will build the system via Leap and run minimization via Sander. Here a brief description of the system and the procedure used to generate the topology and coordinate files.
For this tutorial we will build the peptide Ala10 in Leap. We will build the linear sequence, and we will use the ff99SB force field and the "mbondi2" radii that are appropriate for the igb=5 option in sander. Here is the input file for leap:
leap.in |
source leaprc.ff99SB set default PBradii mbondi2 m = sequence { ACE ALA ALA ALA ALA ALA ALA ALA ALA ALA ALA NHE } saveamberparm m ala10.prmtop ala10.inpcrd quit |
Have tleap read this file with the following command: "tleap -f leap.in"
This should produce ala10.prmtop and ala10.inpcrd.
Make sure to check the leap output and log file for any errors.
Next we will run minimization to relax the input structure before we run molecular dynamics. The following input file should give a minimized structure suitable for starting MD.
min.in |
energy minimization &cntrl imin=1, maxcyc=1000, ncyc=500, ntwr = 1000, ntpr = 100, cut = 999.0, rgbmax = 999.0, ntb = 0, igb = 5, saltcon = 0.0, / |
$AMBERHOME/exe/sander -O -i min.in -o min.out -p ala10.prmtop -c ala10.inpcrd -r min.rst
3) Determining the number of replicas and temperatures for each replica
Before starting a replica exchange simulation we need to determine the number of replicas needed and the temperatures for each replica for the simulated system. To obtain reasonable exchange probabilities we need to have overlaps between the potential energy distributions of neighboring replicas. Closer temperatures would increase this overlap and exchange probabilities but result in more replicas for the given temperature range. Finding an optimum temperature distribution is difficult and there are many ways to determine the temperatures which are beyond the scope of this tutorial. You are advised to consult the literature for example on calculating optimum temperatures. Here we will just use an example set of temperatures so you understand the mechanics of running REMD simulations.
To determine the number of replicas and temperature distribution one typically needs to determine the number of atoms in the system. This information can be found in both the topology and coordinate files. The first number in a coordinate (or restart) file shows the number of atoms.
Alternatively in the topology file the system description again starts with the number of atoms:
For both cases the number of atoms is highlighted with the blue arrow.
After determining the number of atoms (in this case 109) for the system one can determine reasonable numbers of replicas to use and a reasonable initial temperature distribution. Typically the number of replicas would be related to the square root of the number of atoms and the temperature distribution would be chosen to be a geometric progression. There is much discussion in the literature about this and you are advised to do a thorough literature search.
Usually REMD simulations are run for a temperature range between 270 - 600K. Depending on your system a different temperature range may be required, however, an in-depth discussion of this is beyond the scope of this tutorial. We always need to have an even number of replicas since exchanges are always attempted pair-wise. For the purposed of this tutorial we will use 8 replicas and we will use the following temperature distribution.
269.5 |
You should copy the list of temperatures produced and copy them into a new file called "temperatures.dat" which will be used by the setup scripts later on.
4) Generating chirality restraints
To enhance conformational sampling REMD runs typically need to go to very high temperatures (ca. 500-600K). At such temperatures unwanted rotations around the peptide bond might occur leading to non-physical chiralities. To prevent this we need to use chirality restraints on the backbone. We can use the "makeCHIR_RST" script that is provided with AMBER 10 in the exe directory to do this.
To use this script we need a pdb file for our system. We can obtain the pdb file through the ptraj or ambpdb programs available with AMBER or alternatively we could have had leap create one during the system setup stage.
> $AMBERHOME/exe/ambpdb -p ala10.prmtop < ala10.inpcrd > ala10.pdb
Once we have the pdb file we can generate the restraints file using the following command:
> $AMBERHOME/exe/makeCHIR_RST ala10.pdb ala10_chir.dat
Take a look at this file and make sure you understand what it is doing, you can find more information about distance, angle and dihedral restraints in the AMBER manual.
5) Equilibration
Before starting the REMD simulations we will run short simulations with each replica to equilibrate them to their individual temperatures. This step is not strictly necessary since sander can assign velocities but this step will show how sander can be used to perform multiple independent (non-exchanging) MD runs. Running all replicas using the same inpcrd can also cause instability since they will initially have the same energies and rapidly exchange across temperatures, perhaps causing more distortion in the initial coordinates than is desired. Since we need to independently equilibrate each replica we have two choices. We can either run each equilibration individually, so running eight separate MD simulations or we can use Multisander to perform the task in parallel. Multisander is a mode of running sander that simply runs a single parallel MPI sander job that is generating multiple MD simulations rather than a single simulation. This works very much like running REMD, so setting it up and running it is a good introduction to the usage for REMD. This type of simulation will also run efficiently across a parallel cluster if you have access to one.Generating the input files for multisander
Since we have 8 replicas with temperatures from 269.5 to 570.9 K we need 8 sets of input files for multisander. The script "setup_equilibrate_input.x" will generate the input files (mdin) for us. Take a look at this file and make sure you understand how it works. The purpose of this script is to read the temperatures from the temperatures.dat file and adjust temp0 of each mdin file to match. The script also ensures that the seed for the random number generator is different for each structure. This is very important, especially when running Langevin simulations. Without this all sorts of weird correlations can occur during the MD. This script also sets up the group file for us which will be discussed shortly.
(Note: when running multisander it is important that every file that sander writes to be defined with a unique name. If this is not done then performance can be very very poor. Hence we must specify unique names for mdinfo as well as mdout, rst and mdcrd.)
For this exercise we will run 200ps (nstlim=100000, dt=0.02) of simulation for each replica slowly heating them to their respective temperatures using a Langevin thermostat (ntt=3, gamma_ln=1.0). To read the chirality restraints we use "nmropt = 1" to the restraints on and "DISANG=ala10_chir.dat" defines the file to read the restraints from. Since all simulations will use the same restraints we do not need to duplicate this file. We start by making a single input file called mdin. This file is shown below. Note the " temp0=XXXXX" and "ig=RANDOM_NUMBER", these will be filled in automatically by the setup_equilibrate_input.x script.
equilibrate.mdin |
Equilibration &cntrl irest=0, ntx=1, nstlim=100000, dt=0.002, irest=0, ntt=3, gamma_ln=1.0, temp0=XXXXX, ig=RANDOM_NUMBER, ntc=2, ntf=2, nscm=1000, ntb=0, igb=5, cut=999.0, rgbmax=999.0, ntpr=500, ntwx=500, ntwr=100000, nmropt=1, / &wt TYPE='END' / DISANG=ala10_chir.dat |
Since we will run 8 independent simulations in one multisander run we need a "groupfile" to define the individual input and output files for each simulation. An explanation of the groupfile format can be found in the AMBER manual. Essentially it is just a set of nreplica lines specifying what we would normally have included on the command line for a regular sander run. The setup_equilibrate_input.x script will produce this for us automatically.
Once we have generated the generic mdin file we can then run the setup_equilibrate_input.x to produce the multiple input files and the groupfile.
> ./setup_equilibrate_input.x
This should produce the following output:
8 TEMPERATURE: 269.5 K ==> FILE: equilibrate.mdin.001 TEMPERATURE: 300.0 K ==> FILE: equilibrate.mdin.002 TEMPERATURE: 334.0 K ==> FILE: equilibrate.mdin.003 TEMPERATURE: 371.8 K ==> FILE: equilibrate.mdin.004 TEMPERATURE: 413.9 K ==> FILE: equilibrate.mdin.005 TEMPERATURE: 460.7 K ==> FILE: equilibrate.mdin.006 TEMPERATURE: 512.9 K ==> FILE: equilibrate.mdin.007 TEMPERATURE: 570.9 K ==> FILE: equilibrate.mdin.008 N REPLICAS = 8 Done. |
Along with a groupfile that should be as follows:
equilibrate.groupfile |
-O -rem 0 -i equilibrate.mdin.001 -o equilibrate.mdout.001 -c min.rst -r equilibrate.rst.001 -x equilibrate.mdcrd.001 -inf equilibra te.mdinfo.001 -p ala10.prmtop -O -rem 0 -i equilibrate.mdin.002 -o equilibrate.mdout.002 -c min.rst -r equilibrate.rst.002 -x equilibrate.mdcrd.002 -inf equilibra te.mdinfo.002 -p ala10.prmtop -O -rem 0 -i equilibrate.mdin.003 -o equilibrate.mdout.003 -c min.rst -r equilibrate.rst.003 -x equilibrate.mdcrd.003 -inf equilibra te.mdinfo.003 -p ala10.prmtop -O -rem 0 -i equilibrate.mdin.004 -o equilibrate.mdout.004 -c min.rst -r equilibrate.rst.004 -x equilibrate.mdcrd.004 -inf equilibra te.mdinfo.004 -p ala10.prmtop -O -rem 0 -i equilibrate.mdin.005 -o equilibrate.mdout.005 -c min.rst -r equilibrate.rst.005 -x equilibrate.mdcrd.005 -inf equilibra te.mdinfo.005 -p ala10.prmtop -O -rem 0 -i equilibrate.mdin.006 -o equilibrate.mdout.006 -c min.rst -r equilibrate.rst.006 -x equilibrate.mdcrd.006 -inf equilibra te.mdinfo.006 -p ala10.prmtop -O -rem 0 -i equilibrate.mdin.007 -o equilibrate.mdout.007 -c min.rst -r equilibrate.rst.007 -x equilibrate.mdcrd.007 -inf equilibra te.mdinfo.007 -p ala10.prmtop -O -rem 0 -i equilibrate.mdin.008 -o equilibrate.mdout.008 -c min.rst -r equilibrate.rst.008 -x equilibrate.mdcrd.008 -inf equilibra te.mdinfo.008 -p ala10.prmtop |
As you can see it resembles the regular sander command line options. The -rem 0 specifies that we are using multisander but that do not wish to do actual replica exchange. The script should also produce eight mdin files:
equilibrate.mdin.001 | equilibrate.mdin.002 | equilibrate.mdin.003 | equilibrate.mdin.004 |
equilibrate.mdin.005 | equilibrate.mdin.006 | equilibrate.mdin.007 | equilibrate.mdin.008 |
You should check that each mdin file produced has the correct temperature and that each contains a different value for the random number seed (ig=xxxx).
We are now ready to run the equilibration. When we want sander to run in multisander mode our command line options are somewhat different to a regular run. Firstly we have to use sander.MPI since multisander only runs in parallel. In addition the number of cpus we specify for the mpirun command must be a multiple of the number of replicas, so in this case a minimum of 8. While you can run such a run on a single processor desktop it will generally be much more efficient to run such a simulation on at least eight cores, either an 8 core desktop machine or over a parallel cluster. You can also specify more than eight cpus if more are available. For example specifying 16 cpus will use 2 cpus per replica. The command line I used was as follows (this may need adjusting based on your mpi implementation, queuing system, cluster architecture etc.)
mpirun -np 8 $AMBERHOME/exe/sander.MPI -ng 8 -groupfile equilibrate.groupfile
Here the -ng 8 option specifies that there are 8 groups and the -groupfile specifies the name of the groupfile to be read and indicates to sander.MPI that we want to run in multisander mode.
Depending on the speed of your system this could take anywhere from a couple of minutes to several hours. You can check the progress by opening a separate terminal and tailing one of the mdout files. When it is complete you should have 8 rst files, 8 mdcrd files and 8 mdout files. A tar of them is available here: equilibrate_output.tar.gz
You should check each output file to make sure that it read the restraints correctly, ran without problems, had the correct target temperature etc. You should also visualize the mdcrd files to make sure there are no problems.
Now we have equilibrated our initial structures we can move on to running the actual REMD simulations.
6) REMD Simulations
Running REMD is essentially the same as running multisander. During REMD each multisander job will periodically communicate and attempt an exchange. We need to use additional variables in the input files for each replica and the groupfile to be able to run REMD.
Input files for REMD
As with multisander we will need an input file for each replica. The input files will be the same except for the target temperature (temp0) and the random number seed (ig). There are also some other minor changes compared with non-REMD simulations. Here are the necessary changes and explanations:
- nstlim = 500 During REMD this variable determines the number of MD steps between each exchange attempt. In this case we set it to 500 steps which with a 2fs timestep gives us 1ps between exchange attempts.
- numexchg = 1000 This specifies the number of exchange attempts during the simulation. The total simulation length will be numexch * nstlim * dt which in this case yields 1ns of simulation.
Note that unlike normal MD, trajectory output can be written less frequently than nstlim steps. In other words we can set nstlim=500 and still use ntwx=1000. This allows us to write less frequently than once per exchange (this was not supported in AMBER 9 REMD). This also allows us to have very short exchange frequencies, say once every 10 steps without generating huge trajectory files.
We need to generate the input files again. First we create a generic mdin file:
remd.mdin |
Equilibration &cntrl irest=0, ntx=1, nstlim=500, dt=0.002, irest=0, ntt=3, gamma_ln=1.0, temp0=XXXXX, ig=RANDOM_NUMBER, ntc=2, ntf=2, nscm=1000, ntb=0, igb=5, cut=999.0, rgbmax=999.0, ntpr=100, ntwx=1000, ntwr=100000, nmropt=1, numexchg=1000, / &wt TYPE='END' / DISANG=ala10_chir.dat |
We then have a slightly modified version of the setup script above to build the multiple mdin files and the groupfile.
setup_remd_input.x |
#!/bin/bash -f if [ -f groupfile ]; then rm groupfile fi nrep=`wc temperatures.dat | awk '{print $1}'` echo $nrep count=0 for TEMP in `cat temperatures.dat` do let COUNT+=1 REP=`printf "%03d" $COUNT` echo "TEMPERATURE: $TEMP K ==> FILE: remd.mdin.$REP" sed "s/XXXXX/$TEMP/g" remd.mdin > temp sed "s/RANDOM_NUMBER/$RANDOM/g" temp > remd.mdin.$REP echo "-O -rem 1 -remlog rem.log -i remd.mdin.$REP -o remd.mdout.$REP -c equilibrate.rst.$REP -r remd.rst.$REP -x remd.mdcrd.$REP -inf remd.mdinfo.$REP -p ala10.prmtop" >> remd.groupfile rm -f temp done echo "#" >> groupfile echo "N REPLICAS = $nrep" echo " Done." |
Note that we have made some changes to the groupfile generation part of this script:
-
-rem 1 Run REMD.
-
-remlog rem.log Save the log file into "rem.log" to keep track of each replica.
-
We are using the restart files from the equilibration as the input coordinates for each replica.
Running this script creates the necessary input files:
> ./setup_remd_input.x
remd.mdin.001 | remd.mdin.002 | remd.mdin.003 | remd.mdin.004 |
remd.mdin.005 | remd.mdin.006 | remd.mdin.007 | remd.mdin.008 |
We are now ready to run the replica exchange MD run. We run this just as we did for the equilibration:
> mpirun -np 8 $AMBERHOME/exe/sander.MPI -ng 8 -groupfile remd.groupfile
Depending on the speed of your system this could take anywhere from a couple of minutes to several hours. You can check the progress by opening a separate terminal and tailing one of the mdout files. When it is complete you should have 8 rst files, 8 mdcrd files, 8 mdout files and a rem.log file. A tar of them is available here: remd_output.tar.gz
You should note that the output here are for REPLICAS, not for temperatures. This is a change from AMBER 9 and allows each replica MD to write to the same output files, rather than closing and re-opening the file on each exchange. This greatly improves performance, especially for short exchange frequencies. However, this means that we need to carry out extra processing on the output files in order to obtain data for ensembles sampled at a specific temperature. This processing is described below.
An additional file "rem.log" is also created which has summary information about each replica and exchanges. The log files looks like:
# Replica Exchange log file # numexchg is 1000 # REMD filenames: # remlog= rem.log # remtype= rem.type # Rep#, Velocity Scaling, T, Eptot, Temp0, NewTemp0, Success rate (i,i+1), ResStruct# # exchange 1 1 -1.00 0.00 25.08 269.50 269.50 0.00 -1 2 -1.00 0.00 16.21 300.00 300.00 0.00 -1 3 -1.00 0.00 49.19 334.00 334.00 0.00 -1 4 1.06 0.00 57.15 371.80 413.90 2.00 -1 5 0.95 0.00 66.86 413.90 371.80 0.00 -1 6 -1.00 0.00 60.58 460.70 460.70 0.00 -1 7 -1.00 0.00 67.98 512.90 512.90 0.00 -1 8 -1.00 0.00 93.67 570.90 570.90 0.00 -1 # exchange 2 1 -1.00 244.55 0.95 269.50 269.50 0.00 -1 2 -1.00 251.98 13.77 300.00 300.00 0.00 -1 3 1.06 241.54 31.16 334.00 371.80 1.00 -1 4 0.95 311.46 29.45 371.80 334.00 1.00 -1 5 1.06 297.63 39.26 413.90 460.70 1.00 -1 6 0.95 381.98 34.99 460.70 413.90 0.00 -1 7 -1.00 430.62 54.54 512.90 512.90 0.00 -1 8 -1.00 424.99 80.56 570.90 570.90 0.00 -1 # exchange 3 1 -1.00 238.63 11.17 269.50 269.50 0.00 -1 2 -1.00 291.79 18.81 300.00 300.00 0.00 -1 3 -1.00 325.13 23.99 334.00 334.00 0.67 -1 4 -1.00 371.31 41.80 371.80 371.80 0.67 -1 5 -1.00 375.01 54.78 413.90 413.90 0.67 -1 6 -1.00 364.77 64.48 460.70 460.70 0.00 -1 7 -1.00 449.13 81.14 512.90 512.90 0.00 -1 8 -1.00 599.45 92.80 570.90 570.90 0.00 -1 # exchange 4 1 -1.00 252.84 12.33 269.50 269.50 0.00 -1 2 -1.00 311.67 31.03 300.00 300.00 0.00 -1 3 -1.00 302.79 19.00 334.00 334.00 0.50 -1 4 -1.00 402.99 44.30 371.80 371.80 0.50 -1 5 1.06 398.00 54.78 413.90 460.70 1.00 -1 6 0.95 461.72 64.51 460.70 413.90 0.00 -1 7 -1.00 449.58 72.89 512.90 512.90 0.00 -1 8 -1.00 628.56 111.20 570.90 570.90 0.00 -1 |
The columns provide (in this order) the replica number (corresponding to the order of files listed in the groupfile), the velocity scaling factor if an exchange has occurred (-1 otherwise), the instantaneous system temperature (but note that the thermostat temperature is used for the exchange calculation), the potential energy, the thermostat bath temperature during the previous MD stage, the new bath temperature for the next MD stage, the overall running average success rate between the Temp0 temperature and the next higher temperature (0 to 10, but it can be greater than 1 at the beginning since we divide the # of successes by 0.5* the total attempts since each pair is attempted only every other exchange), and finally a last integer that is used only for reservoir REMD, and is -1 otherwise.
7) Analyzing REMD results
Mdout Files
The mdout file will have additional data in a REMD run that described the information being used for the exchange calculation. An example is shown below, where the normal sander energy output (specified by ntpr) is written, along with information about the replica number (REPNUM) and exchange attempt counter. This section is followed by additional information about the exchange attempt, such as the energy difference between replicas, the Metropolis probability and the random number that was generated for comparison. In this example, the exchange was successful.
=============================================================================== NSTEP = 500 TIME(PS) = 205.000 TEMP(K) = 287.31 PRESS = 0.0 Etot = 96.6216 EKtot = 77.6484 EPtot = 18.9731 BOND = 23.5875 ANGLE = 51.8775 DIHED = 84.7745 1-4 NB = 17.5760 1-4 EEL = 629.2285 VDWAALS = -18.4421 EELEC = -687.3208 EGB = -82.4913 RESTRAINT = 0.1832 EAMBER (non-restraint) = 18.7899 TEMP0 = 269.5000 REPNUM = 1 EXCHANGE# = 5 ------------------------------------------------------------------------------ NMR restraints: Bond = 0.000 Angle = 0.000 Torsion = 0.183 =============================================================================== REMD: myEptot= 18.9731 myTargetTemp= 269.50 mytemp= 287.31 ==========================REMD EXCHANGE CALCULATION========================== Exch= 6 RREMD= 0 Replica Temp= 269.50 Indx= 1 Rep#= 1 EPot= 18.97 Partner Temp= 300.00 Indx= 2 Rep#= 2 EPot= 19.93 Not controlling exchange. Rand= 0.274923E+00 MyScaling= 1.06 Success= T ========================END REMD EXCHANGE CALCULATION======================== REMD: checking to see if bath T has changed: 269.50->300.00 REMD: scaling velocities by 1.055 to match new bath T 300.000 | # of SOLUTE degrees of freedom (RNDFP): 272. | # of SOLVENT degrees of freedom (RNDFS): 0. | NDFMIN = 272. NUM_NOSHAKE = 0 CORRECTED RNDFP = 272. | TOTAL # of degrees of freedom (RNDF) = 272. |
Analyzing REMD trajectories
REMD will produce trajectories for each replica. Each trajectory represents the snapshots sampled in the continuous MD run, which changed in thermostat temperature during the run. Furthermore, this trajectory is not standard Amber format- it has additional information about the temperature at each snapshot so that ptraj can reconstruct the data for individual temperatures (or standard format trajectories for each replica). In the next section we provide scripts and explanations for each of these processes: extracting temperature trajectories in standard format from the entire set of REMD trajectories, and extracting a standard format replica trajectory from an individual REMD trajectory.
Here is an example of a replica format trajectory written by sander. Note the "REMD" lines. Instead of using ptraj to convert to a standard format replica trajectory, one could simply use "grep -v REMD" to remove the REM lines. However, if one wants to obtain temperature-based trajectories or data, then using ptraj to process the REMD format trajectories is required. The "REMD" line gives the following information: replica number, exchange number, MD step number, and thermostat temperature.
REMD 1 4 2000 269.500 4.115 25.625 -7.082 5.088 25.200 -6.837 5.449 24.747 -7.761 4.926 24.435 -6.078 5.977 26.261 -6.288 7.154 26.484 -6.594 5.485 26.960 -5.243 4.582 26.747 -4.844 6.317 28.031 -4.632 6.687 28.768 -5.344 5.345 28.815 -3.711 5.867 29.641 -3.227 4.515 29.203 -4.301 5.025 28.096 -2.958 7.520 27.594 -3.850 8.407 28.412 -3.559 7.513 26.336 -3.456 6.761 25.725 -3.739 8.736 25.654 -2.967 9.600 26.250 -3.263 8.729 25.502 -1.433 8.610 26.505 -1.023 7.878 24.891 -1.131 9.687 25.057 -1.165 8.913 24.288 -3.664 7.972 23.524 -3.695 10.115 24.052 -4.154 10.822 24.713 -3.867 10.388 22.890 -5.039 9.462 22.692 -5.579 11.441 23.331 -6.059 12.302 23.846 -5.632 11.821 22.459 -6.592 10.927 23.890 -6.841 10.650 21.615 -4.156 10.722 21.588 -2.896 10.939 20.520 -4.950 10.833 20.723 -5.933 11.205 19.203 -4.419 10.480 19.162 -3.606 11.004 18.105 -5.433 11.437 17.171 -5.075 9.948 17.892 -5.602 11.563 18.409 -6.318 12.652 19.093 -3.826 13.680 19.414 -4.445 12.888 18.699 -2.555 12.098 18.336 -2.042 14.318 18.608 -2.045 14.882 19.440 -2.467 14.297 18.746 -0.523 15.348 18.705 -0.239 13.856 19.711 -0.271 13.744 17.964 -0.003 15.080 17.340 -2.475 14.441 16.387 -2.918 16.425 17.367 -2.411 16.805 18.275 -2.185 17.266 16.394 -3.028 17.109 16.366 -4.107 18.656 16.752 -2.761 18.874 17.149 -1.769 19.399 15.993 -3.007 18.955 17.534 -3.459 17.015 14.967 -2.481 16.636 14.827 -1.306 17.252 13.910 -3.289 17.711 14.148 -4.156 16.967 12.515 -2.978 15.880 12.436 -2.956 17.435 11.759 -4.266 17.204 10.697 -4.187 17.015 12.244 -5.147 18.519 11.816 -4.366 17.611 12.092 -1.694 16.953 11.377 -0.936 18.900 12.398 -1.471 19.434 12.736 -2.259 19.663 12.239 -0.217 19.055 11.655 0.474 20.932 11.483 -0.647 20.724 10.769 -1.444 21.824 12.038 -0.938 21.282 10.963 0.245 19.785 13.621 0.542 20.875 13.975 0.968 18.624 14.306 0.805 17.810 14.041 0.269 18.620 15.479 1.739 19.469 16.136 1.553 17.397 16.330 1.455 17.506 17.269 1.999 17.341 16.447 0.373 16.510 15.815 1.824 18.781 15.092 3.230 18.626 13.945 3.612 19.210 15.971 4.095 19.435 16.883 3.724 19.343 15.881 5.539 18.743 15.108 6.019 20.826 15.523 5.765 21.290 16.498 5.917 20.800 14.993 6.717 21.265 14.962 4.940 19.116 17.267 6.188 18.864 18.323 5.605 19.189 17.409 7.529 19.504 16.636 8.097 19.014 18.339 7.883 REMD 1 6 3000 269.500 4.389 26.138 -9.505 5.070 25.434 -9.026 5.214 24.609 -9.724 4.602 25.014 -8.137 6.383 26.047 -8.737 7.442 25.556 -9.136 6.354 27.293 |
Extracting replica trajectories
The following script can be used with ptraj to read in the REMD trajectories in the modified format written by sander and then write out standard trajectories that can be used in other programs that read the AMBER format. The script looks like a normal ptraj script, with the exception that it loops over all replicas, and ptraj will automatically recognize that the trajin uses the modified REMD format. If analysis of data from the replica trajectories is desired, this can be done as well. In the example below, a distance is calculated, and each numbered output file will have the time sequence of this distance for the corresponding replica. The trajectory format only needs to be converted for use with external programs- ptraj can directly perform analysis on the modified REMD format and it is not necessary to write out a new trajectory file. This script assumes that ptraj is in your PATH.
create_replica_trajs.x |
#!/bin/csh set i = "1" while ($i < 8) set ext=`printf "%3.3i" $i` ptraj ala10.prmtop << EOF trajin remd.mdcrd.$ext trajout remd.reptraj.$ext nobox distance d1 out dist.rep.$ext :2@CA :8@CA go EOF @ i++ end wait |
Running this script will create the following files:
The output from running the above script should show that ptraj successfully recognized the new REMD format and is processing the data for only the single replica.
INPUT COORDINATE FILES File (remd.mdcrd.008) is an AMBER REMD (new format) trajectory, 0 files total (First index is 000), only frames from this file will be used (remdtraj not specified), each file has 500 sets OUTPUT COORDINATE FILE File (remd.reptraj.008) is an AMBER trajectory ACTIONS 1> DISTANCE: between the atoms selections center of mass will be saved to array named d1 Processing AMBER REMD trajectory (new format) file remd.mdcrd.008 Set 1 ................................................. Set 50 ................................................. Set 100 ................................................. Set 150 ................................................. Set 200 ................................................. Set 250 ................................................. Set 300 ................................................. Set 350 ................................................. Set 400 ................................................. Set 450 ................................................. Set 500 PTRAJ: Successfully read in 500 sets and processed 500 sets. |
Extracting temperature trajectories
A typical use of REMD is to obtain thermodynamic ensembles at temperatures of interest (e.g. 300 K). Thus the replica-based trajectory files need to be converted to sets of snapshots for each temperature. This is done in ptraj by specifying the first replica trajectory in the trajin command, along with a command that a specific temperature should be analyzed. No matter which temperature is desired, the first trajectory in the sequence should be used for the trajin command. Ptraj will then open all of the replica trajectories and extract only those frames corresponding to the temperature that the user requested. For this to function properly, ptraj currently requires that the files are numbered in sequence with a 3 digit extension (e.g. file.000 to file.007). An example is shown below. Note that �remdtraj� is used to signal ptraj that we want a specific temperature analyzed, and �remdtrajtemp� followed by the actual temperature tells ptraj which temperature to use. Note also that the requested temperature must match one of those found in the REMD lines in the modified format (which were set in the temperatures.dat file described above). In this case we will extract data for 300 K, write out all frames at 300 K to a new trajectory, and also perform a distance analysis. As with the replica trajectories, the trajout command is optional and only needed if one wishes to perform analysis outside of ptraj.
extract_300k_traj.x |
#!/bin/csh ptraj ala10.prmtop << EOF trajin remd.mdcrd.001 remdtraj remdtrajtemp 300.0 trajout remd.300K.mdcrd nobox distance d1 out dist.300K.dat :2@CA :8@CA go EOF |
This time the ptraj output should show that it is reading all of the REMD files and is processing the data for a single replica:
INPUT COORDINATE FILES File (remd.mdcrd.001) is an AMBER REMD (new format) trajectory, 8 files total (First index is 001), frames at 300.000000 K will be used, each file has 500 sets OUTPUT COORDINATE FILE File (remd.300K.mdcrd) is an AMBER trajectory ACTIONS 1> DISTANCE: between the atoms selections center of mass will be saved to array named d1 Processing AMBER REMD trajectory (new format) files remd.mdcrd.001 -> remd.mdcrd.008 Opening other REMD files. Opening remd.mdcrd.002 Opening remd.mdcrd.003 Opening remd.mdcrd.004 Opening remd.mdcrd.005 Opening remd.mdcrd.006 Opening remd.mdcrd.007 Opening remd.mdcrd.008 Done. Set 1 ................................................. Set 50 ................................................. Set 100 ................................................. Set 150 ................................................. Set 200 ................................................. Set 250 ................................................. Set 300 ................................................. Set 350 ................................................. Set 400 ................................................. Set 450 ................................................. Set 500 PTRAJ: Successfully read in 500 sets and processed 500 sets. |
This should produce the following output files: dist.300K.dat, remd.300K.mdcrd.
Feel free to visualize the trajectory file in a program of your choice.
It is also possible to automate this kind of analysis across all temperature. The following scrip will loop over all of the temperatures in the temperature.dat file and extract the temperature trajectories and run analysis:
analyze_all_temps.x |
#!/bin/bash TEMPLIST=`cat temperatures.dat` for T in $TEMPLIST ; do ptraj ala10.prmtop <<EOF trajin remd.mdcrd.001 remdtraj remdtrajtemp $T trajout remd.Ttraj.$T go EOF ptraj ala10.prmtop <<EOF trajin remd.Ttraj.$T distance d1 out dist.$T.dat :2@CA :8@CA go EOF done |
This will produce the following files:
8) Consideration for production calculations
Especially when using supercomputing centers where wallclock time for individual jobs are limited (and charged for), the jobs must be frequently restarted for long simulations. It is important to check the restart files before restarting the REMD run, especially if the job was killed before it was completed. You should check the restart file headers and make sure that a complete set exists- meaning that there is a restart file for each temperature. It can happen that the restart files will not all be from the same time index.
Hopefully this tutorial has provided you with the basics of how to setup, run and postprocess REMD simulations. You are recommended to check the literature for example applications of replica exchange MD.
(Note: These tutorials are meant to provide illustrative
examples of how to use the AMBER software suite to carry out simulations that
can be run on a simple workstation in a reasonable period of time. They do not
necessarily provide the optimal choice of parameters or methods for the
particular application area.)
HTML Pages
Copyright Ross Walker 2008