(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright McGee, Miller, and Swails 2009

AMBER ADVANCED TUTORIALS
TUTORIAL 3 - SECTION 4

MMPBSA.py

Dwight McGee, Bill Miller III, and Jason Swails

1) Build the starting structure and run a simulation to obtain an equilibrated system.

The system we will model in this simulation is the complex between the human H-Ras protein and the Ras-binding domain of C-Raf1 (Ras-Raf), which is central to the signal transduction cascade. Here is a partially equilibrated, pre-prepared pdb file of the RAS-RAF complex.

ras-raf.pdb

This structure contains the ras and raf proteins and also a physiologically necessary GTP nucleotide as illustrated in the figure below:

For directions on how to build the starting structure and run a simulation to obtain an equilibrated system, please refer to Section 1 and Section 2.

The important files for calculating the binding free energy using MMPBSA.py are the topology files and the mdcrd file (ras-raf_top_mdcrd.tgz)

2) Calculate the binding free energy of Ras-Raf in parallel.

We will now calculate the interaction energy and solvation free energy for the complex, receptor and ligand and average the results to obtain an estimate of the binding free energy. Please note that we will not perform a calculation of the entropy contribution to binding in this part of the tutorial and so strictly speaking our result will not be a true free energy but could be used to compare against similar systems. See Section 5 for an example of using Normal Mode Analysis (Nmode) to calculate the entropy contribution for a system or uncomment out the last line in the general section to perform a Quasi-Harmonic entropy calculation using the ptraj module in AMBER.

We will carry out the binding energy calculation using both the MM-GBSA and the MM-PBSA methods in parallel for comparison with the results obtained from Section 3.1. MMPBSA.py.MPI parallelizes the calculation by assigning an equal number of frames to each thread (process). Thus, it operates most efficiently when the number of frames processed is a multiple of the number of threads started. However, this is not a requirement. If the number of frames is not a multiple of the number of threads, the "leftover" frames will be evenly distributed amongst a subset of the number of threads started. For example, running 50 frames on 3 threads will cause 2 threads to calculate 17 frames and the last thread to calculate only 16. Thus, the third thread will have to wait for the first two to finish their calculations before they can progress. For this reason, 5 threads, for instance, would be a wiser choice (as each thread takes 10 frames). However, the number of threads cannot exceed the number of frames processed or MMPBSA.py.MPI will terminate with an error message. The input file for MMPBSA.py.MPI is identical to the input file for MMPBSA.py:

mmpbsa.in
Input file for running PB and GB
&general
   endframe=50, verbose=1,
#  entropy=1,
/
&gb
  igb=2, saltcon=0.100
/
&pb
  istrng=0.100,
/

The input files for MMPBSA.py.MPI are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each namelist is designated by an ampersand (&) followed by the name of the namelist. Furthermore, a backslash (/) or '&end' can be used to end the namelist. For a complete list of all variables please see the Amber Manual. This input file is divided into three namelists: general, pb, and gb. The general namelist is designed to specify variables that are not specific to a particular part of the calculation, but instead to all parts. In this setup we have defined RAS to be the receptor and RAF to be the ligand. The 'endframe' variable sets what frame of the mdcrd to stop on. The '&gb' and '&pb' namelist markers let the script know to perform MM-GBSA and MM-PBSA calculations with the given values defined within those namelists.

mpirun -np 4 $AMBERHOME/bin/MMPBSA.py.MPI -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -sp ras-raf_solvated.prmtop -cp ras-raf.prmtop -rp ras.prmtop -lp raf.prmtop -y *.mdcrd > progress.log 2>&1

or by submitting a script to a queuing system, such as PBS with

qsub parallel.job

The script, parallel.job, would look something like this using a bash shell:

parallel.job
#!/bin/sh
#PBS -N rasraf_parallel
#PBS -o parallel.out
#PBS -e parallel.err
#PBS -m abe
#PBS -M email@address.com
#PBS -q brute
#PBS -l nodes=1:node:ppn=4
#PBS -l pmem=900mb

cd $PBS_O_WORKDIR

mpirun -np 4 MMPBSA.py.MPI -O -i mmpbsa.in -o FINAL_RESULTS_MMPBSA.dat -sp ras-raf_solvated.prmtop \
             -cp ras-raf.prmtop -rp ras.prmtop -lp raf.prmtop -y bigprod.mdcrd > progress.log 2>&1 

This will print the progress of the calculation to the file progress.log. All errors during the calculation will be printed to this file, as well (that is the purpose of 2>&1). Finally, timings will be printed once the calculation has completed showing the time taken during each step of the script.

progress.log
MMPBSA.py.MPI being run on 4 processors
ptraj found! Using /scr/arwen_3/swails/i686/amber11/exe/ptraj
sander found! Using /scr/arwen_3/swails/i686/amber11/exe/sander

Preparing trajectories with ptraj...
50 frames were read in and processed by ptraj for use in calculation.

Starting calculations

Starting gb calculation...

Starting pb calculation...


Calculations complete. Writing output file(s)...
Timing:
Processing Trajectories With Ptraj:       0.126 min.
Total GB Calculation Time (sander):       4.782 min.
Total PB Calculation Time (sander):      28.407 min.
Output File Writing Time:                 0.053 min.

Total Time Taken:                        33.379 min.


MMPBSA Finished. Thank you for using. Please send any bugs/suggestions/comments to mmpbsa.amber@gmail.com

The '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end in '.mdcrd' and use them as the trajectories to be analyzed.

With 'keep_files' set to its default value of 1, here are all the output files: Parallel_output.tgz.

The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the GB and PB calculations. The *.mdout files contain the energies for all frames specified. An average pdb file is created as an average structure if quasi-harmonic entropy calculations are performed with ptraj. All files created by MMPBSA.py.MPI should begin with the prefix '_MMPBSA_' except for the final output files, FINAL_RESULTS_MMPBSA.dat

FINAL_RESULTS_MMPBSA.dat
| Run on Sun Feb 14 19:10:43 EST 2010

|Input file:
|--------------------------------------------------------------
|Input file for running PB and GB in serial
|&general
|   endframe=50, verbose=1,
|   mpi_cmd='mpirun -np 3', nproc=3
|/
|&gb
|  igb=2, saltcon=0.100
|/
|&pb
|  istrng=0.100, 
|/
|--------------------------------------------------------------
|Solvated complex topology file:  ras-raf_solvated.prmtop
|Complex topology file:           ras-raf.prmtop
|Receptor topology file:          ras.prmtop
|Ligand topology file:            raf.prmtop
|Initial mdcrd(s):                bigprod.mdcrd
|
|Best guess for receptor mask:   ":1-166"
|Best guess for  ligand  mask:   ":167-242"

|Calculations performed using 50 frames.
|Poisson Boltzmann calculations performed using internal PBSA solver in sander.
|
|All units are reported in kcal/mole.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

GENERALIZED BORN:

Complex:
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                  -1863.7944               17.1704              2.4283
EEL                     -17200.7297               75.9366             10.7391
EGB                      -3249.6511               65.2075              9.2217
ESURF                       91.3565                1.3938              0.1971

G gas                   -19064.5240               77.8536             11.0102
G solv                   -3158.2946               65.2224              9.2238

TOTAL                   -22222.8186               51.0216              7.2155


Receptor:
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                  -1268.1888               14.2342              2.0130
EEL                     -11557.0773               71.7127             10.1417
EGB                      -2532.0669               57.7003              8.1600
ESURF                       64.2843                1.1143              0.1576


G gas                   -12825.2661               73.1118             10.3396
G solv                   -2467.7826               57.7110              8.1616

TOTAL                   -15293.0487               35.3527              4.9996


Ligand:
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                   -529.3090                9.4198              1.3322
EEL                      -4684.4720               36.1449              5.1117
EGB                      -1688.9631               26.5353              3.7527
ESURF                       37.0493                0.6185              0.0875


G gas                    -5213.7811               37.3522              5.2824
G solv                   -1651.9138               26.5425              3.7537

TOTAL                    -6865.6949               25.8878              3.6611


Differences (Complex - Receptor - Ligand):
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                    -66.2966                4.2751              0.6046
EEL                       -959.1803               34.9190              4.9383
EGB                        971.3789               33.0497              4.6739
ESURF                       -9.9770                0.3759              0.0532


DELTA G gas              -1025.4769               35.1797              4.9752
DELTA G solv               961.4018               33.0518              4.6742


 DELTA G binding =        -64.0750     +/-      6.3729                 0.9013
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

POISSON BOLTZMANN:

Complex:
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                  -1863.7944               17.1704              2.4283
EEL                     -17200.7297               75.9366             10.7391
EPB                      -3207.7160               66.4023              9.3907
ECAVITY                     67.8762                0.7818              0.1106

G gas                   -19064.5240             6061.1875            857.1813
G solv                   -3139.8399               66.4069              9.3914

TOTAL                    -7686.8660               52.5400              7.4303


Receptor:
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                  -1268.1888               14.2342              2.0130
EEL                     -11557.0773               71.7127             10.1417
EPB                      -2483.7242               56.4551              7.9840
ECAVITY                     47.1495                0.4737              0.0670

G gas                   -12825.2661             5345.3320            755.9441
G solv                   -2436.5747               56.4571              7.9842

TOTAL                    -5250.2060               38.5188              5.4474


Ligand:
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                   -529.3090                9.4198              1.3322
EEL                      -4684.4720               36.1449              5.1117
EPB                      -1670.4169               27.6694              3.9131
ECAVITY                     28.0328                0.4133              0.0584

G gas                    -5213.7811             1395.1865            197.3092
G solv                   -1642.3841               27.6725              3.9135

TOTAL                    -2350.3020               25.1197              3.5525


Differences (Complex - Receptor - Ligand):
Energy Component            Average              Std. Dev.   Std. Err. of Mean
-------------------------------------------------------------------------------
VDWAALS                    -66.2966                4.2751              0.6046
EEL                       -959.1803               34.9190              4.9383
EPB                        946.4251               34.5128              4.8808
ECAVITY                     -7.3062                0.3004              0.0425

DELTA G gas              -1025.4769             1237.6138            175.0250
DELTA G solv               939.1189               34.5141              4.8810



 DELTA G binding =        -86.3579     +/-      8.3264                 1.1775
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

WARNINGS:
igb=2 should be used with mbondi2 pbradii set. Yours are modified Bondi radii (mbondi)

The beginning of the statistics file includes the date/time, a copy of the input file, the best guess for the ligand and receptor masks, the files used by the script, the number of frames analyzed, and which PB solver (if any) was used. The rest of the statistics file includes all the average energies, standard deviations, and standard error of the mean for GB followed by PB. After each section, the ΔG of binding is given along with the error values. The meaning of the different terms in this file is as follows:

VDWAALS = van der Waals contribution from MM.
EEL = electrostatic energy as calculated by the MM force field.
EPB/EGB = the electrostatic contribution to the solvation free energy calculated by PB or GB respectively.
ECAVITY = nonpolar contribution to the solvation free energy calculated by an empirical model.
DELTA G binding = final estimated binding free energy calculated from the terms above. (kCal/mol)

Note that the total gas phase energy has not been reported because the values of the bonded terms for the receptor and ligand should exactly cancel those for the complex using the single trajectory approach. An error message will result if the energies do not cancel to within an allowed tolerance (0.001 kcal/mol).

One would typically expect to find an extremely favorable electrostatic energy and a unfavorable solvation free energy. This symbolises the energy that ones has to use to de-solvate the binding particles and to align their binding interfaces.

From the negative total binding free energy -86.36 kcal/mol we clearly see that this is a favorable protein-protein complex in pure water but keep in mind that the result does not equal the real binding free energy since we did not estimate the (unfavorable) entropy contribution to binding. Note that the GB approach gives a slightly lower binding energy but still suggests that this is a favorable bound state.


CLICK HERE TO GO TO SECTION 3.5

CLICK HERE TO GO TO MMPBSA.py HOME

CLICK HERE TO GO BACK TO THE INTRODUCTION


(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright McGee, Miller, and Swails 2009