Amber Advanced Tutorials - Tutorial 3 - MMPBSA.py

(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright McGee, Miller, and Swails 2009

1) Build the starting structure and run a simulation to obtain an equilibrated system.

The system we will model in this simulation is the protein-ligand complex between the Estrogen Receptor protein and Raloxifene ligand. Here is a pre-prepared pdb file of the complex.

This structure contains the Estrogen Receptor protein along with a ligand called Raloxifene that has been previously docked to the protein as illustrated in the figure below:

This system was constructed in a similar manner to the Ras-Raf system in Section 1. For directions on how to build the starting structure and run a simulation to obtain an equilibrated system, please refer to Section 1 and Section 2 Note that you must also use antechamber to get the correct parameters for raloxifene. See the Sustiva Tutorial (Basic 4) for detailed instructions.

The important files from the MD simuation for calculating the binding free energy using MMPBSA.py are the topology files and the mdcrd file (Est_Rec_top_mdcrd.tgz)

We will now calculate the normal modes for the complex, receptor and ligand and average the results to obtain an estimate of the binding entropy. Please note that the entropy contribution for a system can be calculated by uncommenting out the last line in the general section to perform a Quasi-Harmonic entropy calculation using the ptraj program in AmberTools.

We will carry out the normal mode calculation using the following input file for MMPBSA.py:

The input files for MMPBSA.py are designed to be similar to the setup of an mdin file used in the sander module of AMBER. The start of each section is designated by an ampersand (&) followed by the name of the section. Furthermore, a backslash (/) or '&end' can be used to end the section. For a complete list of all variables please see the Amber Manual. This input file is divided into two namelists: &general and &nmode. The &general namelist specifies variables that are not specific to any particular calculation, but instead to all calculations. In this setup the Estrogen Receptor is the receptor and Raloxifene is the ligand. The 'endframe' variable sets what frame of the mdcrd to stop on when making the dry mdcrd files using ptraj. The 'keep_files' variable allows the user to specify what files are removed at the end of the calculation. The nmode section is used to define the variables specific to the normal mode calculation. The 'nmstartframe' variable defines the frame from the dry mdcrd where normal mode analysis begins. The 'nmendframe' and 'nminterval' setup the end frame and interval for normal mode analysis of the dry mdcrd, respectively. Note that the nmstartframe/endframe/interval variables correspond to the 'trajectory' of snapshots extracted by startframe, endframe, and interval defined in &general. Thus, only a subset of these frames will be chosen for normal mode calculations (at most, each frame in _MMPBSA_complex.mdcrd)

nmode.job

#!/bin/sh
#PBS -N nmode
#PBS -o nmode.out
#PBS -e nmode.err
#PBS -m abe
#PBS -M email@domain.edu
#PBS -q brute
#PBS -l nodes=1:surg:ppn=3
#PBS -l pmem=1450mb

cd $PBS_O_WORKDIR

mpirun -np 3 MMPBSA.py.MPI -O -i mmpbsa_nm.in -o FINAL_RESULTS_MMPBSA.dat -sp 1err.solvated.prmtop -cp complex.prmtop \
             -rp receptor.prmtop -lp ligand.prmtop -y 1err_prod.mdcrd > progress.log

This will print the progress of the calculation to the file progress_nm.log. All errors during the calculation will be printed to this file, as well. Finally, timings will be printed once the calculation has completed showing the time taken during each step of the script. Running this calculation will take approximately 40 hours on 10 frames if run in serial. Note that MMPBSA.py.MPI can be used to run in parallel (see Section 3.4 for details about running this). This will divide the number of frames evenly among the number of threads started as described in Section 3.4. The size of your system significantly affects the computation time and required memory (RAM). Segmentation faults (segfaults) are usually caused by insufficient RAM in your system.

progress_nm.log

ptraj found! Using /share/local/lib/amber10/i686/bin/ptraj
sander found! Using /share/local/lib/amber10/i686/bin/sander (serial only!)
nmode found! Using /share/local/lib/amber10/i686/bin/nmode

Preparing trajectories with ptraj...
50 frames were read in and processed by ptraj for use in calculation.

Starting sander calls


Starting nmode calculations...
Timing:
Processing Trajectories With Ptraj:       0.240 min.
Total Harmonic nmode Calculation Time: 2363.906 min.
Output File Writing Time:                 0.018 min.

Total Time Taken:                      2364.165 min.


MMPBSA Finished. Thank you for using. Please send any bugs/suggestions/comments to mmpbsa.amber@gmail.com

The '-y *.mdcrd' on the command line tells the script to read in all files in the working directory that end in '.mdcrd' and use them as the trajectories to be analyzed.

The script creates three unsolvated mdcrd files (complex, receptor, and ligand) using ptraj that are the coordinates analyzed during the entropy calculations. Normal mode calculations are performed using PDB files for each snapshot, so 10 PDB files are created from the dry mdcrd file for the complex, receptor, and ligand. An average pdb file is created as an average structure for aligning the trajectory for a quasi-harmonic entropy calculation performed by ptraj if entropy == 1. All files created by MMPBSA.py should begin with the prefix '_MMPBSA_' except for the final output file: FINAL_RESULTS_MMPBSA.dat

FINAL_RESULTS_MMPBSA.dat

| Run on Thu Feb 11 12:44:26 EST 2010

|Input file:
|--------------------------------------------------------------
|Input file for running entropy calculations using NMode
|&general
|   endframe=50, keep_files=2,
|/
|&nmode
|   nmstartframe=1, nmendframe=50,
|   nminterval=5,
|/
|--------------------------------------------------------------
|Solvated complex topology file:  1err.solvated.prmtop
|Complex topology file:           complex.prmtop
|Receptor topology file:          receptor.prmtop
|Ligand topology file:            ligand.prmtop
|Initial mdcrd(s):                1err_prod.mdcrd
|
|Best guess for receptor mask:   ":1-240"
|Best guess for  ligand  mask:   ":241"
|Ligand residue name is "RAL"
|
|Calculations performed using 50 frames.
|Poisson Boltzmann calculations performed using internal PBSA solver in sander.
|
|All units are reported in kcal/mole.
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
ENTROPY RESULTS (HARMONIC APPROXIMATION) CALCULATED WITH NMODE:

Complex:
Entropy Term          Average        Std. Dev.
-----------------------------------------------------------
Translational:        16.9389           0.0000
Rotational:           17.3953           0.0038
Vibrational:        2784.7967           2.3982
Total:              2819.1307           2.3964


Receptor:
Entropy Term          Average        Std. Dev.
-----------------------------------------------------------
Translational:        16.9233           0.0000
Rotational:           17.3911           0.0045
Vibrational:        2755.5693           3.0352
Total:              2789.8840           3.0342


Ligand:
Entropy Term          Average        Std. Dev.
-----------------------------------------------------------
Translational:        13.2972           0.0000
Rotational:           11.4991           0.0496
Vibrational:          33.7549           0.0442
Total:                58.5511           0.0058


DELTA S total=       -30.0192 +/-       3.4064

NOTE: All entropy results have units kcal/mol. (Temperature has already been multiplied in as 300. K)
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

The beginning of the output file includes various details about the calculation. The rest of the output file includes the averages, standard deviations, and standard error of the mean for each of the translational, rotational, vibrational, and total entropy contributions. After those sections, the ΔS is given along with the standard deviation and std. error of the mean.

One would typically expect to find a negative ΔS value for a biological complex. This symbolizes the decrease in available microstates as the protein and ligand bind to make the complex. The decrease in available microstates mainly arises from the ligand being trapped and having limited mobility while being bound to the protein.

From the negative ΔS value -30.02 kcal/mol we clearly see that this is an entropically unfavorable protein-ligand complex in pure water but keep in mind that the result does not equal the real binding free energy since we did not estimate the (favorable) enthalpic contribution to binding.