pKa Calculations using Thermodynamic Integration

(Note: These tutorials are meant to provide illustrative examples of how to use the AMBER software suite to carry out simulations that can be run on a simple workstation in a reasonable period of time. They do not necessarily provide the optimal choice of parameters or methods for the particular application area.)
Copyright Ross Walker 2006

pKa Calculations using Thermodynamic Integration - SECTION 4

pKa Calculations Using Thermodynamic Integration

By Ross Walker & Mike Crowley

4) Run thermodynamic integration on the thioredoxin protein.

Before we actually start doing the thermodynamic integration runs we should quickly minimise our system to remove any bad contacts created from the hydrogenation in Leap. One thing to note however is that because we are doing thermodynamic integration there is the requirement that the coordinate files for the end points of the two states are identical. Thus we have at the moment a thio_ash.inpcrd and a thio_asp.incrd which while they have different names are identical. Now if we minimise both of these independently we will end up with two different structures which we won't be able to use for doing thermodynamic integration. However, as our aim here is just to clean up the structure before we start MD we can just run one of the states and then use the resulting structure for both the ASP and ASH starting structures. In the paper they don't mention the value of the nonbond cut off used and so we have to assume that they did not use a cut off. Typically in GB simulations one would want to avoid the use of a cut off if possible. If a cut off has to be used for computational expense reasons this should ideally be 18 angstroms or greater. We will also set the cut off for the radii calculation to be larger than the system as well.

thio_min.in

minimise thioredoxin
&cntrl
imin=1, maxcyc=500, ncyc=250,
igb=5, ntb=0, ntf=2, ntc=2,
cut=999.0, rgbmax=999.0,
ntpr=50
/

$AMBERHOME/exe/sander -O -i thio_ash_min.in -o thio_ash_min.out -p thio_ash.prmtop -c thio_ash.inpcrd \
-r thio_ash_min.rst

Output Files: thio_min.out, thio_ash_min.rst

The next step is to run the thermodynamic integration. We will do this in the same fashion as we did for the model system. In the paper they ran a heating phase and an equilibration phase with harmonic restraints to the crystal structure that they gradually reduced during the equilibration phase. Overall they ran some 6 ns or so of simulation. We can't afford to do this in this tutorial and so for our purposes we will just do a hot start at 300K and run the 5 lambda value simulations for a total of 1 ns each. We will then look at each simulation and likely discard the first 200ps or so as being equilibration. In a real calculation one would want to equilibrate much more carefully and also run a lot longer production run.

thio_step2.in

 thioredoxin ASH to ASP for pKa calculation
 &cntrl
   nstlim = 1000000, nscm=2000,
   ntx=5, irest=1, ntpr=1000,
   tempi=300.0, temp0=300.0, ntt=3, gamma_ln=5.0,
   ntb=0, igb=5, cut=999.0,
   dt=0.001, 
   ntc=2, ntf=2, 
   ntwr = 10000, ntwx=1000, ntave=1000
   icfe=1, clambda=0.11270,
 /

thio_groupfile.2

-O -i thio_step2.in -p thio_ash.prmtop -c thio_step1_ash.rst \
-o thio_step2_ash.out -inf ash.mdinfo -x thio_step2_ash.mdcrd \
-r thio_step2_ash.rst
-O -i thio_step2.in -p thio_asp.prmtop -c thio_step1_asp.rst \
-o thio_step2_asp.out -inf asp.mdinfo -x thio_step2_asp.mdcrd \
-r thio_step2_asp.rst

You should run the above calculation for all 5 values of lambda. You can script them. E.g.:

mpirun -v -machinefile $PBS_NODEFILE -np 128 /usr/local/apps/amber9/exe/sander.MPI \
                                             -ng 2 -groupfile thio_groupfile.1
mpirun -v -machinefile $PBS_NODEFILE -np 128 /usr/local/apps/amber9/exe/sander.MPI \
                                             -ng 2 -groupfile thio_groupfile.2
mpirun -v -machinefile $PBS_NODEFILE -np 128 /usr/local/apps/amber9/exe/sander.MPI
                                             -ng 2 -groupfile thio_groupfile.3
mpirun -v -machinefile $PBS_NODEFILE -np 128 /usr/local/apps/amber9/exe/sander.MPI
                                             -ng 2 -groupfile thio_groupfile.4
mpirun -v -machinefile $PBS_NODEFILE -np 128 /usr/local/apps/amber9/exe/sander.MPI
                                             -ng 2 -groupfile thio_groupfile.5

Note on 128 cpus of a myrinet connected 1.7GHz Itanium 2 cluster this calculation takes approximately 25 hours. Now perhaps you see why we only ran 1 ns. You can download a tar file of the output here: thio_calc_results.tar.gz.

Once again we need to check for convergence in the cumulative DV/DL value. We will discard the first 200ps of the data from each run as being equilibration. As before we only actually need points 2,3 and 4, you can obtain these by using the awk script from earlier:


	2_800_dvdl.dat 3_800_dvdl.dat 4_800_dvdl.dat

As can be seen from the above graphs, especially for the lambda = 0.5 and 0.872 cases we have not yet reached convergence for the cumulative DV/DL average. In fact the presence of multiple states, as noted in the paper, is obvious. Thus to obtain accurate results we would need to carry out significantly more equilibration. However, for the purposes of this tutorial we shall proceed with what we have.

Calculating deltaG(model) by Gaussian Quadrature

As for our model system we will now calculate the value of deltaG(model) by estimating:

based on the our results from above. That is we carry out Gaussian quadrature on:

The following table summarizes the data we require which is the final line of 2-4_800_dvdl.dat which is the cumulative average over each 800 step block. You should refer to page 155 of the Amber 9 manual for more information on the origin of the Lambda and W_i values:

Step	2	3	4
<DV/DL>	-49.8394	-53.6001	-62.1712
Lambda	0.1127	0.5000	0.8873
W_i	0.27777	0.44444	0.27777

Thus the value of the integral above is:

deltaA = (-49.8394 * 0.27777) + (-53.6001 * 0.44444) + (-62.1712 * 0.27777) = -55.0886 kcal / Mol

In this case deltaA = deltaG thus our value for the deltaG(protein) is -55.0886 kcal/Mol.

Calculation of deltadeltaG and pKa shift

Now we have our deltaG(model) and deltaG(protein) we can calculate deltadeltaG by refering to the thermodynamic cycle show at the beginning of this tutorial:

Thus:

deltadeltaG = -55.0886 - -59.8807 = 4.7921 kcal/mol

If you compare this to paper you will find that they calculated a deltadeltaG for ASP26 of 7.1 in implicit solvent. This is obviously larger than ours, probably because we didn't sample for long enough. However, it is interesting to note that the experimental value is 4.8 and so we are spot on. Although I think this is more through luck than judgement.

From this deltadeltaG we can calculate the pKa shift using the following equation (obviously just ignoring the pKa(model) term):

= (1/2.303kT)*4.7921

= 1/(2.303*0.001987*300)*4.7921

pKa shift (ASP26) = 3.49

Thus we calculate the pKa shift of ASP26 as caused by the thioredoxin protein matrix to be 3.49. This is essentially spot on with experiment. Although we were probably just lucky here.

RETURN TO INDEX