Copyright (C) Pengfei Li & Kenneth M. Merz Jr. 2015

Building bonded model with MCPB

Usage of MCPB is described below.

Usage:

MCPB [flags] [options]

options: -i script file -l log file

flags: -h help -f function list


As an example we go through the process of creating the metal ion force field for PDB ID: 1AMP. This is a structure of the Areamonas proteolytica aminopeptidase, in which there are two Zinc ions in the catalytic center. The entire parameterization process involves five distinct steps:

Procedures
A. PDB preparation
B. Build the sidechain model to obtain the force constant parameters
C. Use the large model to obtain atomic point charges
D. Transfer the xml param and lib files to AMBER frcmod and prep files
E. Create the prmtop and inpcrd files for the system

A. PDB preparation

1. Clean the PDB file: (effort: several minutes)

You can download the PDB file here (1AMP.pdb) or find it in the folder: $AMBERHOME/AmberTools/examples/mtkpp/MCPB/1AMP/data, In this step we delete the header and connectivity information, then rename residue 256 and 935 to HID and MOH respectively. The pdb file after these modifications was renamed 1AMP_OH_fixed.pdb. You can download it here (1AMP_OH_fixed.pdb) or find it in the data folder.

2. Generate the MCPB scripts: (effort: <1minute)

You can download the genMetalFF.sh here or find it in the folder: $AMBERHOME/AmberTools/examples/mtkpp/MCPB/

Next we run the following command to generate the scripts for use in the next steps.

sh genMetalFF.sh -n 1AMP_OH

Here the name of the metal site follows the -n flag. The following files are generated and placed in the working directory (here they are separated by usage):

A. General: 1AMP_OH.README; 1AMP_OH_settings.bcl; 1AMP_OH_addHs.bcl; 1AMP_OH_resNames.bcl.

B. Sidechain Model to calculate the Force Constant: 1AMP_OH_sidechain_fc_md.bcl; 1AMP_OH_sidechain_fc_sem.bcl.

C. Large model to do charge fitting: 1AMP_OH_large_respgen.bcl; 1AMP_OH_large_chg0.bcl; 1AMP_OH_large_chg1.bcl; 1AMP_OH_large_chg2.bcl; 1AMP_OH_large_chg3.bcl; getCharges.shcreateLibraries.sh.

D. From xml file to prep and frcmod file: 1AMP_OH_toAmberFormats_md.bcl; 1AMP_OH_toAmberFormats_sem.bcl.

3. Check the setting files: (effort: several minutes)

The setting file 1AMP_OH_settings.bcl will not be used directly but sourced from other files. It contains the force field information, which will be sourced in the parameterization step. Default is the AMBER FF94 model together with the Generalized Amber Force Field (GAFF). You can change it to other AMBER FFs if you like.

4. Add hydrogen atoms to the PDB file: (effort: several minutes)

In this step we will define the disulfide bonds, define the atom types, assign the connectivity, and afterwards add the hydrogen atoms to the 1AMP_OH_fixed.pdb file. We can run the command:

MCPB -i 1AMP_OH_addHs.bcl -l 1AMP_OH_addHs.bcl.log

We will get the following pdb file, 1AMP_OH_fixed_H.pdb which has hydrogen atoms on the heavy atoms.


B. Build the sidechain model to obtain the force constant parameters

5. Build Sidechain Model: (effort: < one hour)

Here we will build the sidechain model for the bond and angle force constant calculation. This file (1AMP_OH_sidechain.bcl) is not generated automatically but manually. You can check the notes in the file to understand the steps involved. Please make sure the charge and multiplicity is correct in the script (as well for the large model building script we will create later in this tutorial). All the charged groups we have in present model are two Zn2+ ions, one GLU group, two ASP groups and one OH- group. So the total charge is 2×(+2)-1-2×1-1=0, and the multiplicity is 1.

You can run the command:

MCPB -i 1AMP_OH_sidechain.bcl -l 1AMP_OH_sidechain.bcl.log

Then we will get the following files: 1AMP_OH.xml; 1AMP_OH_params.xml; 1AMP_OH_sidechain.sdf; 1AMP_OH_sidechain_opt.com; 1AMP_OH_sidechain_fc.com.

6. Gaussian based Side Chain Model Optimization and Frequency Calculation

The 1AMP_OH_sidechain_opt.com file was generated in step5. We now use it to perform a geometry optimization. This may be the most time-consuming step in the whole process. In order to be consistent with the naming scheme of the log file, which will be used in the subsequent steps, you can use the following command to rename the Gaussian files:

mv 1AMP_OH_sidechain_opt.com 1AMP_OH_sidechain_opt_md.com

sed -i 's/_opt/opt_md/g' 1AMP_OH_sidechain_opt_md.com

mv 1AMP_OH_sidechain_fc.com 1AMP_OH_sidechain_fc_md.com

sed -i 's/_fc/_fc_md/g' 1AMP_OH_sidechain_fc_md.com

Next you run the Gaussian job defined by 1AMP_OH_sidechain_opt_md.com to complete the geometry optimization. Here are two links about how to run Gaussian jobs: Gaussian03 and Gaussian09. From this run you will obtain the following files: 1AMP_OH_sidechain_opt_md.chk and 1AMP_OH_sidechain_opt_md.log files. The chk file will be used in the force constant calculation. Important: It is worthwhile to check all of your Gaussian results to insure you have a model you are comfortable with for the subsequent modeling. The current version of MCPB only supports the force constant calculation output file from Gaussian03 but not Gaussion09. After finishing the optimization job you need to run the gaussian job 1AMP_OH_sidechain_fc_md.com to perform the force constant calculation. Then you will get a new 1AMP_OH_sidechain_opt_md.chk and 1AMP_OH_sidechain_fc_md.log files.

(1) If you use the Z-matrix method, you can use the output log file (1AMP_OH_sidechain_fc_md.log) in a subsequent step.

(2) If you use the Seminario method, what you need to do is to transfer the binary chk file to fchk file by using the following command. The fchk file will be used in a later step.

formchk 1AMP_OH_sidechain_opt_md.chk 1AMP_OH_sidechain_opt_md.fchk

7. Generate final XML Param file:(effort: several minutes)

After finishing the force constant calculation, we can save the bond and angle force constant parameters into the XML Param files.

(1) If you use the Z-matrix method, you can use the following file and run the command:

MCPB -i 1AMP_OH_sidechain_fc_md.bcl -l 1AMP_OH_sidechain_fc_md.bcl.log

You will get the following xml file: 1AMP_OH_params_fc_md.xml.

(2) If you use the Seminario method, you can use the following file and run the command:

MCPB -i 1AMP_OH_sidechain_fc_sem.bcl -l 1AMP_OH_sidechain_fc_sem.bcl.log

You will get the following xml file: 1AMP_OH_params_fc_sem.xml.


C. Use the large model to obtain point charge parameters

8. Add the stand model to the Molecule Library: (effort: < one hour)

Before building the large model, we need to create a standard model in the xml library. It is used to determine if other pdb files contain a particular active site. It will also be useful in subsequent steps. The 1AMP_OH_addStdMol.bcl file is also built by hand. It is similar to the sidechain bcl file but is less labor intensive because there are no longer and capped fragments. If there are less than 5 residues between two ligating residues, the bridging residues between them could be represented by GLY residues while the whole chain are capped by ACE and NME as well.

After creating it, run the command:

MCPB -i 1AMP_OH_addStdMol.bcl -l 1AMP_OH_addStdMol.bcl.log

From this we get the following files: 1AMP_OH_stdMol.xml; 1AMP_OH_addstdMol.sdf.

9. Large Model Building:(effort: < one hour)

In the Large Model each residue will be capped with NME and ACE. If the two residues have exactly one residue between them, the extra residue will be included in the model building. Afterwards, we run the following command:

MCPB -i 1AMP_OH_large.bcl -l 1AMP_OH_large.bcl.log

From this we will get the pdb file and the Gaussian input file for the Merz-Kollman charge calculation: 1AMP_OH_large.pdb; 1AMP_OH_large_mk.com. Reminder: you need to change the keyword IOp(6/33=2) to IOp(6/50=1) in the input file if you use the Gaussian09-C.01 or more recent version.

10. Gaussian Calculation of Merz-Kollman charges:

Next run the Gaussian job defined in the 1AMP_OH_large_mk.com file. Here is the output file: 1AMP_OH_large_mk.log.

11. RESP charge fitting: (effort: several minutes)

In this step we can use the 1AMP_OH_large_respgen.bcl file to generate the two stage RESP charge input files.

MCPB -i 1AMP_OH_large_respgen.bcl -l 1AMP_OH_large_respgen.bcl.log

Afterwards, we perform a RESP charge fitting using the getCharges.sh script in which the espgen and resp in $AMBERHOME/bin/resp will be used. You can specify the charge model you want to use after the ./getCharges.sh command. (There are 0, 1, 2 and 3, which correspond to the ChgMod A, B, C and D respectively), herein we use ChgModA as an example:

sh ./getCharges.sh 0

This results in the following RESP charge output file: 1AMP_OH_large_mk0.resp2.chg.

12. Generate the final XML Lib file:(effort: several minutes)

After obtaining the 1AMP_mk0.resp2.chg file from the previous step, we use the input file 1AMP_OH_large_chg0.bcl to store the RESP charge parameter in the xml lib file.

MCPB -i 1AMP_OH_large_chg0.bcl -l 1AMP_OH_large_chg0.bcl.log

We will get the following xml file: 1AMP_OH_chg0.xml.


D. Transfer the xml param and lib files to AMBER frcmod and prep files

13. Create the AMBER prep and frcmod files:(effort: several minutes)

In this step, we will transfer the XML Param file (containing the force constant information) and Lib file (contains the charge information) to the AMBER frcmod and prep files, which will be used in the final steps creating the entire protein system. Care should be taken of the bcl files in this step, the default charge mod is ChgMod C (corresponds to 2), you can edit it to meet your own requirements. Here we use the ChgMod A (corresponds to 0) instead.

(1) If you are using the Z-matrix method to get the force constants, you run the command:

MCPB -i 1AMP_OH_toAmberFormats_md.bcl -l 1AMP_OH_toAmberFormats_md.bcl.log

You will get the following files: 1AMP_OH_md.frcmod; 1AMP_OH_chg0.prep; metals.frcmod.

(2) If you use the Seminario method to get the force constants, you run the command:

MCPB -i 1AMP_OH_toAmberFormats_sem.bcl -l 1AMP_OH_toAmberFormats_sem.bcl.log

You will get the following files: 1AMP_OH_sem.frcmod; 1AMP_OH_chg0.prep; metals.frcmod.


E. Create the topplogy and coordinate file for the system

14. Change the name of metal bound residues and assign bonding information in the leaprc file: (effort: several minutes)

In this step, we will use MCPB to change the metal bonding residues' names in the pdb file, and determine the bonding information of the system then save it to the leaprc file.

We run the command:

MCPB -i 1AMP_OH_resNames.bcl -l 1AMP_OH_resNames.bcl.log

We will get the following PDB and leaprc files: 1AMP_OH_mcpb.pdb; 1AMP_OH.leaprc.

15. Final modeling in tleap to generate the prmtop and inpcrd files:(effort: several minutes)

Here we need to make several modifications to the 1AMP_OH.leaprc file (obtained in the previous step) before the final modeling.

We manually modify the 1AMP_OH.leaprc from

source leaprc.ff94
frcmodMetals = loadamberparams metals.frcmod
frcmod = loadamberparams 1AMP_OH.frcmod
loadamberprep 1AMP_OH.prep
model = loadpdb 1AMP_OH_mcpb.pdb
bond model.223.SG model.227.SG
bond model.292.ZN model.117.OD2
bond model.292.ZN model.152.OE2
bond model.292.ZN model.256.NE2
bond model.292.ZN model.629.O
bond model.293.ZN model.97.NE2
bond model.293.ZN model.117.OD1
bond model.293.ZN model.179.OD1
bond model.293.ZN model.629.O
solvateoct model TIP3PBOX 8.0
saveamberparm model 1AMP_OH_mcpb.top 1AMP_OH_mcpb.crd
savePdb model 1AMP_OH_mcpb_h2o.pdb
quit

To (herein I used red to specify the parts which have been changed):

source oldff/leaprc.ff94
loadoff ions08.lib
frcmodMetals = loadamberparams metals.frcmod
frcmod = loadamberparams 1AMP_OH_md.frcmod
loadamberprep 1AMP_OH_chg0.prep
model = loadpdb 1AMP_OH_mcpb.pdb
bond model.223.SG model.227.SG
bond model.292.ZN model.117.OD2
bond model.292.ZN model.152.OE2
bond model.292.ZN model.256.NE2
bond model.292.ZN model.629.O
bond model.293.ZN model.97.NE2
bond model.293.ZN model.117.OD1
bond model.293.ZN model.179.OD1
bond model.293.ZN model.629.O
addions model Na+ 0
solvateoct model TIP3PBOX 8.0
loadamberparams frcmod.ionsjc_tip3p
saveamberparm model 1AMP_OH_mcpb.top 1AMP_OH_mcpb.crd
savePdb model 1AMP_OH_mcpb_h2o.pdb
quit

And then perform the command:

tleap -f 1AMP_OH.leaprc > 1AMP_OH_tleap.out

Yielding the final topology, coordinates and pdb files: 1AMP_OH_mcpb.top; 1AMP_OH_mcpb.crd; 1AMP_OH_mcpb_h2o.pdb.

Afterwards, we can start minimization and then your MD simulations!

Enjoy!