Generating Force Field Parameters with Paramfit
Formerly known as Parmfit Tutorial
By Robin M. Betz
TABLE OF CONTENTS
IntroductionSystem setup
Generating conformations
Conduct quantum calculations
Derive K
Define parameters to fit
Visualise structure quality
Conduct single molecule fit
Weight individual structures
Multiple molecule fits
Conduct quantum calculations
The next step is to calculate either the energy or forces on each conformer at a quantum level of theory. For this tutorial, we will perform a single-point energy calculation on each structure.
Paramfit can assist you in creating quantum input files if you are using Gaussian, GAMESS, or ADF. All options to Paramfit are specified in the job control file. The easiest way to create this file is to run:
$AMBERHOME/bin/paramfitIf no job control file is specified (-i argument), Paramfit defaults to running a wizard that aids in the creation of the job control file. Follow the prompts, and select that you want to create input files for a quantum program. The resulting job control file will look something like this, sans comments:
job_quantum.in
# Use the first 20 structures in the input trajectory format file NSTRUCTURES=20 COORDINATE_FORMAT=TRAJECTORY # We want to create quantum input files for the Gaussian package RUNTYPE=CREATE_INPUT QMFILEFORMAT=GAUSSIAN # Before the coordinates section in the input file, prepend the text in # gaussian.header, which will specify the run options QMHEADER=gaussian.header # Name the files quantum_inpts/quantum_A###.com QMFILEOUTSTART=quantum_inputs/quantum_A QMFILEOUTEND=.com
In order to specify the options for Gaussian, create the following header file. This file will appear at the top of each input file to Gaussian, and specifies calculation options. Refer to the Gaussian (or other quantum package) documentation for what to put in this file. Do not include a title line- Paramfit will write one for you.
gaussian.header%RWF=job.rwf %Chk=job.chk %NoSave %Mem=2GB #SP b3lyp/6-31G* geom=connectivity
This file specifies a single-point energy calculation, discarding the checkpoint and scratch files, with a 2GB memory limit. The options you specify in this file will depend on available computing power and desired level of theory.
The selection of a level of quantum theory and basis set is up to you. This tutorial uses B3LYP and the 6/31G* basis set so that the calculations will complete quickly. There is no right answer to this question, and careful thought needs to be given to avoid biasing the resultant parameters. Paramfit treats the quantum data as "truth," so your calculations should be as accurate as possible given your time and computational constraints!
The general syntax for running Paramfit for a single topology is:
paramfit -i <job control> -p <prmtop> -c <mdcrd> [-q <quantum>]Where <quantum> is a list of ab-initio quantum energies that we have yet to obtain. Invoke Paramfit using our input files:
paramfit -i job_quantum.in -p a.prmtop -c A_valid_structures.mdcrd \ > paramfit_quantum.outAnd molecular coordinates will be written to 20 Gaussian input files.
Output: (Yours may differ)
paramfit_quantum.out
Quantum inputs not shown here
Note in this output there are quite a few warnings about poor dihedral sampling, including for the C-N-CT-C and N-CT-C-N dihedrals that we are fitting. This means we need to take care that our input structures are high quality. A way of visualizing this will be shown on the next page.
NOTE: Gaussian requires connectivity information for each molecule at the end of the input file. Paramfit will write this information for you, but due to the way the AMBER topology format works, cannot tell bond order and assumes all bonds are single bonds. If there are double or triple bonds in your structure, take care that this section is correct. You may have to edit your files manually or create them yourself.
Now run Gaussian on each input file. It is important to keep the naming consistent, as the order of the retrieved energies needs to match the order of the structures in the coordinate file. A script such as the following will work:
#!/bin/bash mkdir quantum_outputs for i in `ls quantum_inputs/*A*.com`; do echo $i g09 < $i > ${i/inputs/outputs}.gout done
NOTE: This is the most time consuming step of the fitting process. This test system was created to be very small, but still took 2 minutes per structure of computing time on 4 3.40 GHz cores with 8 GB of memory. To speed up this step, you can decrease the level of theory or basis set size, or calculate on fewer structures. However, this risks the derivation of poor quality parameters, so pay careful thought to your choices for this step.
Process Quantum Output Files
Now that the quantum runs are complete, the final energy must be extracted from each file and placed into a quantum energy data file that contains the energy of each structure, one per line, in the same order as the coordinate file.
A script such as the following may be used to extract the energies. Note that the loop in the script goes through the output files in order.
#!/bin/bash for ((i=0;i<20;i++)); do echo $i grep "SCF Done" "quantum_outputs/quantum_A${i}.com.gout" | awk '{print $5}' \ >> quantum_A.dat doneOutput: (Yours may differ)
quantum_A.dat
NOTE: If you use this energy file instead of conducting quantum calculations yourself, make sure to also use the corresponding structure set, as the quantum energy is specific to each structure and the structure set is generated by a random process, so your will be different.