Stage 3: Fitting charges
The Restrained Electrostatic Potential (REsP)
fitting approach (see
here for a reference of particular
relevance) is an established means for placing partial charges on atoms at the
right intensities to mimic the electrostatic field outside of the molecular
surface. One minor difference between the fitting method in mdgx
and REsP is the use of harmonic, rather than hyperbolic, restraints to hold
charges to small values in the absence of strong data to the contrary. (This is
not because I don't like hyperbolic restraints--on the contrary, I plan to
implement them soon.) More importantly, unless explicitly told to use
restraints, mdgx is really just doing ElectroStatic
Potential fitting, or ESP--in theory, with a very large data set, no
restraints are needed. Both mdgx and IPolQ are designed to
function with bigger data sets that new computers make possible.
Another significant thing that the IPolQ procedure changes is the target:
rather than the common HF/6-31G* calculation, IPolQ takes higher-level QM
calculations with inclusion of a time-averaged solvent density, and fits two
sets of charges simultaneously with weak coupling between the two sets. REsP
isn't linked to HF/6-31G*, but the extension for simultaneously fitting two
charge sets is novel. (The motivation for this approach is described in the
publications: briefly, these methods are prone to under-determined solutions,
particularly when some atoms shield others deep within the molecular surface,
and we want to select two sets of charges that make similar choices for
under-determined parts of the system.)
Recall that in the previous stage we computed grids
IPolQgrd<conformation>.vacu and
IPolQgrd<conformation>.solv for six conformations. We will
now fit charges for the molecule. The following mdgx &fitq
input will fit charges, and like the &ipolq namelist in Stage
2 there are lots of parameter options to address details of a procedure that is
conceptually simple. The basic question is "what set of atomic partial
charges, if placed at the coordinates of atoms in the molecule, will best mimic
the electrostatic potential that the molecule projects beyond its surface?"
A simple least squares fitting procedure will give us charges, but for buried
atoms the partial charges could come out quite large. We therefore need some
mechanism of restraining the charges, which I touched on in the previous
paragraph. Aside from that, there's an issue of what region of space we
consider important: far away from the molecule, there'll be no trouble getting
the potential right as long as the overall dipole is correct, but very close to
the nucleus of any solute atoms a point monopole will do a very bad job of
getting the electrostatics of the distributed quantum electron density. Our
solution is to select points where solvent molecules could reasonably be
expected to explore in a simulation, as judged by the van-der Waals clashes
that an oxygen probe atom, with a tiny hydrogen attached, could approach until
incurring some modest energy penalty. Even if the oxygen "probe" can only get
so far, the attached hydrogen might be able to go about an Ångstrom
further in, so points that could be reached by the hydrogen "probe arm" should
be counted as well. Also important: don't take a big clump of points from just
one area, spread them out. That's what the following input says:
&files
-p Glycerol.top
-o fit.out
&end
&fitq
ipolq Conf1/IPolQgrd1.vacu Conf1/IPolQgrd1.solv
Glycerol.top 1.0
ipolq Conf2/IPolQgrd2.vacu Conf2/IPolQgrd2.solv
Glycerol.top 1.0
ipolq Conf4/IPolQgrd4.vacu Conf4/IPolQgrd4.solv
Glycerol.top 1.0
ipolq Conf5/IPolQgrd5.vacu Conf5/IPolQgrd5.solv
Glycerol.top 1.0
ipolq Conf7/IPolQgrd7.vacu Conf7/IPolQgrd7.solv
Glycerol.top 1.0
ipolq Conf8/IPolQgrd8.vacu Conf8/IPolQgrd8.solv
Glycerol.top 1.0
% General conditions for the REsP
pnrg 0.0
flim 0.39
nfpt 3750
MaxMemory 4GB
% Equivalencies on atoms go here
equalq '@C1,C3'
equalq '@O1,O3'
equalq '@H11,H12,H31,H32'
equalq '@HO1,HO3'
minq '@C2'
&end
The biggest block of input here is the section of ipolq lines.
These provide the names of each grid for a particular conformation of the
Glycerol.top system (note that the topology of the solute alone is given). The
number at the end of each line is the weight assigned to each conformation.
Use higher values to make one conformation more important in the fitting. Note
that the coordinates are read from the grid files, and must match the order of
atoms in the topology file.
In the input above, the keyword pnrg ("probe energy") tells
mdgx to include points if they are accessible within 0.9572Å
of points at which a TIP3P oxygen could be placed without incurring any
van-der Waals penalty (the TIP3P-related values are defaults, but can be
changed if the user wants--see the keywords parm / ProbeArm ,
psig / ProbeSig , and peps / ProbeEps ). The keyword
flim ("fitting limit") tells mdgx to search through
each electrostatic potential grid for viable points but exclude new
points from fitting if they are closer than 0.39Å to any of those
already selected (this will not be an issue with the coarse grids we are using
in this tutorial, but would demand that points be at least two spacings
apart in a calculation with 0.2Å grids). The MaxMemory
keyword in this namelist should not be confused with the eponymous keyword
in the &ipolq namelist: here, it tells mdgx
itself how much memory it can allocate for its matrix equations when solving
the IPolQ least-squares problems.
Another important aspect of the input is the equalq keyword.
This keyword may be given as many times as is necessary, each instance
specifying a group of partial charges that should have the same value.
mdgx will treat all partial charges named in each
ambmask string (see the Amber manual for a description of the
format) as a single variable in its fitting problem. The minq
keyword is what applies harmonic restraints to atoms (again, in the absence of
minq , mdgx is merely doing
ElectroStatic Potential fitting).
The collection of grid files for this exercise can be found
here. The output generated by
mdgx is here. In section 2.) of
that output the resulting charges are listed. The charges in the rightmost
column, IPolQ(pert) , are the ones we want (IPolQ charges expressed
as a perturbation of the vacuum phase charges). Currently, these should be
copied manually into the original library file
Glycerol.prepi , a result
which can be found here,
although there is a plan to have mdgx print its own updated
library files to save this tedium. In addition to the new file with the IPolQ
charges, we may want to fit bonded parameters to vacuum phase data, in which
case we would want to make a
second library file containing the
vacuum charges found in the second column of the mdgx output. In
the final stage, we'll discuss iterations of the IPolQ procedure to get
converged charge sets, but this set of vacuum phase charges will not
change.
In practice: Let's consider the case of fitting charges for two
molecules that should share the same charges on particular atoms. In this
case we would include more ipolq lines with each system
(separate topology files) and have more elaborate ambmask
strings to specify atoms in equalq masks spanning both systems.
This can work for any number of molecules, as long as your computer has
enough memory to handle the big matrices it might generate. By default,
mdgx will check each system's topology to know what total charge
it needs to place on each system, but there are options to override that, too.
Users can run $AMBERHOME/bin/mdgx -FITQ for a full list of
the options in charge fitting.
Proceed to Stage 4,
go back to Stage 2, or return to the
introduction.
|