GPU Accelerated Features of PMEMD in Amber
The GPU accerlated version of pmemd is implemented in Amber but not in AmberTools.
The code supports both explicit solvent PME or IPS
simulations in all three canonical ensembles (NVE, NVT and NPT) and implicit solvent
Generalized Born simulations. It has been designed to support as many of the standard PMEMD
features as possible. For a full list of features, see the pmemd section of the Amber manual.
Some Features
- Thermodynamic Integration, FEP and MBAR support
- Metropolis Monte Carlo constant pH
- All-atom PME continuous constant pH MD
- Constant redox MD
- REMD (T, H, pH, redox, coupled redox-pH, multi-dimensional, reservoir)
- Constant pressure REMD
- Expanded umbrella sampling support
- 12-6-4 LJ nonbonded potentials for metal ions
- Gaussian accelerated molecular dynamics
- Self-guided Langevin dynamics (SGLD)
- Middle thermostat scheme
- Gas phase simulations (through
igb=6 )
- External electric fields
- Support for the Charmm VDW Force switch
- Semi-Isotropic pressure scaling
- Enhanced NMR restraints and Rˆ6 averaging support (except NOESY volume restraints)
Alchemical Free Energy Calculations
The free energy methods implemented in Amber GPU code builds on the efficient Amber GPU MD code base (pmemd.cuda ). These methods include both thermodynamic integration (TI), free energy perturbation (FEP) and multi-state Bennett’s ratio (MBAR) classes. See the Free Energy Tutorials for specific examples.
- Input flags to run a TI calculation on
a GPU are the same as for the CPU version. Users need to:
- Set
icfe=1 : to enable the free energy calculations
- Define perturbated regions in
timask1 , timask2
- Set
ifsc=1 : to utilize the soft-core potentials
- Define softcore regions in
scmask1 , scmask2
- Define the current alchemical progress variable lambda by setting
clambda .
- There is a CPU-version tutorial available and users can run it with the GPU version without any modification in the input.
- FEP/MBAR: To generate additional output info for subsequent FEP/MBAR
analysis:
- Users first need to define TI input flags as above
- Enable the FEP/MBAR output:
ifmbar=1
- Define the number of MBAR states in
mbar_states , e.g., mbar_states=11
- Specify the lambda value of each MBAR stat, e.g.
mbar_lambda = 0.0, 0.1, 0.2, 0.3,
0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0
- Define the MBAR output interval,
bar_intervall , e.g. bar_intervall=10 means Amber will
output MBAR results every 10 MD steps
Replica-Exchange Molecular Dynamics
Amber is capable of performing temperature, Hamiltonian, redox, pH, coupled redox-pH, and reservoir replica
exchange simulations on the GPU. Multi-dimensional replica exchange simulations, where two or
more conditions are simulated at the same time, are supported as well. The details of input
and control variables can be found in Amber manuals. The newly
implemented free energy methods in Amber can be performed in conjunction with Hamiltonian
replica exchange so that different windows can exchange their conformations. To enable such
calculations, users need to:
- Create input files for all lambda values.
- Define Hamiltonian replica exchange input flags in each input file
numexchg : the number of exchange attempts that will be performed between
replica pairs
nstlim : the number of MD steps that will be performed between exchange attempts
- Define the Hamiltonian replica exchange group file. Note that:
- In the group file, the entries must be sorted according to the lambda values
- Currently, the number of entries in the group files must be the same as the number of
lambda windows.
- Currently, the number of lambda windows must be a multiple of the available GPUs, e.g.,
if there are 12 lambda windows, users need to allocate 1, 2, 3, 4, 6, or 12 GPUs, since
individual lambda windows cannot span multiple GPUs but
one GPU can run multiple windows provided sufficient GPU memory is available.
Constant pH Molecular Dynamics
Constant pH molecular dynamics simulations can run with the Generalized Born implicit solvent
model and with explicit solvent as described in the manual and
online tutorial.
Features not Supported on GPUs
The following options are NOT supported (after
Amber18)
ibelly != 0 |
Simulations using belly style constraints are not supported. |
(igb != 0 & cut < systemsize) |
GPU accelerated implicit solvent GB simulations do not support a
cutoff. |
nmropt > 1 |
Support is not currently available for nmropt > 1. In addition, for
nmropt = 1, only features that do not change the underlying force field parameters are
supported. For example umbrella sampling restraints are supported as well as simulated
annealing functions such as variation of Temp0 with simulation step. However, varying the
VDW parameters with step is NOT supported. |
nrespa != 1 |
No multiple time stepping is supported. |
vlimit != -1 |
For performance reasons the vlimit function is not implemented on
GPUs. |
es_cutoff != vdw_cutoff |
Independent cutoffs for electrostatics and van der Waals are not
supported on GPUs. (Although it may be coming.) |
order != 4 |
A PME interpolation order of 4 is the only option supported. Currently
we do not see an advantage in making a tradeoff between mapping work and FFT reduction, nor
an advantage in trading direct space electrostatics for reciprocal space work.
|
imin = 1 (in parallel) |
Minimization is only supported in the serial GPU code, and it is wise to
use the double-precision form of the code at that. Highly strained systems may need to be
minimized using the CPU code. |
emil_do_calc != 0 |
Emil is not supported on GPUs. |
iemap > 0 |
EMAP restraints are not supported on GPUs. |
icfe > 0 & imin > 0 |
Minimization is not supported for TI/MBAR on GPUs. |
pmemd vs. sander
For the supported functionality, the input required and output produced in PMEMD are intended to replicate sander. The agreement goes as far as the limits of machine roundoff differences for the CPU code, which performs essentially
all of its arithmetic in 64-bit precision. Likewise, the GPU code offers a double-precision variant for quality assurance during code testing and after installation, but perfect agreement with CPU results is not guaranteed in
cases where the GPU and CPU must generate their own random number sequences with different routines. The production GPU code, which performs most of its arithmetic in 32-bit precision, will necessarily diverge from
the CPU code, but maintains a high degree of numerical reproducibility thanks to fixed-precision accumulation of forces and energies. pmemd simply runs more rapidly, scales better in parallel using MPI, can make use of
NVIDIA GPUs and Intel Xeon Phis for acceleration, and uses less resident memory than the more general sander engine. Dynamic memory allocation is used so memory configuration is not required. Benchmark data is available
on the Amber website, ambermd.org. Given the improvements in performance in both serial and parallel as well as the incredible performance offered by GPU acceleration, it is advisable to always use pmemd in place of sander
if the simulation requirements are within the functionality envelope provided by pmemd .
Minor changes in the output after Amber16
There are some minor differences in the output format in Amber versions after Amber16. For example, the Ewald error estimate
is NOT calculated when running on a GPU. We have updated the Amber outputs and test cases
to reflect this fact--Amber16 and earlier versions printed the CPU-based Ewald error estimate,
but this was a meaningless report. The error estimate coming out of the CPU pertains to the
error in the spline approximation of the Ewald direct space force and energy, a spline-based
approximation to terms based on erfc(). In Amber the GPU also uses a spline-based
approximation to obtain the Ewald direct space force between particles but the splines are
in fact more accurate than analytic computation in 32-bit floating point arithmetic due
to the way we tweak the coefficients when fashioning them on the CPU for use by the kernels.
We do not calculate the error due to this process, but rest assured the direct sum tolerance
and aliasing effects on the grid are much worse for the numerics than the spline will be.
|