AMBER 11 NVIDIA
GPU ACCELERATION SUPPORT
NEWS Major
Update Released 18th 2011 (v2.2 [Amber 11 Bugfix.17]) Doubles
Performance for PME Calculations on GPUs.
Benchmarks
Benchmarks timings by Mike Wu and Ross
Walker.
Machine Specs
Machine 1 CPU = Dual x Quad Core Intel
E5462 2.80 GHz Motherboard = SuperMicro X7DWA-N
Motherboard MPICH2.0 - 1.2.1p1 MKL 10.1.1.019 Ifort
11.1.069 GPU = Tesla C1060 /Tesla C2050 / GTX295 nvcc
v3.2 NVIDIA Driver Linux 64 - 260.19.14
Machine 2 CPU
= Dual x Hex Core Intel X5670 2.93 GHz Motherboard = SuperMicro
X8DTG-DF MVAPICH2.0 - 1.5 (With GPU Direct v1.0) MKL
10.1.1.019 Ifort 11.1.069 GPU = GTX295 (768MB) / GTX580
(1.5GB) / Tesla C(M)2070 (6.0 GB) / Tesla M2090 (6.0 GB) nvcc
v3.2 NVIDIA Driver Linux 64 - 256.47 (Configuration has 1 GPU
per node + 1 Mellanox QDR IB in x16 slot or 2 GPU per node + 1
Mellanox QDR IB in x4 slot)
Code Base = AMBER 11 Release + Bugfixes 1 to 17
- GPU code v2.2 (Aug 18th 2011)
Precision Model = SPDP (GPU), Double Precision
(CPU)
Benchmarks run with ECC turned OFF on
C2050/C2070/M2090 cards. If you see approximately 10% less
performance than the numbers here then run the following (for each
GPU) as root:
nvidia-smi -g 0
--ecc-config=0 (repeat
with -g x for each GPU ID)
Segfaults in Parallel: If you find that
runs across multiple nodes (i.e. using the infiniband adapter)
segfault almost immediately then this is most likely an issue with
GPU Direct v2 (CUDA v4.0) not being properly supported by your
hardware and driver installations. In most cases setting the
following environment variable on all nodes (put it in your .bashrc)
will fix the problem:
export
CUDA_NIC_INTEROP=1
List of Benchmarks
Implicit Solvent (GB)
- TRPCage = 304 atoms
- Myoglobin = 2,492 atoms
- Nucleosome = 25,095 atoms
Explicit Solvent (PME)
- DHFR NVE = 23,558 atoms
- DHFR NPT = 23,558 atoms
- FactorIX NVE = 90,906 atoms
- FactorIX NPT = 90,906 atoms
- Cellulose NVE = 408,609 atoms
- Cellulose NPT = 408,609 atoms
You can download a tar file containing the input
files for all these benchmarks here
(96.0 MB).
^ |

|
|
Implicit
Solvent GB Benchmarks
|
1) TRPCage = 304 atoms
&cntrl
imin=0,irest=1,ntx=5,
nstlim=100000,dt=0.002,ntb=0,
ntf=2,ntc=2,tol=0.000001, ntpr=1000, ntwx=1000,
ntwr=50000, cut=9999.0, rgbmax=15.0,
igb=1,ntt=0,nscm=0, / |
|
|

|

|
|
2) Myoglobin = 2492 atoms
&cntrl
imin=0,irest=1,ntx=5,
nstlim=10000,dt=0.002,ntb=0,
ntf=2,ntc=2,tol=0.000001, ntpr=1000, ntwx=1000,
ntwr=50000, cut=9999.0, rgbmax=15.0,
igb=1,ntt=0,nscm=0, / |
|
|

|

|
|
3) Nucleosome = 25095 atoms
&cntrl
imin=0,irest=1,ntx=5,
nstlim=1000,dt=0.002,ntb=0,
ntf=2,ntc=2,tol=0.000001, ntpr=100, ntwx=100,
ntwr=50000, cut=9999.0, rgbmax=15.0,
igb=1,ntt=0,nscm=0, / |
|
|

|

| |
|
^
|
Explicit
Solvent PME Benchmarks
|
1) DHFR NVE = 23,558 atoms
Typical Production MD NVE with
GOOD energy conservation.
&cntrl
ntx=5, irest=1,
ntc=2, ntf=2, tol=0.000001,
nstlim=10000,
ntpr=1000, ntwx=1000,
ntwr=10000,
dt=0.002, cut=8.,
ntt=0, ntb=1, ntp=0,
ioutfm=1,
/
&ewald
dsum_tol=0.000001,
/
|
|
|

|

|
|
2) DHFR NPT = 23,558 atoms
Typical Production MD NPT
&cntrl
ntx=5, irest=1,
ntc=2, ntf=2,
nstlim=10000,
ntpr=1000, ntwx=1000,
ntwr=10000,
dt=0.002, cut=8.,
ntt=1, tautp=10.0,
temp0=300.0,
ntb=2, ntp=1, taup=10.0,
ioutfm=1,
/
|
|
|

|

|
|
3) FactorIX NVE = 90,906 atoms
Typical Production MD NVE with
GOOD energy conservation.
&cntrl
ntx=5, irest=1,
ntc=2, ntf=2, tol=0.000001,
nstlim=10000,
ntpr=1000, ntwx=1000,
ntwr=10000,
dt=0.002, cut=8.,
ntt=0, ntb=1, ntp=0,
ioutfm=1,
/
&ewald
dsum_tol=0.000001,nfft1=128,nfft2=64,nfft3=64,
/
|
|
|

|

|
|
4) FactorIX NPT = 90,906 atoms
Typical Production MD NVT
&cntrl
ntx=5, irest=1,
ntc=2, ntf=2,
nstlim=10000,
ntpr=1000, ntwx=1000,
ntwr=10000,
dt=0.002, cut=8.,
ntt=1, tautp=10.0,
temp0=300.0,
ntb=2, ntp=1, taup=10.0,
ioutfm=1,
/
|
|
|

|

|
|
5) Cellulose NVE = 408,609 atoms
Typical Production MD NVE with
GOOD energy conservation.
&cntrl
ntx=5, irest=1,
ntc=2, ntf=2, tol=0.000001,
nstlim=10000,
ntpr=1000, ntwx=1000,
ntwr=10000,
dt=0.002, cut=8.,
ntt=0, ntb=1, ntp=0,
ioutfm=1,
/
&ewald
dsum_tol=0.000001,
/
|
|
|

|

|
|
6) Cellulose NPT = 408,609 atoms
Typical Production MD NPT
&cntrl
ntx=5, irest=1,
ntc=2, ntf=2,
nstlim=10000,
ntpr=1000, ntwx=1000,
ntwr=10000,
dt=0.002, cut=8.,
ntt=1, tautp=10.0,
temp0=300.0,
ntb=2, ntp=1, taup=10.0,
ioutfm=1,
/
|
|
|

|

|
^ | |