Amber masthead
Filler image AmberTools23 Amber22 Manuals Tutorials Force Fields Contacts History
Filler image

Useful links:

Amber Home
Download Amber
Installation
Amber Citations
GPU Support
Features
Get Started
Benchmarks
Hardware
Logistics
Updates
Mailing Lists
For Educators
File Formats
Contributors

PMEMD runs on many models of NVIDIA GPUs

| Building Your Own System | AMBER Certified Hardware Solutions |

GPU accelerated PMEMD has been implemented using CUDA and thus will only run on NVIDIA GPUs at present. The code uses a custom designed hybrid single / double / fixed precision model termed SPFP which requires a minimum hardware revision or 3.0 (Kepler, GTX680). For price and performance reasons at this time we generally recommend the GeForce cards over the more expensive Tesla or Quadro variants. There are no issues, accuracy or otherwise, with running AMBER on GeForce GPUs.

In January, 2023, the following cards are supported by Amber22 and AmberTools23. Other similar models are likely to work, but have not been tested.

  • Hardware Version 9.0 (Hopper)
    • H100
  • Hardware Version 8.9 (Ada Lovelace)
    • RTX-4070TI
    • RTX-4090
  • Hardware Version 8.0 (Ampere)
    • RTX-3060
    • RTX-3070
    • RTX-3080
    • RTX-3090
    • RTX-A6000
    • RTX-A5000
    • A100
  • Hardware Version 7.5 (Turing)
    • RTX-2080Ti
    • RTX-2080
    • RTX-2070
    • Quadro RTX6000/5000/4000
  • Hardware Version 7.0 (Volta V100)
    • Titan-V
    • V100
    • Quadro GV100
  • Hardware Version 6.1 (Pascal GP102/104)
    • Titan-XP [aka Pascal Titan-X]
    • GTX-1080TI / 1080 / 1070 / 1060
    • Quadro P6000 / P5000
    • P4 / P40
  • Hardware Version 6.0 (Pascal P100/DGX-1)
    • Quadro GP100 (with optional NVLink)
    • P100 12GB / P100 16GB / DGX-1
  • Hardware Version 5.0 / 5.5 (Maxwell)
    • M4, M40, M60
    • GTX-Titan-X
    • GTX970 / 980 / 980Ti
    • Quadro cards supporting SM5.0 or 5.5
  • Hardware Version 3.0 / 3.5 (Kepler I / Kepler II)
    • Tesla K20 / K20X / K40 / K80
    • Tesla K10 / K8
    • GTX-Titan / GTX-Titan-Black / GTX-Titan-Z
    • GTX770 / 780
    • GTX670 / 680 / 690
    • Quadro cards supporting SM3.0 or 3.5

QUICK Hardware note: Because QUICK can also be used as a standalone QM program outside of AmberTools23, a full list of hardware can be found here:
https://quick-docs.readthedocs.io/en/23.8.0/installation-guide.html#compatible-compilers-and-hardware.

Amber tries to support all CUDA SDK versions up to 11.x. In the past, we have recommended CUDA 9.1 or 9.2 for the best speed of the resulting executables, but this needs to be revisited. Here are the minimum requirements for different tiers of hardware:

  • Ampere (SM_80) based cards require CUDA 11.0 or later.
  • Turing (SM_75) based cards require CUDA 9.2 or later.
  • Volta (SM_70) based cards require CUDA 9.0 or later.
  • Pascal (SM_61) based cards require CUDA 8.0 or later.
  • GTX-1080 cards require NVIDIA Driver version >= 367.27 for reliable numerical results.
  • GTX-Titan and GTX-780 cards require NVIDIA Driver version >= 319.60 for correct numerical results.
  • GTX-780Ti cards require a modified Bios from Exxact Corp to give correct numerical results.
  • GTX-Titan-Black Edition cards require NVIDIA Driver version >= 337.09 or 331.79 or later for correct numerical results.

Other cards not listed here may also be supported as long as they implement Hardware Revision 3.0, 3.5, 5.0, 5.5, 6.0, 6.1, 7.0, 7.5 or 8.0 specifications.

Note that you should ensure that all GPUs on which you plan to run PMEMD are connected to PCI-E 2.0 x16 lane slots or better. If this is not the case then you will likely see degraded performance, although this effect is lessened in serial if you write to the mdout or mdcrd files infrequently (e.g. every 2000 steps or so). Scaling over multiple GPUs within a single node is not really feasible anymore for PME calculations given the interconnect performance has not kept pace with improvements in the individual GPU performance. However, it is still possible to get good multi-GPU scaling for implicit solvent calculations larger than 2500 atoms if all GPUs are in x16 or better slots and can communicate via peer to peer (i.e. connected to the same physical processor socket).

For a detailed writeup on PCI-E layouts in modern hardware and the variations in peer to peer support see the following write-up: [Exploring the complexities of PCI-E connectivity]. It is also possible to run over multiple nodes, although you are unlikely to see any performance benefit and thus it is not recommended except for loosely coupled runs such as REMD. The main advantage of AMBER's approach to GPU implementation over other implementations such as NAMD and Gromacs is that it is possible to run multiple single GPU runs on a single node with little or no slow down. For example, a node with 4 RTX2080TI GPUs cards can run 4 individual AMBER DHFR 4fs NVE calculations all at the same time without slowdown providing an aggregate throughput in excess of 2500ns/day.

Building Your Own System

By Ross Walker (ross __at__ rosswalker.co.uk)

If you are happy putting together your own machines from individual components then you can build unbelievably fast AMBER GPU machines for very little money. Your main considerations are a suitable motherboard, a processor with at least 1 core per GPU and a power supply beefy enough to run everything. Simple 2 GPU systems can be built for around $4200 INCLUDING THE GPUS! Here's a recommended shopping list of parts for building reliable high performing AMBER GPU machines. This machine runs the DHFR NVE HMR 4fs benchmark at over 1000ns/day using just one of the GPUs! The system as specced can support up to 2 GPUs, with a 1600W power supply. I do not recommend going beyond 2 GPUs in such a configuration. You can actually fit 4 but preventing overheating is difficult. If you want more than 2 GPUs in a system I would recommend reaching out to Exxact Corp or AdvancedHPC both of whom have experience with very high end GPU workstations and servers optimally specced for running AMBER.

Amazon

Prices current as of Dec 2020
(Hover mouse over links for current prices)

1 x Corsair Carbide 275R Mid-Tower Gaming Case ~ $79.99
1 x EVGA Supernova 1600 P2 80 Plus Platinum Rated 1600-Watt Power Supply ~ $481.00
1 x Asus Prime TRX40-PRO AMD 3rd Gen Ryzen Threadripper Motherboard ~ $399.29
1 x Samsung 860 EVO 2TB 2.5 Inch SATA III Internal SSD ~ $199.99
1 x Crucial 32GB Kit (8GBx4) DDR4 2666 MT/s (PC4-21300) ~ $154.99
1 x AMD Ryzen Threadripper 3960X 24-Core CPU ~ $1427.62
1 x EVGA CLC 280 Liquid/Water CPU Cooler ~ $99.99
2 x NVIDIA RTX 3080 GPUs ~ $699.99 each

Total Price: ~ $4242.85 for 1 machine [2 GPUs] (as of Dec 2020)

AMBER Certified Hardware Solutions

By Ross Walker (ross _at_ rosswalker.co.uk)

In order to make AMBER GPU computing as simple and cost effective as possible, I have worked with Exxact Corporation (Nick Chen nick@exxactcorp.com) and AdvancedHPC (Joe Lipman joe.lipman@advancedhpc.com) to provide a number of pre-configured, fully warranted [even with GeForce cards] and optimized turn-key desktop, server and cluster solutions specifically designed for running AMBER simulations. These systems come with AMBER preinstalled and include extensive GPU validation and benchmark reports. I personally verify the installation and performance of every AMBER Certified machine from these vendors. Full information, and the most up to date configurations available on Exxact's AMBER MD Workstation and AdvancedHPC's AMBER Optimized GPU system pages. If you are based in Europe or the UK I highly recommend LinuxVixion (Juan Herrero juanjimenez@linuxvixion.com) who also offer preinstalled and validated AMBER GPU systems.

If you would like additional information or advice on optimal systems for a given budget please feel free to email me at ross__at__rosswalker.co.uk.

Disclosure: Exxact contributes to funding AMBER GPU development and research.

"How's that for maxed out?"

Last modified: Sep 7, 2023