AMBER 14 NVIDIA GPU
ACCELERATION SUPPORT

| Background | Authorship & Support | Supported Features | Supported GPUs |
| Accuracy Considerations | Installation and Testing | Running GPU Accelerated Simulations |
| Considerations for Maximizing GPU Performance | Benchmarks |
| Recommended Hardware & Test Drives |
| Return to Main Amber Page |

Recommended Hardware

by Ross Walker.

| Preconfigured Turn-Key AMBER Certified Kepler and GeForce Machines |
| Free Test Drives | Building your own Machine |

The AMBER GPU implementation has been designed to work on a broad range of hardware. Essentially the only thing you need is a NVIDIA GPU supporting hardware revision 2.0, 3.0, 3.5, 5.0 or later. However, there are some considerations when it comes to maximizing performance both in serial and parallel.

This page has been designed to help you select suitable hardware configurations for running AMBER. If you want a simple, hassle free, solution to GPU computing with AMBER we recommend the Exxact AMBER Certified Solutions described below. These have been developed jointly between Exxact and the AMBER developers specifically to make GPU MD simulations as fast and hassle free as possible. The are shipped with AMBER 14 preinstalled and, as of Dec 20th 2015, come with a specially optimized version of AMBER that was developed in collaboration with Ross Walker, Exxact and NVIDIA. These optimized systems provide ~15% better performance than comparable hardware with bitwise identical results. This program has recently been extended to Exxact Certified Solutions for a range of life sciences applications and also to Machine Learning and Deep Neural Networks.

Serial

In serial the performance for each independent AMBER GPU job is, assuming NTPR, NTWX etc are large enough, mostly independent of the underlying CPU, motherboard chipset, the PCI-E bandwidth and the number of GPUs per node. At a minimum you need as least one free CPU core per GPU. If you are building small desktops to run serial calculations then multiple GPUs per node will be the most cost effective. Ideally you should still try to keep the GPUs on x16 PCI-E slots and make sure your power supply is sufficient to power all the GPUs under full load, and that you have adequate cooling.

Parallel

If you want to use multiple GPUs in parallel, rather than one simulation per GPU, then considerations change to the available bandwidth in the node (attempting to run across multiple nodes is not recommended except for Replica Exchange based simulations). With AMBER 14 the ideal specification for performance is 2 or 4 GPUs per node (8 GPUs is supported but needs server class hardware and 240V power) all in PCI-E Gen 3 x16 slots (or better). AMBER 14 uses peer to peer communication to provide optimum multi-GPU scaling. At the time of writing no standard motherboards exist that support more than two way peer to peer at full x16 speed (but we have a new unique custom-built system from Exxact that supports 4 and 8-way P2P simulations - see below). In most 4 GPU systems including the standard Exxact 4 GPU systems you are thus limited to the following combinations of runs [4 x 1 GPU; or 2 x 2 GPU; or 2 x 1 GPU + 1 x 2 GPU]. With an Exxact traditional 8 GPU system you would be limited to the following combinations [8 x 1 GPU; or 4 x 2 GPU; or 6 x 1 GPU + 1 x 2 GPU; or 4 x 1 GPU + 2 x 2 GPU; or 2 x 1 GPU + 3 x 2 GPU]. The new Exxact 8 GPU Peer to Peer solution will add 2 x 4 GPU (and 1 x 8 GPU in AMBER 16) to those options.

Pre-configured AMBER Certified Optimal Solutions

In order to make AMBER GPU computing as simple and cost effective as possible we have teamed up with Exxact Corporation to provide a number of pre-configured, fully warranted [even with GeForce cards] and optimized turn-key desktop and cluster solutions specifically designed for running AMBER simulations. These are discussed in more detail below with the most up to date configurations available on Exxact's AMBER MD Workstation page. Recent work with Exxact has also extended this to optimized solutions for a wide range of life science applications and GPU accelerated convolution neural networks for machine learning.
 

AMBER Certification

One of the biggest challenges with GPU computing is knowing what the optimal configuration is. If you go to a tier one vendor such as Dell you will likely end up paying a large amount for a sub-optimal machine. To make things as simple as possible we designed the AMBER certification process which is currently offered as sole source by Exxact Corporation. This certification involves offering turn-key solutions that conform to the following:

  1. Technical and sales personnel trained by AMBER developers and familiar with AMBER requirements.
  2. Vendor staff have direct link to AMBER developers for technical support and troubleshooting.
  3. All hardware specifications, and custom requests, approved and tested by AMBER developers.
  4. AMBER developer approved installation, configuration and testing including latest updates.
  5. Example submission scripts and pre-configured batch queuing systems (clusters) and automatic AMBER update scripts.
  6. Fully configured serial, parallel and GPU AMBER computing environment for all users.
  7. 24 hour individual GPU burn-in and full numerical validation using AMBER developer designed GPU test suite.
  8. Comprehensive benchmark report and performance validation for all GPUs.
  9. Full vendor 3 year warranty on all components (including GeForce and Tesla GPUs).
  10. All systems are verified personally by an AMBER developer before shipping.
  11. All AMBER v14 systems shipped after Dec 20th 2015 come AMBER v16 ready with free support for upgrading to AMBER v16 provided upon release [AMBER v16 license required]
  12. All AMBER v14 systems shipped after Dec 20th 2015 include a specially optimized version of AMBER 14 that provides ~15% better performance that that available on comparable hardware from other vendors. (see benchmarks for comparison).

The goal of this program is to make it simple to purchase optimum reliable and cost effective AMBER GPU computing solutions without the need for an understanding of GPU or CPU hardware. If you know how to run simulations with AMBER then you will be able to run simulations immediately after powering up an AMBER certified system without any required configuration or installation procedures. Support can also be provided for equipment requests in proposals. Due to the success of this program it has recently been extended to a range of life science applications in the form of the Exxact Life Sciences Certified GPU Computing Program.

 

Exxact AMBER Certified MD Workstation and SimCluster

The main driving force behind the AMBER GPU development has always been to bring supercomputer like performance to individual desktops at a price that is appropriate for the widest range of researchers possible. The motivation is maximizing the amount and quality of the science that can be done rather than chasing artificially large grand challenge problems with massive supercomputers. Think of it as Molecular Dynamics for the 99%.

To make it as simple as possible for AMBER users to purchase optimal workstations and small clusters for running GPU AMBER (and regular CPU AMBER simulations as well) we have teamed up with Exxact Corporation to design a series of machines that provide, in our opinion, the optimum price performance ratio within three specific categories:

  1. Individual Workstations in the $2000 to $8000 range. These use GeForce gaming cards but in our experience and that of a large number of users provide excellent reliability and unparalleled performance.
     
  2. Individual high end workstations in the $5000 to $15000 range. These machines use either GeForce cards or the professional Tesla boards (K40, K80 etc) and provide very high GPU densities (up to 8 GPUs in a single box).
     
  3. Small clusters. These can be custom built for just about any price range and can accommodate either the Tesla boards (K40, K80 etc) or, in the case of Exxact can feature GeForce cards providing stunning performance for extremely reasonable prices.

The following are three example machine configurations, designed in collaboration with Exxact. These are the machines that were used to obtain the benchmarks shown on the AMBER GPU benchmark page. They come as AMBER certified platforms, which includes AMBER 14 fully installed, tested and configured (AMBER 14 license required), and, as of Dec 20th 2015 include a specially optimized version of AMBER 14 that provides ~ 15% better performance than is achievable on generic vendor hardware. They carry full 3 year warranties (even when configured with GeForce cards), are 'burnt in' for a minimum of 24 hours using AMBER 14 to verify performance, reliability and numerical correctness of all hardware (this is what makes it possible to offer reliable GeForce solutions) and can be customized as desired. Systems purchased after Dec 20th 2015 as well as including the optimized version of AMBER 14 will come pre-configured to support AMBER v16 upon release and include free AMBER v16 installation support.

It is also possible to order these machines configured for a range of life sciences applications in addition to AMBER. For more details please contact Ross Walker (ross _at_ rosswalker.co.uk) or Mike Chen (mchen@exxactcorp.com) mentioning that you are interested in GPU computing solutions for running AMBER.

Exxact - AMBER Certified Workstations

AMBER Certified
Entry-Level Workstation
AMBER Certified
Mid-Level Workstation
AMBER Certified
High-End Workstation

Ideal for Graduate Students

Ideal for Researchers

Maximum Performance

• 1x Intel Core i7-4930K CPU
• 1 or 2 x NVIDIA GTX 980 or 980TI GPUs
• 32 GB system memory
• AMBER14 preinstalled, tested and optimized
• CentOS 6 or 7
• 3 year warranty

• 2x Intel Xeon E5-2620 v3 CPUs
• 2 to 4 x NVIDIA GTX 980, 980TI or Titan-X or K40/K80/M40/M60
• 64 GB system memory
• AMBER14 preinstalled, tested and optimized
• CentOS 6 or 7
• 3 year warranty

• 2x Intel Xeon E5-2640 v3 CPUs
• 4x NVIDIA GTX 980, 980TI or Titan-X or K40/K80/M40/M60
• 64 GB system memory
• AMBER14 preinstalled, tested and optimized
• CentOS 6 or 7
• 3 year warranty

~ $3999 ~$5999 ~$7999
Example Spec Example Spec Example Spec

Exxact AMBER Certified Rack Mount Nodes

2U x 4 GPU AMBER Certified
Rack Mount Node
(NEW) 4U x 8 GPU AMBER Certified Rack Mount Node

High Density - 4 GPUs per node - supports peer to peer between pairs of GPUs, run 4 x 1 GPU or 2 x 2 GPU per node.

Cost Effective - Supports 8 GPUs per node - with new PLX switched peer to peer between pairs of GPUs - you can run 8 x 1 GPU, 4 x 2 GPU or any combination in between.

• 2x Intel Xeon E5-2600 v3 Haswell CPUs
• Up to 4 x NVIDIA GTX980 / 980TI / Titan-X or Tesla K40/K80/M40/M60 GPUs
• Up to 1TB system memory
• AMBER14 preinstalled, tested and optimized
Onboard QDR or FDR infiniband available as option.
• CentOS 6 or 7
• 3 year warranty

• 2x Intel Xeon E5-2600 v3 Haswell CPUs
• Up to 8 x NVIDIA GTX980 / 980TI / Titan-X or K40/K80/M40/M60 GPUs
• Up to 768GB system memory
• AMBER14 preinstalled, tested and optimized
Onboard QDR or FDR infiniband available as option
• CentOS 6 or 7
• 3 year warranty

Contact for Price Contact for Price
Example Spec Example Spec

These machines can be customized to fit a specific budget, please contact myself (ross _at_ rosswalker.co.uk) or Mike Chen at Exxact (mchen@exxactcorp.com) for help and advice. Single socket GTX970 solutions are also available for prices starting around $2200. Whisper quiet (24db) dual GTX970 and GTX980 desktop machines are also available.

Exxact, is family owned, has been in business for over 30 years, and is GSA Compliant. As the only supplier of AMBER Certified GPU Workstations and Clusters they can provide sole source justification statements as needed as well support in providing quotes and hardware details for proposals. These desktops and clusters have been sold to numerous universities, pharmaceutical companies, biotech companies and national labs world wide. If you would like contact details for existing customers please contact ross@rosswalker.co.uk.

Exxact AMBER Certified Mid-Level workstation performance. GPU AMBER can be run in two modes, either using GPUs in parallel to run a single MD calculation (A) or using each GPU in serial to run independent MD calculations (B). The latter is what separates AMBER from other codes such as NAMD and Gromacs which rely on both the CPU and GPU for computation and thus do not support mode (B) efficiently.

Clusters

Clusters tend to be custom designed and should be configured for the specific user's needs and budget. Exxact staff have been trained by AMBER developers on the best way to design custom clusters for running AMBER (and other life sciences applications), both GPU and CPU. Many configurations are possible including 2U and 4U nodes with either 1 to 8 GTX-980/Titan-X or 1 to 8 K20(X)/K40/K80/M40 GPUs. Both the 2U and 4U configurations have been fully certified and tested with Tesla and GeForce GPUs and carry full warranties with next business day onsite being available if needed. The following page provides a typical cluster example:


Exxact 12U -  5 node x 16 GTX-Titan-X cluster
(Also available as a 5 node - 10U or 20U configuration)

For more details or to obtain a custom quote please contact Mike Chen (mchen@exxactcorp.com) at Exxact Corp or Ross Walker (ross@rosswalker.co.uk) at SDSC.

Exxact Custom 8 GPU Peer to Peer Solutions

Through an engineering collaboration with Exxact we are pleased to be able to offer unique peer to peer optimized systems as AMBER Certified Peer to Peer GPU MD solutions. These systems can be configured with up to 8 GPUs (16 with dual GPU boards) in a single system image with peer to peer communication supported across all GPUs. These systems were used to obtain the 4 GPU timings provided on the benchmark page (and the 8 GPU Cellulose GB timings). While they have a price premium over the regular Exxact systems they provide the extra flexibility of being able to run individual calculations across 1, 2 or 4 GPUs if desired and with AMBER 16 we hope to offer support for individual PME runs across all 8 GPUs.

Exxact AMBER Certified Peer to Peer Solutions

5 - 8 GPU 5.5U Rack Mount System with 8 way PCIe Gen 3 x16 P2P support
Starting at ~$30K [8 GPUs]

• Single Intel Xeon E5-2620 v3 CPU
• 64 - 768GB Memory
•Up to 60TB HD space
• Eight NVIDIA Titan-X GPUs (or K40/K80/M40)
Exxact custom 8 way P2P PCIe communication system.
• 5.5U (Titan-X) or 5U (K40/K80) Rack Mount Server Chassis
• AMBER14 preinstalled, tested and optimized

• CentOS 6 or 7
• 3 year warranty

Contact for Customization

These machines can be customized to fit a specific budget, please contact myself (ross _at_ rosswalker.co.uk) or Mike Chen at Exxact (mchen@exxactcorp.com) for help and advice.

^

Free Test Drives

If you are interested in measuring the performance of AMBER running your own simulation on the machines shown above then we encourage you to take advantage of the free test drive program we have put together with Exxact. Test accounts on these machines are available free of charge in blocks of 24 hours for you to try things out for yourself. Please see the following page for AMBER MD Workstation Test Drive signup details.

^

Building your own System

If you are happy putting together your own machines from individual components then you can build unbelievably fast AMBER GPU machines for very little money. Your main considerations are a suitable motherboard, a processor with at least 1 core per GPU and a power supply beefy enough to run everything. Simple 2 or 3 GPU systems can be built for around $3500 INCLUDING THE GPUS! Here's a recommended shopping list of parts for building reliable high performing AMBER GPU machines. This machine runs the DHFR NVE HMR 4fs benchmark at over 250ns/day using just one of the GPUs! The system as specced can support up to 3 GPUs, with a 1600W power supply (you can actually fit 4 in but I have seen issues with overheating with all 4 GPUs in use, and there is limited clearance for the 4th GPU). With 3 GPUs you can run three calculations all at the same time (one on each GPU) without impacting performance. For ideal multi-GPU performance you should configure it with 2 GPUs which will, due to the PCI-E switch on the motherboard, supports peer to peer 2xGPU runs.

Amazon
Prices current as of Jan 3rd 2016
(Hover mouse over links for current prices)

1 x Antec P280 Black ATX Mid Tower Computer Case ~ $109.99 each

1 x EVGA Supernova P2 80 Plus Platinum Rated 1600-Watt Modular ATX Power Supply ~ $399.99 each
(1200W version if fine if you plan to only have 2 GPUs)

1 x ASUS ATX DDR4 3000 (o.c.) LGA 2011 Motherboards X99-E WS ~ $657.35 each

1 x Seagate Barracuda XT 3 TB HDD SATA 6 Gb/s 7200 RPM ST33000651AS ~ $119.95 each

2 x Crucial 16GB Kit (8GBx2) DDR4 2133 MT/s (PC4-17000) CT2K8G4DFD8213  $69.99 each

1 x Cooler Master Hyper 212 EVO - CPU Cooler with 120mm PWM Fan (RR-212E-20PK-R2)Cooler Master Headsink $16.00 each

1 x Intel Core i7-5930K Haswell-E 6-Core 3.5GHz LGA 2011-v3 140W BX80648I75930K $587.99 each

3 x EVGA EVGA GTX980 4GB GDDR5 256bit, GPU (04G-P4-2982-KR) ~ $559.99 each

Total Price: ~ $3711.22 for 1 machine [3 GPUs] (as of Jan 2016)

^

Disclosure: Exxact contribute to funding AMBER GPU development and research.