| Background | Authorship &
Support | Supported Features |
by Ross Walker.
The AMBER GPU implementation has been designed to work on a broad range of hardware. Essentially the only thing you need is a NVIDIA GPU supporting hardware revision 2.0, 3.0, 3.5, 5.0 or later. However, there are some considerations when it comes to maximizing performance both in serial and parallel.
This page has been designed to help you select suitable hardware configurations for running AMBER. If you want a simple, hassle free, solution to GPU computing with AMBER we recommend the Exxact AMBER Certified Solutions described below. These have been developed jointly between Exxact and the AMBER developers specifically to make GPU MD simulations as fast and hassle free as possible. The are shipped with AMBER 14 preinstalled and, as of Dec 20th 2015, come with a specially optimized version of AMBER that was developed in collaboration with Ross Walker, Exxact and NVIDIA. These optimized systems provide ~15% better performance than comparable hardware with bitwise identical results. This program has recently been extended to Exxact Certified Solutions for a range of life sciences applications and also to Machine Learning and Deep Neural Networks.
In serial the performance for each independent AMBER GPU job is, assuming NTPR, NTWX etc are large enough, mostly independent of the underlying CPU, motherboard chipset, the PCI-E bandwidth and the number of GPUs per node. At a minimum you need as least one free CPU core per GPU. If you are building small desktops to run serial calculations then multiple GPUs per node will be the most cost effective. Ideally you should still try to keep the GPUs on x16 PCI-E slots and make sure your power supply is sufficient to power all the GPUs under full load, and that you have adequate cooling.
If you want to use multiple GPUs in parallel, rather than one simulation per GPU, then considerations change to the available bandwidth in the node (attempting to run across multiple nodes is not recommended except for Replica Exchange based simulations). With AMBER 14 the ideal specification for performance is 2 or 4 GPUs per node (8 GPUs is supported but needs server class hardware and 240V power) all in PCI-E Gen 3 x16 slots (or better). AMBER 14 uses peer to peer communication to provide optimum multi-GPU scaling. At the time of writing no standard motherboards exist that support more than two way peer to peer at full x16 speed (but we have a new unique custom-built system from Exxact that supports 4 and 8-way P2P simulations - see below). In most 4 GPU systems including the standard Exxact 4 GPU systems you are thus limited to the following combinations of runs [4 x 1 GPU; or 2 x 2 GPU; or 2 x 1 GPU + 1 x 2 GPU]. With an Exxact traditional 8 GPU system you would be limited to the following combinations [8 x 1 GPU; or 4 x 2 GPU; or 6 x 1 GPU + 1 x 2 GPU; or 4 x 1 GPU + 2 x 2 GPU; or 2 x 1 GPU + 3 x 2 GPU]. The new Exxact 8 GPU Peer to Peer solution will add 2 x 4 GPU (and 1 x 8 GPU in AMBER 16) to those options.
Pre-configured AMBER Certified Optimal Solutions
In order to make AMBER GPU computing as simple and cost effective as possible we
have teamed up with
Exxact Corporation to provide a number of
pre-configured, fully warranted [even with GeForce cards] and optimized turn-key desktop and cluster solutions
specifically designed for running AMBER simulations. These are
discussed in more detail below with the most up to date
configurations available on Exxact's
AMBER MD Workstation page. Recent work with
Exxact has also extended this to optimized solutions for a wide
range of life science applications and
GPU accelerated convolution neural networks for machine learning.
One of the biggest challenges with GPU computing is knowing what the optimal configuration is. If you go to a tier one vendor such as Dell you will likely end up paying a large amount for a sub-optimal machine. To make things as simple as possible we designed the AMBER certification process which is currently offered as sole source by Exxact Corporation. This certification involves offering turn-key solutions that conform to the following:
The goal of this program is to make it simple to purchase optimum reliable and cost effective AMBER GPU computing solutions without the need for an understanding of GPU or CPU hardware. If you know how to run simulations with AMBER then you will be able to run simulations immediately after powering up an AMBER certified system without any required configuration or installation procedures. Support can also be provided for equipment requests in proposals. Due to the success of this program it has recently been extended to a range of life science applications in the form of the Exxact Life Sciences Certified GPU Computing Program.
The following are three example machine configurations, designed in collaboration with Exxact. These are the machines that were used to obtain the benchmarks shown on the AMBER GPU benchmark page. They come as AMBER certified platforms, which includes AMBER 14 fully installed, tested and configured (AMBER 14 license required), and, as of Dec 20th 2015 include a specially optimized version of AMBER 14 that provides ~ 15% better performance than is achievable on generic vendor hardware. They carry full 3 year warranties (even when configured with GeForce cards), are 'burnt in' for a minimum of 24 hours using AMBER 14 to verify performance, reliability and numerical correctness of all hardware (this is what makes it possible to offer reliable GeForce solutions) and can be customized as desired. Systems purchased after Dec 20th 2015 as well as including the optimized version of AMBER 14 will come pre-configured to support AMBER v16 upon release and include free AMBER v16 installation support.
It is also possible to order these machines configured for a
range of life sciences applications in addition to AMBER. For more details
please contact Ross Walker (ross _at_ rosswalker.co.uk) or Mike Chen (email@example.com)
mentioning that you are interested in GPU computing solutions for
Exxact - AMBER Certified Workstations
Exxact AMBER Certified Rack Mount Nodes
These machines can be customized to fit a specific budget, please contact myself (ross _at_ rosswalker.co.uk) or Mike Chen at Exxact (firstname.lastname@example.org) for help and advice. Single socket GTX970 solutions are also available for prices starting around $2200. Whisper quiet (24db) dual GTX970 and GTX980 desktop machines are also available.
Exxact, is family owned, has been in business for over 30 years, and is GSA Compliant. As the only supplier of AMBER Certified GPU Workstations and Clusters they can provide sole source justification statements as needed as well support in providing quotes and hardware details for proposals. These desktops and clusters have been sold to numerous universities, pharmaceutical companies, biotech companies and national labs world wide. If you would like contact details for existing customers please contact email@example.com.
Clusters tend to be custom designed and should be configured for the specific user's needs and budget. Exxact staff have been trained by AMBER developers on the best way to design custom clusters for running AMBER (and other life sciences applications), both GPU and CPU. Many configurations are possible including 2U and 4U nodes with either 1 to 8 GTX-980/Titan-X or 1 to 8 K20(X)/K40/K80/M40 GPUs. Both the 2U and 4U configurations have been fully certified and tested with Tesla and GeForce GPUs and carry full warranties with next business day onsite being available if needed. The following page provides a typical cluster example:
Through an engineering collaboration with Exxact we are pleased to be able to offer unique peer to peer optimized systems as AMBER Certified Peer to Peer GPU MD solutions. These systems can be configured with up to 8 GPUs (16 with dual GPU boards) in a single system image with peer to peer communication supported across all GPUs. These systems were used to obtain the 4 GPU timings provided on the benchmark page (and the 8 GPU Cellulose GB timings). While they have a price premium over the regular Exxact systems they provide the extra flexibility of being able to run individual calculations across 1, 2 or 4 GPUs if desired and with AMBER 16 we hope to offer support for individual PME runs across all 8 GPUs.
Exxact AMBER Certified Peer to Peer Solutions
These machines can be customized to fit a specific budget, please contact myself (ross _at_ rosswalker.co.uk) or Mike Chen at Exxact (firstname.lastname@example.org) for help and advice.
If you are interested in measuring the performance of AMBER running your own simulation on the machines shown above then we encourage you to take advantage of the free test drive program we have put together with Exxact. Test accounts on these machines are available free of charge in blocks of 24 hours for you to try things out for yourself. Please see the following page for AMBER MD Workstation Test Drive signup details.
If you are happy putting together your own machines from individual components then you can build unbelievably fast AMBER GPU machines for very little money. Your main considerations are a suitable motherboard, a processor with at least 1 core per GPU and a power supply beefy enough to run everything. Simple 2 or 3 GPU systems can be built for around $3500 INCLUDING THE GPUS! Here's a recommended shopping list of parts for building reliable high performing AMBER GPU machines. This machine runs the DHFR NVE HMR 4fs benchmark at over 250ns/day using just one of the GPUs! The system as specced can support up to 3 GPUs, with a 1600W power supply (you can actually fit 4 in but I have seen issues with overheating with all 4 GPUs in use, and there is limited clearance for the 4th GPU). With 3 GPUs you can run three calculations all at the same time (one on each GPU) without impacting performance. For ideal multi-GPU performance you should configure it with 2 GPUs which will, due to the PCI-E switch on the motherboard, supports peer to peer 2xGPU runs.
Disclosure: Exxact contribute to funding AMBER GPU development and research.