| Background | Authorship &
Support | Supported Features |
by Ross Walker.
The AMBER GPU implementation has been designed to work on a broad range of hardware. Essentially the only thing you need is a NVIDIA GPU supporting hardware revision 2.0, 3.0, 3.5, 5.0 or later. However, there are some considerations when it comes to maximizing performance both in serial and parallel.
This page has been designed to help you select suitable
hardware configurations for running AMBER. If you want a simple,
hassle free, solution to GPU computing with AMBER we recommend the
Exxact AMBER Certified Solutions
described below. These have been developed jointly between
Exxact and the AMBER developers specifically to make GPU MD
simulations as fast and hassle free as possible. They are shipped
with AMBER 16 preinstalled (AMBER 16 license required) and include full GPU validation, 3 years
warranty on all components and a full benchmark report. This program has
recently been extended to
Exxact Certified Solutions for a range of life sciences applications,
and also to
Machine Learning and Deep Neural Networks.
In serial the performance for each independent AMBER GPU job is, assuming NTPR, NTWX etc are large enough, mostly independent of the underlying CPU, motherboard chipset, the PCI-E bandwidth and the number of GPUs per node. At a minimum you need as least one free CPU core per GPU. If you are building small desktops to run serial calculations then multiple GPUs per node will be the most cost effective. Ideally you should still try to keep the GPUs on x16 PCI-E slots and make sure your power supply is sufficient to power all the GPUs under full load, and that you have adequate cooling.
If you want to use multiple GPUs in parallel, rather than one simulation per GPU, then considerations change to the available bandwidth in the node (attempting to run across multiple nodes is not recommended except for Replica Exchange based simulations). With AMBER 16 the ideal specification for performance is 2 or 4 GPUs per node (8 GPUs is supported but needs server class hardware and 240V power) all in PCI-E Gen 3 x16 slots (or better). AMBER 16 uses peer to peer communication to provide optimum multi-GPU scaling. At the time of writing no standard motherboards exist that support more than two way peer to peer at full x16 speed (but we have a new unique custom-built system from Exxact that supports 4 and 8-way P2P simulations - see below). In most 4 GPU systems including the standard Exxact 4 GPU systems you are thus limited to the following combinations of runs [4 x 1 GPU; or 2 x 2 GPU; or 2 x 1 GPU + 1 x 2 GPU]. With an traditional 8 GPU system you would be limited to the following combinations [8 x 1 GPU; or 4 x 2 GPU; or 6 x 1 GPU + 1 x 2 GPU; or 4 x 1 GPU + 2 x 2 GPU; or 2 x 1 GPU + 3 x 2 GPU]. The new 8 and 10 GPU Peer to Peer solutions will add 2 x 4 GPU (and 1 x 8 GPU in AMBER 16 for large GB calculations) to those options.
Pre-configured AMBER Certified Optimal Solutions
In order to make AMBER GPU computing as simple and cost effective as possible we have teamed up with Exxact Corporation to provide a number of pre-configured, fully warranted [even with GeForce cards] and optimized turn-key desktop and cluster solutions specifically designed for running AMBER simulations. These are discussed in more detail below with the most up to date configurations available on Exxact's AMBER MD Workstation page. Recent work with Exxact has also extended this to optimized solutions for a wide range of life science applications, GPU Accelerated Cryo-EM with Relion and GPU accelerated NVIDIA Digits Dev Boxes for machine learning.
One of the biggest challenges with GPU computing is knowing what the optimal configuration is. If you go to a tier one vendor such as Dell you will likely end up paying a large amount for a sub-optimal machine. To make things as simple as possible we designed the AMBER certification process which is currently offered as sole source by Exxact Corporation. This certification involves offering turn-key solutions that conform to the following:
The goal of this program is to make it simple to purchase optimum reliable and cost effective AMBER GPU computing solutions without the need for an understanding of GPU or CPU hardware. If you know how to run simulations with AMBER then you will be able to run simulations immediately after powering up an AMBER certified system without any required configuration or installation procedures. Support can also be provided for equipment requests in proposals with text describing optimum hardware-software co-design available as needed. Due to the success of this program it has recently been extended to a range of life science applications in the form of the Exxact Life Sciences Certified GPU Computing Program.
The following are three example machine configurations, co-designed in collaboration with Exxact. These are the machines that were used to obtain the benchmarks shown on the AMBER GPU benchmark page. They come as AMBER certified platforms, which includes AMBER 16 fully installed, tested and configured (AMBER 16 license required). They carry full 3 year warranties (even when configured with GeForce cards), are 'burnt in' for a minimum of 24 hours using AMBER 16 to verify performance, reliability and numerical correctness of all hardware (this is what makes it possible to offer reliable GeForce solutions) and can be customized as desired.
It is also possible to order these machines configured for a
range of life sciences applications in addition to AMBER. For more details
please contact Ross Walker (ross _at_ rosswalker.co.uk) or Mike Chen (firstname.lastname@example.org)
mentioning that you are interested in GPU computing solutions for
Exxact - AMBER Certified Workstations
Exxact AMBER Certified Rack Mount Nodes
These machines can be customized to fit a specific budget, please contact myself (ross _at_ rosswalker.co.uk) or Mike Chen at Exxact (email@example.com) for help and advice. Single socket GTX1070 solutions are also available for prices starting around $2000. Whisper quiet (24db) dual GTX1070 and GTX1080 desktop machines are also available.
Exxact, is family owned, has been in business for over 30 years, and is GSA Compliant. As the only supplier of AMBER Certified GPU Workstations and Clusters they can provide sole source justification statements as needed as well support in providing quotes and hardware details for proposals. These desktops and clusters have been sold to numerous universities, pharmaceutical companies, biotech companies and national labs world wide. By purchasing from Exxact you are also helping to support future AMBER development through the lab of Prof. Ross Walker. If you would like contact details for existing customers please contact firstname.lastname@example.org.
Clusters tend to be custom designed and should be configured for the specific user's needs and budget. Exxact staff have been trained by AMBER developers on the best way to design custom clusters for running AMBER (and other life sciences applications), both GPU and CPU. Many configurations are possible including 2U and 4U nodes with either 1 to 8 GTX-1080TI/Titan-X [Pascal] or 1 to 8 K20(X)/K40/K80/M40/M60/P40/P100/V100 GPUs. Both the 2U and 4U configurations have been fully certified and tested with Tesla and GeForce GPUs and carry full warranties with next business day onsite being available if needed. The following page provides a typical cluster example:
Through an engineering collaboration with Exxact we are pleased to be able to offer unique peer to peer optimized systems as AMBER Certified Peer to Peer GPU MD solutions. These systems can be configured with up to 10 GPUs in a single system image with peer to peer communication supported across all GPUs. These systems were used to obtain the 4 GPU timings provided on the benchmark page (and the 8 GPU Cellulose GB timings). While they have a price premium over the regular systems they provide the extra flexibility of being able to run individual calculations across 1, 2 or 4 GPUs if desired.
If you are interested in measuring the performance of AMBER running your own simulation on the machines shown above then we encourage you to take advantage of the free test drive program we have put together with Exxact. Test accounts on these machines are available free of charge in blocks of 24 hours for you to try things out for yourself. Please see the following page for AMBER MD Workstation Test Drive signup details.
If you are happy putting together your own machines from individual components then you can build unbelievably fast AMBER GPU machines for very little money. Your main considerations are a suitable motherboard, a processor with at least 1 core per GPU and a power supply beefy enough to run everything. Simple 2 or 3 GPU systems can be built for around $3500 INCLUDING THE GPUS! Here's a recommended shopping list of parts for building reliable high performing AMBER GPU machines. This machine runs the DHFR NVE HMR 4fs benchmark at over 400ns/day using just one of the GPUs! The system as specced can support up to 3 GPUs, with a 1600W power supply (you can actually fit 4 in but I have seen issues with overheating with all 4 GPUs in use, and there is limited clearance for the 4th GPU). With 3 GPUs you can run three calculations all at the same time (one on each GPU) without impacting performance. For ideal multi-GPU performance you should configure it with 2 GPUs which will, due to the PCI-E switch on the motherboard, supports peer to peer 2xGPU runs.
Disclosure: Exxact contribute to funding AMBER GPU development and research.