Amber 10 Benchmarks

By Ross Walker (SDSC)

| JAC Benchmark (PME, 23.5K atm) | Factor IX Benchmark (PME, 91K atm) | Cellulose Benchmark (PME, 408K atm) |
| Download Benchmark Suite Input Files |

Notes:

Machine Description

  1. Dual x Quad Workstation - 2 x Intel E5462 CPU at 2.80 GHz, 1600MHz FSB, 24MB Cache, 4 x 2GB PC2-6400 FB ECC DIMMS, SuperMicro X7DWA-N Motherboard, RHEL4 AS
    Intel Ifort v10.1.018, MKL v10.0.5.025, mpich2-1.0.7 - PMEMD config.h.
     
  2. Dual x Quad Workstation - 2 x Intel E5430 CPU at 2.66 GHz, 1333MHz FSB, 24MB Cache, 4 x 2GB PC2-5300 FB ECC DIMMS, SuperMicro X7DWA-N Motherboard, RHEL4 AS
    Intel Ifort v10.1.018, MKL v10.0.5.025, mpich2-1.0.7 - PMEMD config.h.
     
  3. TACC Ranger - NSF Supercomputer at University of Texas at Austin, Sun Constellation Cluster, 579 TFlops Peak, 3,936 Nodes, 4 x Quad Core AMD Opteron Barcelona at 2.3 GHz, CentOS 4.7
    SDR Infiniband interconnect.
    Default environment: PGF90 v7.2-5, mvapich v1.0.1, tacc_affinity, ibrun - PMEMD config.h
     
  4. NICS Kraken - NSF Supercomputer at University of Tennessee - Cray XT4, Quad core AMD nodes at 2.3GHz, Cray Compute Node Linux.
    Cray CStar2 Interconnect
    Default environment: ftn (PGF90 v7.1.6), xt-asyncpe v1.0.- PMEMD config.h
     

Joint Amber/Charmm DHFR Benchmark (JAC)

1) JAC Original Benchmark (Note: This does NOT do any I/O and so is not very representative of real calculations)
 

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 85.97 1.01 90.41 0.96
4 46.43 1.86 50.20 1.72
6 34.94 2.47 37.42 2.31
8 29.30 2.95 31.55 2.74

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4         73.58 1.17 81.46 1.06 72.58 1.19 70.01 1.19
8     39.81 2.17 38.50 2.24 43.17 2.00 38.48 2.25 36.37 2.25
16 23.53 3.67 21.85 3.95 21.41 4.04 23.51 3.68 21.05 4.10 20.18 4.10
32 13.43 6.43 13.09 6.60 11.77 7.34 13.77 6.27 12.05 7.17 11.28 7.17
48 11.00 7.85 10.06 8.59 8.81 9.81 12.60 6.86 9.17 9.42 8.38 9.42
64 9.98 8.66 8.44 10.24 6.90 12.52 12.23 7.06 7.57 11.41 6.70 11.41
96 10.27 8.41 7.30 11.84 6.02 14.35 9.13 9.46 6.09 14.19 5.01 14.19
128 10.58 8.17 7.03 12.29 5.33 16.21 8.44 10.24 5.46 15.82 4.30 15.82
160 15.29 5.65 10.06 8.59 5.45 15.85 8.44 10.24 6.01 14.37 4.07 14.38
192 20.72 4.17 13.43 6.43 5.51 15.68 8.44 10.24 6.15 14.05 3.79 14.05
224 23.34 3.70 11.81 7.32 8.56 10.09 18.27 4.73 5.92 14.59 3.81 14.59
256 29.19 2.96 13.65 6.33 8.45 10.22 18.46 4.68 5.58 15.48 4.03 15.48

(Click for larger image)

 

2) JAC Production NVE Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 747.31 2.31 806.41 2.14
4 399.96 4.32 431.03 4.01
6 301.54 5.73 330.41 5.23
8 258.03 6.70 283.85 6.09

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4         631.96 2.73 709.65 2.44 623.97 2.77 598.56 2.89
8     344.82 5.01 334.53 5.17 380.03 4.55 335.06 5.16 315.25 5.48
16 208.59 8.28 190.52 9.07 184.84 9.35 210.32 8.22 183.22 9.43 173.55 9.96
32 122.16 14.15 117.95 14.65 103.64 16.67 128.06 13.49 106.70 16.19 98.78 17.49
48 100.73 17.15 92.93 18.59 78.41 22.04 112.29 15.39 83.00 20.82 74.27 23.27
64 94.85 18.22 77.92 22.18 63.55 27.19 105.64 16.36 69.31 24.93 59.07 29.25
96 83.07 20.80 64.07 26.97 53.86 32.08 77.92 22.18 54.04 31.98 44.66 38.69
128 115.00 15.03 59.92 28.83 51.86 33.32 71.38 24.21 50.18 34.44 39.82 43.40
160 218.96 7.89 101.20 17.08 52.67 32.81 71.86 24.05 49.96 34.59 38.08 45.38
192 296.56 5.83 153.34 11.27 110.58 15.63 180.26 9.59 52.35 33.01 37.56 46.01
224 339.54 5.09 163.04 10.60 126.22 13.69 204.17 8.46 51.18 33.76 38.52 44.86
256 395.78 4.37 205.89 8.39 123.53 13.99 218.67 7.90 50.46 34.24 41.22 41.92

(Click for larger image)

 

3) JAC Production NVT Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 748.24 2.31 800.21 2.16
4 400.72 4.31 430.97 4.01
6 300.74 5.75 330.40 5.23
8 258.44 6.69 287.90 6.00

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4         630.75 2.74 710.28 2.43 623.75 2.77 598.85 2.89
8     344.36 5.02 334.30 5.17 380.24 4.54 335.03 5.16 315.18 5.48
16 207.77 8.32 190.43 9.07 184.71 9.36 209.80 8.24 183.06 9.43 173.47 9.96
32 122.24 14.14 120.00 14.40 103.62 16.68 127.27 13.58 106.79 16.18 98.80 17.49
48 100.91 17.12 93.28 18.52 80.32 21.51 112.13 15.41 83.13 20.79 74.23 23.28
64 88.20 19.59 78.06 22.14 63.32 27.29 109.97 15.71 69.59 24.83 59.16 29.21
96 83.23 20.76 62.59 27.61 53.75 32.15 78.78 21.93 54.34 31.80 44.62 38.73
128 103.00 16.78 60.30 28.66 50.44 34.26 75.86 22.78 49.33 35.03 39.86 43.35
160 198.49 8.71 102.43 16.87 51.29 33.69 78.17 22.11 50.34 34.33 37.91 45.58
192 299.54 5.77 127.76 13.53 161.93 10.67 187.96 9.19 52.70 32.79 37.55 46.02
224 349.73 4.94 163.87 10.54 134.74 12.82 202.54 8.53 51.05 33.85 37.98 45.50
256 410.23 4.21 202.00 8.55 131.54 13.14 219.46 7.87 49.18 35.14 40.60 42.56

(Click for larger image)

 

4) JAC Production NPT Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 890.17 1.94 936.72 1.84
4 465.90 3.71 502.05 3.44
6 352.14 4.91 378.85 4.56
8 300.55 5.75 332.72 5.19

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4         706.03 2.45 779.33 2.22 690.94 2.50 661.92 2.61
8     385.49 4.48 370.65 4.66 419.98 4.11 367.37 4.70 345.44 5.00
16 229.04 7.54 210.41 8.21 204.04 8.47 228.71 7.56 201.97 8.56 190.21 9.08
32 135.54 12.75 133.81 12.91 115.30 14.99 141.51 12.21 116.76 14.80 107.04 16.14
48 139.80 12.36 112.54 15.35 90.62 19.07 157.22 10.99 97.83 17.66 82.80 20.87
64 128.90 13.41 98.68 17.51 74.64 23.15 132.49 13.04 82.94 20.83 67.47 25.61
96 110.89 15.58 82.19 21.02 108.60 15.91 99.94 17.29 71.23 24.26 54.02 31.99
128 140.40 12.31 78.41 22.04 68.18 25.34 91.86 18.81 65.18 26.51 47.10 36.69
160 251.58 6.87 96.52 17.90 66.95 25.81 103.49 16.70 65.89 26.23 47.41 36.44
192 322.88 5.35 147.21 11.74 126.10 13.70 227.68 7.59 71.99 24.00 48.17 35.87
224 392.01 4.41 192.56 8.97 145.43 11.88 228.32 7.57 63.05 27.41 48.50 35.63
256 463.67 3.73 275.23 6.28 164.78 10.49 260.84 6.62 66.83 25.86 53.63 32.22

(Click for larger image)

 

Factor IX Benchmark

1) FactorIXOriginal Benchmark (Note: This does NOT do any I/O and so is not very representative of real calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 148.85 0.44 162.85 0.40
4 80.50 0.80 87.22 0.74
6 61.07 1.06 65.62 0.99
8 51.83 1.25 58.82 1.10

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4         123.4 0.53 141.54 0.46 120.55 0.54 113.69 0.57
8     65.96 0.98 62.19 1.04 75.03 0.86 63.43 1.02 58.92 1.10
16 39.19 1.65 34.56 1.88 33.26 1.95 40.98 1.58 33.69 1.92 31.04 2.09
32 22.96 2.82 21.45 3.02 19.58 3.31 23.74 2.73 20.37 3.18 18.64 3.48
48 17.33 3.74 15.59 4.16 14.37 4.51 17.71 3.66 14.65 4.42 13.67 4.74
64 14.60 4.44 12.51 5.18 11.58 5.60 14.48 4.48 11.88 5.45 11.13 5.82
96 11.94 5.43 9.85 6.58 9.33 6.95 10.96 5.91 9.29 6.98 8.60 7.53
128 8.57 7.56 8.69 7.46 8.57 7.56 9.71 6.67 7.72 8.40 7.05 9.19
160 11.03 5.87 9.54 6.79 11.03 5.87 8.91 7.27 6.76 9.59 6.31 10.27
192 12.82 5.05 8.19 7.91 6.64 9.76 9.00 7.20 6.67 9.72 5.58 11.61
224 12.67 5.11 7.92 8.18 7.87 8.23 9.23 7.02 6.05 10.71 5.29 12.25
256 18.91 3.43 15.16 4.27 9.01 7.19 9.21 7.04 5.79 11.19 4.94 13.12

(Click for larger image)

 

2) FactorIX Production NVE Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 3177.60 0.54 3439.07 0.50
4 1639.81 1.05 1778.25 0.97
6 1248.67 1.38 1337.97 1.29
8 1068.24 1.62 1202.00 1.44

(Click for larger image)

 
Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4         2513.11 0.69 2871.92 0.60 2456.23 0.70 2315.32 0.75
8     1358.07 1.27 1277.24 1.35 1518.01 1.14 1283.87 1.35 1196.18 1.44
16 803.03 2.15 711.84 2.43 682.58 2.53 825.99 2.09 676.99 2.55 622.89 2.77
32 449.93 3.84 410.56 4.21 381.73 4.53 450.72 3.83 381.76 4.53 354.88 4.87
48 332.25 5.20 303.29 5.70 280.56 6.16 331.18 5.22 274.63 6.29 254.52 6.79
64 278.84 6.20 239.98 7.20 223.34 7.74 269.50 6.41 220.00 7.85 201.00 8.60
96 221.38 7.81 193.75 8.92 185.46 9.32 209.10 8.26 172.86 10.00 158.19 10.92
128 187.35 9.22 163.97 10.54 168.38 10.26 185.43 9.32 140.47 12.30 126.78 13.63
160 208.02 8.31 184.75 9.35 224.67 7.69 167.66 10.31 120.91 14.29 111.68 15.47
192 193.45 8.93 197.40 8.75 131.35 13.16 164.13 10.53 116.44 14.84 98.47 17.55
224 276.10 6.26 141.21 12.24 142.12 12.16 162.70 10.62 108.81 15.88 90.67 19.06
256 373.53 4.62 273.37 6.32 162.96 10.60 283.44 6.10 101.82 16.97 85.89 20.12

(Click for larger image)

 

3) FactorIX Production NVT Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 3185.58 0.54 3426.52 0.50
4 1635.16 1.06 1781.79 0.97
6 1238.26 1.40 1340.79 1.29
8 1070.38 1.61 1203.77 1.44

(Click for larger image)

 
Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4             2887.60 0.60 2463.15 0.70 2317.55 0.75
8     1354.41 1.28 1283.24 1.35 1521.06 1.14 1284.32 1.35 1196.40 1.44
16 802.48 2.15 707.68 2.44 682.35 2.53 828.53 2.09 677.02 2.55 623.74 2.77
32 453.42 3.81 419.37 4.12 382.10 4.52 453.10 3.81 381.62 4.53 355.04 4.87
48 334.17 5.17 300.78 5.74 279.40 6.18 332.13 5.20 274.55 6.29 254.40 6.79
64 279.54 6.18 237.20 7.28 222.02 7.78 269.63 6.41 219.83 7.86 200.95 8.60
96 232.25 7.44 191.75 9.01 186.64 9.26 211.47 8.17 172.90 9.99 158.42 10.91
128 187.85 9.20 163.07 10.60 220.63 7.83 183.32 9.43 140.04 12.34 126.52 13.66
160 202.52 8.53 173.60 9.95 226.43 7.63 168.61 10.25 120.54 14.34 111.98 15.43
192 212.82 8.12 153.10 11.29 135.04 12.80 151.68 10.69 117.14 14.75 100.73 17.15
224 285.67 6.05 140.24 12.32 144.06 12.00 167.16 10.34 108.03 16.00 91.11 18.97
256 358.11 4.83 175.57 9.84 149.74 11.54 282.17 6.12 101.41 17.04 85.47 20.22

(Click for larger image)

 

4) FactorIX Production NPT Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 3580.80 0.48 3956.05 0.44
4 1927.24 0.90 2097.06 0.82
6 1438.97 1.20 1566.16 1.10
8 1238.03 1.40 1387.17 1.25

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
4             3214.33 0.54 2749.48 0.62 2599.85 0.66
8     1538.94 1.12 1455.83 1.19 1700.43 1.02 1441.15 1.20 1338.27 1.29
16 898.35 1.92 812.53 2.13 778.15 2.22 918.63 1.88 754.85 2.29 695.66 2.48
32 499.43 3.46 457.31 3.78 424.73 4.07 496.25 3.48 422.62 4.09 390.88 4.42
48 375.71 4.60 339.85 5.08 313.63 5.51 371.02 4.66 304.67 5.67 280.71 6.16
64 330.60 5.23 278.59 6.20 245.62 7.04 327.55 5.28 250.82 6.89 222.00 7.78
96 265.22 6.52 230.23 7.51 260.18 6.64 255.11 6.77 192.41 8.98 171.95 10.05
128 237.30 7.28 194.44 8.89 278.31 6.21 225.88 7.65 169.14 10.22 141.18 12.24
160 226.70 7.62 206.26 8.38 243.86 7.09 209.52 8.25 144.03 12.00 125.59 13.76
192 263.97 6.55 232.17 7.44 236.03 7.32 206.19 8.38 144.86 11.93 113.84 15.18
224 312.97 5.52 180.02 9.60 179.71 9.62 187.08 9.24 136.91 12.62 105.03 16.45
256 442.07 3.91 223.35 7.74 230.27 7.50 309.84 5.58 139.12 12.42 104.70 16.50

(Click for larger image)

 

Cellulose Benchmark

1) Cellulose Production NVE Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 15771.57 0.11 16647.33 0.10
4 7959.31 0.22 8693.45 0.20
6 5783.61 0.30 6377.07 0.27
8 4871.55 0.35 5513.70 0.31

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
32 2248.52 0.77 1966.03 0.88 1832.49 0.94 2101.94 0.82 1727.80 1.00 1588.09 1.09
48 1521.62 1.14 1373.30 1.26 1289.22 1.34 1409.15 1.23 1229.35 1.41 1154.30 1.50
64 1264.50 1.37 1074.22 1.61 1024.68 1.69 1142.50 1.51 966.79 1.79 898.56 1.92
96 948.61 1.82 802.48 2.15 813.33 2.12 852.17 2.03 705.59 2.45 650.26 2.66
128 809.01 2.14 658.40 2.62 657.84 2.63 702.70 2.46 572.54 3.02 527.90 3.27
160 719.74 2.400 667.48 2.59 607.94 2.84 606.16 2.85 500.68 3.45 458.65 3.77
192 631.09 2.74 600.00 2.88 558.45 3.09 553.81 3.12 444.56 3.89 398.82 4.33
224 592.72 2.92 501.98 3.44 517.47 3.34 516.62 3.34 406.76 4.25 357.14 4.84
256 575.21 3.00 610.78 2.83 453.37 3.81 496.33 3.48 379.38 4.55 334.27 5.17
288 627.66 2,75 468.74 3.69 488.67 3.54 474.74 3.64 363.60 4.75 310.97 5.56
352 873.93 1,98 598.90 2.89 621.51 2.78 467.28 3.70 336.79 5.13 282.35 6.12
384 1012.00 1.71 700.00 2.47 688.91 2.51 462.31 3.74 319.18 5.41 274.75 6.29
512 1308.00 1.32 863.00 2.00 738.00 2.34 634.75 2.72 301.45 5.73 263.05 6.57

(Click for larger image)

 

2) Cellulose Production NVT Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 15296.54 0.11 16748.45 0.10
4 7940.23 0.22 8706.83 0.20
6 5779.13 0.30 6352.30 0.27
8 4875.60 0.35 5467.51 0.32

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
32 2226.98 0.78 1959.50 0.88 1842.20 0.94 2108.10 0.82 1728.16 1.00 1588.81 1.09
48 1512.63 1.14 1373.38 1.26 1289.00 1.34 1413.22 1.22 1229.73 1.41 1156.08 1.49
64 1250.06 1.38 1075.79 1.61 1031.48 1.68 1148.43 1.50 967.03 1.79 898.80 1.92
96 943.18 1.83 807.31 2.14 762.49 2.27 852.41 2.02 705.14 2.45 649.88 2.66
128 807.28 2.14 656.34 2.63 654.62 2.64 799.73 2.47 574.36 3.01 527.71 3.27
160 701.97 2.46 692.75 2.49 595.80 2.90 609.62 2.83 500.12 3.46 458.59 3.77
192 657.80 2.63 628.13 2.75 614.00 2.81 559.32 3.09 445.06 3.88 398.49 4.34
224 622.31 2.78 502/66 3.44 521.78 3.31 515.53 3.35 405.64 4.26 357.15 4.84
256 563.73 3.07 493.25 3.50 452.36 3.81 493.47 3.50 385.02 4.49 333.73 5.18
288 738.06 2.34 479.96 3.60 507.52 3.40 480.58 3.60 361.05 4.79 310.12 5.57
352 878.35 1.97 505.52 3.42 602.38 2.87 467.62 3.70 336.80 5.13 284.15 6.08
384 907.14 1.90 827.11 2.09 299.72 2.47 458.05 3.77 321.15 5.38 273.93 6.31
512 978.69 1.77 829.91 2.08 708.11 2.44 626.95 2.76 301.72 5.73 268.94 6.43

(Click for larger image)

 

3) Cellulose Production NPT Benchmark (This is a more representative set of benchmarks for real world calculations)

Machine 2 x Intel E5462 2 x Intel E5430
MPI Threads Time (s) ns/day Time (s) ns/day
2 17650.65 0.10 18952.54 0.10
4 9373.97 0.18 10233.36 0.17
6 6680.28 0.26 7300.58 0.24
8 5424.18 0.32 6060.63 0.29

(Click for larger image)

 

Machine TACC Ranger (16 cores [4 cpus] total per node) NICS Kraken (4 cores [1 cpu] total per node)
Cores per node 16 PPN 8 PPN 4 PPN 4 PPN 2 PPN 1 PPN

MPI Threads

Time (s) ns/day Time (s) ns/day Time(s) ns/day Time (s) ns/day Time (s) ns/day Time (s) ns/day
32 2346.01 0.74 2201.00 0.79 1987.31 0.87 2314.27 0.75 1863.30 0.93 1693.97 1.02
48 1644.30 1.05 1537.18 1.12 1415.94 1.22 1534.66 1.13 1312.57 1.32 1219.81 1.42
64 1455.11 1.19 1218.95 1.42 1121.51 1.54 1252.25 1.38 1044.51 1.65 946.98 1.82
96 1218.26 1.42 982.85 1.76 947.34 1.82 978.50 1.77 785.88 2.20 686.19 2.52
128 1124.55 1.54 858.28 2.01 805.00 2.15 825.98 2.09 650.23 2.66 551.15 3.14
160 997.74 1.73 800.27 2.16 680.70 2.54 725.41 2.38 576.24 3.00 482.26 3.58
192 919.02 1.88 711.02 2.43 729.13 2.37 671.29 2.57 519.73 3.32 430.33 4.02
224 848.28 2.04 625.58 2.76 560.95 3.08 626.96 2.76 471.09 3.67 384.50 4.49
256 972.74 1.78 633.18 2.73 568.55 3.04 604.34 2.86 455.10 3.80 366.65 4.71
288 1129.23 1.53 720.01 2.40 828.45 2.09 615.86 2.81 441.65 3.91 350.31 4.93
352 1166.52 1.48 736.43 2.35 812.14 2.13 594.70 2.91 410.36 4.21 314.60 5.49
384 1162.30 1.49 802.11 2.15 838.97 2.06 580.26 2.98 404.94 4.27 307.12 5.63
512 1298.03 1.33 927.27 1.86 901.13 1.92 726.20 2.38 375.43 4.60 303.31 5.70

(Click for larger image)