Benchmarking spatial decomposition performance in CHARMM
Data
First number is timing in seconds (lower number is better)Second number in () is speedup from previous number of CPUs
number in brackets <> is speedup over single CPU timing
OS is GNU/Linux-2.6.11-gentoo-r6, Gentoo distribution
CHARMM is c32a2 modified for spatially distributed parallel computation
pref.dat was used
AMD Opteron (x86_64): 2 X Opteron 242 (1.6GHz) per box, gigE, inexpensive 24 port switch
| CPUs | spatial | standard | NAMD |
| 276483 atoms | 276483 atoms | 276483 atoms | |
| 1 | 1034.1 | 1049.2 | 1189.0 |
| 2 | 547.3(1.89)<1.89> | 552.5(1.90)<1.90> | 651.37(1.83)<1.83> |
| 4 | 283.9(1.93)<3.64> | 301.9(1.83)<3.48> | 338.38(1.92)<3.22> |
| 8 | 158.6(1.79)<6.52> | 176.3(1.71)<5.95> | 168.84(2.00)<6.45> |
| 16 | 90.0(1.76)<11.49> | 115.0(1.53)<9.12> | 86.2(1.96)<12.63> |
| 32 | 53.8(1.67)<19.22> | 84.1(1.37)<12.48> | 45.0(1.92)<24.2> |
Notes:
NAMD: version 2.5, binary taken from NAMD_2.5_Linux-i686-TCP.tar.gz
Compile options:
- gcc-x86_64-3.4.3:g77 -O3 -mcmodel=medium -ffast-math -fno-backslash -fugly-complex -fno-globals -Wno-globals
See also other benchmarks.
In order to repeat the timings in standard column follow the instructions on this page
Milan HodoscekLast modified: Fri July 15 09:47:27 CEST 2005