Benchmarking spatial decomposition performance in CHARMM
Data
First number is timing in seconds (lower number is
better)
Second number in () is speedup from previous number of CPUs
number
in brackets <> is speedup over single CPU timing
OS is
GNU/Linux-2.6.11-gentoo-r6, Gentoo distribution
CHARMM is c32a2 modified for
spatially distributed parallel computation
pref.dat was used
AMD Opteron
(x86_64): 2 X Opteron 242 (1.6GHz) per box, gigE, inexpensive 24 port switch
| CPUs |
spatial |
standard |
NAMD |
|
276483 atoms |
276483 atoms |
276483 atoms |
| 1 |
1034.1 |
1049.2 |
1189.0 |
| 2 |
547.3(1.89)<1.89> |
552.5(1.90)<1.90> |
651.37(1.83)<1.83> |
| 4 |
283.9(1.93)<3.64> |
301.9(1.83)<3.48> |
338.38(1.92)<3.22> |
| 8 |
158.6(1.79)<6.52> |
176.3(1.71)<5.95> |
168.84(2.00)<6.45> |
| 16 |
90.0(1.76)<11.49> |
115.0(1.53)<9.12> |
86.2(1.96)<12.63> |
| 32 |
53.8(1.67)<19.22> |
84.1(1.37)<12.48> |
45.0(1.92)<24.2> |
Notes:
NAMD: version 2.5, binary taken from
NAMD_2.5_Linux-i686-TCP.tar.gz
Compile options:
- gcc-x86_64-3.4.3:g77 -O3 -mcmodel=medium -ffast-math -fno-backslash
-fugly-complex -fno-globals -Wno-globals
See also other benchmarks.
In
order to repeat the timings in standard column follow the instructions on this page
Milan HodoscekLast
modified: Fri July 15 09:47:27 CEST 2005