Benchmarking hierarchical network architecture with CHARMM
Data
First number is timing in seconds (lower number is better) Second number in () is speedup from previous number of CPUs number in brackets <> is speedup over single CPU timing OS is GNU/Linux-2.6.11-gentoo-r6, Gentoo distribution CHARMM is unmodified c32a2 pref.dat was used AMD Opteron (x86_64): 2 X Opteron 242 (1.6GHz) per box, gigE, inexpensive 24 port switch
| CPUs | sequential order | hierarchical order |
| 276483 atoms | 276483 atoms | |
| 1 | 1049.2 | 1049.2 |
| 2 | 552.5(1.90)<1.90> | 552.5(1.90)<1.90> |
| 4 | 313.2(1.76)<3.35> | 301.9(1.83)<3.48> |
| 8 | 198.6(1.58)<5.28> | 176.3(1.71)<5.95> |
| 16 | 140.7(1.41)<7.46> | 115.0(1.53)<9.12> |
| 32 | 111.6(1.26)<9.40> | 84.1(1.37)<12.48> |
Notes:
For details see this reference: Borstnik U, Hodoscek M, Janezic D, 'Improving the performance of molecular dynamics simulations on parallel clusters', J. Chem. Inf. Comp. Sci., 44 (2), 2004, pp 359-364.
Compile options:
- gcc-x86_64-3.4.3:g77 -O3 -mcmodel=medium -ffast-math -fno-backslash -fugly-complex -fno-globals -Wno-globals
See also other tables: Spatial decomposition, More performance benchmarks,
Milan HodoscekLast modified: Fri July 15 09:47:27 CEST 2005