Greetings, CHARMM community.
I've recently tried the free academic CHARMM version to see if I could easily setup the CHARMM Drude-2017 DNA FF simulations, which wasn't too hard indeed (mostly by copying test scripts and avoiding manual as much as possible), however, I am now having trouble with its performance. The performance of standard CPU-only runs (which I would even probably prefer, because the supercomputers I use currently don't have any GPUs) very quickly caps at about 48-96 cores with a result of something like 0.7ns/day. The CPUs in this case had 48 cores each, which could be the reason, however, other simulation packages never had problems with that. I didn't really put much effort in testing each and every parameter, only tried switching neighbor list building algorithms between BYCBim and BYCUim (which didn't make much difference in the end, BYCBim being somewhat faster). When comparing time spent statistics, the 96 cores run has almost doubled "other" time value (and it is not reflected in any of individual "other" values, only the resulting one is much longer).
I couldn't get DOMDEC, BLADE or GRAPE work with the vv2 integrator (I didn't yet try OpenMM). While DOMDEC and BLADE manuals clearly state they are yet incompatible with it, GRAPE for some reason simply crashes any simulation I'm trying to run (even non-polarizable force fields) after the line stating something like "GPU has * atoms". I've tried 11.8 and 12.0 CUDA versions, and at the same time both DOMDEC and BLADE run properly within their restrictions (BLADE actually managing to bring full GPU performance). OpenMPI is built GPU-aware, if that somehow matters.
vv2 integrator is the only one that can run a simulation with several thermostats, right?
I would appreciate any help with both CPU acceleration and fixing GRAPE issues, as I am currently unable to progress with the simulations.
Thanks for your time.
Here's the input script I use to test performance, if it's of any help:
stream toppar_defs_2019.str
read sequence card
20
GUA GUA GUA GUA GUA GUA GUA GUA GUA GUA
GUA GUA GUA GUA GUA GUA GUA GUA GUA GUA
generate dna1 first 5ter last 3ter setup warn drude dmass 0.4 HYPE HORD 4 KHYP 40000 RHYP 0.2
patch deo3 dna1 20 setup warn !special patch for 3-terminal deoxy residue
set i 1
label loop1
patch deox dna1 @i
incr i by 1
if i le 19 goto loop1
read sequence card
20
CYT CYT CYT CYT CYT CYT CYT CYT CYT CYT
CYT CYT CYT CYT CYT CYT CYT CYT CYT CYT
generate dna2 first 5ter last 3ter setup warn drude dmass 0.4 HYPE HORD 4 KHYP 40000 RHYP 0.2
patch deo3 dna2 20 setup warn !special patch for 3-terminal deoxy residue
set i 1
label loop2
patch deox dna2 @i
incr i by 1
if i le 19 goto loop2
autogenerate angles dihedrals !Use of AUTOGENERATE is essential
read sequ SOD 91
generate SOD drude dmass 0.4 HYPE HORD 4 KHYP 40000 RHYP 0.2
read sequ CLA 53
generate CLA drude dmass 0.4 HYPE HORD 4 KHYP 40000 RHYP 0.2
read sequ swm4 19705
generate WAT noang nodih drude dmass 0.4 HYPE HORD 4 KHYP 40000 RHYP 0.2
crystal define cubic 84.6013299894322 84.6013299894322 84.6013299894322 90.0 90.0 90.0
read coor dynr curr form name restarti.dat
crystal build cutoff 17.0
image byres sele all end
shake bonh tol 1.0e-8 para NOFAST SELECT .NOT. TYPE D* END SELECT .NOT. TYPE D* END
ENERGY ATOM VATOM VSHIFT -
CTOFNB 13.0 ctonnb 11.0 CUTNB 15.0 CUTIM 17 -
EWALD SPLINE KAPPA 0.3 BYCBim cdie eps 1. QCOR 0 -
PMEWALD ORDER 6 FFTX 72 FFTY 72 FFTZ 72 -
INBFRQ -1 IMGFRQ -1 IHBFRQ 0 !grape 11
DrudeHardWall L_WALL 0.2
open write unit 234 card name restart.dat
open read unit 345 card name restarti.dat
time now
TPCONTROL NTHER 2 NHGAM 5.0 NHGAMD 10.0 -
THER 1 TREF 300.00 LANG SELECT .NOT. TYPE D* END -
THER 2 TREF 1.00 LANG SELECT TYPE D* END BARO PREF 1.00 BTAU 0.2
dyna vv2 restart nstep 10000 timestep 0.001 -
iprfrq 0 ihtfrq 0 ieqfrq 0 iunwri 234 isvfrq 50000 -
iuncrd -1 nsavc 0 iunrea 345 iunvel -1 kunit -1 -
nprint 1000 nsavv 0 ihbfrq 0 IUNXYZ -1 NSAVX 0 ntrfrq 1000
time diff