Previous Thread
Next Thread
Print Thread
C39b1 and dual-gpu, high temps and energies
#34977 05/11/15 03:45 AM
Joined: Nov 2006
Posts: 15
J
Forum Member
OP Offline
Forum Member
J
Joined: Nov 2006
Posts: 15
Hi,
We have been running C37b2 with openmm quite happily for a while and decided to go for dual gpu so just got our copy of C39b1.

Alas running dual gpu (openmm) results in extremely high energies and temperature - while running one gpu seems to run okay but then after a few hours the simulation just crashes with no error messages n the output file or as standard error.

So to summarize with the same input files:
C39b1 openmm dual-gpu - v high temps/energies
C39b1 openmm single-gpu - runs for short jobs, crahes for ones after a few hrs
C37b2 - openmm single-gpu runs fine.

Does anyone have any patches? Since it was me who persuaded my boss to buy the dual-gpu boxes for Charmm I'm really hoping to get this resolved.
Thanks
Jon

Re: C39b1 and dual-gpu, high temps and energies
Jon_Wright #34980 05/11/15 08:32 PM
Joined: Sep 2003
Posts: 8,532
Likes: 2
rmv Online Content
Forum Member
Online Content
Forum Member
Joined: Sep 2003
Posts: 8,532
Likes: 2
The CHARMM/OpenMM interface has been undergoing continuous development, as has OpenMM itself.

I've found that CHARMM/OpenMM has dependencies on the CHARMM version and the OpenMM version, and that optimal pairings are (alas) not listed in the CHARMM documentation. OpenMM also has CUDA dependencies, but these are clearly listed on the OpenMM site.

With c39b1, I ended up building with OpenMM 5.2.0 and CUDA 5.5, and was able to run the test cases, but I did not pursue it much further (e.g. dual GPUs). I have instead been focusing on DOMDEC usage, which I regard as superior.

I had experienced and reported (see below) similar problems with dual-GPU usage about a year ago for c39a2, the development version which became c39b1.

Bug tracking is gradually moving to www.charmm.org/redmine


Rick Venable
computational chemist

Re: C39b1 and dual-gpu, high temps and energies
Jon_Wright #34983 05/12/15 08:16 AM
Joined: Nov 2006
Posts: 15
J
Forum Member
OP Offline
Forum Member
J
Joined: Nov 2006
Posts: 15
Rick,
I checked redmine and saw there were some patches for the issues I'm having but I can't figure out how to get at the patches. Is there a patch system for Charmm at all for patching versions with bugs?

Re: C39b1 and dual-gpu, high temps and energies
Jon_Wright #34988 05/12/15 04:15 PM
Joined: Sep 2003
Posts: 8,532
Likes: 2
rmv Online Content
Forum Member
Online Content
Forum Member
Joined: Sep 2003
Posts: 8,532
Likes: 2
Most of those patches go into alpha versions for testing, before they make it into a beta release version. Unfortunately, there is not a well organized system for handling patches for beta versions. It has been recognized as a shortcoming, and the process of testing and patching beta releases has recently begun to receive more attention by some of the developers.

With OpenMM, there have been so many changes that it may be difficult to separate the ones that would fix c39b1 without introducing dependencies on other code which may not be ready for beta release.


Last edited by rmv; 05/14/15 11:19 PM. Reason: corrected version

Rick Venable
computational chemist

Re: C39b1 and dual-gpu, high temps and energies
Jon_Wright #34996 05/14/15 11:21 PM
Joined: Sep 2003
Posts: 8,532
Likes: 2
rmv Online Content
Forum Member
Online Content
Forum Member
Joined: Sep 2003
Posts: 8,532
Likes: 2
As far as I can tell, the OpenMM problems may not have been fixed in c39b2; also, the c40a2 development release had a severe memory leak.

There is some hope that c40b1 (due out in August 2015) may fix a number of GPU related problems that have afflicted both c39b1 and c39b2.


Rick Venable
computational chemist


Moderated by  BRBrooks, bucknerj, lennart, rmv 

Link Copied to Clipboard
Powered by UBB.threads™ PHP Forum Software 7.7.4
(Release build 20200307)
Responsive Width:

PHP: 5.6.33-0+deb8u1 Page Time: 0.016s Queries: 24 (0.006s) Memory: 0.9229 MB (Peak: 1.0132 MB) Data Comp: Off Server Time: 2021-03-07 15:19:20 UTC
Valid HTML 5 and Valid CSS