|
Joined: Aug 2009
Posts: 139
Forum Member
|
OP
Forum Member
Joined: Aug 2009
Posts: 139 |
Hi, all:
We purchased charmm c41b2 edition and had it installed on gpu nodes. In order to load all the cores in one gpu node to run a job in parallel, I am wondering what is the command line?
charmm -i step.inp > step.out
Thank you.
lqz
|
|
|
|
Joined: Sep 2003
Posts: 4,883 Likes: 12
Forum Member
|
Forum Member
Joined: Sep 2003
Posts: 4,883 Likes: 12 |
If you use CHARMM/OpenMM there is no point in using more than one CPU-core since all dynamics code is executed on the GPU. See openmm.doc
If, for some reason, you do want to use multiple cores in parallel it is done as for any parallel CHARMM run.
Lennart Nilsson Karolinska Institutet Stockholm, Sweden
|
|
|
|
Joined: Sep 2003
Posts: 8,658 Likes: 26
Forum Member
|
Forum Member
Joined: Sep 2003
Posts: 8,658 Likes: 26 |
Note that domdec_gpu has very different requirements than OpenMM, both at compile time and at run time, esp. for specifying core usage.
Rick Venable computational chemist
|
|
|
|
Joined: Aug 2009
Posts: 139
Forum Member
|
OP
Forum Member
Joined: Aug 2009
Posts: 139 |
The current truth is that the software was installed by ITS staff. When I load half of a GPU node (has 14 cores) and use the following command to run it,
mpirun -n 14 charm -I step.inp > step.out;
I found duplications in the output file for 14 times such as: 1 Chemistry at HARvard Macromolecular Mechanics (CHARMM) - Developmental Version 41b2 February 15, 2017 Copyright(c) 1984-2014 President and Fellows of Harvard College All Rights Reserved Current operating system: Linux-3.10.0-693.11.6.el7.x86_64(x86_64)@nod Created on 8/24/18 at 22:33:26 by user: lzhang
Maximum number of ATOMS: 360720, and RESidues: 120240 1 Chemistry at HARvard Macromolecular Mechanics (CHARMM) - Developmental Version 41b2 February 15, 2017 Copyright(c) 1984-2014 President and Fellows of Harvard College All Rights Reserved Current operating system: Linux-3.10.0-693.11.6.el7.x86_64(x86_64)@nod Created on 8/24/18 at 22:33:26 by user: lzhang
1 Chemistry at HARvard Macromolecular Mechanics (CHARMM) - Developmental Version 41b2 February 15, 2017 Copyright(c) 1984-2014 President and Fellows of Harvard College All Rights Reserved Current operating system: Linux-3.10.0-693.11.6.el7.x86_64(x86_64)@nod Created on 8/24/18 at 22:33:26 by user: lzhang
Maximum number of ATOMS: 360720, and RESidues: 120240 Maximum number of ATOMS: 360720, and RESidues: 120240 1 Chemistry at HARvard Macromolecular Mechanics 1 Chemistry at HARvard Macromolecular Mechanics (CHARMM) - Developmental Version 41b2 February 15, 2017 Copyright(c) 1984-2014 President and Fellows of Harvard College All Rights Reserved Current operating system: Linux-3.10.0-693.11.6.el7.x86_64(x86_64)@nod 1 ....
On the other hand, if I loaded it without using mpirun command, it is not fast at all. I am wondering where is wrong and what I should do. Thank you.
|
|
|
|
Joined: Sep 2003
Posts: 4,883 Likes: 12
Forum Member
|
Forum Member
Joined: Sep 2003
Posts: 4,883 Likes: 12 |
You have to instruct your ITS staff to install CHARMM with MPI support in order to run it in parallel: install.com gnu M
if you want to use the GPU they should also include OpenMM (see instructions in openmm.doc): install.com gnu M openmm
If you want to use DOMDEC the installation procedure is outlined in domdec.doc; DOMDEC GPU is an alternative to OpenMM for using GPUs.
Lennart Nilsson Karolinska Institutet Stockholm, Sweden
|
|
|
|
Joined: Sep 2003
Posts: 8,658 Likes: 26
Forum Member
|
Forum Member
Joined: Sep 2003
Posts: 8,658 Likes: 26 |
Note that any MPI library used (OpenMPI, MVAPICH2, etc.) must be built with Fortran90 support enabled to work with CHARMM; it is not a default option for MPI installation.
Include the PREF command in a test run; if the CHARMM executable was built with parallel support, it will include MPI and PARALLEL keywords in the listing produced.
Rick Venable computational chemist
|
|
|
|
Joined: Aug 2018
Posts: 2
Forum Member
|
Forum Member
Joined: Aug 2018
Posts: 2 |
Hey, folks. Previously-mentioned ITS staff here (background in mechanical engineering, not molecular dynamics). I made two GPU builds of c41b2 for my user about a year ago, one with OpenMM, another with DOMDEC. The OpenMM build shows the following linked libraries (from ldd output):
gpu/c41b2/bin/charmm:
linux-vdso.so.1 => (0x00002aaaaaaab000)
libOpenMMCharmm.so => /cm/shared/apps/charmm/gpu/c41b2/lib/openmm_plugins/libOpenMMCharmm.so (0x00002aaaaaaaf000)
libOpenMMGBSW.so => /cm/shared/apps/charmm/gpu/c41b2/lib/openmm_plugins/libOpenMMGBSW.so (0x00002aaaaacbc000)
libOpenMM.so => /cm/shared/apps/openmm/7.1.1/lib/libOpenMM.so (0x00002aaaaaecd000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002aaaab41b000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00002aaaab620000)
libgfortran.so.3 => /lib64/libgfortran.so.3 (0x00002aaaab927000)
libm.so.6 => /lib64/libm.so.6 (0x00002aaaabc49000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002aaaabf4c000)
libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00002aaaac162000)
libc.so.6 => /lib64/libc.so.6 (0x00002aaaac39e000)
librt.so.1 => /lib64/librt.so.1 (0x00002aaaac76c000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaaac974000)
/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)
The DOMDEC build shows:
domdec/c41b2/bin/charmm:
linux-vdso.so.1 => (0x00002aaaaaaab000)
libcudart.so.8.0 => /cm/shared/apps/cuda80/toolkit/8.0.61/lib64/libcudart.so.8.0 (0x00002aaaaaaaf000)
libnvToolsExt.so.1 => /cm/shared/apps/cuda80/toolkit/8.0.61/lib64/libnvToolsExt.so.1 (0x00002aaaaad15000)
libcufft.so.8.0 => /cm/shared/apps/cuda80/toolkit/8.0.61/lib64/libcufft.so.8.0 (0x00002aaaaaf1e000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00002aaab3d87000)
libmpi_usempi.so.5 => /cm/shared/apps/openmpi/gcc/64/1.10.3/lib64/libmpi_usempi.so.5 (0x00002aaab408f000)
libmpi_mpifh.so.12 => /cm/shared/apps/openmpi/gcc/64/1.10.3/lib64/libmpi_mpifh.so.12 (0x00002aaab4292000)
libmpi.so.12 => /cm/shared/apps/openmpi/gcc/64/1.10.3/lib64/libmpi.so.12 (0x00002aaab44e5000)
libgfortran.so.3 => /lib64/libgfortran.so.3 (0x00002aaab47b8000)
libm.so.6 => /lib64/libm.so.6 (0x00002aaab4ada000)
libgomp.so.1 => /lib64/libgomp.so.1 (0x00002aaab4ddc000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002aaab5003000)
libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00002aaab5219000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002aaab5455000)
libc.so.6 => /lib64/libc.so.6 (0x00002aaab5672000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002aaab5a3f000)
librt.so.1 => /lib64/librt.so.1 (0x00002aaab5c43000)
/lib64/ld-linux-x86-64.so.2 (0x0000555555554000)
libopen-rte.so.12 => /cm/shared/apps/openmpi/gcc/64/1.10.3/lib64/libopen-rte.so.12 (0x00002aaab5e4c000)
libopen-pal.so.13 => /cm/shared/apps/openmpi/gcc/64/1.10.3/lib64/libopen-pal.so.13 (0x00002aaab60c5000)
libnuma.so.1 => /lib64/libnuma.so.1 (0x00002aaab639b000)
libutil.so.1 => /lib64/libutil.so.1 (0x00002aaab65a8000)
I assume the references to various OpenMM shared libraries indicates that build has the correct support. Not sure if it should directly reference CUDA libraries like the DOMDEC build does. If someone could verify if the output an ldd command on their known-working OpenMM or DOMDEC builds is comparable to mine, I'd appreciate it. Am I also correct in assuming that once we have a correctly-compiled CHARMM build for OpenMM or DOMDEC, the input files will need to be modified to include relevant OMM or DOMD commands, unlike NAMD where the same input file is used on GPU and non-GPU runs?
Mike Renfro HPC Systems Administrator, Information Technology Services Tennessee Tech University
|
|
|
|
Joined: Sep 2003
Posts: 4,883 Likes: 12
Forum Member
|
Forum Member
Joined: Sep 2003
Posts: 4,883 Likes: 12 |
If the build was made with the appropriate flags (eg openmm) and a charmm executable was created the build was probably OK. There are testcases for both openmm and domdec in ~charmm/test/c*test/domedec*.inp (or omm*.inp).
Both OpenMM and DOMDEC require slight modification of the CHARMM input file.
Lennart Nilsson Karolinska Institutet Stockholm, Sweden
|
|
|
|
Joined: Sep 2003
Posts: 8,658 Likes: 26
Forum Member
|
Forum Member
Joined: Sep 2003
Posts: 8,658 Likes: 26 |
OpenMM, DOMDEC, and DOMDEC_GPU each require different input files, and different command lines. The document files openmm.doc and domdec.doc provide some information (but probably need to be updated). Some of the run time issues vary depending on the compiler, MPI library, and the queuing system on the cluster.
OpenMM does not require MPI, it is best for using one or more GPUs on a single host.
DOMDEC uses multiple compute cores on multiple hosts using a high speed transport such as Infiniband, with each MPI subtask assigned to a core.
DOMDEC_GPU is a hybrid method that supports multiple GPUs on multiple hosts with e.g. Infiniband, and works best with one or two GPUs per host and at least 8 CPU cores per GPU. Each MPI task manages a GPU, and multiple OpenMP parallel threads, hopefully one per core.
Update: ldd for my c41b2 OpenMM build does not reference CUDA either.
Last edited by rmv; 08/27/18 07:47 PM.
Rick Venable computational chemist
|
|
|
|
Joined: Aug 2018
Posts: 2
Forum Member
|
Forum Member
Joined: Aug 2018
Posts: 2 |
Thanks. Partial output from running omm_device.inp and the OpenMM build:
get_PlatformDefaults> Finding default platform values
get_PlatformDefaults> Default values found: platform=CUDA precision=single deviceid=0
Setup_OpenMM: Initializing OpenMM context
CHARMM> Using OpenMM functionality for electrostatic and vdW interactions exclusively
CHARMM> Using OpenMM nb routines.
Informational: OpenMM Using No Cutoff (cut>=990).
CHARMM> OpenMM switching function selection.
CHARMM> Electrostatic options: switch=F fswitch=F fshift=F PME/Ewald=F RxnFld=F
CHARMM> van der Waals options: vswitch=F vfswitch=F OpenMM vdW switch=F
Init_context: Using OpenMM platform CUDA
CudaDeviceIndex = 0
CudaCompiler = /usr/local/cuda/bin/nvcc
CudaPrecision = single
and I did see my charmm process in nvidia-smi. For DOMDEC using domdec_drude.inp, the output included:
Number of CUDA devices found 1
Using CUDA driver version 9000
Using CUDA runtime version 8000
Node 0 uses CUDA device 0 Tesla K80 with CUDA_ARCH 350
and
Intel CPU | Using CUDA version of non-bonded force loops and SSE elsewhere
Initializing DOMDEC with NDIR = 1 1 1
Number of threads per MPI node = 1
Dynamic Load Balancing disabled
and I also saw the charmm process (very briefly) in nvidia-smi. I was running this on 1 CPU core and 1 GPU device. I assume this indicates that the basic CHARMM install is ok, and should run other input files that have been modified to use OpenMM/DOMDEC options. I also assume that there's still no reason to reserve more than 1 CPU per GPU on an OpenMM job. Is there any other criteria to decide whether to use OMM or DOMDEC? If OpenMM meets all typical requirements, I'm perfectly happy to remove the DOMDEC version entirely.(Edited now that I see rmv's additions.)
Last edited by Mike Renfro; 08/27/18 07:44 PM.
Mike Renfro HPC Systems Administrator, Information Technology Services Tennessee Tech University
|
|
|
|
|