Previous Thread
Next Thread
Print Thread
Page 1 of 3 1 2 3
#1159 04/01/04 01:24 AM
Joined: Oct 2003
Posts: 39
Forum Member
OP Offline
Forum Member
Joined: Oct 2003
Posts: 39
hi, charmmers

When I use the MPIEXEC to run the parallel CHARMM(29b2) in the linux cluster, there are the followings error information, while there is not problem with MPIRUN.

The command is like
mpiexec -n # {charmm}
1
Copyright(c) 1984-2001 President and Fellows of Harvard College
All Rights Reserved
Current operating system: Linux-2.4.20-24.9smp(i686)
Created on 3/31/ 4 at 20:15:50

Maximum number of ATOMS: 60120, and RESidues: 72000
Current HEAP size: 10240000, and STACK size: 2000000

RDTITL> No title read.

***** LEVEL 1 WARNING FROM *****
***** Title expected.
******************************************
BOMLEV ( 0) IS NOT REACHED. WRNLEV IS 5


Parallel load balance (sec.):
Node Eext Eint Wait Comm List Integ Total
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

$$$$$$ New timer profile Local node$$$$$

Total time 0.01695 Other: 0.00000

Last edited by zhangxd68; 04/01/04 01:28 AM.
Joined: Nov 2003
Posts: 200
Forum Member
Offline
Forum Member
Joined: Nov 2003
Posts: 200
I never heard of mpiexec. Is it a site-dependent thing?
We normally use mpirun for mpich, LAM, SCALI, myrinet.

The error you are seeing is caused by an empty input to charmm.
You normally need to redirect standard input to charmm from a file.

Check with your administrator about how to run mpi jobs with standard input.

Mike


Physical mail: Dr. Michael F. Crowley National Renewable Energy Laboratory, MS 3323 1617 Cole Blvd. Golden, CO 80401
Joined: Oct 2003
Posts: 39
Forum Member
OP Offline
Forum Member
Joined: Oct 2003
Posts: 39
hi, Mike

Thanks for your reply.

Would you like to tell me how to redirect the standard input? Thanks again.

Xiaodong

Joined: Feb 2004
Posts: 147
Forum Member
Offline
Forum Member
Joined: Feb 2004
Posts: 147
@crowley: mpiexec is used to start MPICH jobs under *PBS (OpenPBS, PBS Pro, Torque).

@zhangxd68: try this way:

mpiexec charmm \< input \>\& output

which escapes all redirection characters. The "charmm" command itself will be
executed in a subshell and this subshell will receive the redirection characters
unescaped.

Normally, mpiexec is distributed with a patch to be applied to the *PBS source
which allows redirection of stdio; however if the site admin is not willing or able
(as in getting PBS Pro binaries) to recompile *PBS, the mpiexec builtin stdio
redirection doesn't work.

bogdan #1163 04/09/04 04:07 PM
Joined: Oct 2003
Posts: 39
Forum Member
OP Offline
Forum Member
Joined: Oct 2003
Posts: 39
hi, bogdan

Thanks a lots.

Joined: Oct 2003
Posts: 39
Forum Member
OP Offline
Forum Member
Joined: Oct 2003
Posts: 39
hi, bogdan

Thanks a lots.

For mpicexec (mpiexec -n 2 charmm ), it seems non-parallel. The output is
RDTITL> * TEST
RDTITL> *

CHARMM>

CHARMM> set n ?numnode
RDCMND substituted energy or value "?NUMNODE" to "1"
Parameter: N <- "1"

CHARMM> stop

For Mpirun (mpirun -np 2 -machinefile 2cpu charmm ), it is
Processing passed argument "-p4pg"
Processing passed argument "-p4wd"
RDTITL> * TEST
RDTITL> *

CHARMM>

CHARMM> set n ?numnode
RDCMND substituted energy or value "?NUMNODE" to "2"
Parameter: N <- "2"

CHARMM> stop


bogdan #1165 04/12/04 05:29 PM
Joined: Nov 2003
Posts: 200
Forum Member
Offline
Forum Member
Joined: Nov 2003
Posts: 200
Thanks for this mpiexec info.
We use PBS on many platforms and on all of them we have success with
mpirun. SGI Altix, many flavors of PC clusters with many flavors of MPI,
"prun on the alpha cluster at PSC.

Am I right in understanding that mpiexec is something that comes with PBS?
If so, then is it linked to a specific MPI? Is it something the PBS administrator does? That would mean that the executable must be compiled and linked with whatever MPI the administrator chose for mpiexec, is that right?

THanks again for the information.
Mike


Physical mail: Dr. Michael F. Crowley National Renewable Energy Laboratory, MS 3323 1617 Cole Blvd. Golden, CO 80401
Joined: Feb 2004
Posts: 147
Forum Member
Offline
Forum Member
Joined: Feb 2004
Posts: 147
I don't know what is different in our setups, but here I do run it like I mentioned.
And the output is:

CHARMM> set n ?numnode
RDCMND substituted energy or value "?NUMNODE" to "4"
Parameter: N <- "4"

when I start it on 4 CPUs (in PBS syntax I asked for nodes=2:ppn=2).

One thing that I now noticed is that you use the "-np" parameter to mpiexec. Why ?
We usually don't use it as mpiexec by default starts as many CHARMM
processes as CPUs allocated by *PBS to the job.

Joined: Feb 2004
Posts: 147
Forum Member
Offline
Forum Member
Joined: Feb 2004
Posts: 147
mpiexec is an independently developed utility for *PBS. You can find it at:

http://www.osc.edu/~pw/mpiexec/

The main reason for using it (at least in our context) is to allow reliable killing
of all processes of a job with qdel and reliable clean up when one CHARMM
process (usually node 0) dies. The remote processes are not started through
rsh/ssh, but through a PBS internal mechanism which is then able to keep track
of all processes. If rsh/ssh would be used, it is up to the MPI library/environment
to do all this clean up and signal propagation and in lots of cases it fails.

The LAM-MPI library has since version 7.0 also a similar module that comes
with the standard distribution. It is called the "tm" boot module and can be
used with (although when it's present it usually is tried first by default):

lamboot -ssi boot tm

(it has to be enabled when running the "./configure" step while installing)

Joined: Feb 2004
Posts: 147
Forum Member
Offline
Forum Member
Joined: Feb 2004
Posts: 147
Sorry, I forgot to answer the other questions...

In reply to:

If so, then is it linked to a specific MPI?




No, it's specific to MPICH, but it doesn't link to any MPI library, it just knows
how to start a MPICH job. It asks *PBS to start a number of processes on the
nodes allocated to the job and it builds the same command line for them as
MPICH's mpirun Perl script would do.

In reply to:

Is it something the PBS administrator does?




Normally, yes, but it's not necessary if the PBS libraries are available to the users.
mpiexec needs to be linked to the PBS libraries; so it might be that it needs to
be recompiled/relinked everytime the PBS distribution is updated. The feature
of *PBS that it uses (the TM API) is considered stable, so it should not stop
working for minor version updates.

Page 1 of 3 1 2 3

Moderated by  lennart, rmv 

Link Copied to Clipboard
Powered by UBB.threads™ PHP Forum Software 7.7.5
(Release build 20201027)
Responsive Width:

PHP: 7.3.31-1~deb10u1 Page Time: 0.013s Queries: 35 (0.009s) Memory: 0.7841 MB (Peak: 0.8800 MB) Data Comp: Off Server Time: 2022-01-24 03:23:14 UTC
Valid HTML 5 and Valid CSS