Topic Options
#35051 - 06/03/15 09:18 PM MD w. DOMDEC; revised .inp & .csh scripts
rmv Online   content

Forum Member

Registered: 09/17/03
Posts: 8314
Loc: 39 03 48 N, 77 06 54 W
This is a revised version of an earlier post; it contains DOMDEC input examples and other changes that better reflect current usage.

The code blocks which follow represent a system for running MD in discrete chunks, producing a series of sequentially numbered files. The csh script, dyn.csh, has been extensively used on Unix systems and Linux clusters for years, and some concepts date back to VAX DCL scripts for running CHARMM. This particular example is for a Linux cluster with a PBS queueing system, and a cover script for running CHARMM which handles the runtime setup, esp. for MPI based executables. The output files all have generic names initially, and they are numbered only if dyn.csh determines that the run was successful, based on a few simple tests.

The numbering is controlled by a file named next.seqno, which is first created by and then updated by dyn.csh for successful runs. In the absence of this file, dyn.csh assumes a new simulation is being started, and runs an MD startup script, assumes sequence number 1 for numbering, and puts the number 2 into the next.seqno file. The optional last.seqno file can be used to set a controlled stopping point. When next.seqno is present, an MD continuation script is run each time, extending the dynamics.

Another key step for a successful run is that dyn.res is copied to dyn.rea, the restart file to be read by the next run.

The dyn.csh script is set up to run 2 repetitions, and then attempts to resubmit itself by using ssh to connect to the cluster head node, by running a small csh script which invokes 'qsub'. That script, lobos.csh in this case, contains simply--

#!/bin/csh
qsub -N $cwd:t dyn.csh


and is used to start the chain. In the event that ssh (or rsh) to the head node doesn't work at your site, you can increase the number of repetitions in dyn.csh and resubmit manually. The '$cwd:t' construct uses the current dir name as the PBS job name.

For a failed run, dyn.out is renamed to dyn.err.D.H, where D and H are the date and time in numeric form. An email is also sent (assuming one can ssh to a mail host), and no new job is submitted.

The scripting system assumes a separate subdir for each simulation.

The code blocks which follow represent 3 principal files, and 3 modular project files--
  • dyn.csh; the csh script which runs CHARMM and evaluates success
  • dynstrt.inp; MD startup, run if next.seqno does not exist
  • dyn.inp; continue MD; this file is run repeatedly
  • rtfprm.str; reads the topology and parameters for the project
  • psfcrd.str; reads the PSF and COOR files for simulation start point, req'd for CRYSTAL BUILD as well
  • cryst.str; establish initial unit cell size, and define neighboring cells

The latter 3 are the most project specific; their use simplifies subsequent analysis, and makes the other scripts more portable for re-use with related or even different projects.

First is the dyn.csh script called by lobos.csh; this does most of the work, including re-submitting itself upon completion.

Code:
#!/bin/csh
#PBS -j oe -m a -r n
#PBS -l nodes=12:ppn=12:core12:iband,walltime=12:00:00

# disabled directive for GPU nodes
#### PBS -l nodes=6:ppn=12:gpu1:fdrib,walltime=4:00:00

# ASSUMPTION (1): output files are named dyn.res dyn.trj dyn.out
# ASSUMPTION (2): previous restart file read as  dyn.rea
#

cd $PBS_O_WORKDIR

if ( ! -d Res ) mkdir Res
if ( ! -d Out ) mkdir Out
if ( ! -d Crd ) mkdir Crd
if ( ! -d Trj ) mkdir Trj

# special executable for domdec_gpu (disabled)
#set chm = "/v/apps/bin/c40a2 ddg impi 6"

# standard executable for domdec; request 144 cores, 12 nodes ea. w. 12 cores
set chm = "/v/apps/bin/c39b2 ddw impi 144"

@ nrun = 2
set d = $cwd:t
# repetitions
@ krun = 1
while ( $krun <= $nrun )

# determine start or restart based on next.seqno file
if ( -e next.seqno ) then
 $chm dyn.inp D:$d > dyn.out
else
 $chm dynstrt.inp D:$d > dyn.out
endif

set okay = true
# TEST FOR EXISTENCE, THEN NONZERO LENGTH OF OUTPUT FILES
if ( -e dyn.res && -e dyn.dcd ) then
 @ res = `wc dyn.res | awk '{print $1}'`
 @ tsz = `ls -s dyn.dcd | awk '{print $1}'`
 @ nrm = `grep ' NORMAL TERMINATION ' dyn.out | wc -l`
 if ( $res > 100 && $tsz > 0 && $nrm == 1 ) then
# SUCCESSFUL RUN; COPY RESTART FILE
  cp dyn.res dyn.rea
# DETERMINE RUN NUMBER
  if ( -e next.seqno ) then
   @ i = `cat next.seqno` 
  else
   @ i = 1
  endif
# NUMBER AND MOVE THE OUTPUT FILES, COMPRESS TEXT FILES
  mv dyn.out Out/dyn$i.out
  mv dyn.crd Crd/dyn$i.crd
  mv dyn.dcd Trj/dyn$i.dcd
  mv dyn.res Res/dyn$i.res
  gzip -f Out/dyn$i.out Res/dyn$i.res Crd/dyn$i.crd
# CONDITIONAL END CHECK BASED ON OPTIONAL last.seqno 
  if ( -e last.seqno ) then
    @ l = `cat last.seqno`
    if ( $i == $l ) then
      @ i += 1
      echo $i > next.seqno 
      exit
    endif
  endif
# INCREMENT next.seqno
  @ i += 1
  echo $i > next.seqno
 else
# ZERO LENGTH FILE(S)
  set okay = false
 endif
else
# FILE DOESN'T EXIST
 set okay = false
endif

# TEST FOR CHARMM RUN FAILED; CREATE .ERR FILE WITH TIMESTAMP
if ( $okay == true ) then
# SUBMIT THE NEXT JOB; SSH LOGIN TO CLUSTER HEAD NODE
 if ( $krun == $nrun ) ssh m1a "cd $cwd; ./lobos.csh"
 @ krun += 1
else
 set ts = `date +%m%d.%H%M`
# BUILD EMAIL ERROR MESSAGE ABOUT FAILURE
 date > msg.tmp
 echo $cwd >> msg.tmp
 head -64 dyn.out >> msg.tmp
 tail -64 dyn.out >> msg.tmp
# RENAME dyn.out TO dyn.err
 mv dyn.out dyn.err.$ts
# SEND EMAIL FROM CLUSTER HEAD NODE
 ssh m1a "cd $cwd; mail -s '$PBS_JOBID $ts' user@somedomain.net  < msg.tmp"
 exit(201)
endif

end


The next script is the CHARMM input to start the dynamics, which heats the system from ca. 100 K below the target T value and applies damping to the pressure piston (PGAMMA 50.) to slow down any changes in the unit cell size(s) during heating. The remainder of the run does thermal equilibration via periodic velocity assignment (IASORS 1).

Code:
* @D startup equil
*

stream rtfprm.str
stream psfcrd.str
faster on
stream cryst.str

update cutim 14. imgfrq -1 cutnb 14. inbfrq -1 -
  bycb atom vatom vfswitch ctofnb 12. ctonnb 10. -
  ewald pme spline order 6 fftx @FX ffty @FX fftz @FX kappa 0.32 

! setup for 144 cores; 100 for subdomains (5x5x4), remaining 44 for PME
energy domdec ndir 5 5 4 

! alternate setup for GPU tests w. 6 GPU nodes
! energy domdec gpu on ndir 3 2 1

shake bonh param fast

open unit 11 write card name dyn.res
open unit 12 write file name dyn.dcd
prnlev 3 node 0
dyna start cpt timestep 0.001 nstep 1000000 nprint 1000 -
  pcons pint pmass 1000. pgamma 50. pref 1. -
  ihtfrq 5000 ieqfrq 5000 ntrfrq 500 iprfrq 5000 -
  firstt 198. finalt 298. teminc 5. -
  iuncrd 12 nsavc 1000 iunwri 11 iasors 1 echeck 500.

write coor card name dyn.crd
* @D ; a=b=c ?XTLA ; e= ?ENER
*

stop


The next script continues MD from a previous run, and is used repeatedly to extend the simulation to the desired length. Besides opening dyn.rea, the DYNA command is noticeably different from the startup run, with changes on every line, as well as an additional line for the thermostat (HOOVER).

Code:
* hoover thermostat; npt @D
*

stream rtfprm.str
stream psfcrd.str
faster on
stream cryst.str

update cutim 14. imgfrq -1 cutnb 14. inbfrq -1 -
  bycb atom vatom vfswitch ctofnb 12. ctonnb 10. -
  ewald pme spline order 6 fftx @FX ffty @FX fftz @FX kappa 0.32 

energy domdec ndir 5 5 4 ! cpu only
!energy domdec gpu on

shake bonh param fast

open unit 10 read  card name dyn.rea
open unit 11 write card name dyn.res
open unit 12 write file name dyn.dcd
prnlev 3 node 0
dyna restart cpt timestep 0.001 nstep 1000000 nprint 1000 -
  pcons pint pmass 1000. pgamma 0. pref 1. -
  hoover tmass 4000. reft 298.0 -
  ihtfrq 0 ieqfrq 0 ntrfrq 1000 iprfrq 5000 -
  firstt 298. finalt 298. tstruct 298. -
  iuncrd 12 nsavc 1000 iunwri 11 iunrea 10 iasors 1

write coor card name dyn.crd
* @D NPT; a=b=c ?XTLA ; e= ?ENER
*

stop


Finally, here are the 3 modular subfiles rtfprm.str, psfcrd.str, and cryst.str--

Code:
* RTF and PARAM file setup; NHLBI, U MD parameters for PEI polymer
* based on CGenFF via www.paramchem.org server
*

read rtf card name /v/apps/charmm/topparc36/top_all36_cgenff.rtf
read para flex card name /v/apps/charmm/topparc36/par_all36_cgenff.prm
read rtf card append name "/u/rvenable/RvProj/PChandran/toppar2/pei-nhlbi-umd.rtf"
read para flex card append name "/u/rvenable/RvProj/PChandran/toppar2/pei-nhlbi-umd.prm"
stream /v/apps/charmm/topparc36/toppar_water_ions.str

return

Code:
* psf and coor file
*

read psf card name pei58wi.psf
read coor card name pei58wi.crd

return

Code:
* setup periodic boundary
*

calc edge = 114.
crystal define cubic @EDGE @EDGE @EDGE 90. 90. 90.
crystal build noper 0
image byres sele .not. segid P end
image byseg sele segid P end
set fx 120

return
_________________________
Rick Venable
computational chemist


Top
#35058 - 06/04/15 12:16 PM Re: MD w. DOMDEC; revised .inp & .csh scripts [Re: rmv]
rmv Online   content

Forum Member

Registered: 09/17/03
Posts: 8314
Loc: 39 03 48 N, 77 06 54 W
For anisotropic system with planar interfaces such as lipid bilayers, the principal difference is mostly in the cryst.str setup file. The lattice type must be changed to TETRAGONAL, which keeps the a=b constraint, but c (a.k.a. z) can change independently in response to the normal pressure.

Code:
* setup simulation cell via crystal for POPC bilayer
*
 
set z  64.7356           ! 
set ab 148.1286          ! expanded size, 3*@XY

crystal define tetragonal @AB @AB @Z 90. 90. 90.
crystal build noper 0
! IMAGE CENTERING BY RESIDUE
image byres
! for FFT
set fx 144
set fz 64
 
return
_________________________
Rick Venable
computational chemist


Top

Moderator:  chmgr, John Legato, petrella