Topic Options
#21242 - 06/13/09 01:02 PM UNIVersal coord reader examples
rmv Online   content

Forum Member

Registered: 09/17/03
Posts: 8373
Loc: 39 03 48 N, 77 06 54 W
The UNIVersal coordinate reader in CHARMM can be used to read data from any file which
  • uses a fixed field width format, i.e. column aligned
  • has one atom per line, with x,y,z coordinates
  • contains additional fields to identify the atom and molecule

These two examples illustrate the use of a custom format derived from a file type exported by the CCDB 'quest' programme, and an explicit setup for a PDB file with chain IDs.

The first example is for beta-glucose, where the import process is designed for the CSFF sugar FF. The .cor file from CCDB quest was hand edited to produce:
Code:
* AUTH=S.S.C.Chu,G.A.Jeffrey // *CODE=107(Acta Crystallogr.,Sect.B)
* VOLU= 24 // *PAGE= 830 // *YEAR=1968 // *SPAC=P212121 // *RFAC=0
* TEMP=295 // *CELA=9.205 // *CELB=12.640 // *CELC=6.654
* ALPH=90 // *BETA=90 // *GAMM=90 // *REFC=GLUCSE01 // *COMP=beta
G    GLC  1    C1           0.18226   2.04010   3.06749
G    GLC  1    C2          -0.72351   0.90629   2.59639
G    GLC  1    C3          -0.07456   0.07458   1.50314
G    GLC  1    C4           0.44736   0.96570   0.39924
G    GLC  1    C5           1.36142   2.04010   0.98878
G    GLC  1    C6           1.93213   3.01464  -0.01797
G    GLC  1    O1          -0.56243   2.88066   3.87596
G    GLC  1    O2          -1.03004   0.07584   3.71759
G    GLC  1    O3          -1.05581  -0.81781   0.96150
G    GLC  1    O4           1.19389   0.21867  -0.54895
G    GLC  1    O5           0.58544   2.80608   1.92500
G    GLC  1    O6           0.91314   3.71110  -0.71797

This file was designed to be read by the format description in this stream file:
Code:
* RV CUSTOM CCDB INPUT FORMAT; .COR + PREPEND SEGID RESN RESID
* use * as char 1 for title records in input file
*

!*AUTH=G.M.Brown,H.A.Levy // *CODE=107(Acta Crystallogr.,Sect.B)
!         1         2         3         4         5         6
!123456789012345678901234567890123456789012345678901234567890
!G    GLC  1    C1           3.45122   8.92105  -0.37867

read univ card
* ccb custom format
*
unknown
titl   1  1 *
segid  1  4
resn   6  4
resid 11  4
ires  11  4
type  16  4
x     26 10
y     36 10
z     46 10
end

return

Finally, the following input script does the import specific for CSFF; a different script was used to import for a different FF, mostly to handle residue name variations between FF. The same coordinate file and UNIVersal format definition was used in each case.
Code:
* make glucose unit cell via coor oper; must run twice
* first time to identify 0,0,0 transforms, second to apply them
*

open unit 10 read card name  "~/RvProj/CarbParm/CSFF/CSFF_top-merge.inp"
read rtf card unit 10
close unit 10

open unit 11 read card name  "~/RvProj/CarbParm/CSFF/CSFF_parm_orig.inp"
read param card unit 11
close unit 11

read sequ aglc 1
gener g setup warn
patch beta g 1
rename resn glc sele resn aglc end

! CUSTOM UNIV SETUP
stream "../CcdbXtal/ccdbuniv.str"

! EDITED FROM ORIG .cor FILE; ADD H ATOMS
open unit 3 read card name "../CcdbXtal/glucse01.mcr"
read coor univ unit 3
close unit 3
rename resn aglc sele resn glc end
title copy
hbuild

open unit 2 write card name bgluc0.crd
write coor card unit 2
* glucse01 initial coords from hbuild
*
stop


The following post contains the second example.

_________________________
Rick Venable
computational chemist


Top
#21243 - 06/13/09 01:30 PM Re: UNIVersal coord reader examples [Re: rmv]
rmv Online   content

Forum Member

Registered: 09/17/03
Posts: 8373
Loc: 39 03 48 N, 77 06 54 W
This example was used to import lipid bilayer coordinates from a collaborator, in a PDB format which included a single letter chain ID for each segment. By defining the PSF in advance to make the segment names match the chain ID, the PDB file can be read without splitting. (It is the READ SEQUence PDB command which really needs separate PDB files.) The PDB file looked like:
Code:
CRYST1   48.000   48.000   66.744  90.00  90.00  90.00 P 1           1
ATOM      1  N  ADPPCL   1      -8.355  -4.041  21.343  1.00  0.00      L    N
ATOM      2  C13ADPPCL   1      -8.577  -4.975  22.557  1.00  0.00      L    C
ATOM      3 H13AADPPCL   1      -8.571  -4.417  23.482  1.00  0.00      L    H
ATOM      4 H13BADPPCL   1      -7.835  -5.710  22.833  1.00  0.00      L    H
ATOM      5 H13CADPPCL   1      -9.543  -5.430  22.397  1.00  0.00      L    H
ATOM      6  C14ADPPCL   1      -9.604  -3.376  20.995  1.00  0.00      L    C
ATOM      7 H14AADPPCL   1      -9.726  -2.551  21.682  1.00  0.00      L    H
ATOM      8 H14BADPPCL   1     -10.447  -4.048  21.046  1.00  0.00      L    H
ATOM      9 H14CADPPCL   1      -9.631  -2.841  20.057  1.00  0.00      L    H
ATOM     10  C15ADPPCL   1      -7.274  -3.118  21.676  1.00  0.00      L    C
 :
 :
ATOM   9354 C315ADPPCL  72       9.893 -15.250  -0.998  1.00  0.00      L    C
ATOM   9355 H15XADPPCL  72      10.461 -16.091  -0.546  1.00  0.00      L    H
ATOM   9356 H15YADPPCL  72       8.794 -15.416  -1.014  1.00  0.00      L    H
ATOM   9357 C316ADPPCL  72      10.110 -13.947  -0.234  1.00  0.00      L    C
ATOM   9358 H16XADPPCL  72       9.553 -13.974   0.727  1.00  0.00      L    H
ATOM   9359 H16YADPPCL  72       9.695 -13.050  -0.740  1.00  0.00      L    H
ATOM   9360 H16ZADPPCL  72      11.211 -13.899  -0.097  1.00  0.00      L    H
ATOM   9361  OH2BTIP3W   1      21.131 -14.223  20.640  1.00  0.00      W    O
ATOM   9362  H1 BTIP3W   1      21.865 -14.195  20.025  1.00  0.00      W    H
ATOM   9363  H2 BTIP3W   1      20.495 -13.603  20.282  1.00  0.00      W    H
ATOM   9364  OH2BTIP3W   2      18.594  12.455  33.225  1.00  0.00      W    O
ATOM   9365  H1 BTIP3W   2      18.046  12.784  32.512  1.00  0.00      W    H
ATOM   9366  H2 BTIP3W   2      18.880  13.242  33.689  1.00  0.00      W    H

Here is the UNIVersal format used to read this file:
Code:
* define custom format for the lipid PDB files
*

read univ
* custom pdb
*
unknown
iseq  7 5
type 13 4
resn 18 4
segi 22 1
resi 23 4
x    31 8
y    39 8
z    47 8
w    61 6
titl  1 4 CRYS
excl  1 3 END
end

return

Finally, the script used to import the coordinates from the PDB file, and produce a COOR CARD file for subsequent use with CHARMM. The RESID option is used because the file does not contain absolute residue numbers, which are defined as both RESNO (Coordinate) and IRES (Universal) in io.doc, and as IRES in select.doc. The SEGID (chain ID) and RESID are enough to uniquely identify each atom in the PSF.
Code:
* import c27r coords
*

read rtf  card name protlpd27.rtf
read para card name protlpd27r.prm

read sequ dppc 72
gener L setup warn first none last none
read sequ tip3 2189
gener W first none last none noang nodihe

stream univfmt.str

read coor univ resid name dppc-c27r.pdb

write coor card name dppc-c27r.crd
* dppc c27r 1 microsec frame
*
stop

_________________________
Rick Venable
computational chemist


Top

Moderator:  chmgr, John Legato, petrella