Previous Thread
Next Thread
Print Thread
Page 1 of 3 1 2 3
#4116 11/09/04 09:08 PM
Joined: Sep 2003
Posts: 8,658
Likes: 26
rmv Online Content OP
Forum Member
OP Online Content
Forum Member
Joined: Sep 2003
Posts: 8,658
Likes: 26
I've posted two input scripts I used recently to read in an enzyme system and the pertussis toxin 1PRT in .pdb (NOT .ent) format; CHARMM isn't programmed (yet) to read the new CIF format, so coords must be the in the "classic" .pdb format. I've done this a fair number of times, and have developed a fairly robust (but not automated) procedure which uses a script like the two posted, along with some advance preparation of the .pdb format file(s).

Note that while it may be possible to use an alternate procedure and get a CHARMM PSF and coords, my procedure has the following advantages:

+ the sequence is read directly from the PDB coord file
+ original residue numbering in the PDB file is retained
+ temperature B factors are read into WMAIN

I've included a fair number of comments in these advanced usage examples, which should explain the purpose and rationale for the commands used. A few additional remarks are in order, because of the additional steps, and residue numbering can create some confusion.

First, some examination and hand editing of the PDB file is needed:

+ change HIS residue name to e.g. HSE
+ create a new file for each subunit, or ANY gap in residue nos.
+ for multi-model NMR files, create new files for each model
+ examine subunit terminii for residue numbering, blocking groups
Optional:
+ change atom name CD1 to CD for residue ILE
+ for std CO2- terminii, change atom names for O,OXT to OT1,OT2
+ other edits may be needed for e.g. acetyl on Nterm, amide Cterm

I've indicated the last 3 as optional, because one can use IC BUILD to
place a few missing heavy atoms, e.g. at the terminii. For detailed
simulation work, you should probably revisit the issue of HIS
protonation states.

A key concept is that anytime residue N+1 does NOT follow residue N, a
new file must be created; the most common examples are gaps in numbering
due to disordered loops, or abrupt changes for the beginning of a new
subunit for more complex proteins.

Another important concept for this is the distinction between RESNO and
RESID within CHARMM; the PDB residue number RESID is treated as an
arbitrary label, while RESNO is the absolute integer residue number
within the PSF regardless of RESID. This way, for a second subunit, the
first RESID may be 1, but RESNO might be e.g. 224, assuming the first
subunit had 223 residues. This allows preserving the identifiers used
by the molecular biologists and crystallographers. However, in order
for this to work, the OFFSET keyword is needed with the COOR READ
command, and the value is chosen such that RESID+OFFSET = RESNO. For
the above example, the OFFSET would be 223 when reading the second
subunit (even if the last RESID of the first subunit is not 223).

To understand this fully, look at the scripts in conjunction with the
enzyme (1IRK) or pertussis toxin .pdb file (1PRT).


Rick Venable
computational chemist

Joined: Sep 2003
Posts: 8,658
Likes: 26
rmv Online Content OP
Forum Member
OP Online Content
Forum Member
Joined: Sep 2003
Posts: 8,658
Likes: 26

* build catalytic complex of IRK from PDB coords
* file split for CHARMM import
* IRK == Insulin Receptor Kinase fragment; ANP changed to ATP
*

bomblev -1

! STD FILES
open unit 1 read card name -
/usr/local/charmm/c29b2/toppar/top_all27_prot_na.rtf
read rtf card unit 1
close unit 1
open unit 1 read card name -
/usr/local/charmm/c29b2/toppar/par_all27_prot_na.prm
read param card unit 1
close unit 1

! ALTERNATE PDB INPUT METHOD; put B-factors in WMAIN
read univ
* modified pdb
*
pdb
segid 22 1
ires 23 4
w 61 6
end

! PROTEIN FRAGMENT, SEGID A; GET SEQU AND COORDS FROM FILE
open unit 3 read card name irk.pdb
read sequ pdb unit 3
rewind unit 3
gener A setup warn
read coor univ unit 3 offset -980
close unit 3

! ATTEMPT TO PLACE ANY MISSING HEAVY ATOMS
ic purge
ic param
ic fill preserve
ic build
define test sele ( .not. type H* ) .and. ( .not. init ) show end

! REBUILD ALL H ATOM COORDS
coor init sele type H* end
hbuild sele type H* end
define test sele .not. init show end

! SUBSTRATE FRAGMENT, SEGID B; GET SEQU AND COORDS FROM FILE
open unit 3 read card name iri.pdb
read sequ pdb unit 3
rewind unit 3
gener B setup warn first ace last ct3
read coor univ unit 3 offset 296
close unit 3

! ATTEMPT TO PLACE ANY MISSING HEAVY ATOMS
ic purge
ic param
ic fill preserve
ic build
define test sele segid B .and. ( .not. type H* ) .and. ( .not. init ) show end

! REBUILD ALL H ATOM COORDS
coor init sele segid B .and. type H* end
hbuild sele segid B .and. type H* end
define test sele segid B .and. .not. init show end

! ATP
read sequ atp 1
gener N setup warn first none last none
open unit 3 read card name iratp.pdb
read coor univ unit 3 offset 309
close unit 3

! PLACE ANY MISSING HEAVY ATOMS; REBUILD ALL H ATOM COORDS
ic purge
ic param
ic fill preserve
ic build
define test sele segid N .and. ( .not. type H* ) .and. ( .not. init ) show end
coor init sele segid N .and. type H* end
hbuild sele segid N .and. type H* end
define test sele segid N .and. .not. init show end

! MG++ IONS
read sequ mg 2
gener M setup warn first none last none
open unit 3 read card name irmg.pdb
read coor univ unit 3 offset 310
close unit 3

! XTAL WATER
read sequ tip3 202
gener W noang nodihe
open unit 3 read card name irwat.pdb
read coor univ unit 3 offset 312
close unit 3
hbuild sele segid W .and. type H* end

energy

open unit 4 write card name irki.psf
write psf card unit 4
* insulin receptor kinase w. peptide, atp, mg++, water
*

open unit 4 write card name irki.crd
write coor card unit 4
* insulin receptor kinase w. peptide, atp, mg++, water; e = ?ENER
*

stop

Joined: Sep 2003
Posts: 8,658
Likes: 26
rmv Online Content OP
Forum Member
OP Online Content
Forum Member
Joined: Sep 2003
Posts: 8,658
Likes: 26

* pertussis toxin from PDB; must read each subunit separately
* one of two molecules from the asymmetric unit of 1PRT
*

! RTF AND PARAM FILES
open unit 2 read card name /v/rhea1/charmm/c27b3/toppar/top_all22_prot.inp
read rtf card unit 2
close unit 2

open unit 2 read card name /v/rhea1/charmm/c27b3/toppar/par_all22_prot.inp
read param card unit 2
close unit 2

! SETUP UNIV COORD READER TO READ PDB FORMAT, PUT B FACTOR IN WMAIN
read univ
* modified pdb setup
*
pdb
w 61 6
end

! SUBUNIT A HAS A TEN RESIDUE GAP (DISORDERED LOOP); READ AS 2 FILES
! USE SETUP OPTION OF GENERATE TO CREATE IC TABLE
open unit 10 read card name 1prt-a.pdb
read sequ pdb unit 10
gener A setup warn
rewind unit 10
open unit 11 read card name 1prt-a2.pdb
read sequ pdb unit 11
gener A2 setup warn
rewind unit 11
join A A2

! SUBUNITS B-F; READ RESIDUE SEQUENCE AND REWIND FILE
open unit 12 read card name 1prt-b.pdb
read sequ pdb unit 12
gener B setup warn
rewind unit 12

open unit 13 read card name 1prt-c.pdb
read sequ pdb unit 13
gener C setup warn
rewind unit 13

open unit 14 read card name 1prt-d.pdb
read sequ pdb unit 14
gener D setup warn
rewind unit 14

open unit 15 read card name 1prt-e.pdb
read sequ pdb unit 15
gener E setup warn
rewind unit 15

open unit 16 read card name 1prt-f.pdb
read sequ pdb unit 16
gener F setup warn
rewind unit 16

! CONVERT ISOLATED CYS TO DISULFIDE LINK
patch disu A 41 A 201 setup sort warn
patch disu B 23 B 87 setup sort warn
patch disu B 120 B 134 setup sort warn
patch disu B 192 B 199 setup sort warn
patch disu C 23 C 87 setup sort warn
patch disu C 120 C 134 setup sort warn
patch disu C 192 C 199 setup sort warn
patch disu D 31 D 51 setup sort warn
patch disu D 103 D 109 setup sort warn
patch disu E 31 E 51 setup sort warn
patch disu E 103 E 109 setup sort warn
patch disu F 27 F 41 setup sort warn
patch disu F 92 F 98 setup sort warn

! SAVE THE COMPLETED PSF
open unit 1 write card name pttox.psf
write psf card unit 1
* the six subunits (A-F) of pertussus toxin 1PRT; initial CHARMM psf
*

! OFFSETS REQUIRED TO PRESERVE PDB RESIDS
! OFFSET IS ADDED TO RESID TO GET CHARMM RESNO (ABSOLUTE RESIDUE NO.)
read coor univ unit 10 offset -1
close unit 10
read coor univ unit 11 offset -11
close unit 11
read coor univ unit 12 offset 221
close unit 12
read coor univ unit 13 offset 417
close unit 13
read coor univ unit 14 offset 616
close unit 14
read coor univ unit 15 offset 726
close unit 15
read coor univ unit 16 offset 835
close unit 16

ic purge ! CLEANUP IC TABLE
ic param ! GET MISSING BONDS AND ANGLES FROM PARAMETER FILE
ic build ! PLACE ANY MISSING COORDS, E.G. TERMINAL O ON CO2-

! CHECK FOR MISSING HEAVY ATOM COORDS
define test sele ( .not. type H* ) .and. ( .not. init ) show end

! SAVE THE IC TABLE FILLED WITH XTAL DATA
ic fill
open unit 1 write card name pttox.ic
write ic card unit 1
* the six subunits of pertussus toxin; initial CHARMM ic table
*
! USE HBUILD TO REBUILD H ATOMS; SPINS METHYLS, ETC. TO LOCAL MINIMUM
coor init sele type H* end
hbuild sele type H* end

! CHECK FOR ANY MISSING COORDS
define test sele .not. init show end

! SAVE THE COMPLETED CHARMM VERSION OF THE PDB XTAL COORDS
open unit 1 write card name pttox.crd
write coor card unit 1
* six subunits (A-F) of pertussus toxin 1PRT; initial CHARMM coords
*

stop

Joined: Nov 2004
Posts: 8
S
Forum Member
Offline
Forum Member
S
Joined: Nov 2004
Posts: 8
Hi Rick (or anyone else who may be able to help)
I have tried to repeat your example for the protein 1PRT and am encountering the following error message in the output file:
: No angle parameters for 1 ( HA CT2 H )
: No angle parameters for 2 ( HA HA CT1 )
: No angle parameters for 3 ( CT2 HA CT1 )
: No angle parameters for 4 ( CT1 CT2 HB )
: No angle parameters for 5 ( CT1 OC HB )
: No angle parameters for 6 ( O NH3 CT1 )
: No angle parameters for 7 ( C HC CT1 )
: No angle parameters for 8 ( C HC CT1 )
: No angle parameters for 9 ( H HC CT1 )
: No angle parameters for 10 ( NH1 HC CT1 )
: No angle parameters for 11 ( NH1 HC CT1 )
: No angle parameters for 12 ( NH1 HC CT1 )
This continues for about 500 angle parameters.
This message follows the "gener F setup warn" command.

The error seems to be a result of the number of amino acid residues used. For example, if I reduce the number of residues in 1prt-f.pdb to 60, the same error message results. But if I reduce the number of residues to 59 or fewer, no error occurs.
Is there some sort of cutoff value for the maximum number of residues that CHARMM can handle?

I have also tried applying this example to a different protein and the same error occurs.

Any help would be greatly appreciated.

Thanks!

Joined: Sep 2003
Posts: 8,658
Likes: 26
rmv Online Content OP
Forum Member
OP Online Content
Forum Member
Joined: Sep 2003
Posts: 8,658
Likes: 26
You need to use a 'large' CHARMM executable for this example; there are limits to the number of atoms, bonds, residues, etc. and the 'medium' size won't work. The CHARMM output header gives the max number of atoms and residues; 'medium' is ca. 25K atoms, 'large' is ca 60K atoms.

I'd expect a lot of other problems applying the script to some other protein; the OFFSET values for each PDB fragment are highly unique to each situation.


Rick Venable
computational chemist

Joined: May 2004
Posts: 4
X
Forum Member
Offline
Forum Member
X
Joined: May 2004
Posts: 4
I have a fairly simple case concerning the application of offset keyword. The protein i'm working with the the E.coli DAM and it was crystallized with 4 missing segments including both terminal regions.

I have successfully applied offsets and psf file. However, I've noticed that RESID for each fragment begins with 1, instead of the anticipated crystal RESID.

I'd appreciate any input, and if possible, the files pttox.psf and pttox.crd.

Thanks

--------------
This is a portion of my input:

! Generate the first DAM in the DAM-DNA-DAM complex
open unit 10 read form name @DIR/@id1.pdb.crd
read sequence coor unit 10
generate DAMA first none last none setup warn
rewind unit 10
read coor card unit 10 offset -2

open unit 10 read form name @DIR/@id2.pdb.crd
read sequence coor unit 10
generate A2 first none last none setup warn
rewind unit 10
read coor card unit 10 offset -13

join DAMA A2

open unit 10 read form name @DIR/@id3.pdb.crd
read sequence coor unit 10
generate A3 first none last none setup warn
rewind unit 10
read coor card unit 10 offset -26

join DAMA A3

ic param
ic build
hbuild sele type H* end


open unit 20 write form name @id7_a.psf
write psf card unit 20
* CHARMm ic built from EcoDam A
* TOP: @1
* PAR: @2

And a portion of the PSF file:

2985 DAMA 185 ALA CA 22 0.700000E-01 12.0110 0
2986 DAMA 185 ALA HA 6 0.900000E-01 1.00800 0
2987 DAMA 185 ALA CB 24 -0.270000 12.0110 0
2988 DAMA 185 ALA HB1 3 0.900000E-01 1.00800 0
2989 DAMA 185 ALA HB2 3 0.900000E-01 1.00800 0
2990 DAMA 185 ALA HB3 3 0.900000E-01 1.00800 0
2991 DAMA 185 ALA C 20 0.510000 12.0110 0
2992 DAMA 185 ALA O 70 -0.510000 15.9990 0
2993 DAMA 1 ALA N 54 -0.470000 14.0070 0
2994 DAMA 1 ALA HN 1 0.310000 1.00800 0
2995 DAMA 1 ALA CA 22 0.700000E-01 12.0110 0
2996 DAMA 1 ALA HA 6 0.900000E-01 1.00800 0
2997 DAMA 1 ALA CB 24 -0.270000 12.0110 0
2998 DAMA 1 ALA HB1 3 0.900000E-01 1.00800 0
2999 DAMA 1 ALA HB2 3 0.900000E-01 1.00800 0
3000 DAMA 1 ALA HB3 3 0.900000E-01 1.00800 0
3001 DAMA 1 ALA C 20 0.510000 12.0110 0
3002 DAMA 1 ALA O 70 -0.510000 15.9990 0
3003 DAMA 2 SER N 54 -0.470000 14.0070 0
3004 DAMA 2 SER HN 1 0.310000 1.00800 0
3005 DAMA 2 SER CA 22 0.700000E-01 12.0110 0

Joined: Sep 2003
Posts: 8,658
Likes: 26
rmv Online Content OP
Forum Member
OP Online Content
Forum Member
Joined: Sep 2003
Posts: 8,658
Likes: 26
The usage of OFFSET in the 'pttox' example is tied to the PDB file used as input (1PRT), and how residues are numbered in PDB files.

Your script is reading CHARMM .crd files, so my PDB input example may not be applicable.

Re-read the explanation to make sure you understand the difference between RESNO and RESID; see also io.doc


Rick Venable
computational chemist

Joined: May 2004
Posts: 4
X
Forum Member
Offline
Forum Member
X
Joined: May 2004
Posts: 4
It's working fine now that I am reading the sequence and coords directly from the pdb file.

Thanks

Joined: Apr 2004
Posts: 40
wdy Offline
Forum Member
Offline
Forum Member
Joined: Apr 2004
Posts: 40
Hi Rick, I'm running a complex PDB input and the result is an output similar to Steph's output,

: No angle parameters for 1 ( HA CT2 H )
: No angle parameters for 2 ( HA HA CT1 )
: No angle parameters for 3 ( CT2 HA CT1 )
: No angle parameters for 4 ( CT1 CT2 HB )
: No angle parameters for 5 ( CT1 OC HB )
: No angle parameters for 6 ( O NH3 CT1 )
: No angle parameters for 7 ( C HC CT1 )
: No angle parameters for 8 ( C HC CT1 )
: No angle parameters for 9 ( H HC CT1 )
: No angle parameters for 10 ( NH1 HC CT1 )
: No angle parameters for 11 ( NH1 HC CT1 )
: No angle parameters for 12 ( NH1 HC CT1 )
And so on... I have on my system 14468 atoms, and 884 residues. So, the problem is not about the max number of atoms and residues (I've got the 'medium' Charmm). My question is: ¿Which is the max number of bonds? ¿Bonds cutoff could be the problem?

I am building a tetramer, and when I build it with only three subunits (Atom Number:10848, Residue Number:663 )the input works fine.

Thanks in advance for any comments.


Wendy González Bioinformatics and Molecular Simulation Centre University of Talca Chile
Joined: Sep 2003
Posts: 8,658
Likes: 26
rmv Online Content OP
Forum Member
OP Online Content
Forum Member
Joined: Sep 2003
Posts: 8,658
Likes: 26
Check the output from all the GENERATE commands-- I suspect some default patches were applied when they should not have been. You may need to add FIRST NONE LAST NONE to the GENERATE commands in order to override the default behavior.


Rick Venable
computational chemist

Page 1 of 3 1 2 3

Moderated by  lennart, rmv 

Link Copied to Clipboard
Powered by UBB.threads™ PHP Forum Software 7.7.5
(Release build 20201027)
Responsive Width:

PHP: 7.3.31-1~deb10u5 Page Time: 0.022s Queries: 35 (0.016s) Memory: 0.7969 MB (Peak: 0.9048 MB) Data Comp: Off Server Time: 2023-09-30 03:17:35 UTC
Valid HTML 5 and Valid CSS