*FILENAME: gen-prot.inp
*PURPOSE: generate psf for small protein in vacuum (BPTI), with xtal waters
* use pdb file to get sequence and apply patches for disulfides
*AUTHOR: Lennart Nilsson, Karolinska Institutet, October 2003
! Unix variable CHM_HOME has to point to CHARMM installation directory
! The coordinates are taken from EDITED versions of 4pti.pdb
! REFERENCE: HBUILD procedure to place hydrogens
! Brunger, A. T., and Karplus, M. (1988). Polar Hydrogen Positions in Proteins:
! Empirical Energy Placement and Neutron Diffraction Comparison. Proteins 4, 148-156.

! Get definitions of amino acids and standard parameters
read rtf card name $CHM_HOME/toppar/top_all22_prot.inp
read param card name $CHM_HOME/toppar/par_all22_prot.inp

! Get aa sequence from pdb file. Note that the pdb file can only
! contain ONE peptide chain, so it has to be edited if there is
! more in the file.
! File 4pti-a.pdb is 4pti.pdb with lines 544-603 removed
! when reading sequence from pdb file the file has to be explicitly
! opened and then closed again
open unit 21 read form name 4pti-a.pdb
read sequence pdb unit 21
close unit 21

! link everything together in the peptide chain
! the setup keyword is used so that we may use data from the
! Internal Coordinate tables in the RTF
! It is also important to use the same segment id (4pti) as in the pdb file...
gene 4pti setup

! put in the three disulfide bridges which we know are present in BPTI
! CYS 5 - 55, 14 - 38, and 30 - 51
patch disu 4pti 5 4pti 55
patch disu 4pti 14 4pti 38
patch disu 4pti 30 4pti 51

read coor pdb name 4pti-a.pdb ! use file with just protein (see below

! check to see if we have it all (except the hydrogens)
print coor select .not. hydrogen .and. .not. initialized end

! there is a nomenclature problem with some c-terminal and ile atoms
! either the pdb file can be edited or we can just use the IC information
! or we can rename them here to follow PDB naming (see rename-2pdb.str)
! to place these atoms
ic param ! fill in some missing bond and angle information from parameter file
ic build ! try to place missing atoms
! this is an x-ray data set so there are no coordinates for hydrogens
! and the hbuild command is a little smarter than "ic build" in placing hydrogens
hbuild select hydrogen end
! see if we can get an energy here
! there were also 60 x-tal waters in the original file
! let us now try to include these as well

read sequence tip3 60
gene xwat noangle nodihedral ! the water model has no angles or torsions (rigid)

! easiest here is to read the water coordinates from a separate file
! due to problems with different names for atoms, residues and segid
! used in the pdb file; we therefore also have to rename atoms and residues...

rename atom O sele atom xwat * OH2 end
rename resn HOH sele segid xwat end

! since the water residue numbering in the pdb file starts at 101, not 1
! as in our PSF we need to rename these anyhow, and the segid too..

rename segid bpti sele segid 4pti end
rename segid 4pti sele segid xwat end
set i 101
set j 1
label loop
rename resi @i sele atom 4pti @j * end
incr i by 1
incr j by 1
if j le 60 goto loop
!note that when reading CHARMM formatted coordinate file (read coor card..)
! the default is to use the residue number, residue name and atom name information
! but not the RESId and SEGId fields; keyword RESI makes use of this info. The rightmost
! column in the PDB file is interpreted as the SEGId
! File 4pti-wat.pdb is a file with just those lines 544-603 that wereremoved
! to make 4pti-a.pdb from 4pti.pdb
! Note that not all PDB files contain the PDB ID in the column after the B-factor
! and that you may then have to put it in with a text editor

read coor pdb name 4pti-wat.pdb resi
! and rename back..
rename segi XWAT sele segid 4pti end
rename atom OH2 sele atom xwat * O end
rename resn TIP3 sele segid xwat end

hbuild sele hydrogen end ! put in hydrogen coordinates


! see if the disulfides we put in make sense. EXCLUsions needed to also get
! information about the bonded sulphurs in the disulfides.

coor dist sele resn cys .and. type sg end -
sele resn cys .and. type sg end exclusions cut 5.0

! save the PSF and coordinates for later use
write psf card name 4pti.psf
* PSF from top_all22_prot for 4pti w/ 3 disulfides
write coor card name 4pti.crd
* 4pti with 60 xtal waters. hydrogens added by CHARMM HBUILD

Lennart Nilsson
Karolinska Institutet
Stockholm, Sweden