The SASA implicit solvation model
Characteristics of the SASA model
The SASA model is a fast implicit solvation model that is useful to
simulate structured peptides and miniprotein motifs [1]. The polar and
non-polar contributions of each atom to the free energy of solvation are
assumed to be proportional to their solvent accessible surface areas.
The SASA model uses only two solvation parameters (constants of
proportionality) and approximates the solvent accessible surface area of
each solute atom with a simple analytical function that is easily
derivable. The electrostatic screening between solute charges is
accounted for by using a distance dependent dielectric function and by
neutralizing the formal charges (Asp, Glu, Arg, Lys, and the termini) as
in the EEF1 model [2].
The SASA model has been successfully applied to peptides, removing the
major artifacts of in vacuo simulations and reproducing reversible
folding [3]. Benchmarks indicate that a simulation with SASA is only
about 50% slower than an in vacuo simulation.
Theoretical aspects
We assume - as is usually done - that the potential energy of the system
consisting of the solute and the solvent can be decomposed in three
parts: the intra-solute potential energy U(X), the intra-solvent
potential energy V(Y), and the interaction potential energy of the solute
and the solvent W(X,Y), where X denotes the degrees of freedom of the
solute and Y the degrees of freedom of the solvent. Integrating out all
solvent degrees of freedom one gets the potential of mean force W(X),
also called the effective energy. It can be written as the sum of the
intra-solute potential energy U(X) and the free energy of solvation, or
mean solvation term, DW(X) that describes all solvent induced effects.
This is a rigorous result from statistical mechanics. For more details
consult, for instance, the review [4].
Any implicit solvation model based on the solvent accessible surface
area loosely relies on the idea that the major contribution to the free
energy of solvation of the solute is determined by the surface of the
solute that is accessible to the solvent, and by the screening effect of
the solvent. Following this idea, it is assumed that the mean solvation
term is a sum of atomic contributions proportional to the solvent
accessible surface area of each atom, plus an energy term that accounts
for the screening. The constants of proportionality are the solvation
parameters.
Technical aspects of the SASA model
There are three important aspects any implicit solvation model that is
based on the solvent accessible surface area has to take care of: (A)
how to calculate the atomic solvent accessible surface areas, (B) how to
account for the screening of solute charges, and (C) how many solvation
parameters to use.
(A) Calculation of the solvent accessible surface area
In the SASA model the solvent accessible surface area is approximated by
a probabilistic approach. For details please refer to [5]. The base is a
simple formula for the probability to hit the accessible surface of atom
i if N atoms are present by choosing randomly a spot on the solvation
shell of atom i. However, this formula is only true under the assumption
that all the atoms are distributed randomly. This is of course not the
case, due to the fact alone that the van der Waals spheres must not
penetrate each other. Now instead of elaborating in sophisticated
probability calculations the pragmatic approach of [6] is adapted where
the original formula is parameterized by including two sets of
parameters: a set of atom type parameters and a set of connectivity
parameters. The atom type parameters depend on the atom type and correct
for systematic errors primarily due to hybridization. The connectivity
parameters distinguish bound atoms from more distant ones. Also, the
radii used to calculate the approximated solvent accessible surface were
optimized for this purpose. (These radii are used only for the
calculation of the approximated area and they are different from the
radii used for the CHARMM van der Waals energy term.) The optimal values
for these three sets of parameters were taken from [6]. For proteins
subsets of the sets given in [6] are sufficient. The SASA model uses the
same subsets that are taken in [7], where 14 different atom types are
used.
The only internal degree of freedom considered by this approximation is
the interatomic distance. Therefore the major defect is the absence of a
better correlation between exact and approximated areas upon changes in
internal coordinates like dihedral angles.
(B) Screening of solute charges
For the screening of solute charges the SASA model uses a distant
dependent screening function, eps(r)=2r, and neutralizes the charged
groups of polar amino acids in exactly the same way as it is done in the
EEF1 model proposed by Lazaridis and Karplus [2].
(C) Solvation parameters
By default, the SASA model uses only two non-vanishing solvation
parameters: one for hydrophobic and one for hydrophilic groups. The
solvation of explicit hydrogen atoms is neglected. However, this can
be changed by the user (see below). The solvation parameters were
adjusted to the EEF1-modified CHARMM 19 polar hydrogen parameter set
by Philippe Ferrara in a trial and error approach. The criterion was to
minimize the root mean square deviation from the native state for six
small proteins by performing molecular dynamics simulations of 1ns at
300K [1]. For this parameterization process the SASA standard setup was
used (apart from the solvation parameters that were of course varied).
See below for the SASA standard setup.
Implementation in CHARMM
To every CHARMM atom type the SASA model assigns a solvation parameter
(zero by default for explicit hydrogen atoms and non-zero for
hydrophobic and hydrophilic groups), a radius optimized for the
approximation of the solvent accessible surface area and an atom type
parameter. Additionally, a connectivity parameter is assigned to every
pair depending on whether the two atoms of the pair form a 1-2 or a more
distant pair. These parameters are called the SASA parameters.
When initializing SASA, i.e., by invoking the SASA command, all the SASA
parameters are printed to the CHARMM output file. A value of -999.000
means that this specific parameter hasn't yet been determined for SASA
and therefore the corresponding CHARMM atom type can't be used by
default for the SASA calculations - the user would have to assign a
meaningful value. (Currently, SASA does not support the following CHARMM
atom types by default: HA, HT, LP, CT, CM, NP, OH2, OM, OT, OS, and FE.)
All SASA parameters can be changed by the user. For the corresponding
syntax, see below.
To evaluate the approximative formula for the solvent accessible surface
area, all pairs within the cutoff are required. Most, but not all of
these pairs, are included in the nonbond pair list in CHARMM. The
missing pairs are stored in a new pair list, the SASA pair list. More
precisely the SASA pair list contains all pairs that are excluded from
the nonbond pair list and that don't belong to the fixed exclusions (as
given in the topology file) in the case of a positive nonbond exclusion
mode and that are within the cutoff. This list assures correct operation
of the SASA model for any nonbond exclusion mode (-5 to 5). This means
that for any nonbond exclusion mode from 1 to 5, the SASA energy and its
derivatives are identical. The same is true for any nonbond exclusion
mode from -5 and 0. Of course there is a difference between the nonbond
exclusion modes 0 and 1 corresponding to the fact that the fixed
exclusions are treated differently: For an exclusion mode from -5 to 0
they are included in the nonbond pair list, opposed to an exclusion mode
from 1 to 5 where they are excluded from the nonbond pair list.
Range and limitations
The SASA model has been applied to structured peptides, see for instance
[3]. However, you currently can't expect it to work for large proteins
mainly for two reasons. Firstly, it has been parameterized for small
proteins [1] and secondly, the dielectric function does not take
different environments into account, i.e., it does not distinguish
whether or not the interacting partial charges are buried or on the
protein surface.
Caveat
The most distant pairs that can appear in the SASA pair list are 1-4
pairs. Since the 1-2, 1-3, and 1-4 pairs are usually very close in
space, there is no check for the pairs in the SASA pair list whether or
not these pairs are within the nonbonding cutoff. Hence, if one runs a
simulation with a cutoff so short that 1-4 pairs are outside the cutoff,
the SASA mean solvation energy and its derivatives are no longer
consistent with the cutoff. See also below for the SASA standard setup.
Please note that the radii used for the calculation of the approximated
solvent accessible surface area are labeled 'van der Waals radii' in the
SASA output of the CHARMM output file. This is consistent with the terms
used in [6]. However, these radii are different from and do not replace
in any way the CHARMM default van der Waals radii used to calculate the
van der Waals energy term.
Two additional files are needed to use SASA (taken from EEF1):
(1) toph19_eef1.inp : This is a modification of toph19.inp where ionic
side chains and termini are neutralized and contains
an extra parameter type (CR).
(2) param19_eef1.inp: This is a modification of param19.inp which includes
the extra parameter type (CR).
These files can be found in test/data/.
* Menu:
* Syntax:: Syntax of the SASA command
* References:: Useful references
* Example:: Input file
Syntax of the SASA command
There is only one SASA command:
SASA atom-selection [S<number> <real>] [R<number> <real>] [P<number> <real>] [fcon <real>] [ncon <real>] [surf] [infx]
atom-selection:== (see *note select:(select.doc).)
<number>:== number corresponding to a CHARMM atom type from param19.inp
or param19_eef1.inp according to the following list:
number = 001 for H
number = 002 for HC
number = 003 for HA
number = 004 for HT
number = 005 for LP
number = 006 for CT
number = 007 for C
number = 008 for CH1E
number = 009 for CH2E
number = 010 for CH3E
number = 011 for CR1E
number = 012 for CM
number = 013 for N
number = 014 for NR
number = 015 for NP
number = 016 for NH1
number = 017 for NH2
number = 018 for NH3
number = 019 for NC2
number = 020 for O
number = 021 for OC
number = 022 for OH1
number = 023 for OH2
number = 024 for OM
number = 025 for OT
number = 026 for OS
number = 027 for S
number = 028 for SH1E
number = 029 for FE
number = 030 for CR
The SASA command sets up the SASA model for a simulation. All the values
have to be given on one command line. Invoking the SASA command a second
time reinitializes all values either to the default or to the user
specified values.
atom-selection This determines the atoms to be used for the SASA
calculations. All atoms that are not included in
this selection are treated by SASA as if not
existent. Their solvation free energies are not
considered, i.e., are set to zero, and the decrease
in the solvent accessible surface areas of the
selected atoms due to the not selected ones is also
not taken into account.
S<number> <real> Changes the solvation parameter of the CHARMM atom
type corresponding to <number> (see list above) from
the default value to <real>.
R<number> <real> Changes the radius used by SASA (for the calculation
of the approximated solvent accessible surface areas)
of the CHARMM atom type corresponding to <number>
(see list above) from the default value to <real>.
P<number> <real> Changes the atom type parameter of the CHARMM atom
type corresponding to <number> (see list above) from
the default value to <real>.
fcon <real> Changes the connectivity parameter for 1-2 pairs from
the default value to <real>.
ncon <real> Changes the connectivity parameter for more distant
than 1-2 pairs from the default value to <real>.
surf The approximated atomic solvent accessible surface
areas are stored in WMAIN.
infx Includes the fixed exclusion pairs in the SASA pair
list. By default, the fixed exclusion atoms are not
considered in the SASA surface calculations for
historical reasons. This means that, for instance,
for the CG of the residue PHE (see the topology
file), not all its neighbors are considered to
calculate its accessible surface. To include all
neighbors use the keyword infx, but note that SASA
was not parameterized with this option.
Please also have a look at the following examples. The SASA standard
setup looks like this:
nbond nbxmod 5 atom rdiel shift vatom vdistance vshift -
cutnb 8.0 ctofnb 7.5 ctonnb 6.5 eps 2.0 e14fac 0.4 wmin 1.5
sasa selection (.not. hydrogen) end
This command sets up SASA in its standard form. If you would like to
change the solvation parameters, for instance, use the following syntax:
nbond nbxmod 5 atom rdiel shift vatom vdistance vshift -
cutnb 8.0 ctofnb 7.5 ctonnb 6.5 eps 2.0 e14fac 0.4 wmin 1.5
sasa selection (.not. hydrogen) end -
S013 -0.07 S014 -0.07 S016 -0.07 S017 -0.07 S018 -0.07 S019 -0.07 S020 -0.07 S021 -0.07 S022 -0.07 -
S007 0.014 S008 0.014 S009 0.014 S010 0.014 S011 0.014 S027 0.014 S028 0.014 S030 0.014
This command sets up SASA and changes the solvation parameters of the
hydrophilic groups from the default of -0.06 to -0.07 and of the
hydrophobic groups from the default of 0.012 to 0.014. Note that you
have to specify every group separately. (Do not use these values. They
were chosen arbitrarily just for the sake of illustration of the
syntax.)
The nonbond options are the default nonbond options from the default
param19.inp file with the exception of the eps value that is set to 2
instead of 1 and rdiel is used instead of cdiel. The default solvation
parameters are -0.06 for hydrophilic groups (N, NR, NH1, NH2, NH3, NC2,
O, OC, and OH1) and 0.012 for hydrophobic groups (C, CH1E, CH2E, CH3E,
CR1E, S, and SH1E) and 0.0 for explicit hydrogen atoms. Please note that
the param19_eef1.inp file has different default nonbond options
(especially the cutoffs are different) that were not used to
parameterize SASA and therefore should not be used or used only with
care in simulations with SASA since consistency is lost.
A second set of parameters has been tested: -0.144 for hydrophilic
groups, 0.024 for hydrophobic groups and 0.0 for explicit hydrogen atoms
with eps(r)=r. It is not the default. It seems to work better for large
proteins but no results have been published up to date (June 2004).
If you select all atoms for SASA instead of excluding the (explicit)
hydrogens, all atoms will be considered for calculating the solvent
accessible surface area of each atom. However, since the solvation
parameter for hydrogen atoms is zero by default, the solvation energy of
the hydrogen atoms is also zero by default. If you insist on including
the solvation energy due to the solute hydrogen atoms, you have to
assign a non-zero solvation parameter to the hydrogen atoms by yourself,
using 'S001 <real> S002 <real>' in the CHARMM input file. Be aware that
this is not the default.
If you run a simulation with several molecules (so that you have more
than one segment identifier), make sure that you invoke the SASA command
after the generation of the last segment since any molecule generated
after the last use of the SASA command is not included in the SASA
calculations. You don't have to invoke the SASA command after every
generation of a segment, it is sufficient to use the SASA command once
after the generation of the last segment.
The SASA solvation energy is stored in the variable 'SASL'. Use '?SASL'
in the CHARMM input file to access the value.
References
[1] Ferrara, P.; Apostolakis, J.; Caflisch, A.; Evaluation of a Fast
Implicit Solvent Model for Molecular Dynamics Simulations; Proteins
2002; 46; 24-33.
[2] Lazaridis, T.; Karplus. M.; Effective Energy Function for Proteins
in Solution; Proteins 1999; 288; 477-487.
[3] Ferrara, P.; Caflisch, A.; Folding Simulations of a Three-Stranded
Antiparallel Beta-Sheet Peptide; Proc. Natl. Acad. Sci. USA; 2000;
97; 10780.
[4] Roux, B.; Simonson, T.; Implicit Solvent Models; Biophysical
Chemistry; 78; 1999; 1-20.
[5] Wodak, S. J.; Janin, J.; Analytical Approximation to the Solvent
Accessible Surface Area of Proteins; Proc. Natl. Acad. Sci. USA;
1980; 77; 1736.
[6] Hasel, W.; Hendrickson, T. F.; Clark, S. W.; A Rapid Approximation
to the Solvent Accessible Surface Area of Atoms; Tetrahedron
Computer Methodology 1988; Vol. 1; No. 2; 103-116.
[7] Fraternali, F.; van Gunsteren, W. F.; An Efficient Mean Solvation
Force Model for Use in Molecular Dynamics Simulations of Proteins
in Aqueous Solution; J. Mol. Biol. 1996; 256; 939.
Examples
Check test/c29test/sasa.inp for a more complex example with
minimization, equilibration and dynamics. Here is a short version:
* Example input file for the SASA implicit solvation model.
*
! --- Begin generation procedure ---
open read card name toph19_eef1.inp unit 30
read rtf card unit 30
close unit 30
open read card name param19_eef1.inp unit 30
read parameter card unit 30
close unit 30
open read card name filename.crd unit 30
read sequence coor unit 30
close unit 30
generate main warn setup
! --- End generation procedure ---
! --- Begin reading coordinates ---
open read card name filename.crd unit 30
read coordinate card unit 30
close unit 30
! --- End reading coordinates ---
! --- Begin setting up SASA ---
! Use the SASA standard setup.
nbond nbxmod 5 atom rdiel shift vatom vdistance vshift -
cutnb 8.0 ctofnb 7.5 ctonnb 6.5 eps 2.0 e14fac 0.4 wmin 1.5
sasa selection (.not. hydrogen) end
! --- End setting up SASA ---
! --- Begin minimization ---
minimize sd nstep 300 nprint 20 tolgrad 0.1
minimize conj nstep 200 nprint 20 tolgrad 0.1
! --- End minimization ---
stop
CHARMM Documentation / chmgr@tammy.harvard.edu