Overview of NMR refinement using SANDER.

We find the sander module to be a flexible way of incorporating a variety of restraints into a optimization procedure that includes energy minimization and dynamical simulated annealing. The "standard" sorts of NMR restraints, derived from NOE and J-coupling data, can be entered in a way very similar to that of programs like DISGEO, DIANA or X-PLOR; an aliasing syntax allows for definitions of pseudo-atoms, connections with peak numbers in spectra, and the use of "ambiguous" constraints from incompletely-assigned spectra. More "advanced" features include the use of time-averaged constraints, use of multiple copies (LES) in conjunction with NMR refinement, and direct refinement against NOESY intensities, paramagnetic and diamagnetic chemical shifts, or residual dipolar couplings. In addition, a key strength of the program is its ability to carry out the refinements (usually near the final stages) using an explicit-solvent representation that incorporates force fields and simulation protocols that are known to give pretty accurate results in many cases for unconstrained simulations; this ability should improve predictions in regions of low constraint density and should help reduce the number of places where the force field and the NMR constraints are in "competition" with one another.

Since there is no generally-accepted "recipe" for obtaining solution structures from NMR data, the comments below are intended to provide a guide to some commonly-used procedures. Generally speaking, the programs that need to be run to obtain NMR structures can be divided into three parts:

  1. front-end modules, which interact with NMR databases that provide information about assignments, chemical shifts, coupling constants, NOESY intensities, and so on. We have tried to make the general format of the input straightforward enough so that it could be interfaced to a variety of programs. At TSRI, we generally use the FELIX and NMRView codes, but the principles should be similar for other ways of keeping track of a database of NMR spectral information. As the flow-chart on the next page indicates, there are only a few files that need to be created for NMR restraints; these are indicated by the solid rectangles. The primary distance and torsion angle files have a fairly simple format that is largely compatible with the DIANA programs; if one wishes to use information from ambiguous or overlapped peaks, there is an addtional "MAP" file that makes a translation from peak identifiers to ambiguous (or partial) assignments. Finally, there are some specialized (but still pretty straightforward) file formats for chemical shift or residual dipolar coupling restraints.
  2. restrained molecular dynamics, which is at the heart of the conformational searching procedures. This is the part that sander itself handles.
  3. back-end routines that do things like compare families of structures, generate statistics, simulate spectra, and the like. For many purposes, such as visualization, or the running of procheck-NMR, the "interface" to such programs is just the set of pdb-format files that contain the family of structures to be analyzed. These general-purpose structure analysis programs are available in many locations, are are not discussed here. The principal sander-specific tool is sviol, which prepares tables and statisitics of energies, restraint violations, and the like.

Preparing restraint files for Sander


linkthick=1.5

# second row:

PEAK: circle radius 0.4 "NOESY" "peak-list" right arrow 0.75 from PEAK.e "various" above "calibrations" below

DIST: box "7/8 column" "distance" "bounds" arrow 0.75 from DIST.e "makeDIST_RST" above

RDIS: ellipse "distance" "restraints"

arrow 0.75 from RDIS.e "makeVOL_RST" above

NOEEXP: ellipse "volume" "restraints"

# top row:

move up from PEAK.n RCS: ellipse "chemical" "shift" "restraints"

move up from DIST.n CS: circle radius 0.4 "chemical" "shifts" right arrow 0.75 from CS.e "analyze" above "ambiguities" below

MAP: box "MAP" "file"

move right 0.75 from MAP.e DMAP: ellipse "default" "MAP file"

arrow from CS.w to RCS.e "SHIFTS or" above "FANTASIAN" below

# third row:

move down from PEAK.s SPECTRUM: circle radius 0.4 "spectrum"

move right 0.75 from SPECTRUM.e COUPLING: box "5 column" "coupling" "constants" right arrow 0.75 from COUPLING.e "makeANG_RST" above

RANG: ellipse "J-coupling" "restraints"

move right 0.75 from RANG.e DISANG: ellipse "DISANG" "file"

# fourth row:

move down from COUPLING.s CHIR: ellipse "default" "chirality" "inform." right arrow 0.75 from CHIR.e "makeCHIR_RST" above

RCHIR: ellipse "chirality" "restraints"

arrow from RDIS.se to DISANG.nw arrow from RANG.e to DISANG.w arrow from RCHIR.ne to DISANG.sw arrow from MAP.s to RDIS.n arrow from DMAP.w to MAP.e arrow from SPECTRUM.n to PEAK.s arrow from SPECTRUM.e to COUPLING.w arrow from PEAK.ne to CS.sw spline from SPECTRUM.s down 1.6 then right 1.1 ->

# fifth row:

move down from CHIR.s DIP: box "direct" "dipolar" "couplings" right arrow 0.75 from DIP.e "makeDIP_RST" above

RDIP: ellipse "alignment" "restraints"


Notation: circles represent logical information, whose format might differ from one project to the next; solid rectangles are in a specific format (largely compatible with DIANA and other programs), and are intended to be read and edited by the user; ellipses are specific to Sander, and are generally not intended to be read or edited manually. The conversion of NOESY volumes to distance bounds can be carried out by a variety of programs such as mardigras or xpk2bound that are not included with Amber. Similarly, the analysis and partial assignment of ambiguous or overlapped peaks is a separate task; at TSRI, these are typically carried out using the programs xpkasgn and filter.pl.

Once the restraint files are made, here is a flow-chart of the general way in which sander refinements are carried out:

Running NMR refinements in sander


linkthick=1.5

PDB: circle radius 0.4 "generic" "PDB-file" right arrow 0.75 from PDB.e "protonate" above

APDB: box "Amber" "PDB-file" right arrow 0.75 from APDB.e "LEaP" above

PRMTOP: box "prmtop &" "prmcrd" "files"

# second row:

move down from APDB.s MDIN: box "sander" "control" "file" right arrow 0.75 from MDIN.e

SANDER: ellipse "SANDER" move right 0.75 from SANDER.e RST: box "NMR" "restraints"

# third row:

move down from MDIN.s OUTPUTS: ellipse "sander" "output" left arrow 0.75 from OUTPUTS.w "sviol" above

VIOL: circle radius 0.4 "violation" "statistics"

move down from SANDER.s COORDS: ellipse "Amber" "coordinates" right arrow 0.75 from COORDS.e "ambpdb" above

PDBOUT: box "output" "pdb-files"

arrow from PRMTOP.s to SANDER.n arrow from SANDER.sw to OUTPUTS.n arrow from SANDER.s to COORDS.n arrow from RST.w to SANDER.e


Once families of PDB-format structures are available, there is a wide variety of tools available elsewhere to perform comparisons and analyses.

The basic ideas of this scheme owe a lot to the general experience of the nmr community over the past decade. Some good papers to look at are:

  1. "Computational methods for determining protein structures from NMR data," by G.P. Gippert, P.F. Yip, P.E. Wright and D.A. Case, Biochem. Pharm. 40, 15-22 (1990).
  2. "Determination of high resolution NMR structures of proteins", by D.A. Case and P.E. Wright, in NMR in Proteins, G.M. Clore and A.M. Gronenborn, eds. (New York: McMillan, 1993), pp. 53-91.
  3. D.A. Case, H.J. Dyson and P.E. Wright. Use of chemical shifts and coupling constants in nuclear magnetic resonance structural studies on peptides and proteins. Methods in Enzymology 239, 392-416 (1994).
  4. R. Brüschweiler and D.A. Case. Characterization of biomolecular structure and dynamics by NMR cross relaxation. Prog. NMR Spectr. 26, 27-58 (1994).
  5. D.A. Case. NMR spectral simulation and structure refinement. In Encyclopedia of Computational Chemistry, P.v.R. Schleyer, N.L. Allinger, T. Clark, J. Gasteiger, P.A. Kollman and H.F. Schaefer,III, eds. (Chichester: John Wiley, 1998), pp. 1866-1876.
  6. D.A. Case. The use of chemical shifts and their anisotropies in biomolecular structure determination. Curr. Opin. Struct. Biol. 8, 624-630 (1998).

    These papers outline procedures in the Scripps group, from which a lot of the NMR parts of SANDER are derived. They are by no means the only way to proceed. We hope that the flexibility incorporated into SANDER will encourage folks to experiment with refinement protocols.

Preparing distance restraints: makeDIST_RST.

The makeDIST_RST program converts a simplified description of distance bounds into a detailed input for sander. A variety of input and output filenames may be specified on the command line:


input:
-noe 7-col file of upper distance bounds, OR
-ual
8-col file of upper and lower bounds, OR
-vol
7-col file of NOESY volumes


-pdb
Brookhaven format file
-map
MAP file (default:map.DG-AMBER)
-les
LES atom mappings, made by addles


output:
-dgm
DGEOM95 restraint format
-rst
SANDER restraint format
-svf
Sander Volume Format


other options:
-help
(gives you this explanation, overrides other parameters)
-report
(gives you short runtime diagnostic output)
-nocorr
(do not correct upper bound for r**-6 averaging)
-altdis
(use alternative form for the distance restraints)


The 7/8 column distance bound file is essentially that used by the DIANA or DISGEO programs. It consists of one-line per restraint, which would typically look like the following:

23 ALA HA 52 VAL H 3.8 # comments go here
The first three columns identify the first proton, the next three the second proton, and the seventh column gives the upper bound. Only the first three letters of the residue name are used, so that DIANA files that contain residues like "ASP-" will be correctly interpreted. An alternate, 8-column, format has both upper and lower bounds. Comments typically identify the spectrum and peak-number or other identification that allow cross-referencing back to the appropriate spectrum.

If all peaks involved just single protons, and were fully assigned, this is all that one would need. In general, though, some peaks (especially methyl groups or fast-rotating aromatic rings) represent contributions from more than one proton, and many other peaks may not be fully assigned. Sander handles both of these situations in the same way, through the notion of an "ambiguous" peak, that may correspond to several assignments. These peaks are given two types of special names in the 7/8-column format file:

  1. Commonly-occuring ambiguities, like the lack of stereospecific assignments to two methylene protons, are given names defined in the default MAP file. These names, also more-or-less consistent with DIANA, are like the names of "pseudo-atoms" that have long been used to identify such partially assigned peaks, e.g. "QB" refers to the (HB2,HB3) combination in most residues, and "MG1" in valine refers collective to the three methyl protons at position CG1, etc.
  2. There are generally also molecule-specific ambiguities, arising from potential overlap in a NOESY spectrum. Here, the users assigns a unique name to each such ambiguity or overlap, and prepares a list of the potential assignments. The names are arbitrary, but might be constructed, for example, from the chemical shifts that identify the peak, e.g. "p_2.52" might identify the set of protons that could contribute to a peak at 2.52 ppm. The chemical shift list can be used to prepare a list of potential assignments, and these lists can often be pruned by comparison to approximate or initial structures.

The default and molecule-specific MAP files are combined into a single file, which is used, along with the 7-column restraint file, the the program makeDIST_RST to construct the actual sander input files. You should consult the help file for makeDIST_RST for more information. For example, here are some lines added to the MAP file for a recent TSRI refinement:

AMBIG n2:68 = HE 86 HZ 86
AMBIG n2:72 = HE 24 HD 24 HZ 24
AMBIG n2:73 = HN 81 HZ 13 HE 13 HD 13 HZ 24
AMBIG n2:78 = HN 76 HZ 13 HE 13 HZ 24
AMBIG n2:83 = HN 96 HN 97 HD 97 HD 91
AMBIG n2:86 = HD1 66 HZ2 66
AMBIG n2:87 = HN 71 HH2 66 HZ3 66 HD1 66

Here the spectrum name and peak number were used to construct a label for each ambiguous peak. Then, an entry in the restraint file might look like this:

123 GLY HN 0 AMB n2:68 5.5
indicating a 5.5 Å upper bound between the amide proton of Gly 123 and a second proton, which might be either the HE or HZ protons of residue 86. (The "zero" residue number just serves as a placeholder, so that there will be the same number of columns as for non-ambiguous restraints.) If it is possible that the ambigous list might not be exhaustive (e.g. if some protons have not been assigned), it is safest to set ialtd=1, which will allow "mistakes" to be present in the constraint list. On the other hand, if you want to be sure that every violation is "active", set ialtd=0.

If the -les flag is set, the program will prepare distance restraints for multiple copies (LES) simulations. In this case, the input pdb file is one without LES copies, i.e. with just a single copy of the molecule. The "lesfile" specified by this flag is created by the addles program, and contains a mapping from original atom numbers into the copy numbers used in the multiple-copies simulation.

Distance constraints defined here cannot span two different copy sets, i.e., there cannot be one atom of a constraint torsion that is in one multiple copy set, where the other atom from the same restraint is in another copy set. It is OK to have some atoms with single copies, and others with multiple copies in the same restraint.

Preparing torsion angle restraints: makeANG_RST

There are fewer "standards" for representing coupling constant information. We have followed the DIANA convention in the program makeANG_RST. This program takes as input a five-column torsion angle constraint file along with an AMBER pdb file of the molecule. It creates as output (to standard out) a list of constraints in RST format that is readable by AMBER.

Usage: makeANG_RST -help
makeANG_RST -pdb ambpdb_file [-con constraint] [-lib libfile]
[-les lesfile ]

The input torsion angle constraint file can be read from standard in or from a file specified by the -con option on the command line. The input constraint file should look something like this:

1 GUA PPA 111.5 144.0
2 CYT EPSILN 20.9 100.0
2 CYT PPA 115.9 134.2
3 THY ALPHA 20.4 35.6
4 ADE GAMMA 54.7 78.8
5 GLY PHI 30.5 60.3
6 ALA CHI 20.0 50.0
....
Lines beginning with "#" are ignored. The first column is the residue number, the second is the residue name (three letter code, or as defined in your own personal torsion library file). Only the first three letters of the residue name are used, so that DIANA files that contain residues like "ASP-" will be correctly interpreted. Third is the angle name (taken from the torsion library described below).The fourth column contains the lower bound, and the fifth column the upper bound. Additional material on the line is (presently) ignored.

Note: It is assumed that the lower bound and the upper bound define a region of allowed conformation on the unit circle that is swept out in a clockwise direction from . If the number in the lb column is greater than the the number in the ub column, 360o will successively be subtracted from the lb until . This preserves the clockwise definition of the allowed conformation space, while also making the number that specifies the lower bound less than the number that specifies the upper bound, as is required by AMBER. If this occurs, a warning message will be printed to stderr to notify the user that the data has been modified.

The angles that one can constrain in this manner are defined in the library file that can be optionally specified on the command line with the -lib flag, or the default library "tordef.lib" (written by Garry P. Gippert) will be used. If you wish to specify your own nomenclature, or add angles that are not already defined in the default file, you should make a copy of this file and modify it to suit your needs. The general format for an entry in the library is:

LEU PSI N CA C N+
where the first column is the residue name, the second column is the angle name that will appear in the input file when specifying this angle, and the last four columns are the atom names that define the torsion angle. When a torsion angle contains atom(s) from a preceding or succeeding residue in the structure, a "-" or "+" is appended to those atom names in the library, thereby specifying that this is the case. In the example above, the atoms that define PSI for LEU residues are the N, CA, and C atoms of that same LEU and the N atom of the residue after that LEU in the primary structure. Note that the order of atoms in the definition is important and should reflect that the torsion angle rotates about the two central atoms as well as the fact that the four atoms are bonded in the order that is specified in the definition.

If the first letter of the second field is "J", this torsion is assumed to be a J-coupling constraint. In that case, three additional floats are read at the end of the line, giving the A,B and C coefficients for the Karplus relation for this torsion. For example:

ALA JHNA H N CA HA 9.5 -1.4 0.3
will set up a J-coupling restraint for the HN-HA 3-bond coupling, assuming a Karplus relation with A,B, C as 9.5, -1.4 and 0.3. (These particular values are from Brüschweiler and Case, JACS 116: 11199 (1994).)

This program also supports pseudorotation phase angle constraints for prolines and nucleic acid sugars; each of these will generate restraints for the 5 component angles which correspond to the lb and ub values of the input psuedorotation constraint. In the torsion library, a pseudorotation definition looks like:

PSEUDO CYT PPA NU0 NU1 NU2 NU3 NU4
CYT NU0 C4' O4' C1' C2'
CYT NU1 O4' C1' C2' C3'
CYT NU2 C1' C2' C3' C4'
CYT NU3 C2' C3' C4' O4'
CYT NU4 C3' C4' O4' C1'
The first line describes that a PSEUDOrotation angle is to be defined for CYT that is called PPA and is made up of the four angles NU0-NU4. Then the definition for NU0-NU4 should also apper in the file in the same format as the example given above for LEU PSI.

PPA stands for Pseudorotation Phase Angle and is the angle that should appear in the input constraint file when using pseudorotation constraints. The program then uses the definition of that PPA angle in the library file to look for the 5 other angles (NU0-NU4 in this case) which it then generates restraints for. PPA for proline residues is included in the standard library as well as for the DNA nucleotides.

If the -les flag is set, the program will prepare torsion angle restraints for multiple copies (LES) simulations. In this case, the input pdb file is one without LES copies, i.e. with just a single copy of the molecule. The "lesfile" specified by this flag is created by the addles program, and contains a mapping from original atom numbers into the copy numbers used in the multiple-copies simulation.

Torsion angle constraints defined here cannot span two different copy sets, i.e., there cannot be some atoms of a particular torsion that are in one multiple copy set, and other atoms from the same torsion that are in other copy sets. It is OK to have some atoms with single copies, and others with multiple copies in the same torsion. The program will create as many duplicate torsions as there are copies.

A good alternative to interpreting J-coupling constants in terms of torsion angle restraints is to refine directly against the coupling constants themselves, using a appropriate Karplus relation. See the discussion of the variable RJCOEF, above.

Chirality restraints: makeCHIR_RST

Usage: makeCHIR_RST

We also find it useful to add chirality constraints and trans-peptide constraints (where appropriate) to prevent chirality inversions or peptide bond flips during the high-temperature portions of simulated annealing runs. The program makeCHIR_RST will create these constraints. Note that you may have to edit the output of this program to change trans peptide constraints to cis, as appropriate.

NOESY volume restraints: makeVOL_RST

Refinement directly against measured NOESY volume restraints is also possible in sander. The makeVOL_RST is used to prepare detailed input for this.

Usage: makeVOL_RST [-c cutoff] [-p ] [-v ]


Defaults: cutoff is 5. Ang., is INPDB, is PEAKS

The input files are as follows:

  1. the volume-file has one line per peak (free format); the first two columns of each line give the Amber atom numbers for the protons involved in the peaks. (Usual convention: use the last atom number for methyl or aromatic delta and epsilon protons.) The remaining columns give the intensities at the various mixing times.
    The *first line* of the volume-file should have the following (free format): nmix, (emix(i),i=1,nmix), where nmix is the number of mixing times and emix gives the mixing times (in seconds).
    This file would generally be created by running makeDIST_RST using the "-svf" (sander volume format) option. See the instructions for makeDIST_RST for more information.
  2. the pdb-file is a Brookhaven-format file (made by "anal" or "ambpdb",) giving atom names and coordinates. Atom numbers in this file must correspond to those used in the volume-file, and in the Amber runs themselves.

On output, stdout contains a file that, with some minimal hand editing, should serve as input to SANDER. More explicitly, it creates groups and &noeexp namelists that inlcude many of the variables you will need. Any missing variables should be added by hand. [I do this since it seems easier to hand-edit the output file than to arrange for a lot of pass-throughs from this program into the output file.]

During refinement, measured NOESY volumes are included in penalty functions that depend upon where is the experimentally measured value, and I is the value corresponding the current conformation; the functional form of the penalty depends upon the ipnlty variable. Careful experimentation will undoubtedly be required for each data set to define a reasonable penalty function. Simply weighting each observed peak equally (with the default values of awt and arange) is almost certainly a bad idea, since this effectively gives too much influence to the strong peaks at the expense of longer-range information.

Direct dipolar coupling restraints: makeDIP_RST

For simulations with residual dipolar coupling restraints, makeDIP_RST is a simple code to prepare the input file. Type makeDIP_RST -help to obtain a more detailed description of the usage. For now, this code only handles backbone NH and CaH data. The header specifying values for various parameters needs to be manually added to the output of makeDIP_RST.

Use of residual dipolar coupling restraints is new both for AMBER and for the general NMR community. Refinement against these data should be carried out with care, and the optimal values for the force constant, penalty function, and initial guesses for the alignment tensor components are still under investigation. Here are some suggestions from the experiences so far:

  1. Beware of overfitting the dipolar coupling data in the expense of AMBER force field energy. These dipolar coupling data are very sensitive to tiny changes in the structure. It is often possible to drastically improve the fitting by making small distortions in the backbone angles. We recommend inclusion of explicit angle restraints to enforce ideal backbone geometry, especially for those residues that have corresponding residual dipolar coupling data.
  2. The initial values for the cartesian components of the alignment tensor can influence the final structure and alignment if the structure is not fixed (ibelly = 0). For a fixed structure (ibelly = 1), these values do not matter. Therefore, the current "best" strategy is to fit the experimental data to the fixed starting structure, and use the alignment tensor[s] obtained from this fitting as the initial guesses for further refinement.
  3. AMBER is capable of simultaneously fitting more than one set of alignment data. This allows the use of individually obtained datasets with different alignment tensors. However, if the different sets of data have equal directions of alignment but different magnitudes, using an overall scaling factor for these data with a single alignment tensor could greatly reduce the number of fitting parameters.
  4. Because the dipolar coupling splittings depend on the square root of the order parameters (0 S2 \(<=1), these order parameters describing internal motion of individual residues are often neglected (N. Tjandra and A. Bax, Science 278, 1111-1113, 1997). However, the square root of a small number can still be noticeably smaller than 1, so this may introduce undesirable errors in the calculations.

Getting summaries of NMR violations

If you specify LISTOUT=POUT when running sander, the output file will contain a lot of detailed information about the remaining restraint violations at the end of the run. When running a family of structures, it can be useful to process these ouput files with sviol, which takes a list of sander output files on the command line, and sends a summary of energies and violations to STDOUT. If you have more than 20 or so structures to analyze, the output from sviol becomes unwieldy. In this case you may also wish to use sviol2, which prints out somewhat less detailed information, but which can be used on larger families of structures. The senergy script gives a more detailed view of force-field energies from a series of structures. (We thank the TSRI NMR community for helping to put these scripts together, and for providing many useful suggestions.)

Modifying the force field

In carrying out NMR refinement simulations, it is common to modify the standard force field. The ifstrp variable allows one to treat nonbonded interactions as soft repulsions with no electrostatic contributions. Since bonds and angles are kept close to ideal values by the force constants inherent in the standard force field, and since the intrinsic dihedral barriers for single bonds are also quite small, this provides a "generic" or simplified representation of the allowed conformational space that may appeal to some users.

Other users will wish to use force fields that incorporate more of what we know about relative conformational energies, i.e. a more elaborate force field. Even here, though, some modifications may be advisable. For example, modifications we have found useful for peptides/proteins include increasing the torsional force constant for the peptide bond (to reduce the tendency of restained simulations to produce badly distorted bonds) and a reduction in the net charge of charged side chains (to compensate in part for neglect of explicit solvent). These changes are incorporated into a database and a frcmod file in the src/nmr_aux/forcefield subdirectory; see the README file in that directory.

Time-averaged restraints.

The model of the previous sections involves the "simgle-average-structure" idea, and tries to fit all constraints to a single model, with minimal deviations. A generalization of this model treats distance constraints arising from from NOE crosspeaks (for example) as being the average distance determined from a trajectory, rather than as the single distance derived from an average structure. Time-averaged bonds and angles are calculated as

where

= time-averaged value of the internal coordinate (distance or angle)
t
= the current time
= the exponential decay constant
= the value of the internal coordinate at time t'
i
= average is over internals to the inverse of i. Usually i = 3 or 6 for NOE distances, and -1 (linear averaging) for angles and torsions.
C
= a normalization integral.

Time-averaged torsions are calculated as

where is the torsion, and and are calculated using the equation above with or substituted for r(t').

Forces for time-averaged restraints can be calculated either of two ways. This option is chosen with the DISAVI / ANGAVI / TORAVI commands (Section 1). In the first (the default),

(and analogously for y and z). The forces then correspond to the standard flat-bottomed well functional form, with the instantaneous value of the internal replaced by the time-averaged value. For example, when ,

and similarly for other ranges of .

When the second option for calculating forces is chosen (IINC = 1 on a DISAVI, ANGAVI or TORAVI card), forces are calculated as

For example, when ,

Integration of this equation does not give Equation (4), but rather a non-intuitive expression for the energy (although one that still forces the bond to the target range). The reason that it may sometimes be preferable to use this second option is that the term , which occurs in the exact expression [Eq. (3)], varies as . When i=3, this means the forces can be varying with the fourth power the distance, which can possibly lead to very large transient forces and instabilities in the molecular dynamics trajectory. [Note that this will not be the case when linear scaling is performed, i.e. when i=-1, as is generally the case for valence and torsion angles. Thus, for linear scaling, the default (exact) force calculation should be used].

It should be noted that forces calculated using Equation (5) are not conservative forces, and would cause the system to gradually heat up, if no velocity rescaling were performed. The temperature coupling algorithm should act to maintain the average temperature near the target value. At any rate, this heating tendency should not be a problem in simulations, such as fitting NMR data, where MD is being used to sample conformational space rather than to extract thermodynamic data.

This section has described the methods of time-averaged restraints. For more discussion, the interested user is strongly urged to consult recent studies:

  1. "Time Averaged Nuclear Overhauser Effect Distance Restraints Applied to Tendamistat" by A.E. Torda, R.M. Scheek & W.F. van Gunsteren (1989) J. Mol. Biol. 214, 223-235; and
  2. "Are Time-Averaged Restraints Necessary for NMR Refinement: A Model Study for DNA" by D.A. Pearlman and P.A. Kollman (1991) J. Mol. Biol. 220, 457-479.
  3. "Structure refinement using time-averaged J-coupling constant restraints", by A.E. Torda, R.M. Brunne, T. Huber, H. Kessler and W.F. van Gunsteren, (1993) J. Biomol. NMR 3, 55-66.
  4. "How well do time-averaged J-coupling restraints work?", by D.A. Pearlman (1994). J. Biomol. NMR 4, 279-299.
  5. "How is an NMR structure best defined? An analysis of molecular dynamics distance-based approaches", by D.A. Pearlman (1994) J. Biomol. NMR 4, 1-16.  

Multiple copies refinement using LES

NMR restraints can be made compatible with the multiple copies (LES) facility; see the following chapter for more information about LES. To use NMR constraints with LES, you need to do two things:

  1. Add a line like "file wnmr name=(lesnmr) wovr" to you input to addles. The filename (lesnmr in this example) may be whatever you wish. This will cause addles to output an additional file that is needed at the next step.
  2. Add "-les lesnmr" to the command line arguments to makeDIST_RST. This will read in the file created by addles containing information about the copies. All NMR restraints will then be interpreted as "ambiguous" restraints, so that if any of the copies satisifies the restraint, the penalty goes to zero.

Note that although this scheme has worked well on small peptide test cases, we have yet not used it extensively for larger problems. This should be treated as an experimental option, and users should use caution in applying or interpreting the results.

Some sample input files

The next few pages contain excerpts from some sample NMR refinement files used at TSRI. The first example just sets up a simple (but often effective) simulated annealing run. You may have to adjust the length, temperature maximum, etc. somewhat to fit your problem, but these values work well for many "ordinary" NMR problems.

+---------------------------------------------------------------------------+
|		   1.  Simulated annealing NMR refinement		    |
+---------------------------------------------------------------------------+
|

| | 15ps simulated annealing protocol | | &cntrl | | nstlim=15000, ntt=1, (time limit, temp. control) | | scee=1.2, (scee must be set - 1-4 scale factor) | | ntpr=500, pencut=0.1, (control of printout) | | ipnlty=1, nmropt=1, (NMR penalty function options) | | vlimit=10, (prevent bad temp. jumps) | | ntb=0, (non-periodic simulation) | | igb=3, (don't use pme, do use r dielectric) | | &end | |# | |# Simple simulated annealing algorithm: | |# | |# from steps 0 to 1000: raise target temperature 10->1200K | |# from steps 1000 to 3000: leave at 1200K | |# from steps 3000 to 15000: re-cool to low temperatures | |# | | &wt type='TEMP0', istep1=0,istep2=1000,value1=10., | | value2=1200., &end | | &wt type='TEMP0', istep1=1001, istep2=3000, value1=1200., | | value2=1200.0, &end | | &wt type='TEMP0', istep1=3001, istep2=15000, value1=0., | | value2=0.0, &end | |# | |# Strength of temperature coupling: | |# steps 0 to 3000: tight coupling for heating and equilibration | |# steps 3000 to 11000: slow cooling phase | |# steps 11000 to 13000: somewhat faster cooling | |# steps 13000 to 15000: fast cooling, like a minimization | |# | | &wt type='TAUTP', istep1=0,istep2=3000,value1=0.2, | | value2=0.2, &end | | &wt type='TAUTP', istep1=3001,istep2=11000,value1=4.0, | | value2=2.0, &end | | &wt type='TAUTP', istep1=11001,istep2=13000,value1=1.0, | | value2=1.0, &end | | &wt type='TAUTP', istep1=13001,istep2=14000,value1=0.5, | | value2=0.5, &end | | &wt type='TAUTP', istep1=14001,istep2=15000,value1=0.05, | | value2=0.05, &end | | | | (continued on next page) | +---------------------------------------------------------------------------+

+---------------------------------------------------------------+
|      1.  Simulated annealing NMR refinement (continued)	|
+---------------------------------------------------------------+
|#								|
|#  "Ramp up" the restraints over the first 3000 steps:		|
|#								|
| &wt type='REST', istep1=0,istep2=3000,value1=0.1,		|
|	     value2=1.0,  &end					|
| &wt type='REST', istep1=3001,istep2=15000,value1=1.0,		|
|	     value2=1.0,  &end					|
|								|
| &wt type='END'  &end						|
|LISTOUT=POUT		       (get restraint violation list)	|
|DISANG=RST.f		       (file containing NMR restraints) |
+---------------------------------------------------------------+

The next example just shows some parts of the actual RST file that sander would read. This file would ordinarily not be made or edited by hand; rather, run the programs makeDIST_RST, makeANG_RST and makeCHIR_RST, combining the three outputs together to construct the RST file.

+--------------------------------------------------------------------+
|	    2.	Part of the RST.f file referred to above	     |
+--------------------------------------------------------------------+
|

| |# first, some distance constraints prepared by makeDIST_RST: | |# (comment line is input to makeRST, &rst namelist is output) | |# | |#( proton 1 proton 2 upper bound) | |#--------------------------------------------- | |# | |# 2 ILE HA 3 ALA HN 4.00 | |# | | &rst iat= 23, 40, r3= 4.00, r4= 4.50, | | r1 = 1.3, r2 = 1.8, rk2=0.0, rk3=32.0, ir6=1, &end | |# | |# 3 ALA HA 4 GLU HN 4.00 | |# | | &rst iat= 42, 50, r3= 4.00, r4= 4.50, &end | |# | |# 3 ALA HN 3 ALA MB 5.50 | |# | | &rst iat= 40, -1, r3= 6.22, r4= 6.72, | | igr1= 0, 0, 0, 0, igr2= 44, 45, 46, 0, &end | |# | |# .......etc...... | |# | |# next, some dihedral angle constraints, from makeANG_RST: | |# | | &rst iat= 213, 215, 217, 233, r1=-190.0, | | r2=-160.0, r3= -80.0, r4= -50.0, &end | | | | &rst iat= 233, 235, 237, 249, r1=-190.0, | | r2=-160.0, r3= -80.0, r4= -50.0, &end | | | |# .......etc....... | | | +--------------------------------------------------------------------+

+--------------------------------------------------------------------+
|     2.  Part of the RST.f file referred to above (continued)	     |
+--------------------------------------------------------------------+
|#								     |
|#								     |
|#   next, chirality and omega constraints prepared by makeCHIR_RST: |
|#								     |
|#								     |
|#  chirality for residue  1  atoms:   CA CG HB2 HB3		     |
| &rst iat= 3 , 8 , 6 , 7 ,					     |
|   r1=10., r2=60., r3=80., r4=130., rk2 = 10., rk3=10.,  &end	     |
|#								     |
|#  chirality for residue  1  atoms:   CB SD HG2 HG3		     |
| &rst iat= 5 , 11 , 9 , 10 , &end				     |
|#								     |
|#  chirality for residue  1  atoms:   N C HA CB		     |
| &rst iat= 1 , 18 , 4 , 5 , &end				     |
|#								     |
|#  chirality for residue  2  atoms:   CA CG2 CG1 HB		     |
| &rst iat= 22 , 26 , 30 , 25 , &end#				     |
|#								     |
|   ......etc........						     |
|								     |
|#  trans-omega constraint for residue	2			     |
| &rst iat= 22 , 20 , 18 , 3 ,					     |
|   r1=155., r2=175., r3=185., r4=205., rk2 = 80., rk3=80.,  &end    |
|#								     |
|#  trans-omega constraint for residue	3			     |
| &rst iat= 41 , 39 , 37 , 22 , &end				     |
|#								     |
|#  trans-omega constraint for residue	4			     |
| &rst iat= 51 , 49 , 47 , 41 , &end				     |
|#								     |
|#  ......etc........						     |
|#								     |
+--------------------------------------------------------------------+

The next example is an input file for volume-based NOE refinement. As with the distance/angle RST file shown above, the user would generally not construct this file, but create it from a "7-column" file using the makeVOL_RST program. Hand-editing might be used at the top of the file, to change the correlation times, etc.

+---------------------------------------------------------------------------------+
|		      3.  Sample NOESY intensity input file			  |
+---------------------------------------------------------------------------------+
|# A part of a NOESY intensity file, made by makeVOL_RST :			  |
|										  |
| &noeexp									  |
|  id2o=1,		  (exchangeable protons removed)			  |
|  oscale=6.21e-4,	  (scale between exp. and calc. intensity units)	  |
|  taumet=0.04,		  (correlation time for methyl rotation, in ns.)	  |
|  taurot=4.2,		  (protein tumbling time, in ns.)			  |
|  NPEAK = 13*3,	  (three peaks, each with 13 mixing times)		  |
|  EMIX = 2.0E-02,  3.0E-02,  4.0E-02,	5.0E-02,  6.0E-02,			  |
|     8.0E-02,	0.1,  0.126,  0.175,  0.2,  0.25,  0.3,	 0.35,			  |
|			  (mixing times, in sec.)				  |
|  IHP(1,1) = 13*423,  IHP(1,2) = 13*1029,  IHP(1,3) = 13*421,			  |
|			  (number of the first proton)				  |
|  JHP(1,1) = 78*568,  JHP(1,2) = 65*1057,  JHP(1,3) = 13*421,			  |
|			  (number of the second proton)				  |
|  AEXP(1,1) = 5.7244,	7.6276,	 7.7677,  9.3519,				  |
|	       10.733,	15.348,	 18.601,					  |
|	       21.314,	26.999,	 30.579,					  |
|	       33.57,  37.23,  40.011,						  |
|			  (intensities for the first cross-peak)		  |
|  AEXP(1,2) =	8.067,	11.095,	 13.127,  18.316,				  |
|	       22.19,	26.514,	 30.748,					  |
|	       39.438,	44.065,	 47.336,					  |
|	       54.467,	56.06,	 60.113,					  |
|  AEXP(1,3) =	7.708,	13.019,	 15.943,  19.374,				  |
|	       25.322,	28.118,	 35.118,					  |
|	       40.581,	49.054,	 53.083,					  |
|	       56.297,	59.326,	 62.174,					  |
| &end										  |
|SUBMOL1									  |
|RES 27 27 29 29 39 41 57 57 70 70 72 72 82 82	  (residues in this submol)	  |
|END										  |
|END										  |
|										  |
+---------------------------------------------------------------------------------+

Next, we illustrate the form of the file that holds residual dipolar coupling restraints. Again, this would generally be created from a human-readable input using the program makeDIP_RST.

+-----------------------------------------------------------------+
|   5.	Residual dipolar restraints, prepared by makeDIP_RST:	  |
+-----------------------------------------------------------------+
| &align							  |
|   ndip=91, dcut=-1.0, fj=0.1, dfac_al=37*-0.46623, 54*-0.21341, |
|   s11=3.883, s22=53.922, s12=33.855, s13=-4.508, s23=-0.559,	  |
|   id(1)=188,	 jd(1)=189,  dobs(1)=  6.24,			  |
|   id(2)=208,	 jd(2)=209,  dobs(2)= -10.39,			  |
|   id(3)=243,	 jd(3)=244,  dobs(3)= -8.12,			  |
|								  |
|   ....							  |
|   id(91)=1393,   jd(91)=1394,	 dobs(91)= -19.64,		  |
| &end								  |
+-----------------------------------------------------------------+

Finally, we show how the detailed input to sander could be used to generate a more complicated restraint. Here is where the user would have to understand the details of the RST file, since there are no "canned" programs to create this sort of restraint. This illustrates, though, the potential power of the program.

+---------------------------------------------------------------------------------------+
|			   5.  A more complicated constraint				|
+---------------------------------------------------------------------------------------+
|											|
|# 1)	 Define two centers of mass. COM1 is defined by					|
|#	 {C1 in residue 1; C1 in residue 2; N2 in residue 3; C1 in residue 4}.		|
|#	 COM2 is defined by {C4 in residue 1; O4 in residue 1;	N* in residue 1}.	|
|#	 (These definitions are effected by the igr1/igr2 and grnam1/grnam2		|
|#	 variables; You can use up to 200 atoms to define a center-of-mass		|
|#	 group)										|
|#											|
|# 2)	 Set up a distance restraint between COM1 and COM2 which goes from a		|
|#	 target value of 5.0A to 2.5A, with a force constant of 1.0, over steps 1-5000. |
|#											|
|# 3)	 Set up a distance restraint between COM1 and COM2 which remains fixed		|
|#	 at the value of 2.5A as the force slowly constant decreases from		|
|#	 1.0 to 0.01 over steps 5001-10000.						|
|#											|
|# 4)	 Sets up no distance restraint past step 10000, so that free (unrestrained)	|
|#	 dynamics takes place past this step.						|
|#											|
|											|
| &rst iat=-1,-1, nstep1=1,nstep2=5000,							|
|    iresid=1,irstyp=0,ifvari=1,ninc=0,imult=0,ir6=0,ifntyp=0,				|
|    r1=0.00000E+00,r2=5.0000,r3=5.0000,						|
|	r4=99.000,rk2=1.0000,rk3=1.0000,						|
|    r1a=0.00000E+00,r2a=2.5000,r3a=2.5000,						|
|	r4a=99.000,rk2a=1.0000,rk3a=1.0000,						|
|    igr1 = 2,3,4,5,0,									|
|    grnam1(1)='C1',grnam1(2)='C1',grnam1(3)='N2',grnam1(4)='C1',			|
|    igr2 = 1,1,1,0,									|
|    grnam2(1)='C4',grnam2(2)='O4',grnam2(3)='N*',					|
| &end											|
| &rst iat=-1,-1, nstep1=5001,nstep2=10000,						|
|    iresid=1,irstyp=0,ifvari=1,ninc=0,imult=0,ir6=0,ifntyp=0,				|
|    r1=0.00000E+00,r2=2.5000,r3=2.5000,						|
|	r4=99.000,rk2=1.0000,rk3=1.0000,						|
|    r1a=0.00000E+00,r2a=2.5000,r3a=2.5000,						|
|	r4a=99.000,rk2a=1.0000,rk3a=0.0100,						|
|    igr1 = 2,3,4,5,0,									|
|    grnam1(1)='C1',grnam1(2)='C1',grnam1(3)='N2',grnam1(4)='C1',			|
|    igr2 = 1,1,1,0,									|
|    grnam2(1)='C4',grnam2(2)='O4',grnam2(3)='N*',					|
| &end											|
|											|
+---------------------------------------------------------------------------------------+

\


[Contents] [Previous] [Next]
Updated on January 5, 2000. Comments to case@scripps.edu