Product Name - Current File Formats

Element records contain the various type of information about the elements (Table 4).


# code vdw radius cov. radius min val max val common 
weight

Element Record

Table 4 . Element Record Definition

Contents Comment
element Record identifier
element code One or two letter element name
vdw radius van der Waals radius of the element
covalent radius Covalent radius of the element
minimum valence Minimum number of bonds allowed for the element
maximum valence Maximum number of bonds allowed for the element
common valence Number of bonds in the most common state
atomic weight Atomic weight of the element

Table 4 . Element Record Definition
Contents	Comment
element	Record identifier
element code	One or two letter element name
vdw radius	van der Waals radius of the element
covalent radius	Covalent radius of the element
minimum valence	Minimum number of bonds allowed for the element
maximum valence	Maximum number of bonds allowed for the element
common valence	Number of bonds in the most common state
atomic weight	Atomic weight of the element

The bond record specifies the bond lengths between various elements. All the bond records must come after the element records. The bond lengths for all lone pairs are assumed to be 1.1 Å.


element H   1.10        0.32    1.0    1.0    
1.0        1.008 

element C   1.55        0.77    4.0    4.5    
4.0        12.011 

element N   1.40        0.75    2.5    5.0    
3.0        14.007

Bond Record

Table 5 . Bond Record Definition

Contents Comment
bond Record identifier
element code One- or two-letter element name
element code One- or two-letter element name
bond length Bond length in angstroms

Table 5 . Bond Record Definition
Contents	Comment
bond	Record identifier
element code	One- or two-letter element name
element code	One- or two-letter element name
bond length	Bond length in angstroms

Free-format files, with the .frm extension, should be stored in and used from the $BIOSYM/data/insight directory. If you do not have write permission in this directory, you can setenv $INSIGHT_DATA to another location before you start up Insight.


bond H  N        1.03 

bond C  C        1.54 

bond C  O        1.43 

bond L  S        1.1

Free-Format Files (.frm)

Format files consist of sequentially executed commands from the set described below. All but the FORMAT and BOND_TABLE commands are a single line. The commands in the group beginning FORMAT_ are followed by any number of field specifiers and terminated by an END_FORMAT command. All commands and field specifiers must appear exactly as listed here; no abbreviations or lower case letters are allowed.

IGNORE_FOR number

On input: Skip the given number of lines. Useful for skipping a fixed length header section (a line of the header can be read as the TITLE field).

On output: No function on output.

IGNORE_TO [start] string

On input: Skip lines until a line is encountered which contains the given string. The optional start parameter may be used to indicate in what column testing for the match should begin, or may be * to check for a match anywhere in the line. Note that column numbering begins at 1 for the leftmost column. If no start column is supplied, matching begins in column 1. This command might be used to get down to the ATOM section of a pdb file.

On output: No function on output.

IGNORE_WHILE [start] string

On input: Skip lines while they match the string given. The optional start parameter functions as in IGNORE_TO.

On output: No function on output.

MARKER [start] string

On input: Skip one line from the input file

On output: Output the given string starting in the column specified or column 1 if no start is given. An asterisk (*) given for the start column is interpreted as column 1 in this command. Used for such things as END markers that separate atom and connectivity sections of the file.

FORMAT_FOR number

On input: Read the specified number of lines using the format that follows. The number field is often a symbolic variable such as $NUM_ATOMS filled by an earlier read.

On output: Write the specified number of lines using the given format. When fields corresponding to atom data are included, the data come from the list of specified atoms starting at the beginning and advancing one atom each time the format is applied. If a symbolic variable like $NUM_ATOMS is used, then it is evaluated to the number of atoms in the object specified in the put command.

FORMAT_TO [start] string

On input: Read input lines using the format until a line containing the given string is encountered. The line with the matching string is not processed. This type of read can be used in conjunction with the MARKER command for files with sections separated by markers such as END.

On output: Write out the information in the format for every atom specified in the put command.

FORMAT_WHILE [start] string

On input Read input lines until encountering a line whose initial characters do not match the given string. Stops so that this non-matching line is the next line to be read. This type of read is designed for pdb-style files where sections are delimited by different keywords. The length and presence of the string does not affect the columns of the format specification.

On output: Write out the information in the format for every atom in the list. The string does not automatically appear in the line being written but may be output using a MARKER field in the format.

FORMAT_TO_EOF

On input: Read input lines using the format until the end of the input file is encountered. This type of read should be the last in a format description file, since all subsequent reads will fail.

On output: Write out the information in the format for every atom in the list.

END_FORMAT

On input: Marks end of a format specification.

On output: Marks end of a format specification.

BIDIRECTIONAL_BONDS

On input: Lets the system know that all bonds are listed twice, once in each direction. Insight .mdf files use this convention.

On output: Lets the system know that all bonds are listed twice, once in each direction. Insight .mdf files use this convention.

Fields

Fields are described by a keyword corresponding to some piece of information about the atom or molecule, a start position in the input line where this information is found, and a field length indicating how many characters are to be read/written for this field. The start position and/or field length may be an asterisk indicating space/comma-delimited fields. Floating-point fields such as ATOM_X may have an optional number of decimal places in their length specifier. The format is length.decimal_length.

The regular field types are described in Table 6.

Table 6 . Field Type

Field Name Description Type Examples
ATOM_NAME Name of atom (5 characters max) Can be referenced later to add more information to an already defined atom. It is therefore important that atom names be unique. (string) CD1,N
ATOM_NUMBER User defined atom number, not necessarily the same as the Insight sequence number. Can be referenced later, for example when processing a connectivity section. If no atom numbers are defined they are set to the sequence numbers. (integer) 1,12,32
ATOM_X X coordinate of atom. (float)
ATOM_Y Y coordinate of atom. (float)
ATOM_Z Z coordinate of atom. (float)
CELL_A,CELL_B,CELL_C Unit cell dimensions. The presence of any of these fields causes subsequent atom coordinates to be read or written as fractional space coordinates. (float)
ALPHA,BETA, GAMMA Unit cell angles in degrees. The presence of any of these fields causes subsequent atom coordinates to be read or written as fractional space coordinates. (float)
BOND_ORDER Bond order for a corresponding bond. The codes used to indicate various types of bonds may be defined in an optional BOND_TABLE. (integer)
BOND_FROM_NAME Name of the first of a pair of atoms to be connected. It has the special significance of advancing to the next bond record when encountered during output. (string)
BOND_FROM_NUMBER Number of the first of a pair of atoms to be connected. It has the special significance of advancing to the next bond record when encountered during output. (integer)
BOND_NUMBER Sequential number for each bond entry list. No function on input. (integer)
BOND_TO_NAME Name of the second of a pair of atoms that are to be connected. (string) HD1
BOND_TO_NUMBER Number of the second of a pair of atoms to be connected. (integer)
CHARGE Atom charge. (float)
ELEMENT_NAME Element type of an atom. Converted to an element code by Insight. (string) C,H,Ca,Br
ELEMENT_NUMBER Periodic table index of the element type of an atom. (integer)
GROUP Charge group name. (string)
OCCUPANCY Occupancy factor for atom. (float)
POTENTIAL_TYPE Potential atom type for an atom (7 characters max). (string) c=,hs
RESIDUE_TYPE The type of the current monomer/residue (4 characters max). (string) GLY,ARGn
RESIDUE_NUMBER The monomer/residue sequence number of the current monomer BOND_FROM_NAME residue including optional chain code alternate sequence indicator. This is also called the monomer/residue name in Insight parlance (7 characters max). (string) 1,A12,C172A
SPACE_GROUP The name of the crystallographic space group for the molecule. The presence of any of these fields causes subsequent atom coordinates to be read or written as fractional space coordinates. (string) P 1, C m c 2_1, R 3b
TEMP_FACTOR Temperature factor for atom. (float)
TITLE Title for the system. (string)
There are several special fields:
DEFINE_ATOM Used to indicate that this format should cause a new atom to be created each time the format is executed. This would usually be true for the first format of a format file but then not used in subsequent sections, such as connectivity, that refer to already defined atoms. Note: It is a common error to neglect to include DEFINE_ATOM in at least one format.
NEXT_LINE This means we are to skip to the next line and read it using the fields defined in the rest of the format. On output a new line is started, and output of the subsequent fields is on that line. There can be any number of NEXT_LINEs in a format.
NEXT_ATOM On input: This means we are to skip to the next atom and save further information in it. This could involve creating a new atom if we are executing a DEFINE_ATOM format but it is more geared to situations where you might have 5 charges per line. On output: Skip to the next atom and take any further output from it. Terminate execution of format if we run off the end of the atom list.
NUM_ATOMS Variable to hold the number of atoms. Once read it can be used as $NUM_ATOMS in a FORMAT_FOR command. When part of a format during output, the number of selected atoms is written.
NUM_BONDS Variable used to hold the total number of connections defined. It is often used to control the number of reads in the connectivity section. On output it is the sum of all the connections for all selected atoms, each bond being counted only once.
SPACES n For doing the equivalent of the X format in FORTRAN. Advances the current character position in the input/output line by the number of spaces specified. This has no effect if the next field has an absolute start column. SPACES is useful for implementing FORTRAN formats in the following style: (I4,1X,A4,1X,1X,3(F9.5,1X)..... is written as: ATOM_NAME * 4
SPACES 1
ATOM_NUMBER * 4
SPACES 2
ATOM_X * 9
SPACES 1

Table 6 . Field Type
Field Name	Description	Type	Examples
ATOM_NAME	Name of atom (5 characters max) Can be referenced later to add more information to an already defined atom. It is therefore important that atom names be unique.	(string)	CD1,N
ATOM_NUMBER	User defined atom number, not necessarily the same as the Insight sequence number. Can be referenced later, for example when processing a connectivity section. If no atom numbers are defined they are set to the sequence numbers.	(integer)	1,12,32
ATOM_X	X coordinate of atom.	(float)
ATOM_Y	Y coordinate of atom.	(float)
ATOM_Z	Z coordinate of atom.	(float)
CELL_A,CELL_B,CELL_C	Unit cell dimensions. The presence of any of these fields causes subsequent atom coordinates to be read or written as fractional space coordinates.	(float)
ALPHA,BETA, GAMMA	Unit cell angles in degrees. The presence of any of these fields causes subsequent atom coordinates to be read or written as fractional space coordinates.	(float)
BOND_ORDER	Bond order for a corresponding bond. The codes used to indicate various types of bonds may be defined in an optional BOND_TABLE.	(integer)
BOND_FROM_NAME	Name of the first of a pair of atoms to be connected. It has the special significance of advancing to the next bond record when encountered during output.	(string)
BOND_FROM_NUMBER	Number of the first of a pair of atoms to be connected. It has the special significance of advancing to the next bond record when encountered during output.	(integer)
BOND_NUMBER	Sequential number for each bond entry list. No function on input.	(integer)
BOND_TO_NAME	Name of the second of a pair of atoms that are to be connected.	(string)	HD1
BOND_TO_NUMBER	Number of the second of a pair of atoms to be connected.	(integer)
CHARGE	Atom charge.	(float)
ELEMENT_NAME	Element type of an atom. Converted to an element code by Insight.	(string)	C,H,Ca,Br
ELEMENT_NUMBER	Periodic table index of the element type of an atom.	(integer)
GROUP	Charge group name.	(string)
OCCUPANCY	Occupancy factor for atom.	(float)
POTENTIAL_TYPE	Potential atom type for an atom (7 characters max).	(string)	c=,hs
RESIDUE_TYPE	The type of the current monomer/residue (4 characters max).	(string)	GLY,ARGn
RESIDUE_NUMBER	The monomer/residue sequence number of the current monomer BOND_FROM_NAME residue including optional chain code alternate sequence indicator. This is also called the monomer/residue name in Insight parlance (7 characters max).	(string)	1,A12,C172A
SPACE_GROUP	The name of the crystallographic space group for the molecule. The presence of any of these fields causes subsequent atom coordinates to be read or written as fractional space coordinates.	(string)	P 1, C m c 2_1, R 3b
TEMP_FACTOR	Temperature factor for atom.	(float)
TITLE	Title for the system.	(string)
There are several special fields:
DEFINE_ATOM	Used to indicate that this format should cause a new atom to be created each time the format is executed. This would usually be true for the first format of a format file but then not used in subsequent sections, such as connectivity, that refer to already defined atoms. Note: It is a common error to neglect to include DEFINE_ATOM in at least one format.
NEXT_LINE	This means we are to skip to the next line and read it using the fields defined in the rest of the format. On output a new line is started, and output of the subsequent fields is on that line. There can be any number of NEXT_LINEs in a format.
NEXT_ATOM	On input: This means we are to skip to the next atom and save further information in it. This could involve creating a new atom if we are executing a DEFINE_ATOM format but it is more geared to situations where you might have 5 charges per line. On output: Skip to the next atom and take any further output from it. Terminate execution of format if we run off the end of the atom list.
NUM_ATOMS	Variable to hold the number of atoms. Once read it can be used as $NUM_ATOMS in a FORMAT_FOR command. When part of a format during output, the number of selected atoms is written.
NUM_BONDS	Variable used to hold the total number of connections defined. It is often used to control the number of reads in the connectivity section. On output it is the sum of all the connections for all selected atoms, each bond being counted only once.
SPACES n	For doing the equivalent of the X format in FORTRAN. Advances the current character position in the input/output line by the number of spaces specified. This has no effect if the next field has an absolute start column. SPACES is useful for implementing FORTRAN formats in the following style: (I4,1X,A4,1X,1X,3(F9.5,1X)..... is written as: ATOM_NAME * 4 SPACES 1 ATOM_NUMBER * 4 SPACES 2 ATOM_X * 9 SPACES 1

Format Descriptions

A format description consists of one of the format commands, followed by a variable number of field definitions and terminated by an END_FORMAT command. When processing a format there is a notion of current position which is important for delimited reading. The initial position is at the start of the input line. As the program reads each of the defined fields, it starts at either the current position, if the start of the field is *, or moves to the given column if one is supplied. If the field length is *, then reading continues until a delimiter is encountered (if a non-digit is found in an integer or float field, reading stops there as if it were a delimiter). If the field length is explicitly given then reading continues for exactly that many characters or until end of line, whichever happens first.

Bond Code Table

To accommodate a wide variety of bond order representations, the free format utility allows definition of a bond code table. This table allows you to associate the bond order codes of the file format being read/written with the Insight II bond orders. A table giving the code for any or all of the Insight II bond orders (SINGLE, DOUBLE, TRIPLE, PART_DOUBLE) is given in the following example:


BOND_TABLE

PART_DOUBLE			1.5

TRIPLE			4

END_TABLE

When using this example during input, a BOND_ORDER field with the value of 1.5 is interpreted as a partial double bond. On output the BOND_ORDER for a triple bond is written as 4.

If no bond table is given, the default bond order codes are:


1=SINGLE

2=DOUBLE

3=TRIPLE

4=PART_DOUBLE

Fractional Coordinates

The free format utility reads and writes atom coordinates as fractional space coordinates if any of the following fields are encountered in the format file: CELL_A, CELL_B, CELL,C, ALPHA, BETA, GAMMA, or SPACE_GROUP. If none of these field types are encountered, Cartesian space coordinates are assumed when reading and writing ATOM_X, ATOM_Y, and ATOM_Z fields.

Atom Creation vs. Referencing Existing Atoms

When to create a new atom is based on whether a format contains a DEFINE_ATOM statement. Every execution of a format containing DEFINE_ATOM creates a new entry in the atom list. Subsequent values from the fields of the format are saved in that atom. When no DEFINE_ATOM is present in a format, then each time an ATOM_NAME or ATOM_NUMBER field is read it is taken as a reference to an existing atom. This is why uniqueness of atom names and numbers can be important.

Important in this system is the notion of current atom. When you begin processing a file there are no atoms defined and hence the current atom is null. After processing a format in which atoms were defined, the atom list is non-empty. Before starting to process a subsequent format, set the current atom to the first atom of the atom list. Then after every application of the format you advance the current atom to the next in the list. This automatic stepping down the atom list provides for an implicit correspondence between different sections of an input file. The most common example is a file that has an atom definition section, and then a connectivity section where the lines correspond sequentially to the atoms in the atom section. When atom names or numbers are explicitly specified, an attempt is made to find that atom in the existing atom list and make it the current atom.

These rules are:


If there is a DEFINE_ATOM in the format

		{

		create a new atom and make it the current
atom

		}

else  if there is an atom number/name in format

		{

		find the specified atom and make it the 
current atom

		}

Add the fields read to the current atom



Advance the current atom to next in the list

Sample Free-Format Files

syblike Example

Following is the sample format file syblike.frm, for doing free_format input/output of syblike files.


# SYBLIKE.FRM

# Format file for doing free_format input/output of 
syblike files

#

IGNORE_WHILE "*"



# interpret bond order of 5 as partial double

BOND_TABLE

PART_DOUBLE		5

END_TABLE



# number of atom records

FORMAT_FOR 1

NUM_ATOMS			1	4

MARKER			6	"MOL"

TITLE			12	100

END_FORMAT



# atom records

FORMAT_FOR $NUM_ATOMS

DEFINE_ATOM

ATOM_NUMBER			1	4

ELEMENT_NUMBER			5	4

ATOM_X			9	9

ATOM_Y			18	9

ATOM_Z			27	9

ATOM_NAME			36	4

END_FORMAT



IGNORE_WHILE "*"



# number of bond records

FORMAT_FOR 1

NUM_BONDS			1	4

MARKER			6	"MOL"

END_FORMAT



# bond records

FORMAT_FOR $NUM_BONDS

BOND_NUMBER			*	4

BOND_FROM_NUMBER			*	4

BOND_TO_NUMBER			*	4

SPACES				9

BOND_ORDER		*	4

END_FORMAT



MARKER "0    MOL"

chemdlike Example

Following is the sample format file chemdlike.frm, for free format input/output of chemdlike files:


# CHEMDLIKE.FRM

# chemdlike format file for free format input/output

#

BIDIRECTIONAL_BONDS

LINE_LENGTH  85



IGNORE_WHILE "*"



# cell parameters

FORMAT_FOR 1

CELL_A			39	8

CELL_B			47	8

CELL_C			55	8

END_FORMAT

FORMAT_FOR 1

ALPHA			22	8

BETA			30	8

GAMMA			38	8

END_FORMAT



# number of atoms 

FORMAT_FOR 1

NUM_ATOMS			1	4

TITLE			10	60

END_FORMAT



MARKER "       Title2 not used"

FORMAT_FOR $NUM_ATOMS

DEFINE_ATOM

# special marker string to put 0's in all bond_to 
fields that 

# will not be filled with actual bonds

MARKER  		42	"   0   0   0   0   0   0   0   
0"    

ATOM_NUMBER			1	4

ATOM_NAME			6	4

ATOM_X			12	9

ATOM_Y			22	9

ATOM_Z			32	9

BOND_TO_NUMBER			42	4

BOND_TO_NUMBER			*	4

BOND_TO_NUMBER			*	4

BOND_TO_NUMBER			*	4

BOND_TO_NUMBER			*	4

BOND_TO_NUMBER			*	4

BOND_TO_NUMBER			*	4

BOND_TO_NUMBER			*	4

CHARGE			75	7.3

# marker for atom group field

MARKER			85	"1"

END_FORMAT

pdblike Example

Following is the sample format file pdblike.frm, for doing free format input/output of pdblike files:


# PDBLIKE.FRM                                  
Revised 7/13/89

# Format file for doing free format input/output of 
pdblike files 

#

# NOTE: Since the connectivity section may contain 
lines with 

# fewer bonds than the possible four, there may be 
messages about 

# inability to find atoms to connect to. 



IGNORE_TO "ATOM"



FORMAT_WHILE "ATOM"

DEFINE_ATOM

MARKER			1	"ATOM"

ATOM_NUMBER			7	5

ATOM_NAME			14	3

RESIDUE_TYPE			18	3

RESIDUE_NUMBER			23	4

ATOM_X			31	8.3

ATOM_Y			*	8.3

ATOM_Z			*	8.3

END_FORMAT



MARKER "TER"



# PDB files specify bonds twice, once in each direction,
so we

# need to set the bidirectional bonds flag

BIDIRECTIONAL_BONDS



FORMAT_WHILE "CONECT"

MARKER			1	"CONECT"

BOND_FROM_NUMBER			7	5

BOND_TO_NUMBER			*	5

BOND_TO_NUMBER			*	5

BOND_TO_NUMBER			*	5

BOND_TO_NUMBER			*	5

END_FORMAT

mdllike Example

Following is the sample format file mdllike.frm, for doing free format input/output of mdllike files:


# MDLLIKE.FRM

# Format file for doing free format input/output of 
mdllike files



FORMAT_FOR 1

TITLE		1	80

END_FORMAT



#Molecule Header

MARKER		""



#Comments

MARKER  "File Written using Insight Free Format Output"




#number of atoms and bonds

FORMAT_FOR 1

NUM_ATOMS 			1	3

NUM_BONDS			*	3

END_FORMAT



#atom records

FORMAT_FOR $NUM_ATOMS

DEFINE_ATOM

ATOM_X			*	10.4

ATOM_Y			*	10.4

ATOM_Z			*	10.4

SPACES 		1

ELEMENT_NAME			*	3

#NOTE: we cannot do the charges because they are 
coded in a non-

#standard way

END_FORMAT;



#bond records

FORMAT_FOR $NUM_BONDS

BOND_FROM_NUMBER			*	3

BOND_TO_NUMBER			*	3

BOND_ORDER			*	3.0

END_FORMAT

Standard Graph Definition File (.grf)

This section provides a description of the file format needed for the creation of standard graphs using the Graph/Get command.

As with graph files, you may include any number of comment lines at the top of the file. You may define as many graphs as you like, but remember that only nine graphs fit on the screen without overlapping one another.

Each graph may define multiple plots and may define the title of the graph. Each plot may define the color to be used, the point connection, and the symbol to use if points are to be displayed.

Each element of the graph definition is identified by a string. GRAPH indicates a new graph. The string TITLE is optional. If you want to give the graph a title, enter TITLE on the line following GRAPH. Follow TITLE with a space and then the actual title.

PLOT indicates a new plot. As mentioned above, a graph definition may contain several plot definitions. For each plot you may optionally specify:

The color of the plot using the keyword COLOR, followed by either a hue description (RED, BLUE, GREEN, YELLOW, CYAN, WHITE, etc.) or an RGB specification (e.g., 255,255,0).
Whether or not to display the plot as a bar graph, by using the keyword BAR followed by ON or OFF. If you specify BAR ON, then you must specify the dependent axis (keyword DEPENDENT AXIS) on the following line and which axis is dependent. The dependent axis may be X, Y, or Z. You can also optionally specify the width of the bars with the keyword SCALE followed by a real number.
Whether or not to display the lines of the plot. This is done using the keyword CONNECTION followed by ON or OFF. If CONNECTION is OFF, then you can display the points only by using the keywords POINT SYMBOL followed by the name of the symbol (X, BOX, CROSS, STAR, TRIANGLE). If no point symbol is specified, then the plot is not displayed. The plot is in the graph, but it is blank. If a point symbol is given, then you can control the size of its display by using the keyword SCALE followed by a real number.

If you specify BAR ON and CONNECTION OFF and specify a point symbol, the points symbols are not displayed. This is because individual points are simply not drawn if a bar display is used.

Following the optional display definitions, the x, y, and optionally z, functions are defined using the keywords X FUNCTION, Y FUNCTION, and Z FUNCTION, followed by the name of the function for that axis.

Below is the order in which each graph element definition should occur and which elements are optional:

GRAPH <required>
TITLE <optional>
PLOT <required>
COLOR <optional>
BAR <optional>
DEPENDENT AXIS <required, but only if BAR is ON>
SCALE <optional, but only if BAR is specified>
CONNECTION <optional>
POINT SYMBOL <optionally specified if CONNECTION is OFF>
SCALE <optional, but only if POINT SYMBOL is specified>
X FUNCTION <required>
Y FUNCTION <required>
Z FUNCTION <optional>

Blank lines may occur only in the comments at the top. Graph and plot definitions may not contain or be separated by any blank lines. If any required elements are missing or in the wrong order an error declaring a bad file format is displayed.

For standard graphs, all functions given in the .grf must be contained within the graph data file (.tbl). If a specific function in the standard graph definition file (.grf) cannot be located in the graph data file (.tbl), an error does not occur, but an informational message is displayed and the plot is not created.

Sample Graph Definition File

This sample of a standard graph file defines four graphs.

The first is 2D, with only one plot and a title. Notice that you may optionally specify a Z Function.

The second is 3D, with only one plot. This graph definition accepts the default color and point connection attributes.

The third graph defines two plots, and each is 2D. The first plot defines RED to be the color. The second plot defines the color to be BLUE (RED and BLUE are hues; see the Graph/Color commands description), and specifies that only the points should be displayed (CONNECTION is OFF) using the TRIANGLE symbol.

The next graph defines two plots, the first 3D and the second 2D. In the first, only points are displayed using the BOX symbol. The color uses an RGB specification; in this case yellow. In the second plot, both lines and points are displayed (if not specifically turned off, CONNECTION is ON). The color is light blue, the point symbol is a STAR, the scale of the points is 4.0.

The last graph defines a single 2D plot. The color is yellow, with a bar display, and Y specified as the independent axis.


GRAPH	! First graph !

TITLE Sample 1

PLOT	! Only plot in first graph !

X FUNCTION function_1

Y FUNCTION function_2

GRAPH	! Second graph !

PLOT	! Only plot in second graph!

X FUNCTION time

Y FUNCTION energy

Z FUNCTION None

GRAPH	! Third graph !

PLOT	! First plot in third graph !

COLOR RED

X FUNCTION function_a

Y FUNCTION function_b

PLOT	! Second plot in third graph !

COLOR BLUE

CONNECTION off

POINT SYMBOL TRIANGLE

X FUNCTION function_c

Y FUNCTION function_c

GRAPH	! Fourth graph !

TITLE Sample 

PLOT	! First plot in fourth graph !

COLOR 255,255,0

CONNECTION OFF

POINT SYMBOL BOX

X FUNCTION function_a

Y FUNCTION function_b

Z FUNCTION function_c

PLOT	! Second plot in fourth graph !

COLOR 0,255,255

POINT SYMBOL STAR

SCALE 4.0

X FUNCTION function_1

Y FUNCTION function_2

GRAPH	! Last graph !

PLOT	! Only plot in last graph !

COLOR YELLOW

BAR ON

DEPENDENT AXIS Y

X FUNCTION function_3

Y FUNCTION function_4

Hessian Files (.hessian, .hessianx, .xhessian)

The .hessian file contains the data for sets of gradients.

Following a successful completion, the finite-difference data are used to generate a second-derivative matrix. This is mass weighted and diagonalized to generate the harmonic vibrational spectrum. The second-derivative matrix (not mass-weighted) is appended to the .hessian file. Following the data for the last displacement, the flag matrix appears, followed by the lower triangle of elements of the second-derivative matrix. These data are in 5f12.7 format:


HESSIAN

H(1,1)

H(2,1)  H(2,2)

H(3,1)  H(3,2)  H(3,3)

The data continue to H(3N,3N), where N is the number of atoms.

The Discover program can output .hessian files, and the quantum programs produce and/or use files having "hessian" as part or all of their suffix.

The .hessian suffix indicates an ASCII Hessian in Discover format, and .hessianx, an ASCII Hessian in Turbomole format. Zindo, DMol and Turbomole can read both .hessian and .hessianx formats as input. Files of type .xhessian (XDR format) are no longer produced by the quantum programs (however, the quantum programs can still read them).

The following Hessian files are produced by quantum runs:

product calculation type Hessian file type
DMol optimization .hessian
frequency .hessian
Turbomole optimization .hessian
frequency .hessianx
Zindo optimization .hessian
frequency .hessian

product	calculation type	Hessian file type
DMol	optimization	.hessian
	frequency	.hessian
Turbomole	optimization	.hessian
	frequency	.hessianx
Zindo	optimization	.hessian
	frequency	.hessian

Sample .hessian FIle


$hessian

 1  1   0.6780639398   0.0000000000   0.0000000000  -0.1259825011   0.0000000000

 1  2   0.0000000000  -0.2760407194   0.0000000000  -0.0947402056  -0.2760407194

 1  3   0.0000000000   0.0947402056

 2  1   0.0000000000   0.2160004237   0.0000000000   0.0000000000  -0.0719526695

 2  2   0.0000000000   0.0000000000  -0.0720238769   0.0000000000   0.0000000000

 2  3  -0.0720238769   0.0000000000

 3  1   0.0000000000   0.0000000000   1.2506493403   0.0000000000   0.0000000000

 3  2  -1.0175861358  -0.0926737292   0.0000000000  -0.1165316022   0.0926737292

 3  3   0.0000000000  -0.1165316022

 4  1  -0.1259825011   0.0000000000   0.0000000000   0.0877428007   0.0000000000

 4  2   0.0000000000   0.0191198500   0.0000000000  -0.0434219786   0.0191198500

 4  3   0.0000000000   0.0434219786

 5  1   0.0000000000  -0.0719526695   0.0000000000   0.0000000000   0.0239681673

 5  2   0.0000000000   0.0000000000   0.0239922511   0.0000000000   0.0000000000

 5  3   0.0239922511   0.0000000000

 6  1   0.0000000000   0.0000000000  -1.0175861358   0.0000000000   0.0000000000

 6  2   1.1102928577  -0.0172186375   0.0000000000  -0.0463533610   0.0172186375

 6  3   0.0000000000  -0.0463533610

 7  1  -0.2760407194   0.0000000000  -0.0926737292   0.0191198500   0.0000000000

 7  2  -0.0172186375   0.2759933671   0.0000000000   0.1240272754  -0.0190724977

 7  3   0.0000000000  -0.0141349086

 8  1   0.0000000000  -0.0720238769   0.0000000000   0.0000000000   0.0239922511

 8  2   0.0000000000   0.0000000000   0.0240124468   0.0000000000   0.0000000000

 8  3   0.0240191791   0.0000000000

 9  1  -0.0947402056   0.0000000000  -0.1165316022  -0.0434219786   0.0000000000

 9  2  -0.0463533610   0.1240272754   0.0000000000   0.1525595219   0.0141349086

 9  3   0.0000000000   0.0103254413

10  1  -0.2760407194   0.0000000000   0.0926737292   0.0191198500   0.0000000000

10  2   0.0172186375  -0.0190724977   0.0000000000   0.0141349086   0.2759933671

10  3   0.0000000000  -0.1240272754

11  1   0.0000000000  -0.0720238769   0.0000000000   0.0000000000   0.0239922511

11  2   0.0000000000   0.0000000000   0.0240191791   0.0000000000   0.0000000000

11  3   0.0240124468   0.0000000000

12  1   0.0947402056   0.0000000000  -0.1165316022   0.0434219786   0.0000000000

12  2  -0.0463533610  -0.0141349086   0.0000000000   0.0103254413  -0.1240272754

12  3   0.0000000000   0.1525595219

$hessian (projected)

 1  1   0.6780518841   0.0000000000   0.0000000000  -0.1259711247   0.0000000000

 1  2   0.0000000000  -0.2760403797   0.0000000000  -0.0947424832  -0.2760403797

 1  3   0.0000000000   0.0947424832

 2  1   0.0000000000   0.2159909630   0.0000000000   0.0000000000  -0.0719448964

 2  2   0.0000000000   0.0000000000  -0.0720230333   0.0000000000   0.0000000000

 2  3  -0.0720230333   0.0000000000

 3  1   0.0000000000   0.0000000000   1.2506493402   0.0000000000   0.0000000000

 3  2  -1.0175861358  -0.0926737292   0.0000000000  -0.1165316022   0.0926737292

 3  3   0.0000000000  -0.1165316022

 4  1  -0.1259711247   0.0000000000   0.0000000000   0.0877339193   0.0000000000

 4  2   0.0000000000   0.0191186027   0.0000000000  -0.0434183083   0.0191186027

 4  3   0.0000000000   0.0434183083

 5  1   0.0000000000  -0.0719448964   0.0000000000   0.0000000000   0.0239642810

 5  2   0.0000000000   0.0000000000   0.0239903077   0.0000000000   0.0000000000

 5  3   0.0239903077   0.0000000000

 6  1   0.0000000000   0.0000000000  -1.0175861358   0.0000000000   0.0000000000

 6  2   1.1102928578  -0.0172186375   0.0000000000  -0.0463533610   0.0172186375

 6  3   0.0000000000  -0.0463533610

 7  1  -0.2760403797   0.0000000000  -0.0926737292   0.0191186027   0.0000000000

 7  2  -0.0172186375   0.2759938209   0.0000000000   0.1240265791  -0.0190720439

 7  3   0.0000000000  -0.0141342123

 8  1   0.0000000000  -0.0720230333   0.0000000000   0.0000000000   0.0239903077

 8  2   0.0000000000   0.0000000000   0.0240163628   0.0000000000   0.0000000000

 8  3   0.0240163628   0.0000000000

 9  1  -0.0947424832   0.0000000000  -0.1165316022  -0.0434183083   0.0000000000

 9  2  -0.0463533610   0.1240265791   0.0000000000   0.1525603398   0.0141342123

 9  3   0.0000000000   0.0103246234

10  1  -0.2760403797   0.0000000000   0.0926737292   0.0191186027   0.0000000000

10  2   0.0172186375  -0.0190720439   0.0000000000   0.0141342123   0.2759938209

10  3   0.0000000000  -0.1240265791

11  1   0.0000000000  -0.0720230333   0.0000000000   0.0000000000   0.0239903077

11  2   0.0000000000   0.0000000000   0.0240163628   0.0000000000   0.0000000000

11  3   0.0240163628   0.0000000000

12  1   0.0947424832   0.0000000000  -0.1165316022   0.0434183083   0.0000000000

12  2  -0.0463533610  -0.0141342123   0.0000000000   0.0103246234  -0.1240265791

12  3   0.0000000000   0.1525603398

$end

Dynamics Trajectory History Files (.his and .fhis)

Special Information for the Discover 2.9.x Program

There are two forms of the Discover 2.9.x history file, .his and .fhis.

.his is the file to which the dynamics history is periodically written during a Discover 2.9.x dynamics calculation. It is a binary file, and for a reasonable-length dynamics run it can become fairly large. It contains coordinates and other pertinent information for the system being simulated. The frequency with which this file is updated can be modified with the initialize and restart dynamics commands of the Discover 2.9.x program (see the Discover User Guide).

The .his file is written using FORTRAN unformatted I/O with the records described in Table 7. For each record, the types of the variables and the lengths of the arrays, if applicable, are given. The first frame contains extra information about the atom types, movable atoms, etc.; subsequent frames contain only the changing information--coordinates, velocities, etc.

Table 7 . Format of .his File (Page 1 of 3)
record	type	array	contents
1	integer		control variable: 0 for first frame not 0 for subsequent frames
2	character*4	20	character string giving version information
	real*8		control variable: the Discover version (Vershn)
3	character*4	20	title
4	character*4	20	title
5	integer		number of forcefield atom types (NAtTyp)
	character*4	NAtTyp	names of forcefield atom types
	real*8	NAtTyp	atomic masses of forcefield atom types
6	integer		number of residue names (NNmRes)
	character*4	NNmRes	names of residues
7	integer		number of atoms in the system (NAtoms)
	integer	NAtoms	index of atom's forcefield atom type
	character*4	NAtoms	name of atom (for Vershn < 2.9.0)
	character*5	NAtoms	name of atom (for Vershn 2.9.0)
8	integer		reserved
	integer		number of moveable atoms in the system (NAtMov)
	integer	NAtMov	index of moveable atoms (for Vershn 2.6, not present prior to that)
9	integer		number of molecules (NMol)
	integer	NMol	number of atoms per molecule
	integer	NMol	number of residues per molecule
10	integer		total number of residues (NRes)
	integer	2,NRes	first and last atoms in each residue
	integer	NRes	index into names of residues
11A	integer		number of bonds (NBonds)
11B	integer	2,NBonds	I and J atoms for each bond (this record exists only if NBonds is > 0)
12	real*8	6	unit cell parameters: a, b, c, alpha, beta, gamma
	real*8	3,3	lattice vectors
	real*8	3,3	transformation matrix from crystal to Cartesian coordinates
	real*8	3,3	transformation matrix from Cartesian to crystal coordinates
	real*8	3,3,196	matrices for space group symmetry operators
	real*8	3,196	translation vector for each operator in the space group
	real*8	3,3,196	rotation matrix for each operator
	integer		number of symmetry operations in the space group
	real*8	4	reserved
	integer		reserved
	real*8	6	reserved
	integer	6	reserved
13	integer		number of component energies (NEner)
	real*8		time step in fs
	integer		frequency (in steps) for writing the frames
	integer		starting step number
14	real*8		total energy (kcal mol^-1)
	real*8		total potential energy (kcal mol^-1)
	real*8		total kinetic energy (kcal mol^-1)
	real*8	NEner	component energies (kcal mol^-1)
	real*8	NMol	potential energy per molecule (kcal mol^-1)
	real*8	NMol,NEner	component energies per molecule (kcal mol^-1)
	real*8	NMol	van der Waals dispersion energy per molecule
	real*8	NMol	van der Waals repulsion energy per molecule
	real*8	NMol	van der Waals energy per molecule (kcal mol^-1)
	real*8	NMol	coulombic energy per molecule (kcal mol^-1)
	real*8		pressure in bar
	real*8		reserved
	real*8	3x3	pressure tensor in bar
	real*8	3x3	reserved
	real*8	3x3	kinetic energy contribution to pressure
	real*8	3x3	reserved
	real*8	3x3	virial contribution to the pressure
	real*8	3x3	reserved
15	real*4	3,NAtoms	Cartesian coordinates of the atoms in angstroms
16	real*4	3,NAtoms	Cartesian velocities of the atoms in angstroms per timestep
Subsequent frames repeat the following records:
N	integer		Control variable: 0 for first frame not 0 for subsequent frames
N+1	real*8		total energy (kcal mol^-1)
	real*8		total potential energy (kcal mol^-1)
	real*8		total kinetic energy (kcal mol^-1)
	real*8	NEner	component energies (kcal mol^-1)
	real*8	NMol	potential energy per molecule (kcal mol^-1)
	real*8	NMol,NEner	component energies per molecule (kcal mol^-1)
	real*8	NMol	van der Waals dispersion energy per molecule
	real*8	NMol	van der Waals repulsion energy per molecule
	real*8	NMol	van der Waals energy per molecule (kcal mol^-1)
	real*8	NMol	coulombic energy per molecule (kcal mol^-1)
	real*8		pressure in bar
	real*8		reserved
	real*8	3x3	pressure tensor in bar
	real*8	3x3	reserved
	real*8	3x3	kinetic energy contribution to pressure
	real*8	3x3	reserved
	real*8	3x3	virial contribution to the pressure
	real*8	3x3	reserved
N+2	real*8	6	unit cell parameters: a, b, c, alpha, beta, gamma
	real*8	3,3	lattice vectors in angstroms
In the following two records, prior to version 2.6 data was written for all atoms (N = NAtoms); for version 2.6 or later only the coordinates and velocities for moving atoms are present (N = NMovAt).
N+3	real*4	3,N	Cartesian coordinates in angstroms
N+4	real*4	3,N	Cartesian velocities in angstroms per timestep

.fhis is a formatted ASCII version of the .his file. The .fhis file is created from the .his file by the utility formhis and can be reconverted into an unformatted history file with the utility uformhis. The .fhis file is a text file that can be viewed and edited. It is also independent of a particular machine's representation of numbers and so can be transferred between dissimilar computers. The file is written using FORTRAN formatted I/O. Table 8 shows the FORTRAN format used in creating the .fhis file. A format that is enclosed in parentheses and preceded by a number indicates that the information is on more than one line, each of which has the indicated format. The number indicates the number of lines.

Table 8 . Format of .fhis File (Page 1 of 3)

record format contents
1 I1 control variable: 0 for first frame
not 0 for subsequent frames
2 20A4,F4.2 character string identifying the version control variable: the Discover version (Vershn)
3 20A4 title
4 20A4 title
5 9I5 number of forcefield atom types (NAtTyp)
number of residue names (NNmRes)
number of atoms (NAtoms)
reserved
number of moveable atoms (NAtMov)
number of molecules (NMol)
number of residues (NRes)
number of bonds (NBonds)
number of space group symmetry operations (NSymOp)
6 NAtTyp(A4,F10.6) name and atomic mass for each forcefield atom type
7 NNmRes(A4) name of each residue
8 NAtoms(I3,A4) index of forcefield type and name for each atom (for Vershn 2.9.0 the format is (I3,A5))
9 NMol(2I5) number of atoms and residues for each molecule
10 NRes(3I5) first and last atom and index of name for each residue
11 NBonds(2I5) I and J atoms for each bond
Record 12 is present only if Vershn is greater than or equal to 2.6
12 NAtMov(I5) indices of the moving atoms
13 11(3E14.8) unit cell parameters: a, b, c, alpha, beta, gamma
unit cell vectors (3x3 matrix) transformation matrix cell coordinates to Cartesian coordinates (3x3 matrix) transformation matrix Cartesian coordinates to cell coordinates (3x3 matrix)
Records 14-16 are present only if the calculation uses periodic boundary conditions (PBC), in which case NSymOp is greater than 0.
14 NSymOp(9F5.2) matrices for space group symmetry operators
15 NSymOp(3F5.2) translation vector for each operator in the space group
16 NSymOp*3(3E14.8) rotation matrix for each operator
17 3E14.8 reserved (3 long real*8 vector)
18 E14.8 reserved
19 2(3E14.8) reserved (6 long real*8 vector)
20 7I3 reserved (7 long integer vector)
21 3I10,F6.2 number of component energies (NEner)
frequency (in steps) for writing frames
initial step number
timestep in fs
22 3E14.8 total energy, potential energy, and kinetic energy
23 NEner(E14.8) component energies
24 NMol*NEner(E14.8) component energies for each molecule (the index of the molecules runs fastest; thus, the list of the first component energies for each molecule comes first, then for the second component energy)
25 NMol(5E14.8) the total, dispersion, repulsion, van der Waals, and electrostatic energies for each molecule
26 E14.8 the pressure for PBC calculations (in bar)
27 E14.8 reserved (real*8 number)
28 3(3E14.8) pressure tensor (3x3 matrix) in bar
29 3(3E14.8) reserved (3x3 matrix)
30 3(3E14.8) kinetic energy contribution to the pressure (3x3 matrix)
31 3(3E14.8) reserved (3x3 matrix)
32 3(3E14.8) virial contribution to the pressure (3x3 matrix)
33 3(3E14.8) reserved (3x3 matrix)
34 Natoms(3E14.8) x, y and z coordinates for each atom
35 NAtoms(3E14.8) x, y and z velocities for each atom in angstroms per timestep
Subsequent frames repeat the following records:
N I1 control variable: 0 for first frame
1 for subsequent frames
N+1 3E14.8 total energy, potential energy, and kinetic energy
N+2 NEner(E14.8) component energies
N+3 NMol*NEner(E14.8) component energies for each molecule (the index of the molecules runs fastest; thus, the list of the first component energies for each molecule comes first, then for the second component energy)
N+4 NMol(5E14.8) the total, dispersion, repulsion, van der Waals, and electrostatic energies for each molecule
N+5 E14.8 the pressure for PBC calculations (in bar)
N+6 E14.8 reserved (real*8 number)
N+7 3(3E14.8) pressure tensor (3x3 matrix) in bar
N+8 3(3E14.8) reserved (3x3 matrix)
N+9 3(3E14.8) kinetic energy contribution to the pressure (3x3 matrix)
N+10 3(3E14.8) reserved (3x3 matrix)
N+11 3(3E14.8) virial contribution to the pressure (3x3 matrix)
N+12 3(3E14.8) reserved (3x3 matrix)
N+13 2(3E14.8) unit cell parameters: a, b, c, alpha, beta, gamma
N+14 3(3E14.8) unit cell vectors
In the following two records, prior to Discover version 2.6, data was written for all atoms (N = NAtoms); for version 2.6 and later, the coordinates and velocities are present only for moving atoms (N = NMovAt).
N+15 N(3E14.8) x, y and z coordinates for each atom
N+16 N(3E14.8) x, y and z velocities for each atom in angstroms per timestep
For Discover version 2.6 and later there are 18 component energies:
1 bond
2 angle
3 torsion
4 out-of-plane
5 bond-bond
6 bond-angle
7 angle-angle
8 bond-torsion
9 angle-torsion
10 angle-angle-torsion
11 1-3 bond-bond
12 out-of-plane-out-of-plane
13 torsion-torsion
14 total van der Waals
15 van der Waals repulsion
16 van der Waals attraction (dispersion)
17 electrostatic
18 10-12 hydrogen bond

Table 8 . Format of .fhis File (Page 1 of 3)
record	format	contents
1	I1	control variable: 0 for first frame not 0 for subsequent frames
2	20A4,F4.2	character string identifying the version control variable: the Discover version (Vershn)
3	20A4	title
4	20A4	title
5	9I5	number of forcefield atom types (NAtTyp) number of residue names (NNmRes) number of atoms (NAtoms) reserved number of moveable atoms (NAtMov) number of molecules (NMol) number of residues (NRes) number of bonds (NBonds) number of space group symmetry operations (NSymOp)
6	NAtTyp(A4,F10.6)	name and atomic mass for each forcefield atom type
7	NNmRes(A4)	name of each residue
8	NAtoms(I3,A4)	index of forcefield type and name for each atom (for Vershn 2.9.0 the format is (I3,A5))
9	NMol(2I5)	number of atoms and residues for each molecule
10	NRes(3I5)	first and last atom and index of name for each residue
11	NBonds(2I5)	I and J atoms for each bond
Record 12 is present only if Vershn is greater than or equal to 2.6
12	NAtMov(I5)	indices of the moving atoms
13	11(3E14.8)	unit cell parameters: a, b, c, alpha, beta, gamma unit cell vectors (3x3 matrix) transformation matrix cell coordinates to Cartesian coordinates (3x3 matrix) transformation matrix Cartesian coordinates to cell coordinates (3x3 matrix)
Records 14-16 are present only if the calculation uses periodic boundary conditions (PBC), in which case NSymOp is greater than 0.
14	NSymOp(9F5.2)	matrices for space group symmetry operators
15	NSymOp(3F5.2)	translation vector for each operator in the space group
16	NSymOp*3(3E14.8)	rotation matrix for each operator
17	3E14.8	reserved (3 long real*8 vector)
18	E14.8	reserved
19	2(3E14.8)	reserved (6 long real*8 vector)
20	7I3	reserved (7 long integer vector)
21	3I10,F6.2	number of component energies (NEner) frequency (in steps) for writing frames initial step number timestep in fs
22	3E14.8	total energy, potential energy, and kinetic energy
23	NEner(E14.8)	component energies
24	NMol*NEner(E14.8)	component energies for each molecule (the index of the molecules runs fastest; thus, the list of the first component energies for each molecule comes first, then for the second component energy)
25	NMol(5E14.8)	the total, dispersion, repulsion, van der Waals, and electrostatic energies for each molecule
26	E14.8	the pressure for PBC calculations (in bar)
27	E14.8	reserved (real*8 number)
28	3(3E14.8)	pressure tensor (3x3 matrix) in bar
29	3(3E14.8)	reserved (3x3 matrix)
30	3(3E14.8)	kinetic energy contribution to the pressure (3x3 matrix)
31	3(3E14.8)	reserved (3x3 matrix)
32	3(3E14.8)	virial contribution to the pressure (3x3 matrix)
33	3(3E14.8)	reserved (3x3 matrix)
34	Natoms(3E14.8)	x, y and z coordinates for each atom
35	NAtoms(3E14.8)	x, y and z velocities for each atom in angstroms per timestep
Subsequent frames repeat the following records:
N	I1	control variable: 0 for first frame 1 for subsequent frames
N+1	3E14.8	total energy, potential energy, and kinetic energy
N+2	NEner(E14.8)	component energies
N+3	NMol*NEner(E14.8)	component energies for each molecule (the index of the molecules runs fastest; thus, the list of the first component energies for each molecule comes first, then for the second component energy)
N+4	NMol(5E14.8)	the total, dispersion, repulsion, van der Waals, and electrostatic energies for each molecule
N+5	E14.8	the pressure for PBC calculations (in bar)
N+6	E14.8	reserved (real*8 number)
N+7	3(3E14.8)	pressure tensor (3x3 matrix) in bar
N+8	3(3E14.8)	reserved (3x3 matrix)
N+9	3(3E14.8)	kinetic energy contribution to the pressure (3x3 matrix)
N+10	3(3E14.8)	reserved (3x3 matrix)
N+11	3(3E14.8)	virial contribution to the pressure (3x3 matrix)
N+12	3(3E14.8)	reserved (3x3 matrix)
N+13	2(3E14.8)	unit cell parameters: a, b, c, alpha, beta, gamma
N+14	3(3E14.8)	unit cell vectors
In the following two records, prior to Discover version 2.6, data was written for all atoms (N = NAtoms); for version 2.6 and later, the coordinates and velocities are present only for moving atoms (N = NMovAt).
N+15	N(3E14.8)	x, y and z coordinates for each atom
N+16	N(3E14.8)	x, y and z velocities for each atom in angstroms per timestep
For Discover version 2.6 and later there are 18 component energies:
	1	bond
	2	angle
	3	torsion
	4	out-of-plane
	5	bond-bond
	6	bond-angle
	7	angle-angle
	8	bond-torsion
	9	angle-torsion
	10	angle-angle-torsion
	11	1-3 bond-bond
	12	out-of-plane-out-of-plane
	13	torsion-torsion
	14	total van der Waals
	15	van der Waals repulsion
	16	van der Waals attraction (dispersion)
	17	electrostatic
	18	10-12 hydrogen bond

Special Information for the Discover 97.0/3.0.0 Program

The Discover 97.0/3.0.0 program typically sends information during dynamics runs to .arc, .out, tbl, and/or user-named files--See the Insight online help and the discussion of the output command of the Discover 2.9.7/97.0/3.0.0 User Guide for information on controlling what the Discover 97.0/3.0.0 program includes in these files and how often information is output during a dynamics run.

However, the Discover 97.0/3.0.0 program can read and write history files. These files are in the same format as those of Discover 2.9.x and can be read by the Insight program. (Some fields are set to 0.0, however, since certain data are not stored by the Discover97.0/3.0.0 program.) The history file is written by using a print command during a minimization or dynamics simulation.

The readFile command may be used to read a particular frame of a history file into the Discover program. In this way a history file might be converted into an archive file, for instance, by using the writeFile archive command. The return value of the readFile command, when it is applied to a history or archive file, is the potential energy of that frame. This would allow you to, for instance, construct scripts that sort the frames in an archive or history file based on energy.

Layout Template File (.ltpl)

The layout template file contains descriptions of one or more layout templates. These templates describe the relative sizes and positions of windows in a window layout.

A system layout template file that contains simple default templates is read when Insight II starts up. These templates are therefore automatically available in every session. If you create some layout templates that you want to make generally available to all users at your site, you can add them to this file and they will be present each time Insight II is run. The pathname of this file is:

$BIOSYM/data/insight/insight.ltpl

The layout template file is a free-format file. Blank lines are ignored, and any line that begins with the character "!" is considered a comment and is also ignored.

The first line must contain the header !BIOSYM layout_template 1.

The next lines contain definitions of layout templates, each with the following format.

The first line contains the keyword Layout_template: followed by the name of the layout template.
The next line contains the keyword Layout_template_type: followed by one of the two valid types, Free_format or Stacked.

For Free_format templates the next lines contain template entry definitions, each with the following format.

The first line contains the keyword Layout_template_entry:
The next lines contain one of the keywords Left:, Right:, Top:, or Bottom:, followed by the position of the corresponding edge of the window expressed as a percent of the total size of the layout.

For Stacked templates, the next line contains the keyword Stack_offset: followed by the number of pixels the window is to be offset from the top left corner of the preceding window.

Sample Layout Template File


!BIOSYM layout_template 1



Layout_template:SIDE_BY_SIDE

Layout_template_type:Free_format



    Layout_template_entry:

    Left:0.000000

    Right:50.000000

    Top:0.000000

    Bottom:100.000000



    Layout_template_entry:

    Left:50.000000

    Right:100.000000

    Top:0.000000

    Bottom:100.000000



Layout_template:STACKED

Layout_template_type:Stacked

Stack_offset:30

Exclusion Shell File (.ludi_pseudo_protein, fort.12)

This file describes the exclusion shell that Ludi constructs from the active analogs. No fragment will be fit outside of this shell. The file is in PDB format and can be read into Insight II by turning the Load_Pseudo_Protein parameter on in the Ludi/Load command. The fort.12 file produced by the Ludi/Run background job is automatically renamed to <run_name>.ludi_pseudo_protein when the background job completes.

Molecular Data File (.mdf)

The molecular data (.mdf) file contains static information about a molecular system. This is information that does not change during the course of a calculation.

Note that .mdf files are used by several Insight II products; therefore, some of the information present may be ignored by some programs or used only by certain programs.

The molecular data file has been changed minimally since the previous versions of the Discover and Insight programs. The primary change is that the potential type identifier can now be longer (up to seven characters).

Note that the order of connections listed in the .mdf file is important for atoms whose out-of-plane (oop) flag is 2 or for which chirality information is given. Therefore, these connections must not be reordered.

The .mdf file consists of one header, one end statement, and three main sections:

Topology section (see page 60)
Symmetry section (see page 65)
Atomset section (see page 69)

The sections begin and end with the character #. The order of the sections is not important, and unneeded sections can be omitted. Records within the sections begin with keyword identifiers that start with @.

In addition, comment records, beginning with !, are allowed.

The overall structure of an .mdf file is shown in Table 9. Descriptions and examples of each major part follow.

Table 9 . Structure of an .mdf File

section record where
described contents
<first line> !header page 60 file identifier
<any line> !comment page 60 optional comments
#topology page 60 general topology of the molecule
@column page 60 column headers for types of data contained in atom records
@molecule name page 61 molecule identifier
atom records page 61 atom name, element, forcefield atom type, charge group name, isotopic number, formal charge, atomic charge, switching atom flag, out-of-plane flag, chirality flag, occupancy, X-ray temperature factor, number of connections, connectivity (order and types of data as defined by @column record)
#symmetry page 65 symmetry information
@periodicity page 65 translational periodicity of the system
@group page 67 symmetry group name associated with periodicity of system
@matrix page 68 matrix representations of the symmetry operators
@helix page 69 indicates that helical symmetry is present
#atomset page 69 named subsets of atoms for specific purposes
@degree n page 69 number of atoms associated (used for general case)
@list page 71 list of associated atoms (used, for example, to define and name pseudoatoms, backbone atoms, or subsets)
@quartet page 72 states that four atoms are associated (used, for example, to define and name torsional or dihedral angles)
<last line> #end page 73 end-of-file marker

Table 9 . Structure of an .mdf File
section	record	where described	contents
<first line>	!header	page 60	file identifier
<any line>	!comment	page 60	optional comments
#topology		page 60	general topology of the molecule
	@column	page 60	column headers for types of data contained in atom records
	@molecule name	page 61	molecule identifier
	atom records	page 61	atom name, element, forcefield atom type, charge group name, isotopic number, formal charge, atomic charge, switching atom flag, out-of-plane flag, chirality flag, occupancy, X-ray temperature factor, number of connections, connectivity (order and types of data as defined by @column record)
#symmetry		page 65	symmetry information
	@periodicity	page 65	translational periodicity of the system
	@group	page 67	symmetry group name associated with periodicity of system
	@matrix	page 68	matrix representations of the symmetry operators
	@helix	page 69	indicates that helical symmetry is present
#atomset		page 69	named subsets of atoms for specific purposes
	@degree n	page 69	number of atoms associated (used for general case)
	@list	page 71	list of associated atoms (used, for example, to define and name pseudoatoms, backbone atoms, or subsets)
	@quartet	page 72	states that four atoms are associated (used, for example, to define and name torsional or dihedral angles)
<last line>	#end	page 73	end-of-file marker

Header Record

The first record of a molecular data file must be:


!BIOSYM molecular_data #

The ! must be the first character in the file. The Discover program interprets this line as indicating an ASCII file containing molecular data records as outlined in this section. The string molecular_data indicates that the contents of the file are those of an .mdf file; the # is replaced by an actual number, which identifies the file format for the Discover program. The number 4, for example, indicates that the file format is as specified here for the Discover program, versions 2.9.5/3.2 and later.

Comment Record

Comment lines begin with an ! and may occur anywhere after the first record. By convention, the Insight program inserts a system title and a date as a comment record after the version record.

Topology Section

The topology section contains tabular information about atoms in a molecule or system of associated molecules. Its first line is:


#topology

Next, the column headings are defined. The molecule name and atomic data follow.

Column Record

The column headings for the table of atomic information are defined at the beginning of the topology section, in an @column record. All @column records must precede the first molecule or atom record.

Column records have the following syntax:

@column # type specifier

where @column is a keyword identifying the record, # is the number of a column containing a certain type of atomic data, and specifier (for example, the name of a forcefield) further defines the type, when necessary.

The types of atomic data are shown in Table 9. Column headings must all be listed, in the order given.

@column 1 element
@column 2 atom_type cvff
...

Molecule Name Record

The syntax of the molecule name record is:

@molecule name type

where @molecule is an identifying keyword, name is a molecule name for identification purposes, and type is the optional type of molecule for classification purposes. If type is present, all molecules of the same type must be topologically identical.

Examples:


@molecule crambin

@molecule wat4 water

@molecule h2o5 water

@molecule benz1 c6h6

Atom Records

The atom records have no identifying keyword--they are identified by the fact that they immediately follow the molecule name record. Only unique atoms are included in the topology section. Symmetrically or translationally equivalent atoms are not included, although bonds to such atoms may be indicated.

Atom records consist of the fields shown in Table 10. The values allowed for flag settings are also shown.

Table 10 . Types and Order of Atomic Data and Flag Settings

type flag setting description
<first column> -- complete atom name in standard Insight format of residue:atom_name
element -- the chemical symbol of the atom
atom_type -- forcefield atom type (followed by forcefield name in @column definition)
charge_group -- name of the charge group (followed by the forcefield name in @column definition)
isotope -- isotopic number (0 indicates use of default)
formal_charge -- formal charge as a string (e.g., 1+, 2-, 1/2-)
charge -- floating-point value of the atomic charge (followed by the forcefield name in @column definition)
switching_atom flag for the switching atom in a group
0 indicates is it not a switching atom
1 switching atom for the group
oop_flag flag for out-of-plane atoms (followed by forcefield name in @column definition)
0 indicates it is not an oop atom
1 oop atom, use the order of atom types in the forcefield to determine the improper torsion
2 oop atom, use the order of atoms in connectivity record in .mdf file to determine the improper torsion
chirality_flag chirality of the connections
0 neither chiral nor prochiral
1 prochiral, priorities 0 0 1 2 in connectivity record
2 prochiral, priorities 0 1 1 2 in connectivity record
3 prochiral, priorities 0 1 2 2 in connectivity record
4 chiral, priorities 0 1 2 3 in connectivity record
8 not determined
9 unable to determine
occupancy -- partial occupancy factor
xray_temp_factor -- isotropic temperature factor from experiment (X-ray)
n_connections -- number of bonds to that atom
<last columns> connectivity records--the syntax is described below (if the chirality flag is set or the oop flag is 2, the order of these records is important)

Table 10 . Types and Order of Atomic Data and Flag Settings
type	flag setting	description
<first column>	--	complete atom name in standard Insight format of residue:atom_name
element	--	the chemical symbol of the atom
atom_type	--	forcefield atom type (followed by forcefield name in @column definition)
charge_group	--	name of the charge group (followed by the forcefield name in @column definition)
isotope	--	isotopic number (0 indicates use of default)
formal_charge	--	formal charge as a string (e.g., 1+, 2-, 1/2-)
charge	--	floating-point value of the atomic charge (followed by the forcefield name in @column definition)
switching_atom		flag for the switching atom in a group
	0	indicates is it not a switching atom
	1	switching atom for the group
oop_flag		flag for out-of-plane atoms (followed by forcefield name in @column definition)
	0	indicates it is not an oop atom
	1	oop atom, use the order of atom types in the forcefield to determine the improper torsion
	2	oop atom, use the order of atoms in connectivity record in .mdf file to determine the improper torsion
chirality_flag		chirality of the connections
	0	neither chiral nor prochiral
	1	prochiral, priorities 0 0 1 2 in connectivity record
	2	prochiral, priorities 0 1 1 2 in connectivity record
	3	prochiral, priorities 0 1 2 2 in connectivity record
	4	chiral, priorities 0 1 2 3 in connectivity record
	8	not determined
	9	unable to determine
occupancy	--	partial occupancy factor
xray_temp_factor	--	isotropic temperature factor from experiment (X-ray)
n_connections	--	number of bonds to that atom
<last columns>		connectivity records--the syntax is described below (if the chirality flag is set or the oop flag is 2, the order of these records is important)

The syntax for atom records consists of one record for each of the first 11 data types listed in the table, followed on the same line by the connectivity records, which consist of several records.

Examples of atom records:

1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890


ACE_1:CA            C  c3      meA  0  0  -0.3000 1 0 8  1.0000  0.0000 4 connectivity record

N-M_3:N             N  n       nme  0  0  -0.5000 1 1 8  1.0000  0.0000 3 connectivity record

(The underlined numbers are used merely to indicate the column numbers--they are not part of the file.)

Connectivity Records

The syntax of each connectivity record is:

resname_resnumber:atom%cellxyz#symop/bondorder,wedgebond

The number of connectivity records equals the number of atoms (including ghost atoms) that that atom is connected to. The meaning and default values of each of these is shown in Table 11. Except for atom and /bondorder, all other portions of a record may be omitted if the default values are satisfactory.

Table 11 . Connectivity Record Items

item default value definition
resname current name residue name
_resnumber: _current number: alphanumeric residue "number"
atom -- atom name (must be present)
%cellxyz %000 cell offsets to be applied to the atom (3 signed numbers with no intervening spaces)
#symop #1 integer index of the symmetry operation to be applied to the atom
/bondorder /1.0 floating-point number indicating the bond order of the connection
,wedgebond ,0 optional number indicating the stereochemistry of bonds in a 2D molecule (i.e., a "sketch") (0 for a non-wedged bond; -1 and 1 for the narrow and wide ends, respectively of a wedge-up bond; and -2 and 2 for the narrow and wide ends of a wedge-down bond)

Table 11 . Connectivity Record Items
item	default value	definition
resname	current name	residue name
_resnumber:	_current number:	alphanumeric residue "number"
atom	--	atom name (must be present)
%cellxyz	%000	cell offsets to be applied to the atom (3 signed numbers with no intervening spaces)
#symop	#1	integer index of the symmetry operation to be applied to the atom
/bondorder	/1.0	floating-point number indicating the bond order of the connection
,wedgebond	,0	optional number indicating the stereochemistry of bonds in a 2D molecule (i.e., a "sketch") (0 for a non-wedged bond; -1 and 1 for the narrow and wide ends, respectively of a wedge-up bond; and -2 and 2 for the narrow and wide ends of a wedge-down bond)

The order in which the atoms are listed in the connectivity record should correspond to the chirality flag. The priority ordering used for determining chirality and prochirality is reflected by the listing order from lowest to highest priorities.

Examples of connectivity records:

full form equivalent short form using default values

LEU_6:N%000#1/1.0 N/1.0
LEU_6:N%000#1/2.0 N/2.0
LEU_6:N%+0-1+0#1/1.0 N%0-10/1.0
LEU_6:N%000#5/2.0 N#5/2.0
ALA_7:N%000#1/1.0 ALA_7:N/1.0

(for all these examples except the last one, the current (default) value of resname_resnumber: is LEU_6:)

Example atom and connectivity records for helix:


eth_1:C1	C  c   a    0	 0  -0.1200  1 0 1 4 C2 C3%001

eth_1:C2	C  c   a    0	 0  -0.1200  1 0 1 4 C1 C2

eth_1:C3	C  c   b    0	 0  -0.1200  1 0 1 4 C2 C1%00-1

This example is for an isolated 3-atom helix containing all single bonds. Atom C1 is connected to atom C2 and to a helical image of atom C3 so that C1 lies between C3 and the image of C3. Likewise, C3 is bonded to atom C2 and to the helical image of C1. The locations of the image atoms are generated by means of helix information stored in the .car or .arc files.

Symmetry Section

The symmetry section contains information about the geometry-independent periodicity and symmetry of a molecular system. Its first line is:


#symmetry

This is followed by records describing the periodicity, the symmetry group associated with the periodicity, explicit matrix representations of the operators of the group, and whether helical symmetry is present. Some of these records are optional.

Periodicity Record

The periodicity record is optional and defines the translational periodicity of the molecular system and the system of axes used to set up the periodicity.

The syntax of the periodicity record is:

@periodicity type axes

where @periodicity is the keyword identifier, type is the number of dimensions in which translational periodicity occurs, and axes is a description of how the periodic vectors relate to the Cartesian coordinate system. The type can have the values shown in Table 12.

Table 12 . Types of Periodicity

value of periodicity type definition
0 no periodicity (default)
2 2D periodicity
3 3D periodicity

Table 12 . Types of Periodicity
value of periodicity type	definition
0	no periodicity (default)
2	2D periodicity
3	3D periodicity

The axes entry consists of the letters x, y, and z. The number of letters used is equal to the periodicity type, and each letter can appear no more than once in the string. If no axes entry is present, the default alignment is assumed. The order and number of letters in the axes entry has the following significance:

For 3D periodicity, cell vector a is associated with the x axis, b is associated with y, and c is associated with z. The following rules apply in the .mdf file:

: The first letter (x, y, or z) specifies which cell vector is to be aligned with its associated Cartesian axis.
: The second letter specifies which cell vector is to lie in the plane formed by its associated Cartesian axis and the Cartesian axis associated with the first letter.
: The third letter specifies that the remaining cell vector lies somewhere in the space formed by the Cartesian axes.
: Thus the axes entry zyx means that the c cell vector lies along the z axis, the b cell vector lies in the z, y plane, and the a cell vector lies somewhere in z, y, x space. Note that currently only the default values (Table 12) are supported.

For 2D periodicity, vectors k and l describe a basal plane. By convention, the basal plane lies in the plane of two of the Cartesian axes and perpendicular to the third Cartesian axis. The following rules apply in the .mdf file:

The two axes letters together specify the Cartesian plane associated with the basal plane.

The first axes letter specifies the Cartesian axis upon which the basal plane k vector lies. The second axes letter specifies which Cartesian axis forms the plane in which the l vector lies.

Thus the axes entry yx means that the k basal plane vector lies along the y axis, the l basal plane vector lies in the x, y plane, and the basal plane is perpendicular to the z axis. Only the default values (Table 13) are currently supported.

Table 13 . Default Values of axes Entries
Currently, the Discover and Insight programs support only the default axes specifications.

value of periodicity type default axes specification meaning
0 x (no periodicity)
2 xy k = x axis; l is in x, y plane
3 xyz a = x axis; b is in x, y plane

Table 13 . Default Values of *axes* Entries
Currently, the Discover and Insight programs support only the default axes specifications.
value of periodicity type	*default axes* specification**	meaning
0	x	(no periodicity)
2	xy	k = x axis; l is in x, y plane
3	xyz	a = x axis; b is in x, y plane

Examples:


@periodicity 3 xyz



@periodicity 2 xy



@periodicity 3

The last entry implies that the default axes specification is used.

Group Record

The group record contains the symmetry group associated with the periodicity of the system. Note that if the periodicity record is present and its type is nonzero, then the group record must be present.

The syntax of the group record is:

@group name or @group matrix #

where @group is the keyword identifier, name is the symmetry group name, matrix is a keyword indicating that matrix representations of the operators follow, and # is the number of matrices.

The symmetry group name is that associated with the periodicity type. For example, if the periodicity type is 3, then the group name is the name of a space group (see the Discover 2.9.x/97.0/3.0.0 User Guide). If the periodicity type is 2, then the group name is the name of a plane group.

Likewise, the matrix representations that follow are those for the operators associated with the given periodicity type.

Examples:


@group (P21 21 2)

@group matrix 4

Matrix Records

If a group is given the special name matrix, then the representation matrices of the complete set of operators of the space group must follow, one 4 X 4 matrix for each @matrix record.

The syntax of a matrix record is:

@matrix # name
a b c d
e f g h
i j k l
m n o p

where @matrix is the keyword identifier, # is the number of the operator and runs from 1 continuously to the number of operators in the group, name is an optional name for the operator. The single letters on the next four lines are the individual floating-point elements of the 4 X 4 representation matrix of the operator.

All matrices, including the identity operator, must be specified.

Example:


! Space group 5 (C 2)

@matrix 1

 1.   0.   0.   0.

 0.   1.   0.   0.

 0.   0.   1.   0.

 0.   0.   0.   1.

@matrix 2

-1.   0.   0.   0.

 0.   1.   0.   0.

 0.   0.  -1.   0.

 0.   0.   0.   1.

@matrix 3

 1.   0.   0.   0.

 0.   1.   0.   0.

 0.   0.   1.   0.

 0.5  0.5  0.   1.

@matrix 4

-1.   0.   0.   0.

 0.   1.   0.   0.

 0.   0.  -1.   0.

 0.5  0.5  0.   1.

Helix Record

The helix record indicates that helical symmetry is present in the system.

The syntax of the helix record is:

@helix

where @helix is the keyword identifier.

A helix record can be present only when the periodicity type is 0 or 2. Currently, the presence of this record means that all molecules in the system have helical symmetry.

Helix information is currently used only by the Polymer programs. The Discover program ignores the helix record, since it does not currently support infinite helices. The Insight program uses the helix record to display helical systems.

Examples:


#symmetry

@periodicity 2 xy

@helix



#symmetry

@helix

Atomset Section

The atomset section is used to define named lists of atoms so that different programs using the .mdf file can use the names to refer to sets of atoms. Its first line is:


#atomset

Each atomset record is introduced by a line having the following syntax:

@degree type name [other]

where @degree is an integer or synonymous word indicating how many subsequent atoms make up a single entry. For example, a list of bonds or distances have a degree of 2, and a list of dihedral angles has a degree of 4. A general list, with no association of atoms, has a degree of 1. The synonyms list, pair, triplet, and quartet are used for degrees 1 through 4, respectively.

The type field specifies a general type for the set of atoms and is used in determining how the set is to be used (Table 14).

Table 14 . Atom Set Types

value of atom set type definition
backbone atoms in the backbone or main chain of a polymer
torsion quartets of atoms bonded together to form torsion angles
subset general list of atoms corresponding to subsets in Insight program
pseudoatom list of atoms defining a pseudoatom

Table 14 . Atom Set Types
value of atom set type	definition
backbone	atoms in the backbone or main chain of a polymer
torsion	quartets of atoms bonded together to form torsion angles
subset	general list of atoms corresponding to subsets in Insight program
pseudoatom	list of atoms defining a pseudoatom

The name field is the identifying name given to the set of atoms. Depending on the type of set, this can be the name of a torsion, pseudoatom, backbone, or general subset.

Depending on the type of set, other information may be either required or optional. This information is detailed below with the descriptions of the syntaxes for each type of set.

Following each set definition is a list of zero or more atoms that belong to the set. The list can continue for more than one line without any explicit continuation characters, up to the next line beginning with a @ or # symbol or the end of the file.

Atom specifications are in the standard Insight format. However, if either the molecule or molecule and residue portions of an atom specification are identical to those of the preceding atom in the list, they may be omitted. These current values will then be used as defaults.

The general syntax of the second line is:

moleculename:residuename:atomname

Because of the ability to use default values, a line such as:


poly1:eth1:c1 c2 eth2:c1 c2 poly2:eth1:c1 c2

is equivalent to the following explicit list:


poly1:eth1:c1 poly1:eth1:c2 poly1:eth2:c1 
poly1:eth2:c2 

poly2:eth1:c1 poly2:eth1:c2

Definition of Backbone Atoms

The complete syntax for defining backbone atoms is:

@list backbone name
atom1 atom2 atom3 ...

where name is used to identify the backbone.

The atom list contains zero or more atoms and can continue for more than one line without any explicit continuation characters--it is considered finished at the next @ or # symbol. The specification for the first atom is in the standard Insight format. If the molecule or residue portions of the specifications for the other atoms are missing, they default to the previous appropriate value used in the list. Wildcards are allowed.

Example of defining backbone atoms:


@list backbone 1

poly1:eth_1:C1 C2

poly1:sty_2:C1 C2

Definition of Subsets of Atoms

The complete syntax for defining subsets is:

@list subset name
atom1 atom2 atom3 ...

where name is used to identify the subset. The atom list is the same as for defining backbone atoms.

Example of defining subset atoms:


@list subset eth1

poly1:eth1:c1 c2 h1 h2 h3 h4

Definition of Pseudoatoms

The complete syntax for defining pseudoatoms is:

@list pseudoatom name A
atom1 atom2 atom3 ...

where name is the name for the pseudoatom in the form of a simple atom specification, and A (arithmetic average) indicates the method of calculating pseudoatom coordinates. The atom list is the same as for defining backbone atoms.

Examples of defining pseudoatoms:


@list pseudoatom xmol:xres:x A

*:*:*



@list pseudoatom water:xres:cm A

water:*:*



@list pseudoatom poly1:sty_2:XPHE A

poly1:sty_2:C3 C4 C5 C6 C7 C8

H3 H4 H5 H6 H7 H8



@list pseudoatom poly1:xres:x A

poly1:eth_1:C* sty_2:C1,C2

Definition of Torsions

The complete syntax for assigning names to torsions is:

@quartet torsion name
atom1 atom2 atom3 atom4

where name is the name of the torsion being defined and must include the molecule and residue names in the standard Insight format. Wildcards are allowed. The molecule and residue names can be omitted from the atom names, in which case they are assumed to be the same as in the torsion name. Relative residue numbers denoted by a signed integer may be used (e.g., -1:C or +1:N). Full molecule and residue names may be given, but must also be used in the torsion name.

Examples of defining torsions:


@quartet torsion *:*:phi

-1:C N CA C



@quartet torsion *:VAL_*:chi1

N CA CB CG1



@quartet torsion crn:tor:tors

crn:1:C 2:N 2:CA 2:HA

End Record

The end of any section is marked either by the next section header starting with # or by the end of the file. The special header #end can also be used to end a section without introducing another section.

Sample .mdf Files

Example 1: Nonperiodic, Nonhelical System


!BIOSYM molecular_data 4

!

!DATE:      Fri Sep 27 13:50:15 1993     INSIGHT generated molecular data file

!

#topology

!

@column 1 element

@column 2 atom_type cvff

@column 3 charge_group cvff

@column 4 isotope

@column 5 formal_charge

@column 6 charge cvff

@column 7 switching_atom cvff

@column 8 oop_flag cvff

@column 9 chirality_flag

@column 10 occupancy

@column 11 xray_temp_factor

@column 12 connections

!

@molecule ACEALANM

!

ACE_1:CA            C  c3      meA  0  0  -0.3000 1 0 8  1.0000  0.0000 HA1 HA2 HA3 C

ACE_1:HA1           H  h       meA  0  0   0.1000 0 0 8  1.0000  0.0000 CA

ACE_1:HA2           H  h       meA  0  0   0.1000 0 0 8  1.0000  0.0000 CA

ACE_1:HA3           H  h       meA  0  0   0.1000 0 0 8  1.0000  0.0000 CA

ACE_1:C             C  c'      pepC 0  0   0.3800 1 1 8  1.0000  0.0000 CA O/2.0 ALA_2:N

ACE_1:O             O  o'      pepC 0  0  -0.3800 0 0 8  1.0000  0.0000 C/2.0

ALA_2:N             N  n       pepN 0  0  -0.5000 1 1 8  1.0000  0.0000 ACE_1:C CA HN

ALA_2:CA            C  ca      pepN 0  0   0.1200 0 0 8  1.0000  0.0000 N HA C CB

ALA_2:HN            H  hn      pepN 0  0   0.2800 0 0 8  1.0000  0.0000 N

ALA_2:HA            H  h       pepN 0  0   0.1000 0 0 8  1.0000  0.0000 CA

ALA_2:C             C  c'      pepC 0  0   0.3800 1 1 8  1.0000  0.0000 CA O/2.0 N-M_3:N

ALA_2:O             O  o'      pepC 0  0  -0.3800 0 0 8  1.0000  0.0000 C/2.0

ALA_2:CB            C  c3      meB  0  0  -0.3000 1 0 8  1.0000  0.0000 CA HB1 HB2 HB3

ALA_2:HB1           H  h       meB  0  0   0.1000 0 0 8  1.0000  0.0000 CB

ALA_2:HB2           H  h       meB  0  0   0.1000 0 0 8  1.0000  0.0000 CB

ALA_2:HB3           H  h       meB  0  0   0.1000 0 0 8  1.0000  0.0000 CB

N-M_3:N             N  n       nme  0  0  -0.5000 1 1 8  1.0000  0.0000 ALA_2:C CA HN

N-M_3:CA            C  c3      nme  0  0  -0.0800 0 0 8  1.0000  0.0000 N HA1 HA2 HA3

N-M_3:HN            H  hn      nme  0  0   0.2800 0 0 8  1.0000  0.0000 N

N-M_3:HA1           H  h       nme  0  0   0.1000 0 0 8  1.0000  0.0000 CA

N-M_3:HA2           H  h       nme  0  0   0.1000 0 0 8  1.0000  0.0000 CA

N-M_3:HA3           H  h       nme  0  0   0.1000 0 0 8  1.0000  0.0000 CA

!

#atomset

!

@quartet torsion *:ALA_2:omeg

CA     C      *:N    *:CA

@quartet torsion *:ALA_2:phi

*:C    N      CA     C

@quartet torsion *:ALA_2:chi1

N      CA     CB     HB1

Example 2: 3D-Periodic, Nonhelical System

This .mdf file contains 3 molecules having 3D symmetry with explicit space group matrices.


!BIOSYM molecular_data 4

!DATE:      Thu Jun 11 15:24:13 1993     INSIGHT generated molecular data file

!

#topology

!

@column 1 element

@column 2 atom_type cvff

@column 3 charge_group cvff

@column 4 isotope

@column 5 formal_charge

@column 6 charge cvff

@column 7 switching_atom cvff

@column 8 oop_flag cvff

@column 9 chirality_flag

@column 10 n_connections

@column 11 connectivity

!

@molecule WTR1   water

WTR_1:O1  O  o*      WTR 16  0  -0.8200  1 0 0 2 H1 H2

WTR_1:H1  H  h*      WTR  2  0   0.4100  0 0 0 1 O1

WTR_1:H2  H  h*      WTR  2  0   0.4100  0 0 0 1 O1

!

@molecule SF6 sulfur_hexafluoride

sf6_1:S   S  s       a   0   1+     1.5000  1 0 0 6 F1 F1#2 F1#3 F1#4 F2 F3

sf6_1:F1  F  f       a   0   1/6-  -0.2500  0 0 0 1 S

sf6_1:F2  F  f       a   0   1/6-  -0.2500  0 0 0 1 S

sf6_1:F3  F  f       a   0   1/6-  -0.2500  0 0 0 1 S

!

@molecule poly1 ethylene-styrene

eth_1:C1  C  c       a    0   0  -0.1200  1 0 1 4 H1 H2 C2 sty_2:C2%010

eth_1:H1  H  h       a    0   0   0.0600  0 0 0 1 C1

eth_1:H2  H  h       a    0   0   0.0600  0 0 0 1 C1

eth_1:C2  C  c       b    0   0  -0.1200  1 0 1 4 H3 H4 C1 sty_2:C1

eth_1:H3  H  h       b    0   0   0.0600  0 0 0 1 C2

eth_1:H4  H  h       b    0   0   0.0600  0 0 0 1 C2

sty_2:C1  C  c       me1  0   0  -0.0600  1 0 4 4 H1 C2 eth_1:C2 sty_2:C3

sty_2:H1  H  h       me1  0   0   0.0600  0 0 0 1 C1

sty_2:C2  C  c       me2  0   0  -0.1200  1 0 1 4 H2 H3 C1 eth_1:C1%0-10

sty_2:H2  H  h       me2  0   0   0.0600  0 0 0 1 C2

sty_2:H3  H  h       me2  0   0   0.0600  0 0 0 1 C2

sty_2:C3  C  cp      ph1  0   0   0.0000  1 1 0 3 C1 C4/1.5 C8/1.5

sty_2:C4  C  cp      ph1  0   0  -0.1000  0 1 0 3 C3/1.5 C5/1.5 H4

sty_2:H4  H  h       ph1  0   0   0.1000  0 0 0 1 C4

sty_2:C8  C  cp      ph1  0   0  -0.1000  0 1 0 3 C3/1.5 C7/1.5 H8

sty_2:H8  H  h       ph1  0   0   0.1000  0 0 0 1 C8

sty_2:C5  C  cp      ph2  0   0  -0.1000  0 1 0 3 C4/1.5 C6/1.5 H5

sty_2:H5  H  h       ph2  0   0   0.1000  0 0 0 1 C5

sty_2:C6  C  cp      ph2  0   0  -0.1000  1 1 0 3 C5/1.5 C7/1.5 H6

sty_2:H6  H  h       ph2  0   0   0.1000  0 0 0 1 C6

sty_2:C7  C  cp      ph2  0   0  -0.1000  0 1 0 3 C6/1.5 C8/1.5 H7

sty_2:H7  H  h       ph2  0   0   0.1000  0 0 0 1 C7

!

#symmetry

@periodicity 3 xyz

@group matrix 4

@matrix 1 identity

 1.000  0.000  0.000  0.000

 0.000  1.000  0.000  0.000

 0.000  0.000  1.000  0.000

 0.000  0.000  0.000  1.000

@matrix 2 C4_1

 0.000  1.000  0.000  0.000

 1.000  0.000  0.000  0.000

 0.000  0.000  1.000  0.000

 0.000  0.000  0.000  1.000

@matrix 3 C4_2

-1.000  0.000  0.000  0.000

 0.000  1.000  0.000  0.000

 0.000  0.000  1.000  0.000

 0.000  0.000  0.000  1.000

@matrix 4 C4_3

 0.000 -1.000  0.000  0.000

-1.000  0.000  0.000  0.000

 0.000  0.000  1.000  0.000

 0.000  0.000  0.000  1.000

Example 3: Nonperiodic, Helical System

This .mdf file contains a single helical molecule with no translational periodicity.


!BIOSYM molecular_data 4

!DATE:      Thu Jun 11 17:44:53 1993     INSIGHT generated molecular data file



#topology



@column 1 element 

@column 2 atom_type cvff

@column 3 charge_group cvff

@column 4 isotope

@column 5 formal_charge

@column 6 charge cvff

@column 7 switching_atom cvff

@column 8 oop_flag cvff

@column 9 chirality_flag

@column 10 occupancy

@column 11 xray_temp_factor

@column 12 connections



@molecule TEST_6_11_HLX1

ETHE_1:C1           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H11 H12 C2 C2%001

ETHE_1:H11          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:H12          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:C2           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H21 H22 C1 C1%00-1

ETHE_1:H21          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2

ETHE_1:H22          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2



#symmetry

@helix

!

#atomset

@list backbone TEST_6_11_HLX1

TEST_6_11_HLX1:ETHE_1:C1 C2

Example 4: 2D-Periodic, Helical System

This .mdf file contains 4 helical molecules exhibiting 2D translational periodicity.


!BIOSYM molecular_data 4

!DATE:      Thu Jun 11 17:42:58 1993     INSIGHT generated molecular data file



#topology



@column 1 element

@column 2 atom_type cvff

@column 3 charge_group cvff

@column 4 isotope

@column 5 formal_charge

@column 6 charge cvff

@column 7 switching_atom cvff

@column 8 oop_flag cvff

@column 9 chirality_flag

@column 10 occupancy

@column 11 xray_temp_factor

@column 12 connections



@assembly NEW_CELL



@molecule TEST_6_11_HLX

ETHE_1:C1           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H11 H12 C2 C2%001

ETHE_1:H11          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:H12          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:C2           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H21 H22 C1 C1%00-1

ETHE_1:H21          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2

ETHE_1:H22          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2



@molecule TEST_6_11_HLX01

ETHE_1:C1           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H11 H12 C2 C2%001

ETHE_1:H11          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:H12          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:C2           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H21 H22 C1 C1%00-1

ETHE_1:H21          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2

ETHE_1:H22          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2



@molecule TEST_6_11_HLX0101

ETHE_1:C1           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H11 H12 C2 C2%001

ETHE_1:H11          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:H12          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:C2           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H21 H22 C1 C1%00-1

ETHE_1:H21          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2

ETHE_1:H22          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2



@molecule TEST_6_11_HLX010101

ETHE_1:C1           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H11 H12 C2 C2%001

ETHE_1:H11          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:H12          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C1

ETHE_1:C2           C  c2      0    0  0   1.0000 0 1 8  0.0000  0.0000 H21 H22 C1 C1%00-1

ETHE_1:H21          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2

ETHE_1:H22          H  h       0    0  0   0.0000 0 0 8  0.0000  0.0000 C2



#symmetry

@periodicity 2 xy

@group (P1)

@helix

!

#atomset



@list backbone TEST_6_11_HLX

TEST_6_11_HLX:ETHE_1:C1 C2



@list backbone TEST_6_11_HLX01

TEST_6_11_HLX01:ETHE_1:C1 C2



@list backbone TEST_6_11_HLX0101

TEST_6_11_HLX0101:ETHE_1:C1 C2



@list backbone TEST_6_11_HLX010101

TEST_6_11_HLX010101:ETHE_1:C1 C2

Brookhaven Protein Databank File (.pdb)

For a complete description of the .pdb file format, contact the Brookhaven Protein Data Bank. Directions for postal, gopher, and ftp contact are in the Insight Products System Guide, Preparing for the Installation. You can visit the Protein Data Bank on the worldwide web at:

: http://www.pdb.bnl.gov

The format of the PDB files has been revised since the last release of Insight. For a complete description of the new format, load this URL into your web browser:

: http://www.pdb.bnl.gov/Format.doc/Contents_Guide_2.html

This section describes how Insight products handle the new or changed parts of the .pdb file format.

The file reader from Insight releases earlier than Insight II 97.0 ignore the new records and the changes to the existing record types do not impact the reader.

The Insight II 97.0 file reader handles the new parts of the PDB file format in the following ways.

Reading PDB Files

Check for a REMARK line indicating this is a FORMAT 2.0 file.
Look for segment identifier in columns 73-76 of the .pdb file. If present, a subset is created that collects the atoms for that segment.
Read the element name from columns 77, 78.
Read the formal charge from columns 79, 80.

Writing PDB Files

Write out a REMARK line that says it is FORMAT 2.0.
Write out the element name in columns 77, 78.
Write out the formal charge in columns 79, 80.
No segment identifiers are written out.

X-PLOR Coordinate File (.pdbx)

The .pdbx file contains the Cartesian coordinates of a molecular system, suitable for input to X-PLOR. Refer to Chatper 6 of X-PLOR, Version 3.1, A System for X-ray Crystallography and NMR (Axel Brunger, Yale Univ. Press, 1992) for a complete description and usage of this file format.

The .pdbx file is essentailly similar to the Brookhaven Protein Data Bank (PDB) format, with the following differences:

X-PLOR does not use chain identifier information. Instead it uses the characters in columns 73-76 for the segment name.
The insertion character is treated as part of the residue number. The residue number is a string consisting of a maximum of four characters.
X-PLOR ignores any reference to atom numbers and generates its own numbering scheme.
The REMARK record of PDB files is treated as a title record.
No other type of PDB specification, such as HETAT, SCALE, or SEQU is interpreted at present. These additional records have to be removed before one reads PDB coordinates withX-PLOR.
The PDB convention requires an END statement at the end of the coordinate file. X-PLOR uses the same convention.

NMR Peak Intensity/Integral (.pks)

Description of Sections

Table 15 . Sections of .pks File

Section Example Description
Header !BIOSYM nmr_peak_intensities 2
Data
mixing times

<float1> <float2>...<floatN> mixing times associated with peak intensities
peak intensities

<pkID> <floatw2> <floatw1> <floatlw2> <floatlw1> <float1> <float2> ... <floatN> measured peak intensities

Table 16 . Variables in .pks File

Variable Description
<floati>

experimental mixing time
<pkID>

peak specification (integer > 0)
<floatw2>

peak position in spectrum along w2 axis in ppm
<floatw1>

peak position in spectrum along w1 axis in ppm
<float1w2>

line width along w2 axis
<floatlw1>

line width along w1 axis
<floatN>

peak intensity corresponding to N-th mixing time

Table 15 . Sections of .pks File
Section	Example	Description
Header	!BIOSYM nmr_peak_intensities 2
Data
mixing times	<float1> <float2>...<floatN>	mixing times associated with peak intensities
peak intensities	<pkID> <floatw2> <floatw1> <floatlw2> <floatlw1> <float1> <float2> ... <floatN>	measured peak intensities

Table 16 . Variables in .pks File
Variable	Description
<floati>	experimental mixing time
<pkID>	peak specification (integer > 0)
<floatw2>	peak position in spectrum along w2 axis in ppm
<floatw1>	peak position in spectrum along w1 axis in ppm
<float1w2>	line width along w2 axis
<floatlw1>	line width along w1 axis
<floatN>	peak intensity corresponding to N-th mixing time

Rules:

The number of peak intensity values is equal to the number of mixing times.
Mixing time values are separated by white space (space, tab, or new line).
Peak intensity information must be entered on a single line.
For Felix to read the line, the peaks must occur in order of increasing peak ID.

Sample .pks File


!BIOSYM nmr_peak_intensities 2

!

#mixing_times

2.000000E-02 4.000000E-02 8.000000E-02 1.200000E-01

!

#peak_intensities

!Peak W2_Pos W1_Pos LineWdth2 LineWdth1                 Intensities

!

 120 5.315 2.091   8.575    17.070    4.0410E+05 2.3580E+05 6.4820E+05 4.1640E+05

 121 5.256 3.133   8.575    17.790    1.9220E+06 2.6670E+06 3.3220E+06 3.8150E+06

 122 5.123 1.009   8.575    17.910    8.4710E+06 1.1300E+07 2.3800E+07 3.0970E+07

 123 4.324 2.344  10.020    17.600    5.8450E+05 2.0510E+06 4.7590E+06 7.6090E+06

 124 4.081 4.400   9.328    16.770    3.3020E+05 -1.3270E+05 2.8710E+05 1.2310E+06

 125 4.006 1.212   8.575    16.890   -6.1340E+05 7.5890E+05 1.3390E+06 2.7490E+06

 126 3.336 1.003   9.005    17.560   -8.9400E+04 -7.3900E+05 2.0200E+06 2.5500E+06

 127 4.081 8.570   8.575    16.890    4.7400E+06 1.2700E+07 2.3900E+07 3.1900E+07

 128 5.235 1.789   9.320    25.200    4.6700E+05 7.3400E+05 9.7000E+05 2.0400E+06

 129 2.323 0.763   8.575    17.070    2.4600E+06 4.4700E+06 6.6900E+06 9.0400E+06

 130 3.456 1.234   8.575    17.790   -1.1000E+05 -4.1000E+04 3.1600E+05 1.3900E+06

 131 5.289 2.569   8.575    17.910    3.16000E+05 6.6000E+06 1.5400E+07 2.1000E+07

 132 1.312 1.132  10.020    17.600    5.3800E+06 8.3500E+06 1.1390E+07 1.6870E+07

 133 1.789 1.766   9.328    16.770    3.3800E+06 8.3500E+06 1.1390E+07 1.6870E+07

 134 3.232 2.737   8.575    16.890    9.4200E+05 2.6000E+06 8.6100E+06 1.2400E+07

Pseudoatom Library (.plb)

There are two pseudoatom library files (cvffa.plb and amber.plb) which contain pseudoatoms respectively corresponding to the monomers used in the cvffa and amber forcefields. These pseudoatom library files are located in the directory referred to by the $BIOSYM_LIBRARY environmental variable.

Description of Sections

Table 17 . Sections in .plb File

Section Example Description
Header !BIOSYM pseudo_atom_library 1
Data
plb_entry

list of pseudoatoms for each monomer
<pseudoatom_name> <atom_name1> <atom_name2> ... <atom_nameN> Type <pseudo_type> <prochirality> {Center: <center_name> Reference: <ref_name>}

Table 18 . Variables in .plb File

Variable Description
<pseudoatom_name>

name assigned to pseudoatom
<atom_name1> ... <atom_nameN>

names of atoms which make up pseudoatom
Type <pseudo_type>

pseudoatom type (used for bound corrections). Valid types include CH2, CH3, 2CH3, ArH2, 2ArH2, NH2, 2NH2, NH3
<prochirality>

Prochiral or Not_Prochiral. If prochirality = Prochiral, the optional prochiral fields below are required
<center_name>

name of the atom which is the prochiral center
<ref_name>

name of the atom bonded to the prochiral center which is on the path to the pseudoatom

Table 17 . Sections in .plb File
Section	Example	Description
Header	!BIOSYM pseudo_atom_library 1
Data
plb_entry		list of pseudoatoms for each monomer
	<pseudoatom_name> <atom_name1> <atom_name2> ... <atom_nameN> Type <pseudo_type> <prochirality> {Center: <center_name> Reference: <ref_name>}

Table 18 . Variables in .plb File
Variable	Description
<pseudoatom_name>	name assigned to pseudoatom
<atom_name1> ... <atom_nameN>	names of atoms which make up pseudoatom
Type <pseudo_type>	pseudoatom type (used for bound corrections). Valid types include CH2, CH3, 2CH3, ArH2, 2ArH2, NH2, 2NH2, NH3
<prochirality>	Prochiral or Not_Prochiral. If prochirality = Prochiral, the optional prochiral fields below are required
<center_name>	name of the atom which is the prochiral center
<ref_name>	name of the atom bonded to the prochiral center which is on the path to the pseudoatom

Each plb_entry contains four lines with the above information, and consists of a list of pseudoatom entries separated by a line with the "!" character in the first column.

Sample .plb File


!BIOSYM pseudo_atom_library 1

!

! CVFF

!

#plb_entry

ALAN                    Alanine, positive N-terminus

	HNX

	HN1 HN2 HN3

	Type NH3

	Not_Prochiral

!

	HBX

	HB1 HB2 HB3

	Type CH3

	Not_Prochiral

!

!

#plb_entry

ALA                     Alanine, polypeptide residue

	HBX

	HB1 HB2 HB3

	Type CH3

	Not_Prochiral

!

!

	.

	.

	.

#plb_entry

LEUN                    Leucine, positive N-terminus

	HNX

	HN1 HN2 HN3

	Type NH3

	Not_Prochiral

!

	HBX

	HB1 HB2

	Type CH2

	Not_Prochiral

!

	HD1X

	HD11 HD12 HD13

	Type CH3

	Prochiral Center: CG Reference: CD1

!

	HD2X

	HD21 HD22 HD23

	Type CH3

	Prochiral Center: CG Reference: CD2

!

	HDX

	HD11 HD12 HD13 HD21 HD22 HD23

	Type 2CH3

	Not_Prochiral

!

!

	.

	.

	.

Proton Chemical Shifts (.ppm)

Description of Sections

Table 19 . Sections in .ppm File

Section Example Description
Header !BIOSYM nmr_chemical_shifts 1
chemical shifts <atom_spec> <float1> <float2> <float3> Chemical shift information

Table 20 . Variables in .ppm File

Variable Description
<atom_spec>

reference to a hydrogen or pseudoatom
<float1>

chemical shift in ppm
<float2>

T1 leakage rate
<float3>

Line width in Hz

Table 19 . Sections in .ppm File
Section	Example	Description
Header	!BIOSYM nmr_chemical_shifts 1
chemical shifts	<atom_spec> <float1> <float2> <float3>	Chemical shift information

Table 20 . Variables in .ppm File
Variable	Description
<atom_spec>	reference to a hydrogen or pseudoatom
<float1>	chemical shift in ppm
<float2>	T1 leakage rate
<float3>	Line width in Hz

Rule:

Chemical shifts may be stereo-specifically assigned using the labeling conventions described at the beginning of this document. For example, the proR HB of ASN may be assigned a chemical shift of 1.15 via the following entry:


1:ASN_1:HBR   1.1500 0.0000 0.0000

Sample .ppm File


!BIOSYM nmr_chemical_shifts 1

#chemical_shifts

! Atom Spec              PPM    T1 Leak   Line Width

!

1:SERN_1:HA              4.390     1.000    20.000

1:SERN_1:HB*             4.080     1.000    20.000

1:SERN_1:HG              1.100     1.000    20.000

1:ASN_2:HN               8.570     1.000    20.000

1:ASN_2:HA               4.550     1.000    20.000

1:ASN_2:HB2              2.740     1.000    20.000

1:ASN_2:HB1              3.230     1.000    20.000

1:ASN_2:HD2*             6.710     1.000    20.000

1:PHE_3:HN               9.300     1.000    20.000

1:PHE_3:HA               4.040     1.000    20.000

1:PHE_3:HB*              3.570     1.000    20.000

1:PHE_3:HD*              7.230     1.000    20.000

1:PHE_3:HE*              7.390     1.000    20.000

1:PHE_3:HZ               7.270     1.000    20.000

...

1:AR+C_7:HD*             2.405     1.000    20.000

Dynamics Scratch File (.pre)

The dynamics scratch file is used as a buffer file for temporary storage of the thermodynamic state table generated during constant-pressure dynamics. This table is appended to the .out file after the energy table.

Protein Bond Angle Table (pro_angle.dat)

Much of the following information is taken straight from the files themselves and serves to explain the syntax and meaning of the values.

The pro_angle.dat file contains a protein bond angle table consisting of residue- specific bond angles and standard deviations. This information was derived from
R. A. Engh and R. Huber (Acta. Cryst., A47, 292-300, 1991)

Later lines take precedence over earlier ones with atoms of same names.
The atom name in column 1 is associated with the residue names in column 4, column 2 is associated with column 5, and so on.
The only wildcarding allowed is a single * character; this implies a match with any residue name
A zero entry implies that a specific bond is not present in the data base and will not be checked.

describes a C-N-CA bond angle (where the carbonyl atom may be in any residue while the nitrogen and alpha carbon atoms are in a glycine) as having a mean bond angle of 120.6° with a standard deviation of 1.7°.


C       N       CA      *       GLY     GLY     
120.6   1.7

Description of Sections

Table 21 . Sections in pro_angle.dat File

Section Example Description
Comment !File rules ! character implies comment
Data <atom1> <atom2> <residue1> <residue2> <residue3> <mean> <std_devn> Atom and residue ID's, bond angle mean value, and standard deviation

Table 22 . Variables in pro_angle.dat File

Variable Description
<atom1>

Valid atom name
<atom2>

Valid atom name
<atom3>

Valid atom name
<residue1>

Valid residue name
<residue2>

Valid residue name
<residue3>

Valid residue name
<mean>

Bond angle mean value in degrees
<std_devn>

Bond angle standard deviation in degrees

Table 21 . Sections in pro_angle.dat File
Section	Example	Description
Comment	!File rules	! character implies comment
Data	<atom1> <atom2> <residue1> <residue2> <residue3> <mean> <std_devn>	Atom and residue ID's, bond angle mean value, and standard deviation

Table 22 . Variables in pro_angle.dat File
Variable	Description
<atom1>	Valid atom name
<atom2>	Valid atom name
<atom3>	Valid atom name
<residue1>	Valid residue name
<residue2>	Valid residue name
<residue3>	Valid residue name
<mean>	Bond angle mean value in degrees
<std_devn>	Bond angle standard deviation in degrees

Sample pro_angle.dat File


! Created Sept 7 1994. 

! Residue Specific bond Angles and standard deviations

! Information derived from 

! R.A.Engh and R.Huber, Acta. Cryst., A47 292-300 (1991),

! File rules

! 1) Later lines take precedence over earlier ones with

! 	atoms of same names.

! 2) Atom name in column 1 is associated with residue name 

! 	in column 4, 2 with 5  etc.  

! 3) Only wildcarding allowed is a single * character

!	this implies a match with any residue name

! 4) A zero entry implies this specific bond not present

!	in the data base and will not be checked. 

C	N	CA 	*	*	*	121.7	1.8

C	N	CA	*	GLY	GLY	120.6	1.7

C	N	CA	*	PRO	PRO	122.6	5.0

CA	C	N	*	*	*	116.2	2.0

CA	C	N	GLY	GLY	*	116.4	2.1

CA	C	N	*	*	PRO	116.9	1.5

CA	C	N	GLY	GLY	PRO	118.2	2.1

CA	C	O	*	*	*	120.8	1.7

CA	C	O	GLY	GLY	GLY	120.8	2.1

CB	CA	C	*	*	*	110.1	1.9

CB	CA	C	ALA	ALA	ALA	110.5	1.5

CB	CA	C	ILE	ILE	ILE	109.1	2.2

CB	CA	C	THR	THR	THR	109.1	2.2

CB	CA	C	VAL	VAL	VAL	109.1	2.2

N	CA	C	*	*	*	111.2	2.8

N 	CA	C	*	GLY	GLY	112.5	2.9

N	CA	C	PRO	*	*	111.8	2.5

N	CA	C	PRO	GLY	GLY	0.0	0.0

N	CA	CB	*	*	*	110.5	1.7

N	CA	CB	ILE	ILE	ILE	111.5	1.7

N	CA	CB	THR	THR	THR	111.5	1.7

N	CA	CB	VAL	VAL	VAL	111.5	1.7

N	CA	CB	ALA	ALA	ALA	110.4	1.5

N	CA	CB	PRO	PRO	PRO	103.0	1.1 

O	C	N	*	*	*	123.0	1.6

O	C	N	*	*	PRO	122.0	1.4

Protein Bond Length Table (pro_bond.dat)

Much of the following information is taken straight from the files themselves and serves to explain the syntax and meaning of the values.

The pro_bond.dat file contains a protein bond length table consisting of residue-specific bond lengths and standard deviations. This information was derived from
R. A. Engh and R. Huber (Acta. Cryst., A47, 292-300, 1991).

Later lines take precedence over earlier ones with atoms of the same names.
The atom name in the first column is associated with the residue name in the third column.
The only wildcarding allowed is a single * character; this implies a match with any residue name.

states that the bond between a Glycine nitrogen and an alpha carbon has a mean value of 1.451 Å, with a standard deviation of 0.016 Å.


N       CA      GLY     1.451   0.016

Much of the following information is taken straight from the files themselves and serves to explain the syntax and meaning of the values.

Later lines take precedence over earlier ones with atoms of the same names.
The atom name in the first column is associated with the residue name in the third column.
The only wildcarding allowed is a single * character; this implies a match with any residue name.

states that the bond between a Glycine nitrogen and an alpha carbon has a mean value of 1.451 Å, with a standard deviation of 0.016 Å.


N       CA      GLY     1.451   0.016

Description of Sections

Table 23 . Sections in pro_bond.dat File

Section Example Description
Comment !File rules ! character implies comment
Data <atom1> <atom2> <residue> <mean> <std_devn> Atom, residue ID's, bond length mean value, and standard deviation

Table 24 . Variables in pro_bond.dat File

Variable Description
<atom1>

Valid atom name
<atom2>

Valid atom name
<residue>

Valid residue name
<mean>

Bond length mean value in Å
<std_devn>

Bond length standard deviation in Å

Table 23 . Sections in pro_bond.dat File
Section	Example	Description
Comment	!File rules	! character implies comment
Data	<atom1> <atom2> <residue> <mean> <std_devn>	Atom, residue ID's, bond length mean value, and standard deviation

Table 24 . Variables in pro_bond.dat File
Variable	Description
<atom1>	Valid atom name
<atom2>	Valid atom name
<residue>	Valid residue name
<mean>	Bond length mean value in Å
<std_devn>	Bond length standard deviation in Å

Sample pro_bond.dat File


! Created Sept 7 1994. 

! Residue Specific bond lengths and standard deviations

! Information derived from 

! R.A.Engh and R.Huber, Acta. Cryst., A47 292-300 (1991),

! File rules

! 1) Later lines take precedence over earlier ones with

! 	atoms of same names.

! 2) Atom name in first column is associated with residue name 

! 	in third column.  

! 3) Only wildcarding allowed is a single * character

!	this implies a match with any residue name

CA	C	*	1.525	0.021

CA	C 	GLY	1.516	0.018

C	O 	* 	1.231	0.020

CB	CA	*	1.530	0.020

CB	CA	ALA	1.521	0.033

CB	CA	ILE	1.540	0.027

CB 	CA	THR	1.540	0.027

CB	CA	VAL	1.540	0.027

N	CA	*	1.458	0.019

N	CA	PRO	1.466	0.015

N	CA	GLY	1.451	0.016

N 	C 	*	1.329	0.014

N 	C 	PRO 	1.341	0.016

SG	SG	CYS	2.000	0.100

NMR Project (.proj)

The .proj file contains a history of the refinement steps performed on a given molecular system along with the input files used for each step. Each new step is appended to the end of the file so it may be used as a running account of the structure refinement process. Note that since this process may take place over a long period of time, each step in the file begins with a project header line which contains the version of the NMR software used.

Description of Sections

Table 25 . Sections in .proj File

Section Example Description
Data One or more sections (separated by a blank line) containing the
following information:
!BIOSYM project 1 <timestamp>
<run_description>
<comment>
Project files written: <file1> <file2> ... <filen>
{<RUN> files written: <run_file1> <run_file2> ... <run_filen>}

Table 26 . Variables in .proj File

Variable Description
<timestamp>

day_of_week, month, day, HH:MM:SS, year
<run_description>

description of the type of run, molecule name, etc.
<comment>

comment for the project update conducted at the given time stamp
<file1>, <file2>, ... <filen>

NMR database files used in the current step (e.g., file.rstrnt, file.ppm, file.asn, file.pks, etc.)
<RUN>

Run type (e.g., RMA, DGII)
<run_file1>,
<run_file2>, ... <run_filen>

List of specific input files created for use in the given run

Sample .proj File


!BIOSYM project 1

Mon Nov 4 11:04:14 1991

RMA Run: test Molecule=CRAM7AVG NMR_project=test.

Test of average structure.

Project files written:

cram7avg.ppm cram7avg.pks cram7avg.asn cram7avg.rstrnt

RMA files written:

test.rmainp test.mdh test.shift test.rma_temp test.rstrnt_temp

Updated RMA files are test_01.rma and test_01.rstrnt



!BIOSYM project 1

Mon Nov 4 11:58:29 1991

RMA Run: test Molecule=CRAM7AVG NMR_project=test.

Test of average structure.

Project files written:

cram7avg.ppm cram7avg.pks cram7avg.asn cram7avg.rstrnt

RMA files written:

test.rmainp test.mdh test.shift test.rma_temp test.rstrnt_temp

Updated RMA files are test_02.rma and test_02.rstrnt

Insight Protein Miscellaneous Properties Table (pro_misc.dat)

Much of the following information is taken straight from the files themselves and serves to explain the syntax and meaning of the values.

The pro_misc.dat file contains a table of miscellaneous protein properties. This information was derived from J. Thornton and co workers (J. Appl. Cryst., 26, 283-291, 1993), R. A. Engh and R. Huber (Acta. Cryst., A47, 292-300, 1991), and from M. Macarthur (private communication).

The first word on a line is a property keyword, the second word is a property mean value, and the third word is a property standard deviation.
Values of mean and standard deviations can be modified by you, but names of keywords cannot be changed.
Unrecognized keywords are ignored.
Kabsch_Sander H-Bond energies are given in Kcal.

describes the omega torsion angle as having a mean value of 180.0 °, with a standard deviation of 5.8 °.


OMEGA                   180.0   5.8

Much of the following information is taken straight from the files themselves and serves to explain the syntax and meaning of the values.

The first word on a line is a property keyword, the second word is a property mean value, and the third word is a property standard deviation.
Values of mean and standard deviations can be modified by you, but names of keywords cannot be changed.
Unrecognized keywords are ignored.
Kabsch+Sander H-Bond energies are given in Kcal.

describes the omega torsion angle as having a mean value of 180.0 °, with a standard deviation of 5.8 °.


OMEGA                   180.0   5.8

Description of Sections

Table 27 . Sections in pro_misc.dat File

Section Example Description
Comment !File rules ! character implies comment
Data <keyword> <mean> <std_devn> Property name, mean value, and standard deviation

Table 28 . Variables in pro_misc.dat File

Variable Description
<keyword>

Name of per-residue property
<mean>

Property mean value
<std_devn>

Property standard deviation

The following keywords are recognized in this release:

Table 29 . Keywords in pro_misc.dat File

Keyword Description
CHI_1_RANGE_1

CHI_1_RANGE_2

CHI_1_RANGE_3

Three ranges for chi1 torsion
CHI_2_RANGE_1

CHI_2_RANGE_2

CHI_2_RANGE_3

Three ranges for chi2 torsion
PROLINE_PHI

Proline specific phi torsion
HELIX_PHI

HELIX_PSI

Phi, Psi angles in a helix found by Kabsch-Sander method
CHI_3_SS_RANGE_1

CHI_3_SS_RANGE_2

Two allowed ranges for disulfide bond torsion
OMEGA

Peptide bond torsion angle
E_H_BOND_KS

H-bond donor energy found by Kabsch-Sander method
CA_VIRTUAL_TORSION

Alpha carbon virtual torsion CA-N-C-CB

Sample pro_misc.dat File


! Created Sept 15 1994. 

! Contains information on miscellaneous protein properties. 

! Information derived from 

! J. Thornton and co workers J. Appl. Cryst. vol 26, 283-291 (1993)

! R.A.Engh and R.Huber, Acta. Cryst., A47 292-300 (1991),

! M.Macarthur private communication.

! File rules

! 1) First word on line is property keyword  

! 	second word is property mean value

!	third word is property standard deviation	

! 2) Values of mean and standard deviation can be modified by user

!	names of keywords cannot be changed 

! 3) Unrecognised keywords ignored.

! 	  

! Kabsch+Sander H-Bond energies in Kcal

CHI_1_RANGE_1  			64.1	15.7

CHI_1_RANGE_2			183.6	16.8

CHI_1_RANGE_3			-66.7	15.0

CHI_2_RANGE_1			68.7	21.3	

CHI_2_RANGE_2			177.5	19.4	

CHI_2_RANGE_3			-71.8	21.1	

PROLINE_PHI			-65.4	11.2

HELIX_PHI			-65.3	11.9

HELIX_PSI			-39.4	11.3

CHI_3_SS_RANGE_1			96.8	14.8

CHI_3_SS_RANGE_2			-85.8	10.7

OMEGA			180.0	5.8

E_H_BOND_KS			-2.02	0.75

CA_VIRTUAL_TORSION			33.9	3.5

X-PLOR Molecular Structure File (.psf)

The .psf file contains inforamtion about the molecular structure. This file is created by the


WRITe STRUcture

stement in X-PLOR and is suitable for input using the


STRUcture

statement in X-PLOR. The contents of this file consist of atmo names, types, charges and masses; residue names and segment names; and a list of bond terms, angle terms, dihedral terms, improper terms, explicit hydrogen-bonding terms, explicit nonbonded exclusions, and nonbonded group partitions. It does not contain atomic coordinates,parameters, constraints, restraints, or any other information that is specific to effective energy terms, such as diffraction data.

Refer to Chapter 3 of X-PLOR, Version 3.1, A System for X-ray Crystallography and NMR (Axel Brunger, Yale Univ. Press, 1992) for the description and usage of this file format.

Residue Library (.rlb)

The residue libraries have been created for use in assigning potential function atom types and partial charges to peptides, proteins, and nucleic acids. They contain experimental data for the twenty standard amino acid residues, and for other selected residues.

Three standard residue libraries are provided:

Consistent valence amino acids, the default (cvffa.rlb)
AMBER DNA and RNA nucleic acids plus amino acids (amber.rlb)
Consistent valence amino acids, for use with the potential energy function consortium CFF91 forcefield.

This section is intended to help you understand the structure of these files and to aid you in using the library to prepare molecules for molecular mechanics simulations.

Note: A complete description of the residue library is given below. However, only the potential function atom type, partial charge, charge group, and named torsion fields are used by Insight II. All other numeric fields may be set to 0 for use with Insight II. Insight II uses the geometry, topology, and bond order information found in the fragment libraries for all building functions.

The residue library provides connectivity information by specifying the parent of each and every atom found in the molecule. For each atom, the parent of the atom is uniquely specified, the bond order is given, and if the atom is involved in a ring closure then the ring closure atom is also specified. If the parent of a given atom is found in the current residue then the name of that parent atom is given. If the parent is in the preceding residue then the bond order to the parent atom is followed by an asterisk, *, to indicate that the parent is located in the previous residue.

The three geometrical parameters that are provided for each atom are:

The distance of the current atom from its parent atom,
The angle subtended by the current atom, its parent, and its grandparent, and
The torsional or dihedral angle subtended by the atom, its parent, its grandparent, and its great-grandparent.

In addition, specific torsion angles can be given names in the residue library. Each torsion name represents a torsion angle between four specific atoms.

The next two fields contain two flags:

A side chain flag.
An out-of-plane flag.

The side chain flag is used to indicate whether a named torsion is a side chain torsion or a backbone torsion (in peptides). If it is a side chain torsion, then the side chain flag is set to 1 or 2; otherwise, the flag is set to 0. The out-of-plane flag is used to indicate whether a given atom is a central atom of an out-of-plane group. If it is a central atom of an out-of-plane group, then the out-of-plane flag is set to 1; otherwise, the flag is set to 0. This information is used in locating and assigning internal coordinates and their associated potential function parameters.

The next two items specified for each atom in the residue library are:

The potential function atom type.
The partial atomic charge.

These atom types are used at run time to select parameters for the internal coordinates from the associated potential function parameter library, cvff.frc, amber.frc, or cff91.frc. The partial atomic charge is used directly in the calculations of the Coulombic contribution to the nonbond potential energy.

The final two fields are used to identify the charge group to which the atom belongs and whether or not this atom is the switching atom (i.e., the atom used to decide whether this charge group is within the cutoff distance for nonbond calculations). A portion of Insight's residue library is given as an illustration in Portion of Insight's Residue Library. (Note that the first two lines shown below are used merely to label the column numbers; they are not part of the library.) The residue libraries are found in the directory pointed to by the environment variable $BIOSYM_LIBRARY.

Portion of Insight's Residue Library

(This excerpt does not include the beginning of the file.) 1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890


SD   CG   1.0           1.740   111.000  195.000 chi2 1 0 s    0.1200 csc  1   

CE   SD   1.0           1.670   101.000  194.000 chi3 1 0 c3  -0.3200 csc  0   

HE1  CE   1.0           1.080   110.000  300.000      0 0 h    0.1000 csc  0   

HE2  CE   1.0           1.080   110.000  180.000 chi4 1 0 h    0.1000 csc  0   

HE3  CE   1.0           1.080   110.000   60.000      0 0 h    0.1000 csc  0   

PHE                         20                Phenylalanine, polypeptide residue

N    C    1.0 *         1.348   116.300  180.000 psi  0 1 n   -0.5000 pepN 1   

CA   N    1.0           1.436   123.100  180.000 omeg 0 0 ca   0.1200 pepN 0   

HN   N    1.0           1.080   123.000    0.000      0 0 hn   0.2800 pepN 0   

HA   CA   1.0           1.080   110.000  300.000      0 0 h    0.1000 pepN 0   

C    CA   1.0           1.509   109.600  180.000 phi  0 1 c'   0.3800 pepC 1   

O    C    2.0           1.263   118.100    0.000      0 0 o'  -0.3800 pepC 0   

CB   CA   1.0           1.554   111.600   60.000      0 0 c2  -0.2000 meB  1   

HB1  CB   1.0           1.080   110.000   63.000      0 0 h    0.1000 meB  0   

HB2  CB   1.0           1.080   110.000  183.000      0 0 h    0.1000 meB  0   

CG   CB   1.0           1.472   113.700  303.000 chi1 1 1 cp   0.0000 arG  1   

CD1  CG   1.5           1.376   123.100   87.400 chi2 1 1 cp  -0.1000 arD1 1   

HD1  CD1  1.0           1.080   120.000    0.000      0 0 h    0.1000 arD1 0   

CE1  CD1  1.5           1.368   122.600  180.000      0 1 cp  -0.1000 arE1 1   

HE1  CE1  1.0           1.080   120.000  180.000      0 0 h    0.1000 arE1 0   

CZ   CE1  1.5           1.388   118.900    0.000      0 1 cp  -0.1000 arZ  1   

HZ   CZ   1.0           1.080   120.000  180.000      0 0 h    0.1000 arZ  0   

CE2  CZ   1.5           1.380   120.600    0.000      0 1 cp  -0.1000 arE2 1   

HE2  CE2  1.0           1.080   120.000  180.000      0 0 h    0.1000 arE2 0   

CD2  CE2  1.5  CG   1.5 1.376   118.000    0.000      0 1 cp  -0.1000 arD2 1   

HD2  CD2  1.0           1.080   120.000  180.000      0 0 h    0.1000 arD2 0   

PHEn                        21                Phenylalanine, neutral N-terminus

N    N    1.0           0.000     0.000    0.000      0 0 n2  -0.5000 pepN 1   

CA   N    1.0           1.436     0.000    0.000      0 0 ca   0.1200 pepN 0   

HN1  N    1.0           1.080   123.000    0.000      0 0 hn   0.1400 pepN 0   

HN2  N    1.0           1.080   123.000  180.000      0 0 hn   0.1400 pepN 0   

HA   CA   1.0           1.080   110.000  120.000      0 0 h    0.1000 pepN 0   

C    CA   1.0           1.509   109.600    0.000      0 1 c'   0.3800 pepC 1   

SD   CG   1.0           1.740   111.000  195.000 chi2 1 0 s    0.1200 csc  1   

CE   SD   1.0           1.670   101.000  194.000 chi3 1 0 c3  -0.3200 csc  0   

HE1  CE   1.0           1.080   110.000  300.000      0 0 h    0.1000 csc  0   

HE2  CE   1.0           1.080   110.000  180.000 chi4 1 0 h    0.1000 csc  0   

HE3  CE   1.0           1.080   110.000   60.000      0 0 h    0.1000 csc  0   

PHE                         20                Phenylalanine, polypeptide residue

N    C    1.0 *         1.348   116.300  180.000 psi  0 1 n   -0.5000 pepN 1   

CA   N    1.0           1.436   123.100  180.000 omeg 0 0 ca   0.1200 pepN 0   

HN   N    1.0           1.080   123.000    0.000      0 0 hn   0.2800 pepN 0   

HA   CA   1.0           1.080   110.000  300.000      0 0 h    0.1000 pepN 0   

C    CA   1.0           1.509   109.600  180.000 phi  0 1 c'   0.3800 pepC 1   

O    C    2.0           1.263   118.100    0.000      0 0 o'  -0.3800 pepC 0   

CB   CA   1.0           1.554   111.600   60.000      0 0 c2  -0.2000 meB  1   

HB1  CB   1.0           1.080   110.000   63.000      0 0 h    0.1000 meB  0   

HB2  CB   1.0           1.080   110.000  183.000      0 0 h    0.1000 meB  0   

CG   CB   1.0           1.472   113.700  303.000 chi1 1 1 cp   0.0000 arG  1   

CD1  CG   1.5           1.376   123.100   87.400 chi2 1 1 cp  -0.1000 arD1 1   

HD1  CD1  1.0           1.080   120.000    0.000      0 0 h    0.1000 arD1 0   

CE1  CD1  1.5           1.368   122.600  180.000      0 1 cp  -0.1000 arE1 1   

HE1  CE1  1.0           1.080   120.000  180.000      0 0 h    0.1000 arE1 0   

CZ   CE1  1.5           1.388   118.900    0.000      0 1 cp  -0.1000 arZ  1   

HZ   CZ   1.0           1.080   120.000  180.000      0 0 h    0.1000 arZ  0   

CE2  CZ   1.5           1.380   120.600    0.000      0 1 cp  -0.1000 arE2 1   

HE2  CE2  1.0           1.080   120.000  180.000      0 0 h    0.1000 arE2 0   

CD2  CE2  1.5  CG   1.5 1.376   118.000    0.000      0 1 cp  -0.1000 arD2 1   

HD2  CD2  1.0           1.080   120.000  180.000      0 0 h    0.1000 arD2 0   

PHEn                        21                Phenylalanine, neutral N-terminus

N    N    1.0           0.000     0.000    0.000      0 0 n2  -0.5000 pepN 1   

CA   N    1.0           1.436     0.000    0.000      0 0 ca   0.1200 pepN 0   

HN1  N    1.0           1.080   123.000    0.000      0 0 hn   0.1400 pepN 0   

HN2  N    1.0           1.080   123.000  180.000      0 0 hn   0.1400 pepN 0   

HA   CA   1.0           1.080   110.000  120.000      0 0 h    0.1000 pepN 0   

C    CA   1.0           1.509   109.600    0.000      0 1 c'   0.3800 pepC 1

Version Number


123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_ 

         2.2

Starting with Version 2.2 of the Discover program, the first line of the residue library must specify the version number. If a version record is missing, an old format style is assumed. For residue libraries following the format described here, the correct version number is 2.2. The version number must be a floating-point number in columns 10-15.

Header Card/First Line of a Residue

Columns 1-4, Residue Name

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
PHE 20 Phenylalanine, polypeptide residue

The 1- to 4- letter abbreviation for a residue (the residue name) is always found at the very beginning of the list of atoms for that residue. The next residue name follows the list of atoms for the previous residue. For example, in PHE begins the list of atoms pertaining to phenylalanine, and PHEn follows the list of atoms pertaining to phenylalanine, PHE (PHEn begins the list of atoms pertaining to the neutral N-terminus version of phenylalanine). Remember, the naming conventions are completely optional.

Columns 26-30, Number of Atoms

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
PHE 20 Phenylalanine, polypeptide residue

The number listed in the first line of a residue, in columns 26-30, represents the number of atoms in that residue. For example, the 20 atoms contained in PHE ().

Columns 36-45, pK_a values

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
PHE 20 Phenylalanine, polypeptide residue

The number listed in the first line of a residue, in columns 36-45, represents the pK_a value for that residue. Note that PHE in has no pK_a value because it has no ionizable protons. The pK_a values are used by Insight II to assign hydrogens by pH.

Columns 47-127, Residue comments

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789..
PHE 20 Phenylalanine, polypeptide residue

An optional brief description of a residue may be placed in columns 47-132.

Atom Cards/Second and Following Lines of a Residue

Columns 1-4, Residue Atoms

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

The names listed in the second and following lines of a residue define the atoms contained in that residue. The first field (columns 1-4) contains the atom name. This name must be unique within this residue. By convention, the atom name begins with the atomic symbol. Any choice of letters or numbers may be included in the atom name, so long as the total number of characters does not exceed 4.

Columns 6-9, Parent Atoms

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Names listed in columns 6-9 are the atom names of this atom's parent (i.e., the name of the atom to which this atom is bonded). The parent atom must be defined before it can be used as a parent, unless the parent exists in a previous residue (see column 15).

Columns 11-13, Bond Order (may be set to 0.0 for Insight II)

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

The bond order information is used in generating automatic parameters when no explicit parameters are available. Currently the automatic parameter procedures recognize bond orders of 1.0, 1.5, 2.0, or 3.0, which correspond to single, partial double (automatic), double, and triple bonds, respectively.

Column 15, Atom Bonds to Previous Residue

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

An asterisk * indicates that the parent atom is found in the previous residue. For example, in the first line for the residue PHE (), N is bonded to C. The * denotes that the parent atom is not the C in PHE, but rather the C in the preceding residue.

Columns 16-19, Ring Closure Atoms (not required by Insight II)

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
CD2 CE2 1.5 CG 1.5 1.376 118.000 0.000 0 1 CP -0.1000 arD2 1

Names listed in columns 16-19 define ring closure bonds. For example, (in ) for PHE, CG closes the 6-atom ring, CG-CD2-CE2-CZ-CE1-CD1-CG. A ring closure atom only designates the atom that closes a ring and does not change other parameters in the atom card.

Columns 21-23, Ring Closure Bond Order (may be set to 0.0 for Insight II)

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
CD2 CE2 1.5 CG 1.5 1.376 118.000 0.000 0 1 CP -0.1000 arD2 1

A number in this field defines the bond order for ring closing bonds in the same way a number in columns 11-13 defines normal bonds.

Columns 25-29, Bond Distance Parameters (may be set to 0.0 for Insight II)

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Numbers that appear in columns 25-29 represent atom-parent bond distance parameters (in angstroms). For example, in the residue PHE (), N-C * has a bond distance of 1.348 Å. Other examples, in PHE, include: CA-N = 1.436 Å, C-CA = 1.509 Å, O-C = 1.263 Å, CB-CA = 1.554 Å.

Columns 33-39, Valence Angle Parameters (may be set to 0.0 for Insight II)

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
CA N 1.0 1.436 123.000 180.000 omeg 0 0 ca 0.1200 pepN 0

Numbers that appear in columns 33-39 represent atom-parent-grandparent bond angles, or valence angles (in degrees). For example, in the residue PHE (), CA-N-C * has a valence angle of 123.100°. Other examples, in PHE, include: O-C-CA = 118.100°, HA-CA-N = 110.000°, and HZ-CZ-CE1 = 120.000°.

Columns 42-48, Torsion Angle Parameters (may be set to 0.0 for Insight II)

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
C CA 1.0 1.509 109.600 180.000 phi 0 1 c' 0.3800 pepC 1

Numbers that appear in columns 42-48 represent atom-parent-grandparent-greatgrandparent bond angles, or torsion angles (in degrees). For example, in PHE (), C-CA-N-C * has a torsion angle of 180.000°. Other examples, in PHE, include: O-C-CA-N = 0.000°, CG-CB-CA-N = 303.000°, and HB1-CB-CA-N = 63.000°.

Columns 50-53, Torsion Angle Names

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Columns 50-53 are reserved for torsion angle names, which correspond to the torsion angle parameters in columns 42-48. Each name represents a torsion angle between four specific atoms.

Column 55, Side Chain Flag (may be set to 0 for Insight II)

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Column 55 can have only two values, 0 and 1. This flag has a value of 0 if the named torsion angle is a main chain (backbone) torsion. This flag has a value of 1 if the named torsion angle is a side chain torsion. For example, in PHE (), the flag has a value of 0 for the torsion angles corresponding to psi, omeg, and phi, which are main chain torsions. The flag has a value of 1 for the torsion angles corresponding to chi1 and chi2, which are sidechain torsions.

Column 57, Out-of-Plane Flag

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Column 57 can have only three values, 0, 1, and 2. A value of 1 indicates that the present atom is a central atom of an out-of-plane group; thus, any central atom that has a potential to move out of the plane of the bond is flagged with a value of 1. A value of 2 is used for AMBER atom types to indicate the use of standardized ordering for the pseudotorsion atoms.

Columns 59-60, Potential Function Atom Types

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Names listed in columns 59-61 represent potential function atom types. For example, n is the atom type name for the atom N and can be interpreted as an amide nitrogen. The potential atom type is the primary link into the forcefield parameters and so determines the chemistry of each atom (the internal force constants, nonbond interactions, etc.).

Columns 63-69, Partial Atomic Charges

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Numbers listed in columns 63-69 are the partial atomic charges in electrons for the corresponding atoms in columns 1-4. Examples of partial charge values, in PHE (), are N = -0.50 e, CA = 0.12 e, O = -0.38 e, and HE1 = 0.10 e. Each atom can have a unique partial atomic charge.

Columns 71-74, Charge Group Name

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

Each atom must be associated with a charge group. This charge group is used during nonbond calculations if a cutoff distance has been specified. If the distance between switching atoms of two charge groups is less than a cutoff distance, then the interactions between all atoms within each charge group is calculated. Charge group names are arbitrary, but must be unique within the residue. In general, charge groups should be as close to neutral as possible (unless the group is charged). This prevents cutting off only part of a dipole, which has the undesirable effect of creating transient monopoles during a calculation.

Column 76, Switching Atom Flag

123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_
N C 1.0 * 1.348 116.300 180.000 psi 0 1 n -0.5000 pepN 1

This flag is 1 if the atom is the switching atom (see above) of the charge group, otherwise it is 0. Although the choice of the switching atom is arbitrary, the atom closest to the geometric center of the group is recommended. Finally, atoms belonging to the same group must be contiguous in the residue library entry.

Terminal Card/EOF Last Line of Residue Library

The input for the entire residue library ends with the word elib in columns 1-4 of the final line of the file.

Specific Format for Insight's Residue Library

Insight's default residue library contains experimental data for the 20 standard amino acid residues and for other selected residues. The internal coordinate information that is contained in the residue library is meant to reflect, as much as possible, experimental structural information obtained from X-ray diffraction. Information on the original structural data is provided in Hagler et al. (1978, 1979a-c, 1985). (For complete references, refer to the Insight II references list.) For each residue, the information provided in the residue library consists of:

A 1- to 4-letter abbreviation for the residue.
A list of atoms.
A list of parent atoms, internal coordinates (bond lengths, valence bond angles, torsion angles), selected torsion names, potential atom types, and partial atomic charges.

Specific naming conventions are given for residue names, residue atoms, torsion names, and potential function atom types.

Residue Names

The 20 standard amino acid residues are listed in alphabetical order. Other selected residues are then listed in random order. The standard 3-letter abbreviations are used to represent the standard amino acids with no terminal ends--the residues that have no caps and can exist internally in a protein chain (i.e., GLY for glycyl, TYR for tyrosyl). Positively charged amino acids, with no caps, have a plus (+) sign in column 4 of the residue name (i.e., ARG+ for positively charged arginyl). Negatively charged amino acids, with no caps, have a minus (-) sign in column 4 of the residue name (i.e., ASP- for negatively charged aspartyl).

Capped residues are represented by an N in column 4 for an amino terminal NH₃ (e.g., GLYN for NH₃-glycyl-); an n in column 4 for an amide terminal NH₂ (e.g., GLYn for NH₂-glycyl-); or a C in column 4 for a carbonyl terminal COO^- (e.g., GLYC for -glycine). Table 30 shows a summary of these residue naming conventions.

Table 30 . General Reference for Residue Names in Insight's Residue Library

Residue Name Description
res Internal neutral residue.
res+ Positively charged residue.
res- Negatively charged residue.
resN Charged amino terminal NH₃⁺.
resn Neutral amino terminal NH₂.
resC Charged carboxyl terminal COO^-.

Residue Atom Names

Atom names are based on the Brookhaven naming convention, using the Greek alphabet. The backbone atoms are represented by N, CA, C, and O; then all the remaining atoms (the side chain atoms) are named in order from the alpha carbon CA. Table 31 includes examples of residue atom names and their corresponding Greek letters.

Table 31 . Greek Letter Notation Used in Insight's Residue Library for Residue Atom Names

Greek Letter Greek
Name Residue Library
Letter Examples of
Atom Names
a alpha A CA, HA
b beta B CB, HB
g gamma G CG, HG, OG, SG
d delta D CD, HD, OD, ND, SD
e epsilon E CE, HE, OE, NE, SE
z zeta Z CZ, HZ, OZ, NZ
h eta H CH, HH, OH, NH

If there are two or more atoms in the same position relative to CA, the atoms are given number representations. For example, in PHE (), two carbons are in the delta position; they are thus designated CD1 and CD2.

Torsion Angle Names

Torsion angle names are provided for selected torsion angles. Table 32 includes the torsion angle names and their corresponding torsion angles found in Insight's residue library. Note that a torsion name is associated with the residue in which the grandparent atom is found.

Table 32 . Torsion Angle Names Included in Insight's Residue Library

Torsional Angle Name Atoms Included in Angle¹
phi (_i) C_i-CA_i-N_i-C_i-1
psi (_i) N_i+1-C_i-CA_i-N_i
omeg (_i) CA_i-C_i-N_i+1-CA_i+1
chi1 (_i¹) N_i-CA_i-CB_i-CG_i
chi2 (_i²) CA_i-CB_i-CG_i-CD_i
chi3 (_i³) CB_i-CG_i-CD_i-CE_i

¹ C_i-1 means this atom exists in the previous residue.

Definitions

Parents

A parent of an atom completes a bond between two atoms.

The atoms in columns 6-9 represent bonds to the corresponding atoms in columns 1-5. The atoms in columns 6-9 are thus parents of the corresponding atoms in columns 1-5. For the residue PHE (in ), C * is the parent of N, N is the parent of CA, CA is the parent of C, C is the parent of O, CA is the parent of CB, etc. An atom can be a parent of one or more atoms.

Grandparents

A grandparent of an atom completes a valence angle between three atoms. In , for the residue PHE, CA is the parent of CB and N is the parent of CA, so N is the grandparent of CB. N completes a valence angle between the three atoms CB-CA-N. N is also the grandparent of C; thus, N also completes a valence angle between the three atoms C-CA-N. Other examples of valence angles, specified for PHE, include: O-C-CA, CG-CB-CA, HA-CA-N, HZ-CZ-CE1.

Greatgrandparents

A greatgrandparent of an atom completes a torsion angle between four atoms. In , for the residue PHE, CB is the parent of CG (CG-CB) and CA is the grandparent of CG (CG-CB-CA), so N is the great-grandparent of CG. N completes a torsion angle between the four atoms, CG-CB-CA-N. Other examples of torsion angles, specified for PHE, include: O-C-CA-N, C-CA-N-C *, CZ-CE1-CD1-CG, HB1-CB-CA-N.

Distance, Torsion, Chiral, and NOE Volume Restraints (.rstrnt)

The .rstrnt file contains descriptions of the restraints to be applied during a minimization or dynamics calculation. The .rstrnt file replaces the .noe file that was used in earlier versions of Discover. The .rstrnt file has three sections. The distance and dihedral sections specify the upper and lower bounds for applying the restraint, as well as the force constants for the biharmonic restraining force outside this range. The chiral section specifies the chirality to be achieved at asymmetric centers.

The .rstrnt file is an ASCII file that can be created by you or written by NMRchitect. The file can contain the following records:

Header record.
Comment records.
Section identifiers.
Remote prochiral center section.
Distance restraints section.
Dihedral restraints section.
Chirality restraints section.
Mixing times restraints section.
NOE Volume restraints section.

Description of Sections

Header Record

The header record must be the first record in the file and contain:


!BIOSYM restraint n

where n is an integer (usually 1). Discover then interprets the file as being an ASCII file containing restraint records as outlined here.

Comment Record

Comment lines begin with an exclamation mark (!) and may occur anywhere after the first record.

Section Identifiers

Section identifiers must start in column 1 with a pound sign (#). All non-comment records that come after a section identifier and before the next section identifier (or the end of the file) are assumed to be records appropriate to that section.

The identifier lines introducing the sections are:


#remote_prochiral_centers



#chiral



#distance



#NOE_distance



#NOE_distance_overlapped



#mixing_times 



#NOE_volume



#NOE_volume_overlapped



#NMR_dihedral



#3J_dihedral

Atom Specification

The restraints records use the following syntax for selecting particular atoms and pseudoatoms, and for defining pseudoatoms.


molecule#:residuename_residue#:atomname

where the molecule number, residue name, residue number, and atom name are as defined in the .mdf file. Colons (:) and underscores (_) are used to delimit these numbers and names as shown.

The atom name can be that of an actual atom, a pseudoatom defined in the atom set section of the .mdf file or a pseudoatom defined using the define average command in Discover.

A previously undefined pseudoatom can be referenced with wildcards or a list. Wildcards can be used for pseudoatoms consisting of atoms in the same residue if all these atoms have names beginning with some common characters. For example, if atoms 1:ASN_2:HB1 and 1:ASN_2:HB2 are present, then 1:ASN_2:HB* defines a pseudoatom consisting of these two atoms. The asterisk wildcard can match strings of any length. These two atoms can also be referred as a list, that is,
1:ASN_2:HB1,HB2. In the list syntax, atom names are separated by commas without intervening spaces.

The pseudoatom is defined when the wildcard or list appears for the first time in the .rstrnt file. Thereafter, this pseudoatom is used whenever the same pattern appears.

One of a pair of prochiral hydrogens can be selected by using its prochiral specification. For example, on encountering the atom name HBS, Discover looks in the specified residue to find two atoms with names HB1 and HB2, determines their prochirality, and selects the pro-S atom to be used in the restraint. Similarly, on encountering HGR*, Discover looks for two pseudoatoms with names HG1* and HG2*, creates the pseudoatoms if necessary from (HG11,HG12,HG13) and from (HG21,HG22,HG23), and then selects the pro-R pseudoatom to use in the restraint. In each case, the character R (or S) is replaced with 1 or 2 and pro-R (or pro-S) is selected. Wildcards are allowed in this context.

Prochirality can be determined only if the molecular data file contains the priority sequence of the substituents at each prochiral center. If the prochiral atoms are not directly bonded to the prochiral center, the remote_prochiral_centers section of the restraints file should contain an entry indicating how these atoms are connected.

Record Format

In each section of the restraints file, data records appropriate to that section follow its identifier line. Within a record, the data is in free format, which means that at least one blank space is required between fields and that each field must contain a non-blank entry. All fields must be specified--no blank fields are allowed, except for trailing blank fields, which are read as zeroes.

The contents of each record are described in the following sections and tables.

Remote Prochiral Centers Section

When prochiral atoms are separated from the prochiral center by more than one bond, they must be listed in this section prior to using their prochiral specification in any restraint record. The identifier line for this section is:


#remote_prochiral_centers

The identifier is followed by records as shown in Table 33. The atom specifications in this section should not use any prochiral specification, since that would lead to a cyclic definition.

Table 33 . Remote Prochiral Center Definition

field# contents comments
1 atom specification one of the prochiral atoms
2 atom specification the second prochiral atom
3 atom specification the atom bonded to the first prochiral atom that leads to the prochiral center
4 atom specification the atom bonded to the second prochiral atom that leads to the prochiral center
5 atom specification the prochiral center

Sample:


#remote_prochiral_centers 

1:VAL_8:HG1* 1:VAL_8:HG2* 1:VAL_8:CG1 1:VAL_8:CG2 1:VAL_8:CB

Chirality Restraints Section

The records in the chirality restraints section specify chirality around asymmetric centers, as shown in Table 34.


#chiral

Table 34 . Chirality Restraints Definition

field# contents comments
1 atom specification the asymmetric center
2 S or R one character representing the desired chirality at the center


#chiral

1:THRN_1:CA S

1:ILE_35:CB S

Distance Restraints Section

The distance restraints section specifies upper and lower bounds for distances between pairs of atoms, force constants, and a limit for the force, using the format shown in Table 35.


#distance

Table 35 . Distance Restraints Definition

field# contents comments
1 atom specification one of the atoms in the pair
2 atom specification the other atom
3 lower bound* the smallest separation allowed between the pair of atoms (in angstroms)
4 upper bound the greatest separation allowed between the pair of atoms (in angstroms)
5 K_L force constant applied when atoms are closer than the lower bound (kcal mol^-1 Å^-2)
6 K_U force constant applied when atoms are farther apart than upper bound (kcal mol^-1 Å^-2)
7 maximum force limit on the magnitude of force (kcal mol^-1 Å^-1)
* a value of -1.0 signifies that no lower bound information is available, and that the sum of the van der Waals radii will be used instead.


#distance

1:AR+N_1:CA        1:ASP-_3:CA         4.700 7.200 1.00 1.00 1000.000

1:PRO_2:CA         1:PHE_4:CA          4.700 7.200 1.00 1.00 1000.000

NOE Distance Restraints Section


#NOE_distance

This section contains distance restraints derived from NOE data. These restraints are the same as the restraints in the distance restraints section. However, the records in this section have an expanded format (Table 36), to contain additional data relevant to NOE analysis

Table 36 . NOE Distance Restraints Definition

field# contents comments
1 atom specification one of the atoms in the pair
2 atom specification the other atom
3 lower bound the smallest separation allowed between the pair of atoms (in angstroms)
4 upper+correction the greatest separation allowed between the pair of atoms (in angstroms)
5 upper bound not currently used in Discover
6 K_L force constant applied when atoms are closer than the lower bound (kcal mol^-1 Å^-2)
7 K_U force constant applied when atoms are farther apart than upper bound (kcal mol^-1 Å^-2)
8 maximum force limit on the magnitude of force (kcal mol^-1 Å^-1)

.


#NOE_distance

!ATOM #1            ATOM #2                Distance            Force Constant   Max

!                                  Lower   Upper     Upper     Lower   Upper   Force

!                                        + correction

1:CYS_3:HA         1:CYS_4:HN        2.00   3.00     3.00       1.000   1.000   1000.0

1:GLY_31:HA*       1:CYS_32:HN       3.00   5.00     4.00       1.000   1.000   1000.0

1:SER_6:HBR        1:ILE_7:HN        2.00   3.00     3.00       1.000   1.000   1000.0

1:VAL_8:HGS*       1:VAL_8:HA        3.00   5.00     4.00       1.000   1.000   1000.0

NOE Overlapped Distance Restraints Section

This section contains overlapped distance restraints derived from NOE data. The first line of each of these restraints shares almost the same format as the NOE distance restraints. The only difference is that the column corresponding to the pseudo atom correction is absent in the overlapped restraint category. To assign multiple pairs of protons to the same restraint, one can put one additional pair per line with the continuation symbol "+" in the first column of succeeding lines.


#NOE_distance_overlapped

Table 37 . NOE Overlapped Distance Restraints Definitions

First line of the overlapped distance restraint:
field # contents comments
1 atom specification one of the atoms in the pair
2 atom specification the other atom
3 lower bound the smallest effective separation allowed between the pairs of atoms (in angstroms)
4 upper+correction the greatest effective separation allowed between the pairs of atoms (in angstroms)
6 K_L force constant applied when effective distance is smaller than the lower bound (Kcal mol^-1 Å^-2)
7 K_U force constant applied when effective distance is bigger than the upper bound (Kcal mol^-1 Å^-2)
8 maximum force limit on the magnitude of force (Kcal mol^-1 Å^-1)

.

Table 38 . NOE Overlapped Distance Restraints Definition

Succeeding line of the overlapped distance restraint:
field# contents comments
1 Continuation Symbol a "+" sign in the first column indicates a continuation of the definition of the multiple spin pairs in the same restraint.
2 atom specification the 1st atom
3 atom specification the 2nd atom

.


#NOE_distance_overlapped

!ATOM #1	 ATOM #2 	Effective Distance 	Force Constant   			Max

! 		Lower   Upper 	Lower   Upper 			Force

1:CYS_3:HA 	1:CYS_4:HN	 2.00        5.00 	1.000   1.000   			1000.0

+ 1:GLY_31:HA*	1:CYS_32:HN 

+ 1:SER_6:HBR 	1:ILE_7:HN

Mixing Times Restraints Section

Each field contains the value of the mixing times (in seconds) at which the subsequent NOE_Volume restraints were determined.


#mixing_times

The format of each entry is as shown in Table 39.

Table 39 . Mixing Times Restraints Definition

field# contents comments
1 tmix1 mixing time 1
2 tmix2 mixing time 2
m tmixm mixing time m

The sample specifies that the subsequent volume entries are associated with mixing times of 50, 100, 150, and 200 ms.


#mixing_times 

0.05 0.1 0.15 0.2

NOE Volume Restraints Section

This section contains NOE peaks volume restraints derived from experimentally measured NOE peak volumes or integrals. In the direct NOE refinement scheme, the volume restraints are compared to theoretical NOE volumes calculated for the current model structure. The number of fields will be 2m + 4, where m is the number of mixing times.


#NOE_volume

The format of each entry is as shown in Table 36.

Table 40 . NOE Volume Restraints Definition

field# contents comments
1 atom specification one of the atoms in the pair
2 atom specification the other atom
3 NOE volume in LB 1 lower bound on the NOE volume for mixing time 1
4 NOE volume in UB 1 upper bound on the NOE volume for mixing time 1
5 NOE volume in LB 2 lower bound on the NOE volume for mixing time 2
6 NOE volume in UB 2 upper bound on the NOE volume for mixing time 2
2m+1 _{NOE volume in LB m} lower bound on the NOE volume for mixing time m
2m+2 NOE volume in UB m upper bound on the NOE volume for mixing time m
2m+3 K_L lower bound force constant
2m+4 K_U upper bound force constant


#NOE_volume



1:GLY_2:HAR 1:PHE_HD* 0.075 0.125 0.175 0.225 0.275 0.325 0.375 0.425 40 80 

1:ALA_4:HB* 1:CYS_10:HAR -999.0 -999.0 0.175 0.225 0.275 0.325 0.375 0.425 40 80..

NOE Overlapped Volume Restraints Section

This section contains NOE peaks volume restraints derived from experimentally measured overlapped NOE peak volumes or integrals. In the direct NOE refinement scheme, the volume restraints are compared to theoretical NOE volumes calculated for the current model structure. The first line of each overlapped restraint shares the same format as that of the non overlapped case. The number of fields will be 2m + 4, where m is the number of mixing times. Each succeeding line then adds a spin pair to the definition of the overlapped peaks.


#NOE_volume_overlapped

The format of each entry is as shown in Table 36.

Table 41 . NOE Volume Restraints Definition

First line of the restraint:
field # contents comments
1 atom specification one of the atoms in the pair
2 atom specification the other atom
3 NOE volume in LB 1 lower bound on the NOE volume for mixing time 1
4 NOE volume in UB 1 upper bound on the NOE volume for mixing time 1
5 NOE volume in LB 2 lower bound on the NOE volume for mixing time 2
6 NOE volume in UB 2 upper bound on the NOE volume for mixing time 2
2m+1 _{NOE volume in LB m} lower bound on the NOE volume for mixing time m
2m+2 NOE volume in UB m upper bound on the NOE volume for mixing time m
2m+3 K_L lower bound force constant
2m+4 K_U upper bound force constant

Table 42 . NOE Overlapped Volume Restraints Definition

Succeeding line of the overlapped volume restraint:
field# contents comments
1 Continuation Symbol a "+" sign in the first column indicates a continuation of the definition of the multiple spin pairs in the same restraint.
2 atom specification the 1st atom
3 atom specification the 2nd atom

.


#NOE_volume



1:GLY_2:HAR 1:PHE_HD* 0.075 0.125 0.175 0.225 0.275 0.325 0.375 0.425 40 80 

+ 1:ALA_4:HB* 1:CYS_10:HAR

NMR Dihedral Restraints Section

Each record specifies a range for a dihedral angle and the force constants for the biharmonic restraint force (Table 43).


#NMR_dihedral

Table 43 . NMR Dihedral Restraints Definition

field# contents comments
1 atom specification four atoms defining the dihedral angle, listed in bonding sequence
2 atom specification
3 atom specification
4 atom specification
5 lower bound the smallest dihedral angle allowed (degrees)
6 upper bound the greatest dihedral angle allowed (degrees)
7 K_L force constant applied when angle is too small (kcal mol^-1 rad^-2)
8 K_U force constant applied when angle is too large (kcal mol^-1 rad^-2)
9 maximum force limit on the magnitude of force (kcal mol^-1 rad^-1)


#NMR_dihedral

1:CYS_4:C 1:PRO_5:N 1:PRO_5:CA 1:PRO_5:C -120.0 -60 50.0 50.0 500.0

3J Coupling Dihedral Section