Molecular Models

# 4.3 Gromacs

The simulation program Gromacs bases on algorithms for flexible models, mainly applied for bio-systems. Since the MMBZS database mainly contains rigid molecular models, the proposed force fields are only partially applicable for Gromacs. Rigid models are constructed by using workarounds such as bond and angles constrains. The introduction of these constraints is only possible for models with less than four interaction sites. The database therefore only provides Gromacs input files for such molecular structures. Furthermore, it should be noted that Gromacs distinguishes between two types of interaction sites:

1. The interaction site
can be Lennard-Jones and/or Coulomb point charge interactions and have a mass.
2. The virtual site
can be Coulomb point charges and have no mass.

Neither point dipoles nor point quadrupoles can be modeled in Gromacs explicitly. Therefore, point dipoles and point quadrupoles are modeled by using point charges that reproduce such dipoles and quadrupoles. This procedure is described in section 3. The point charges are introduced using virtual sites.

The structure and content of the Gromacs input files provided by the MMBZS database are described in the following. The input file for a single molecular model is separated into four individual files for the sake of simplicity and to avoid a large and cluttered input file. Each of the four individual files has a separate task:

1. forcefield.itp
Defines the type of interactions between similar and differently parameterized interaction sites.
2. atomtypes.atp
Lists all sites used in the force field model and additionally gives the mass of these interaction sites.
3. *.itp
Defines the topology of the models and thus indicates the relative position of the sites to each other.
4. *.pdb
Contains the coordianes of all sites for a single molecule in pdb format, which can be used to visualize
the model, for instance in VMD, and to generate a simulation box using the "genbox" tool in gromacs.

In the following, the structure of the individual files is described in more detail, with special attention on the implementation of rigid models in Gromacs. In general, Gromacs input files are divided into sections. The name of each section is define using brakets. Then, each line within a section defines a set of parameters of the model. Since several parameters are defined per line, the name or purpose of a parameter is given in a comment line below the name of the section for the sake of clarity. A comment is indicated by a semicolon ";".

# forcefield.itp File

The forcefield.itp file consists of only one section "defaults". The title of a sections is always enclosed by square brackets. In this section three parameters have to be defined:

1. "nbfunc"
Defines the type of interaction used to model the dispersive and repulsive interactions. Two interaction types are available: Lennard-Jones ("1") and Buckingham ("2"). For all models of the database this parameter is set to: "1".
2. "comb-rule"
Defines the calculation of the interaction parameters for the interaction between two unlike sites, i.e. the combination rule. To use the convention already introduced in section 3, this parameter is set to: "2", which means that the Lorentz-Berthelot mixing rules are applied.
3. "gen-pairs"
Specifies whether certain interactions between two unlike interaction sites are to be specified manually, thus modifying the existing convention for certain interaction pairs. This results in the values "yes" and "no" for this parameter. Since a convention for interactions between two unlike interaction sites has already been made, this parameter is set to: "no" for the files provided here.

These general settings are identical for each model of the database and therefore the input file "forcefield.itp" is the same for each model of the database:

[ defaults ] ; nbfunc comb-rule gen-pairs 1 2 no

# atomtypes.atp File

The input file atomtypes.atp has no subdividing sections or further structure. This file is a list of the all interaction sites of the molecular model and their respective masses. In each line only one site name and the corresponding mass of the interaction site is listed, whereby the two information are separated at least by one blank character. The naming of the individual interaction sites is done by enumerating the sites, taking into account the type of site. For a molecular model consisting of $n$ interaction sites and $m$ virtual sites, the input file atomtypes.atp has the following form:

A1 #Mass_of_interaction_site_A1 A2 #Mass_of_interaction_site_A2 . . . . . . An #Mass_of_interaction_site_An VS1 #Mass_of_virtual_site_VS1 VS2 #Mass_of_virtual_site_VS2 . . . . . . VSm #Mass_of_virtual_site_VSm

# *.itp File

After the sites of the molecular model have been introduced in the input file atomtypes.atp, the interaction parameters of the individual sites are assigned in the itp input file within the section "atomtypes". The following parameters can be considered interaction parameters in the simulation program Gromacs:

1. Lennard-Jones interaction parameter
2. charge
3. mass
4. Type of the site

After the individual sites are fully specified, the molecular model is defined under the section "moleculetype" by combining interaction sites to molecules. The interaction sites used in molecules will be referred to as atoms in the following -- in analogy to the Gromacs documentation. In the following sections, the topology of the molecular model is defined. This procedure enables the user ingeneral to define several molecular models in the four input files presented. Nevertheless, the input files provided on the MMBZS database always only contain single molecular models. In the following, the structure of the sections of the input file itp is explained in detail.

[ atomtypes ]

In this section, the individual sites are specified by setting the following parameters for each site:

1. name
The name assigned to the interaction site in the input file \texttt{atomtypes.atp}, this name serves as reference which site is defined in this line.
2. bond_type
This tag define the bonds that the site can form, but since only rigid models are provided here, this tag is not used.
3. ptype
This parameter indicates whether the interaction site is an interaction site of type (A) or a virtual site (D) -- see above.
4. mass, charge, LJ-parameter (sigma, epsilon)
These parameters define the molecular interactions of the site.

For each site, the parameters listed above are defined in the order shown in the Figure below, with the parameters separated by at least one space. Note that each site is defined in a separate line.

name bond_type mass charge ptype sigma epsilon

[ moleculetype ]

After the individual interaction sites have been specified, a molecule is defined in this section that consists of the sites defined above. First, the molecule is named , followed by an integer. The integer defines the number of bonds or constraints below which non-bonded interactions within a molecule are excluded. For instance, a three indicates that all non-bonded interactions between sites separated by three or less bonds are neglected. Since the database only consists of rigid molecular models, this integer is set to a large number so that no internal interactions are considered. In the case of n interaction sites, the number is set to n-1.

[ moleculetype ] moleculename n-1

[ atoms ]

The individual atoms of the molecule are specified in the section "atoms". Therefore a consecutive integer is assigned to each site in the molecule. Based on this reference number, the individual atoms of the molecule are defined using the following parameters:

1. nr
Internal molecule reference number for each atoms
2. type
This parameter is used to describe the interaction properties of the atom. This is achieved by setting this parameter to the name of the interaction site already defined in the "atomtypes" section.
3. resnr, residue, cgnr
These parameters can be used to define further substructures like functional groups. The input files provided by the MMBZS database does not use this feature, whereby these parameters have the same value for each atom (resnr = 1, residue = MOD, cgnr = 1).
4. atom
This parameter generally describes which physical part of a substance is modeled by this site.
5. charge, mass
These parameters adjust the corresponding interaction properties of the atom. In the case of the input files provided by the MMBZS database no further adjustments are needed, thus these parameters are identical to those of the corresponding interaction sites.

[ atoms ] nr type resnr residue atom gnr charge mass

After the atoms of the molecule have been defined, the topology of the molecule is now defined in the following sections. First, the relative arrangement of the atoms described by an interaction site is defined. In the second step the relative position of the virtual sites is described based on the interaction sites. The definition of the position of the interaction sites is analogous to the Z-Matrix representation, where distances and angles between the sites are given.

[ constraints ]

In the section "constraints" all distances between two atoms, each corresponding to a interaction sites, are defined and fixed over the whole simulation. This is done with four parameters. The first two parameters give the reference number of the atoms whose distance should be defined. The next parameter describes the function type to be used internally to keep the distance between the given atoms constant, in this case the parameter is set to: "1", for each pair of atoms. The last parameter then specifies the distance between the specified atoms in nm.

[ constraints ] #site_1_nr. #site_2_nr. 1 #distance

If, however, the number of interaction sites is greater than two, additional bond angles must be defined, which must also be kept constant over the entire simulation. To keep the bond angle constant, the option "all-angles" in the *.mdp file should be used.

[ angletypes ]

Defining an angle is similar to the definition of sites and is done in two steps. First, the type of the angle has to be specified. This is done after the section "atomtypes" in the section "angletypes". Subsequently, the position of the angle is defined in the section "angles", which follows the section "constrains".

The definition of the angle is done by five parameters. The first three parameters indicate the sites between which the angle is located. The names of the interaction sites are used for referencing. This procedure is analogous to the Z-Matrix. The next parameter defines the function that describes the bond angle potential, at this point the option "1" is selected for all angles, since all molecular models in the MMBZS are rigid molecular models and the value is to be ignored by Gromacs. The fifth parameter describes the magnitude of the angle. The last parameter is used to calculate the bond angle potential. Since the option "all-angles" must be selected in the "*.mdp" input file, which transform all angles into constraints, this parameter is ignored here and will not be used any more. Therefore, the value of this parameter can be chosen arbitrarily.

[ angletypes ] ;i j k func th0 cth A1 A2 A3 1 57.4074 1000.000

[ virtual_sites3 ]

After the positions of the interaction sites have been defined, the positions of the atoms corresponding to virtual sites are specified in the section "virtual_sites" on the basis of the atoms corresponding to the interaction sites. If more than two interaction sites were used in the molecular model, the section "virtual_sites3" is used, otherwise "virtual_sites2". The more detailed structure of these sections can be found in the Gromacs manual in chapter 4.7.

[ exclusions ]

Since the models of the database are rigid models, intramolecular interactions must be switched off in Gromacs. This section describes how to exclude intramolecular interactions between atoms by using the internal reference numbers of the atoms. Since all molecular interactions are to be excluded, all interactions with the other atoms must be excluded for each atom. Thus the following scheme results, if the number of sites is $n$:

[ exclusions ] 1 2 3 ... n 2 1 3 ... n . . . . . . . . . . . . . . . n-1 1 ... n-2 n n 1 2 ... n-1

# *.pdb File

The pdb file is provided for initialization of the simulation box. In general a pdb file has the following form for a model consisting of $n$ interaction sites and $m$ virtual sites:

MODEL: #model_name CRYST1 #heigth #width #depth 90.00 90.00 90.00 ATOM 1 A1 MOD 1 #x #y #z ATOM 2 A2 MOD 1 #x #y #z . . . . . . . . . . . . . . . . . . . . . . . . ATOM n An MOD 1 #x #y #z ATOM n+1 VS1 MOD 1 #x #y #z . . . . . . . . . . . . . . . . . . . . . . . . ATOM n+m VSm MOD 1 #x #y #z

In the first line, "#model_name" indicates the molecular model which is specified in the pdb file. In the second line, a box is defined around the entire model, which simplifies the construction of a simulation box. The parameters are: "#height", "#width", "#depth"; which are the height, width and depth of the simulation box, respectively. In the following, the position of the individual interaction sites are defined line by line, with the parameters:"#x", "#y" and "#z", which are the Cartesian coordinates of the corresponding site in an arbitrary but fixed Cartesian coordinate system. It is important to pay attention to the format of this file, especially to the spacing of the individual information. Since this file is read in columns, informations are required to be in a certain position.

This is shown schematically in the figure below, where the lengths of the information and the spaces are indicated by square brackets:

MODEL: #model_name [10] | [3] | [6] | [3] |[6]| [3] |[4]| CRYST1 |#heigth| |#width| |#depth| | 90.00 90.00 90.00 [6] | [5] | [4] |[1]| [3] |[1]|[1]| [9] | [8] | [8] | [8] | ATOM | 1 | A1 | | MOD | | 1 | | #x | #y | #z | . . . . . . . .

# Example

The 'C2H4 I'-model published by Vrabec et al. in [Vrabec, 2001] is taken as an full example for the Gromacs input files provided by the MMBZS database. Note that the point quardupole of the original model was converted into point charges, since the Gromacs simulation program cannot handle point quardupoles explicitly. The input files of this model for the simulation program Gromacs has than the following form:

forcefield.itp File

[ defaults ] ; nbfunc comb-rule gen-pairs 1 2 no

atomtypes.atp File

A1 14.027 A2 14.027 VS1 0 VS2 0 VS3 0

CHN.itp File

; rigid model: use the option "all-angles" *.mdp file [ atomtypes ] ; name bond_type mass charge ptype sigma epsilon A1 A1 14.0270 0.00000 A 0.37610 0.63980 A2 A2 14.0270 0.00000 A 0.37610 0.63980 VS1 VS1 0.00000 12.7515 D 0.00000 0.00000 VS2 VS2 0.00000 -25.5030 D 0.00000 0.00000 VS3 VS3 0.00000 12.7515 D 0.00000 0.00000 [ moleculetype ] C2H4_I 1 [ atoms ] ;nr type resnr residue atom cgnr charge mass 1 A1 1 MOD A1 1 0.00000 14.0270 2 A2 1 MOD A2 1 0.00000 14.0270 3 VS1 1 MOD VS1 1 12.7515 0.00000 4 VS2 1 MOD VS2 1 -25.5030 0.00000 5 VS3 1 MOD VS3 1 12.7515 0.00000 [ constraints ] ; fixed bond length ; d (nm) 1 2 1 0.12700 [ exclusions ] 1 2 3 4 5 5 1 2 3 4 4 5 1 2 3 3 4 5 1 2 2 3 4 5 1 [ virtual_sites2 ] ; Vsite from to funct a 3 1 2 1 0.64811736904293 4 1 2 1 0.5 5 1 2 1 0.35188263095707

CHN.pdb File

MODEL: C2H4_I CRYST1 1.0 1.0 1.2 90.00 90.00 90.00 ATOM 1 A1 MOD 1 0.000 0.000 -0.634 ATOM 2 A2 MOD 1 0.000 0.000 0.634 ATOM 3 VS1 MOD 1 0.000 0.000 0.188 ATOM 4 VS2 MOD 1 0.000 0.000 0.000 ATOM 5 VS3 MOD 1 0.000 0.000 -0.188