GROMACS Tutorial 1 -- Water
In this introductory tutorial, I'll show you how to create a box of water and run a simple simulation on it with constant temperature and pressure. At the end we'll find out the density of water.
Every GROMACS simulations needs three essential files: structure (.gro/.pdb), topology (.top), and parameters (.mdp). The structure file contains the Cartesian coordinates of every atomic site in the system. The topology file contains information on how each atomic site interacts with other atomic sites, whether that is in non-bonded interactions or bonded interactions. This information is provided by the force field. Non-bonded interactions included van der Waals interactions and Coulomb interactions. Bonded interactions include bonds, angles, and dihedrals. The parameters file includes information on how long to run the simulation, the timestep, temperature and pressure coupling, etc. Below we'll obtain / create these files.
At this point I would suggest creating a directory to store files for this tutorial.
We'll start with the topology file. Typically a topology file uses an
statement to include the force field to be used. This force field includes
[ bondtypes ],
[ angletypes ], and
[ dihedraltypes ]
directives. Then in the topology file usually we specify different
moleculetype ] directives which contain
[ atoms ],
[ bonds ], and
dihedrals ] which refer back to the force field. Don't worry about this too
much right now. Water models include all of these for us. See Chapter 5 of the
reference manual for more information.
Create a file named
topol.top with the following text:
#include "oplsaa.ff/forcefield.itp" #include "oplsaa.ff/tip4pew.itp" [ System ] TIP4PEW [ Molecules ]
As you can see we've included the force field for OPLS-AA. Additionally we've
included the TIP4PEW water model. After this you'll see a
[ System ] directive,
which includes just the name of the system, which can be anything you want.
Lastly, we list out each
moleculetype and how many there are under
Molecules ]. Right now we don't have any (we'll get those in a minute).
The structure of TIP4PEW is already provided by GROMACS in the topology
directory. This standard location is typically
/usr/share/gromacs/top, but you
my have it installed in a different directory. If you are properly sourcing
GMXRC then it will be located at
$GMXDATA/top. In that directory you'll see
.gro files, one of which is
tip4p.gro. You'll also see the folder
oplsaa.ff which we've included in our topology file above. There isn't a
structure file specific to TIP4PEW. Four-point water structure is essentially
the same for TIP4P and TIP4PEW. What makes them different is the force field
To create a box of water using that structure file do:
$ gmx solvate -cs tip4p -o conf.gro -box 2.3 2.3 2.3 -p topol.top
If you open back up
topol.top you'll see that a line has been added at the
end with the word
SOL and number.
SOL is the name of the
is defined in
oplsaa.ff/tip4pew.itp. When we ran gmx solvate, GROMACS added
enough water molecules to fill a box 2.3 nm in each direction.
Now we need a set of parameter files so that GROMACS knows what to do with our starting structure. Simulations almost always have three main parts: minimization, equilibration, and production. Minimization and equilibration can be broken down into multiple steps. Each of these needs it's own parameters file. In this case we'll be doing two minimizations, two equilibrations, and one production run.
The files we'll be using should be downloaded from here.
There will be a few things common to all five of our files. In each description, I only give a very small comment. See the GROMACS page on this for more information on each option.
|cutoff-scheme||Verlet||Use in creating neighbor lists. This is now the default, but we provide it here in order to avoid any notes.|
|coulombtype||PME||Use Particle-Mesh Ewald for long-range (k-space) electrostatics.|
|rcoulomb||1.0||Cut-off for real/k-space for PME (nm).|
|vdwtype||Cut-off||van der Walls forces cut-off at
|rvdw||1.0||Cut-off for VDW (nm).|
|DispCorr||EnerPress||Long-range correction for VDW for both energy and pressure.|
Cut-off distances should be set keeping in mind how the force field was parameterized. In other words, its a good idea to look at the journal article that describes how the force field was created. We've chosen 1.0 nm for our cut-offs here, which is common enough for OPLS, but you may determine for your system to choose something else.
Additionally in each part we'll also be outputting an energy file, a log file,
and a compressed trajectory file. The rate of output (in simulation steps) for
these is set using
respectively. We'll output more information in the production run.
For each part, except for the second minimization, we'll also be constraining
all bonds involving a hydrogen using the LINCS algorithm by setting
constraint-algorithm = lincs and
constraints = h-bonds. This allows us to use
a larger time step than otherwise.
For the first minimization we use the steepest descents algorithm by setting
integrator = steep to minimize the energy of the system with a maximum of
1,000 steps (
nsteps = 1000). The minimization will stop if the energy
converges before then. Additionally we have
-DFLEXIBLE. This lets GROMACS know to use flexible water, since by default all
water models are rigid using an algorithm called SETTLE. In the water model's
topology file, which we have includes, there is an if statement that looks for
FLEXIBLE variable to defined. The purpose of this first minimization is to
get the molecules in a good starting position so we can turn on SETTLE without
In the second minimization we are simply removing
define = -DFLEXIBLE and
increasing the number of maximum steps to 50,000.
The last three parts -- the two equilibrations and production -- all use the
leap-frog integrator by setting
integrator = md. Additionally each one will
use a 2 fs time step by setting
dt = 0.002.
For the first equilibration step there are a few things to note. We are adding several parameters shown below:
|gen-vel||yes||Generate velocities for each atomic site according to a Maxwell-Boltzmann distribution. Only generate velocities for your first equilibration step. This gets us close to the temperature at which we will couple the system.|
|gen-temp||298.15||Temperature in K to use for
|tcoupl||Nose-Hoover||The algorithm to use for temperature coupling. Nose-Hoover correctly produces the canonical ensemble.|
|tc-grps||System||Which groups to couple. You can couple different groups of atoms separately, but we'll just couple the whole system.|
|tau-t||2.0||Time constant for coupling. See the manual for details.|
|ref-t||298.15||The temperature in K at which to couple.|
|nhchainlength||1||Leap-frog integrator only supports 1, but by default this is 10. This is set so GROMACS doesn't complain to us.|
The point of this first equilibration is to get us to the correct temperature
(298.15 K) before adding pressure coupling. Adding temperature and pressure
coupling at the same time can cause your system to be unstable and crash. We
don't want to shock our system at the beginning. Additionally, we have set
nsteps = 50000, so with a 2 fs timestep, that means that this will run for 100
ps. This is adequate for what we are doing here, but in larger / more
complicated systems you may need to equilibrate longer.
The second equilibration adds pressure coupling. Note that we are not
generating velocities again, since that will undo some of the work we just did.
We also set
continuation = yes for the constraints, since we are continuing
the simulation from the first equilibration. This part will run for 1 ns. Again,
this may need to be longer for other systems.
|pcoupl||Parrinello-Rahman||The algorithm to use for pressure coupling. Parrinello-Rahman correctly produces the isobaric-isothermal ensemble when used with Nose-Hoover.|
|tau-p||2.0||Time constant for coupling. See the manual for details.|
|ref-p||1.0||The pressure in bar at which to couple.|
|compressibility||4.46e-5||The compressibility of the system in bar^-1.|
For the production run, everything is exactly the same as the last equilibration, except we are outputting more data and running for 10 ns.
We have all the files we need now to run each part of the simulation. Each part you typically run gmx grompp to preprocess the three files we now have (.gro, .top, and .mdp) into a .tpr file (sometimes confusingly also called a topology file).
First, let's run our two minimization steps by doing the following:
$ gmx grompp -f mdp/min.mdp -o min -pp min -po min $ gmx mdrun -deffnm min $ gmx grompp -f mdp/min2.mdp -o min2 -pp min2 -po min2 -c min -t min $ gmx mdrun -deffnm min2
At each part we are reading in the .mdp file with the
-f flag. By default if
-p flags are not specified GROMACS uses
for the structure and topology files. Additionally we are outputting a processed
-pp and mdp file
-po. These are optional, but probably worth
looking at, especially the processed mdp file, since it is commented.
At each subsequent step we read in the previous step's last structure file or
checkpoint file using the
-t flags. By default GROMACS outputs
checkpoint files every 15 minutes and at the last step. If the checkpoint file
is not present, GROMACS will use the structure file defined by
-c, so it is a
good practice to specify both. At each gmx mdrun we are telling GROMACS to use
a default name for each input and output file, since several files are output.
Note we are using
-maxwarn 1 for the second minimization. Only use this flag
if you know what you are doing! In this case we get a warning about the
efficiency of L-BFGS which we can safely bypass.
To get a feel for what's going on, let's extract the potential energy of both of
these parts using the GROMACS command gmx energy. Do the following and enter
the number that corresponds with
Potential, followed by enter again:
$ gmx energy -f min.edr -o min-energy.xvg
Now do the same for the second minimization:
$ gmx energy -f min2.edr -o min2-energy.xvg
The header of the resulting
.xvg. file will contain information for use with
the Grace plotting program. I use gnuplot so some of these lines will cause
errors. I just simply replace every
@ character with
# in the
and then I can use gnuplot. To plot with first start gnuplot:
Then in the gnuplot terminal do:
> plot 'min-energy.xvg' w l
To plot the second minimization step do:
> plot 'min2-energy.xvg' w l
Your first plot should look something like this:
My second minimization didn't change anything, so I have nothing to plot.
Equilibration 1 (NVT)
Now that we have a good starting structure, let's do the first equilibration step, by adding the temperature coupling:
$ gmx grompp -f mdp/eql.mdp -o eql -pp eql -po eql -c min2 -t min2 $ gmx mdrun -deffnm eql
Let's take a look at how the temperature varies throughout the simulation:
$ gmx energy -f eql.edr -o eql-temp.xvg
Choose the number corresponding to
Temperature at the prompt and hit enter
again. Plot it in gnuplot as above. You should see something like:
Note that the temperature initially fluctuates wildly but eventually settles.
Equilibration 2 (NPT)
For our last equilibration, as stated earlier, we're adding a pressure coupling:
$ gmx grompp -f mdp/eql2.mdp -o eql2 -pp eql2 -po eql2 -c eql -t eql $ gmx mdrun -deffnm eql2
You can check out the temperature and pressure using gmx energy as above. Here's a plot of the pressure:
Note that pressure fluctuates quite a bit, which is normal. The average after full equilibration should be close to 1 bar in this case.
Now for the production part do:
$ gmx grompp -f mdp/prd.mdp -o prd -pp prd -po prd -c eql2 -t eql2 $ gmx mdrun -deffnm prd
Using gmx energy as above on
prd.edr, get the average temperature, pressure,
and density. Are they what you expect?
Here's my output:
Energy Average Err.Est. RMSD Tot-Drift ------------------------------------------------------------------------------- Temperature 298.145 0.019 8.65629 0.0338992 (K) Pressure 3.25876 0.97 688.616 -2.75083 (bar) Density 995.381 0.15 12.92 0.0705576 (kg/m^3)
You can also visualize your simulation using a program like vmd. To open the production part with vmd do:
$ vmd prd.gro prd.xtc
Here's a snapshot:
Note that do to the period boundary condition this can look kind of strange with bonds stretching across the box. You can make molecules whole by using gmx trjconv:
$ gmx trjconv -f prd.xtc -s prd.tpr -pbc mol -o prd-mol.xtc
Viewing that file should look much nicer:
In this tutorial we generate a box of TIP4PEW water using gmx solvate. We simulated it in five distinct parts: minimization 1, minimization 2, equilbiration 1, equilibration 2, and production. Each part used its own .mdp files which were explained. At each part we used gmx energy to extract useful information about the simulation. After the production run we were able to find the density of TIP4PEW water.