GromacsWrapper — a Python framework for GROMACS
- Release:
0.8.5+23.g463820c.dirty
- Date:
November 20, 2023
GromacsWrapper is a pure Python package that wraps system calls to GROMACS tools into thin classes. This allows for fairly seamless integration of the GROMACS tools into Python scripts. This is generally superior to shell scripts because of Python’s better error handling and superior data structures. It also allows for modularization and code re-use. In addition, commands, warnings and errors are logged to a file so that there exists a complete history of what has been done.
GROMACS versions 4.6.x, 2016.x, 2018.x, 2019.x, 2020.x, 2021.x,
2022.x, and 2023.x are all supported. GromacsWrapper detects your
Gromacs tools and provides them as gromacs.grompp()
,
gromacs.mdrun()
, etc, regardless of your Gromacs version, which
allows one to write scripts that are broadly Gromacs-version
agnostic. Source your GMXRC
file or make the gmx binary
(for versions ≥ 5.x) or all the GROMACS tools available on your
PATH
for GromacsWrapper to find the GROMACS installation.
Warning
Although GromacsWrapper has been used in published research over the last 10 years or so, there is no guarantee that any of the defaults chosen are suitable for any particular project or simulation setup.
It is your responsibility to ensure that you are running simulations with sensible parameters. Provide appropriate template files instead of the bundled defaults and check the logger output!
Please report bugs in the issue tracker and go to the discussions forum for questions.
Contributions are very welcome — start by raising an issue in the issue tracker to describe what feature you’re adding or what bug you’re fixing.
See also
Other approaches to interfacing Python and GROMACS are listed under Alternatives to GromacsWrapper.
Getting started
See Installation for supported versions of Python and download and installation instructions. The Quick Start provides a brief example of how to use GromacsWrapper in the most basic fashion.
The source code itself is available in the GromacsWrapper git repository.
You also need to have a version of GROMACS installed.
License
The GromacsWrapper package is made available under the terms of the GNU Public License v3 (or any higher version at your choice) except as noted below. See the file COPYING for the licensing terms for all modules.
Citing
GromacsWrapper was written by Oliver Beckstein with contributions from many other people. Please see the file AUTHORS for all the names.
If you find this package useful and use it in published work I’d be grateful if it was acknowledged in text as
“… used GromacsWrapper (Oliver Beckstein et al, https://github.com/Becksteinlab/GromacsWrapper doi: 10.5281/zenodo.17901)”
or in the Acknowledgements section.
Thank you.
Contact
Please use the issue tracker to report bugs, installation problems,
and feature requests (mention @orbeckst
in the issue report) and
use the discussions forum for general questions.
Installation
This document should help you to install the GromacsWrapper package. Please raise and issue in the Issue Tracker if problems occur or if you have suggestions on how to improve the package or these instructions. Ask for help in the discussions forum.
pip
installation
The latest release can be directly installed with pip:
pip install GromacsWrapper
(This will automatically download and install the latest version of GromacsWrapper from PyPi.)
conda
installation
New in version 0.8.1.
Changed in version 0.8.3: Package migrated from bioconda to conda-forge.
Install as a conda-forge package with the conda package manager from the conda-forge channel
conda install -c conda-forge gromacswrapper
The conda-forge channel should be explicitly specified if you are not already using it by default.
Note
The conda-forge channel also contains conda-forge packages for GROMACS (earlier versions ≤ 2021.x are available as GROMACS bioconda packages), which can be used for testing and system setup; for running in a high-performance environment you are advised to carefully benchmark and possibly compile a version of GROMACS that is tuned for the system.
Manual Download
If your prefer to download manually, get the latest stable release from https://github.com/Becksteinlab/GromacsWrapper/releases and either
pip install GromacsWrapper-0.8.5.tar.gz
or install from the unpacked source:
tar -zxvf GromacsWrapper-0.8.5.tar.gz
cd GromacsWrapper-0.8.5
pip install .
Source code access
The tar archive from https://github.com/Becksteinlab/GromacsWrapper/releases contains a full source code distribution.
In order to follow code development you can also browse the code git repository at https://github.com/Becksteinlab/GromacsWrapper and checkout the main branch:
git clone https://github.com/Becksteinlab/GromacsWrapper.git
cd GromacsWrapper
Code contributions are welcome. We use black for uniform code formatting so please install black and run it on your code.
Requirements
Python >= 3.8 and GROMACS (4.6.x, 2016, 2018, 2019, 2020, 2021, 2022, 2023) must be installed.
System requirements
Tested with Python 3.8–3.12 on Linux and Mac OS X. Earlier Python versions were only supported until release 0.8.5.
Required Python modules
The basic package makes use of numpy and numkit (which uses scipy); all dependencies are installed during a normal installation process.
Quick Start
Given a PDB file 1iee.pdb
, set up and run a simple simulation (assuming
you have all other input files at hand such as the MDP files).
Start with importing the package. If you can find Gromacs in your shell then GromacsWrapper can find it, too. Check the release of the loaded Gromacs package:
>>> import gromacs
>>> print(gromacs.release())
2018.2
You can get help through the usual Python mechanisms:
>>> help(gromacs.pdb2gmx)
DESCRIPTION
gmx pdb2gmx reads a .pdb (or .gro) file, reads some database files,
adds hydrogens to the molecules and generates coordinates in GROMACS
...
...
OPTIONS
Options to specify input files:
-f [<.gro/.g96/...>] (eiwit.pdb)
Structure file: gro g96 pdb brk ent esp tpr
...
...
Now set up the system: (1) generate topology, (2) put in a dodecahedral simulation box, (3) solvate with water (for simplicity, we leave out the “add ions step”):
>>> gromacs.pdb2gmx(f="1iee.pdb", o="protein.gro", p="topol.top",
... ff="oplsaa", water="tip4p")
>>> gromacs.editconf(f="protein.gro", o="boxed.gro",
... bt="dodecahedron", d=1.5, princ=True,
... input="Protein")
>>> gromacs.solvate(cp="boxed.gro", cs="tip4p", p="topol.top",
... o="solvated.gro")
Given an MDP input file for energy minimization, generate the TPR file and run the energy minimization locally:
>>> gromacs.grompp(f="emin.mdp", c="solvated.gro", p="topol.top",
... o="emin.tpr")
>>> gromacs.mdrun(v=True, deffnm="emin")
Assuming it all went well, set up and run a MD simulation, starting from the energy minimized system:
>>> gromacs.grompp(f="md.mdp", c="emin.gro", p="topol.top", o="md.tpr")
>>> gromacs.mdrun(v=True, deffnm="md")
See also
The documentation of the gromacs
package itself contains
more examples and explains the common arguments of all commands.
GromacsWrapper Overview
GromacsWrapper (package gromacs
) is a thin shell around the Gromacs
tools for light-weight integration into python scripts or interactive use in
ipython.
Modules
gromacs
The top level module contains all gromacs tools; each tool can be run directly or queried for its documentation. It also defines the root logger class (name gromacs by default).
gromacs.config
Configuration options. Not really used much at the moment.
gromacs.cbook
The Gromacs cook book contains typical applications of the tools. In many cases this not more than just an often-used combination of parameters for a tool.
gromacs.tools
Contains classes that wrap the gromacs tools. They are automatically generated from the list of tools in
gromacs.tools.gmx_tools
.gromacs.fileformats
Classes to represent data files in various formats such as xmgrace graphs. The classes allow reading and writing and for graphs, also plotting of the data.
gromacs.utilities
Convenience functions and mixin-classes that are used as helpers in other modules.
gromacs.setup
Functions to set up a MD simulation, containing tasks such as solvation and adding ions, energy minimizqtion, MD with position-restraints, and equilibrium MD.
gromacs.qsub
Functions to handle batch submission queuing systems.
gromacs.run
Classes to run mdrun in various way, including on multiprocessor systems.
Examples
The following examples should simply convey the flavour of using the package. See the individual modules for more examples.
Getting help
In python:
gromacs.g_dist.help()
gromacs.g_dist.help(long=True)
In ipython
:
gromacs.g_dist ?
Simple usage
Gromacs flags are given as python keyword arguments:
gromacs.g_dist(v=True, s='topol.tpr', f='md.xtc', o='dist.xvg', dist=1.2)
Input to stdin of the command can be supplied:
gromacs.make_ndx(f='topol.tpr', o='md.ndx',
input=('keep "SOL"', '"SOL" | r NA | r CL', 'name 2 solvent', 'q'))
Output of the command can be caught in a variable and analyzed:
rc, output, junk = gromacs.grompp(..., stdout=False) # collects command output
for line in output.split('\n'):
line = line.strip()
if line.startswith('System has non-zero total charge:'):
qtot = float(line[34:])
break
(See gromacs.cbook.grompp_qtot()
for a more robust implementation of this
application.)
Warnings and Exceptions
A number of package-specific exceptions (GromacsError
) and
warnings (GromacsFailureWarning
, GromacsImportWarning
,
GromacsValueWarning
, AutoCorrectionWarning
,
BadParameterWarning
) can be raised.
If you want to stop execution at, for instance, a AutoCorrectionWarning
or
BadParameterWarning
then use the python warnings
filter:
import warnings
warnings.simplefilter('error', gromacs.AutoCorrectionWarning)
warnings.simplefilter('error', gromacs.BadParameterWarning)
This will make python raise an exception instead of moving on. The default is to always report, eg:
warnings.simplefilter('always', gromacs.BadParameterWarning)
The following exceptions are defined:
- exception gromacs.GromacsError[source]
Error raised when a gromacs tool fails.
Returns error code in the errno attribute and a string in strerror. # TODO: return status code and possibly error message
- exception gromacs.MissingDataError[source]
Error raised when prerequisite data are not available.
For analysis with
gromacs.analysis.core.Simulation
this typically means that theanalyze()
method has to be run first.
The following warnings are defined:
- exception gromacs.GromacsValueWarning[source]
Warns about problems with the value of an option or variable.
- exception gromacs.AutoCorrectionWarning[source]
Warns about cases when the code is choosing new values automatically.
Logging
The library uses python’s logging module to keep a history of what it has been
doing. In particular, every wrapped Gromacs command logs its command line
(including piped input) to the log file (configured in
gromacs.config.logfilename
). This facilitates debugging or simple
re-use of command lines for very quick and dirty work. The logging facilty
appends to the log file and time-stamps every entry. See gromacs.config
for more details on configuration.
It is also possible to capture output from Gromacs commands in a file instead of displaying it on screen, as described under Input and Output.
Normally, one starts logging with the start_logging()
function but in
order to obtain logging messages (typically at level debug) right from the
start one may set the environment variable GW_START_LOGGING
to any
value that evaluates to True
(e.g., “True” or “1”).
Version
The package version is recorded in the gromacs.__version__
variable.
- gromacs.__version__ = '0.8.5+23.g463820c.dirty'
Version of the package, following semantic versioning in the form MAJOR.MINOR.PATCH. When PATCH increases, bugs are fixed or documentation or metadata are updated. Increases in MINOR can introduce new features and deprecate old code. API-breaking and backwards incompatible changes can only occur when MAJOR is increased, except during initial development while MAJOR == 0, in which also increases in MINOR may (rarely) introduce breaking changes.
Additional information after PATCH indicates that you are working with an unreleased version, with the number of git commits after the release and the commit ID encoded in the trailing string.
Configuration
This section documents how to configure the GromacsWrapper package. There
are options to configure where log files and templates directories are located
and options to tell exactly which commands to load into this package. Any
configuration is optional and all options have sane defaults. Further
documentation can be found at gromacs.config
.
Default configuration
Note
Do not configure anything. This is the best approach.
If you are used to loading your Gromacs environment by sourcing the
GMXRC
file yourself or via module then do not
configure anything and let GromacsWrapper find your Gromacs
installation. Only read on if there are specific things that you want
to configure or if you always want to use exactly the same version of Gromacs
with GromacsWrapper.
Basic options
Place an INI file named ~/.gromacswrapper.cfg
in your home directory, it
may look like the following document. The GMXRC parameter is the path your
GMXRC
start-up script:
[Gromacs]
GMXRC = /usr/local/gromacs/bin/GMXRC
The Gromacs software suite needs some environment variables that are set up
sourcing the GMXRC
file. You may source it yourself (then do not
include it in the config file) or set the option like the above one. If this
option isn’t provided, GromacsWrapper will guess that Gromacs was globally
installed as if it was installed somewhere on your PATH
or if you
externally set the Gromacs environment.
As there isn’t yet any way to know which Gromacs version to use, GromacsWrapper will first try to use “modern” Gromacs (i.e., version 5, 2016, 2018, 2019, 2020, 2021, …) if available, then to use Gromacs 4.x. If you have modern versions (collectively referred to as “version 5”) and want to use version 4 or just want to document it, you may specify which version will be used with the release parameter:
[Gromacs]
GMXRC = /usr/local/gromacs/bin/GMXRC
release = 4.6.7
For now GromacsWrapper will guess which tools are available to put it into
gromacs.tools
, but you can always configure it manually with the
tools parameter. Gromacs 5/2016/…/2021 has a driver command (typically
called gmx) but depending on how you compile Gromacs, you can have
different drivers installed. For example, you might have 4 “gmx” commands
[Gromacs]
tools = gmx gmx_d gmx_mpi gmx_mpi_d
for single and double precision work and compiled with MPI support.
For Gromacs 4, tools are separate executables and you can specify them explicitly:
[Gromacs]
GMXRC = /usr/local/gromacs/bin/GMXRC
release = 4
tools =
g_cluster g_dyndom g_mdmat g_principal g_select g_wham mdrun
do_dssp g_clustsize g_enemat g_membed g_protonate g_sgangle g_wheel mdrun_d
editconf g_confrms g_energy g_mindist g_rama g_sham g_x2top mk_angndx
eneconv g_covar g_filter g_morph g_rdf g_sigeps genbox pdb2gmx
g_anadock g_current g_gyrate g_msd g_sorient genconf
g_anaeig g_density g_h2order g_nmeig g_rms g_spatial genion tpbconv
g_analyze g_densmap g_hbond g_nmens g_rmsdist g_spol genrestr trjcat
g_angle g_dielectric g_helix g_nmtraj g_rmsf g_tcaf gmxcheck trjconv
g_bar g_dih g_helixorient g_order g_rotacf g_traj gmxdump trjorder
g_bond g_dipoles g_kinetics g_pme_error g_rotmat g_tune_pme grompp
g_bundle g_disre g_lie g_polystat g_saltbr g_vanhove make_edi xpm2ps
g_chi g_dist g_luck g_potential g_sas g_velacc make_ndx
Commands will be available directly from the gromacs
module:
import gromacs
gromacs.mdrun_d # either v5 `gmx_d mdrun` or v4 `mdrun_d`
gromacs.mdrun # either v5 `gmx mdrun` or v4 `mdrun`
Gromacs 4 tools will also be aliased to Gromacs 5 names (i.e., Gromacs 5/2016/2018/2019/2020/2021 names) so that it is, at least in principle, possible to run GromacsWrapper scripts under any version of Gromacs (between 4.x and at least 2021.x, except for incompatible changes in input files and command behavior).
Changed in version 0.6.0: The format of the tools
variable in the [Gromacs]
section of the
config file was changed for Gromacs 5 commands.
More options
Other parameters can be set to customize where templates for job submission systems and mdp files are located:
[DEFAULT]
# Directory to store user templates and rc files.
configdir = ~/.gromacswrapper
# Directory to store user supplied queuing system scripts.
qscriptdir = %(configdir)s/qscripts
# Directory to store user supplied template files such as mdp files.
templatesdir = %(configdir)s/templates
And there are options for how to handle logging:
[Logging]
# name of the logfile that is written to the current directory
logfilename = gromacs.log
# loglevels (see Python's logging module for details)
# ERROR only fatal errors
# WARN only warnings
# INFO interesting messages
# DEBUG everything
# console messages written to screen
loglevel_console = INFO
# file messages written to logfilename
loglevel_file = DEBUG
Creating default configuration files and directories
If needed you may set up basic configuration files and directories using
gromacs.config.setup()
:
import gromacs
gromacs.config.setup()
API documentation
The gromacs
package makes Gromacs tools available via thin
Python wrappers, which are generated in gromacs.tools
and made
available in the top-level name space gromacs
. The
functionality to generate the tool wrappers resides in the core
modules.
Building blocks to solve commonly encountered tasks related to set-up and running of simulations are collected as building blocks.
Gromacs core modules
This section documents the modules, classes, and functions on which the other parts of the package rely. The information is probably mostly relevant to anyone who wants to extend the package.
gromacs.core
– Core functionality
Here the basic command class GromacsCommand
is defined. All Gromacs
command classes in gromacs.tools
are automatically generated from
it. The documentation of GromacsCommand
applies to all wrapped Gromacs
commands and should be read by anyone using this package.
Input and Output
Each command wrapped by either GromacsCommand
or Command
takes three additional keyword arguments: stdout, stderr, and
input. stdout and stderr determine how the command returns its own
output.
The input keyword is a string that is fed to the standard input of the
command (actually, subprocess.Popen.stdin
). Or, if it is not string-like
then we assume it’s actually a file-like object that we can read from, e.g. a
subprocess.Popen.stdout
or a File
.
By setting the stdout and stderr keywords appropriately, one can have the
output simply printed to the screen (use True
; this is the default,
although see below for the use of the capture_output
gromacs.environment
flag), capture in a python variable as a string for
further processing (use False
), write to a file (use a File
instance) or as input for another command (e.g. use the
subprocess.Popen.stdin
).
When writing setup- and analysis pipelines it can be rather cumbersome to have
the gromacs output on the screen. For these cases GromacsWrapper allows you to
change its behaviour globally. By setting the value of the
gromacs.environment
Flag
capture_output
to True
(in the GromacsWrapper
gromacs.environment.flags
registry)
import gromacs.environment
gromacs.environment.flags['capture_output'] = True
all commands will capture their output (like stderr = False
and stdout
= False
). Explicitly setting these keywords overrides the global
default. The default value for flags['capture_output']
is False
,
i.e. output is directed through STDOUT and STDERR.
Warning
One downside of flags['capture_output'] = True
is that it becomes much
harder to debug scripts unless the script is written in such a way to show
the output when the command fails. Therefore, it is advisable to only
capture output on well-tested scripts.
A third value of capture_output
is the value "file"
:
gromacs.environment.flags['capture_output'] = "file"
This writes the captured output to a file. The file name is specified in
flags['capture_output_filename'
and defaults to
“gromacs_captured_output.txt”. This file is over-written for each
command. In this way one can investigate the output from the last command
(presumably because it failed). STDOUT and STDERR are captured into this file
by default. STDERR is printed first and then STDOUT, which does not necessarily
reflect the order of output one would see on the screen. If your code captures
STDOUT for further processing then an uncaptured STDERR is written to the
capture file.
Note
There are some commands for which capturing output
(flags['capture_output'] = True
) might be problematic. If the command
produces a large or inifinite amount of data then a memory error will occur
because Python nevertheless stores the output internally first. Thus one
should avoid capturing progress output from
e.g. Mdrun
unless the output has been throttled
appropriately.
Classes
- class gromacs.core.GromacsCommand(*args, **kwargs)[source]
Base class for wrapping a Gromacs tool.
Limitations: User must have sourced
GMXRC
so that the python script can inherit the environment and find the gromacs programs.The class doc string is dynamically replaced by the documentation of the gromacs command the first time the doc string is requested. If the tool is not available at the time (i.e., cannot be found on
PATH
) then the generic doc string is shown and anOSError
exception is only raised when the user is actually trying to the execute the command.Set up the command with gromacs flags as keyword arguments.
The following are generic instructions; refer to the Gromacs command usage information that should have appeared before this generic documentation.
As an example, a generic Gromacs command could use the following flags:
cmd = GromacsCommand('v', f=['md1.xtc','md2.xtc'], o='processed.xtc', t=200, ...)
which would correspond to running the command in the shell as
GromacsCommand -v -f md1.xtc md2.xtc -o processed.xtc -t 200
Gromacs command line arguments
Gromacs boolean switches (such as
-v
) are given as python positional arguments ('v'
) or as keyword argument (v=True
); note the quotes in the first case. Negating a boolean switch can be done with'nov'
,nov=True
orv=False
(and evennov=False
works as expected: it is the same asv=True
).Any Gromacs options that take parameters are handled as keyword arguments. If an option takes multiple arguments (such as the multi-file input
-f file1 file2 ...
) then the list of files must be supplied as a python list.If a keyword has the python value
None
then it will not be added to the Gromacs command line; this allows for flexible scripting if it is not known in advance if an input file is needed. In this case the default value of the gromacs tool is used.Keywords must be legal python keywords or the interpreter raises a
SyntaxError
but of course Gromacs commandline arguments are not required to be legal python. In this case “quote” the option with an underscore (_
) and the underscore will be silently stripped. For instance,-or
translates to the illegal keywordor
so it must be underscore-quoted:cmd(...., _or='mindistres.xvg')
Command execution
The command is executed with the
run()
method or by calling it as a function. The two next lines are equivalent:cmd(...) cmd.run(...)
When the command is run one can override options that were given at initialization or one can add additional ones. The same rules for supplying Gromacs flags apply as described above.
Non-Gromacs keyword arguments
The other keyword arguments (listed below) are not passed on to the Gromacs tool but determine how the command class behaves. They are only useful when instantiating a class, i.e. they determine how this tool behaves during all future invocations although it can be changed by setting
failuremode
. This is mostly of interest to developers.- Keywords:
- failure
determines how a failure of the gromacs command is treated; it can be one of the following:
- ‘raise’
raises GromacsError if command fails
- ‘warn’
issue a
GromacsFailureWarning
None
just continue silently
- docstring
additional documentation (ignored) []
Changed in version 0.6.0: The doc keyword is now ignored (because it was not worth the effort to make it work with the lazy-loading of docs).
- Popen(*args, **kwargs)
Returns a special Popen instance (
PopenWithInput
).The instance has its input pre-set so that calls to
communicate()
will not need to supply input. This is necessary if one wants to chain the output from one command to an input from another.- TODO:
Write example.
- command_name = None
Derive a class from command; typically one only has to set command_name to the name of the script or executable. The full path is required if it cannot be found by searching
PATH
.
- commandline(*args, **kwargs)
Returns the commandline that run() uses (without pipes).
- property failuremode
mode determines how the GromacsCommand behaves during failure
It can be one of the following:
- ‘raise’
raises GromacsError if command fails
- ‘warn’
issue a
GromacsFailureWarning
None
just continue silently
- failuremodes = ('raise', 'warn', None)
Available failure modes.
- help(long=False)
Print help; same as using
?
inipython
. long=True also gives call signature.
- run(*args, **kwargs)
Run the command; args/kwargs are added or replace the ones given to the constructor.
- class gromacs.core.Command(*args, **kwargs)[source]
Wrap simple script or command.
Set up the command class.
The arguments can always be provided as standard positional arguments such as
"-c", "config.conf", "-o", "output.dat", "--repeats=3", "-v", "input.dat"
In addition one can also use keyword arguments such as
c="config.conf", o="output.dat", repeats=3, v=True
These are automatically transformed appropriately according to simple rules:
Any single-character keywords are assumed to be POSIX-style options and will be prefixed with a single dash and the value separated by a space.
Any other keyword is assumed to be a GNU-style long option and thus will be prefixed with two dashes and the value will be joined directly with an equals sign and no space.
If this does not work (as for instance for the options of the UNIX
find
command) then provide options and values in the sequence of positional arguments.Example
Create a
Ls
class whose instances execute thels
command:LS = type("LS", (gromacs.core.Command,), {'command_name': 'ls'}) ls = LS() ls() # lists directory like ls ls(l=True) # lists directory like ls -l
Now create an instance that performs a long directory listing by default:
lslong = LS(l=True) lslong() # like ls -l
- Popen(*args, **kwargs)[source]
Returns a special Popen instance (
PopenWithInput
).The instance has its input pre-set so that calls to
communicate()
will not need to supply input. This is necessary if one wants to chain the output from one command to an input from another.- TODO:
Write example.
- __call__(*args, **kwargs)[source]
Run command with the given arguments:
rc,stdout,stderr = command(*args, input=None, **kwargs)
All positional parameters args and all gromacs kwargs are passed on to the Gromacs command. input and output keywords allow communication with the process via the python subprocess module.
- Arguments:
- inputstring, sequence
to be fed to the process’ standard input; elements of a sequence are concatenated with newlines, including a trailing one [
None
]- stdin
None
or automatically set toPIPE
if input given [None
]- stdout
how to handle the program’s stdout stream [
None
]- filehandle
anything that behaves like a file object
None
orTrue
to see output on screen
False
orPIPE
returns the output as a string in the stdout parameter
- stderr
how to handle the stderr stream [
None
]STDOUT
merges standard error with the standard out stream
False
orPIPE
returns the output as a string in the stderr return parameter
None
orTrue
keeps it on stderr (and presumably on screen)
Depending on the value of the GromacsWrapper flag
gromacs.environment.flags```['capture_output']`
the above default behaviour can be different.All other kwargs are passed on to the Gromacs tool.
- Returns:
The shell return code rc of the command is always returned. Depending on the value of output, various strings are filled with output from the command.
- Notes:
In order to chain different commands via pipes one must use the special
PopenWithInput
object (seeGromacsCommand.Popen()
method) instead of the simple call described here and first construct the pipeline explicitly and then call thePopenWithInput.communicate()
method.STDOUT
andPIPE
are objects provided by thesubprocess
module. Any python stream can be provided and manipulated. This allows for chaining of commands. Usefrom subprocess import PIPE, STDOUT
when requiring these special streams (and the special boolean switches
True
/False
cannot do what you need.)(TODO: example for chaining commands)
- command_name = None
Derive a class from command; typically one only has to set command_name to the name of the script or executable. The full path is required if it cannot be found by searching
PATH
.
- help(long=False)[source]
Print help; same as using
?
inipython
. long=True also gives call signature.
- class gromacs.core.PopenWithInput(*args, **kwargs)[source]
Popen class that knows its input.
Set up the instance, including all the input it shoould receive.
Call
PopenWithInput.communicate()
later.
Note
Some versions of python have a bug in the subprocess module ( issue 5179 ) which does not clean up open file descriptors. Eventually code (such as this one) fails with the error:
OSError: [Errno 24] Too many open files
A weak workaround is to increase the available number of open file descriptors with
ulimit -n 2048
and run analysis in different scripts.Initialize with the standard
subprocess.Popen
arguments.- Keywords:
- input
string that is piped into the command
gromacs.config
– Configuration for GromacsWrapper
The config module provides configurable options for the whole package;
It serves to define how to handle log files, set where template files are
located and which gromacs tools are exposed in the gromacs
package.
In order to set up a basic configuration file and the directories
a user can execute gromacs.config.setup()
.
If the configuration file is edited then one can force a rereading of
the new config file with gromacs.config.get_configuration()
:
gromacs.config.get_configuration()
However, this will not update the available command classes (e.g. when new
executables were added to a tool group). In this case one either has to
reload()
a number of modules (gromacs
, gromacs.config
,
gromacs.tools
) although it is by far easier simply to quit python and
freshly import gromacs
.
Almost all aspects of GromacsWrapper (paths, names, what is loaded)
can be changed from within the configuration file. The only exception
is the name of the configuration file itself: This is hard-coded as
~/.gromacswrapper.cfg
although it is possible to read other
configuration files with the filename argument to
get_configuration()
.
Configuration management
Important configuration variables are
- gromacs.config.configdir = '/home/docs/.gromacswrapper'
Directory to store user templates and rc files. The default value is
~/.gromacswrapper
.
- gromacs.config.path = ['.', '/home/docs/.gromacswrapper/qscripts', '/home/docs/.gromacswrapper/templates']
Search path for user queuing scripts and templates. The internal package-supplied templates are always searched last via
gromacs.config.get_templates()
. Modifygromacs.config.path
directly in order to customize the template and qscript searching. By default it has the value['.', qscriptdir, templatesdir]
. (Note that it is not a good idea to have template files and qscripts with the same name as they are both searched on the same path.)path
is updated whenever cfg is re-read withget_configuration()
.
When GromacsWrapper starts up it runs check_setup()
. This
notifies the user if any config files or directories are missing and
suggests to run setup()
. The check if the default set up exists
can be suppressed by setting the environment variable
GROMACSWRAPPER_SUPPRESS_SETUP_CHECK
to ‘true’ (‘yes’ and ‘1’
also work).
Users
Users will likely only need to run gromacs.config.setup()
once and
perhaps occasionally execute gromacs.config.get_configuration()
. Mainly
the user is expected to configure GromacsWrapper by editing the configuration
file ~/.gromacswrapper.cfg
(which has ini-file syntax as described in
ConfigParser
).
- gromacs.config.setup(filename='/home/docs/.gromacswrapper.cfg')[source]
Prepare a default GromacsWrapper global environment.
Create the global config file.
Create the directories in which the user can store template and config files.
This function can be run repeatedly without harm.
- gromacs.config.get_configuration(filename='/home/docs/.gromacswrapper.cfg')[source]
Reads and parses the configuration file.
Default values are loaded and then replaced with the values from
~/.gromacswrapper.cfg
if that file exists. The global configuration instancegromacswrapper.config.cfg
is updated as are a number of global variables such asconfigdir
,qscriptdir
,templatesdir
,logfilename
, …Normally, the configuration is only loaded when the
gromacs
package is imported but a re-reading of the configuration can be forced anytime by callingget_configuration()
.- Returns:
a dict with all updated global configuration variables
- gromacs.config.check_setup()[source]
Check if templates directories are setup and issue a warning and help.
Set the environment variable
GROMACSWRAPPER_SUPPRESS_SETUP_CHECK
skip the check and make it always returnTrue
:return
True
if directories were found andFalse
otherwiseChanged in version 0.3.1: Uses
GROMACSWRAPPER_SUPPRESS_SETUP_CHECK
to suppress check (useful for scripts run on a server)
Developers
Developers are able to access all configuration data through
gromacs.config.cfg
, which represents the merger of the package default
values and the user configuration file values.
- gromacs.config.cfg = <gromacs.config.GMXConfigParser object>
cfg
is the instance ofGMXConfigParser
that makes all global configuration data accessible
- class gromacs.config.GMXConfigParser(*args, **kwargs)[source]
Customized
ConfigParser.SafeConfigParser
.Reads and parses the configuration file.
Default values are loaded and then replaced with the values from
~/.gromacswrapper.cfg
if that file exists. The global configuration instancegromacswrapper.config.cfg
is updated as are a number of global variables such asconfigdir
,qscriptdir
,templatesdir
,logfilename
, …Normally, the configuration is only loaded when the
gromacswrapper
package is imported but a re-reading of the configuration can be forced anytime by callingget_configuration()
.- property configuration
Dict of variables that we make available as globals in the module.
Can be used as
globals().update(GMXConfigParser.configuration) # update configdir, templatesdir ...
A subset of important data is also made available as top-level package
variables as described under Location of template files (for historical
reasons); the same variable are also available in the dict
gromacs.config.configuration
.
- gromacs.config.configuration = {'configdir': '/home/docs/.gromacswrapper', 'configfilename': '/home/docs/.gromacswrapper.cfg', 'logfilename': 'gromacs.log', 'loglevel_console': 20, 'loglevel_file': 10, 'path': ['.', '/home/docs/.gromacswrapper/qscripts', '/home/docs/.gromacswrapper/templates'], 'qscriptdir': '/home/docs/.gromacswrapper/qscripts', 'templatesdir': '/home/docs/.gromacswrapper/templates'}
Dict containing important configuration variables, populated by
get_configuration()
(mainly a shortcut; usecfg
in most cases).
Default values are hard-coded in
- gromacs.config.CONFIGNAME = '/home/docs/.gromacswrapper.cfg'
Default name of the global configuration file.
- gromacs.config.defaults = {'configdir': '/home/docs/.gromacswrapper', 'logfilename': 'gromacs.log', 'loglevel_console': 'INFO', 'loglevel_file': 'DEBUG', 'qscriptdir': '/home/docs/.gromacswrapper/qscripts', 'templatesdir': '/home/docs/.gromacswrapper/templates'}
Initial defaults for directories, filenames, and logger options.
Accessing configuration and template files
The following functions can be used to access configuration data. Note that
files are searched first with their full filename, then in all directories
listed in gromacs.config.path
, and finally within the package itself.
- gromacs.config.get_template(t)[source]
Find template file t and return its real path.
t can be a single string or a list of strings. A string should be one of
a relative or absolute path,
a file in one of the directories listed in
gromacs.config.path
,a filename in the package template directory (defined in the template dictionary
gromacs.config.templates
) ora key into
templates
.
The first match (in this order) is returned. If the argument is a single string then a single string is returned, otherwise a list of strings.
- Arguments:
t : template file or key (string or list of strings)
- Returns:
os.path.realpath(t) (or a list thereof)
- Raises:
ValueError
if no file can be located.
- gromacs.config.get_templates(t)[source]
Find template file(s) t and return their real paths.
t can be a single string or a list of strings. A string should be one of
a relative or absolute path,
a file in one of the directories listed in
gromacs.config.path
,a filename in the package template directory (defined in the template dictionary
gromacs.config.templates
) ora key into
templates
.
The first match (in this order) is returned for each input argument.
- Arguments:
t : template file or key (string or list of strings)
- Returns:
list of os.path.realpath(t)
- Raises:
ValueError
if no file can be located.
Logging
Gromacs commands log their invocation to a log file; typically at loglevel INFO (see the python logging module for details).
- gromacs.config.logfilename = 'gromacs.log'
File name for the log file; all gromacs command and many utility functions (e.g. in
gromacs.cbook
andgromacs.setup
) append messages there. Warnings and errors are also recorded here. The default is gromacs.log.
- gromacs.config.loglevel_console = 20
The default loglevel that is still printed to the console.
- gromacs.config.loglevel_file = 10
The default loglevel that is still written to the
logfilename
.
Gromacs tools and scripts
Fundamentally, GromacsWrapper makes existing Gromacs tools (executables) available as functions. In order for this to work, these executables must be found in the environment of the Python process that runs GromacsWrapper, and the user must list all the tools that are to be made available.
Setting up the environment
The standard way to set up the Gromacs environment is to source GMXRC
in
the shell before running the Python process. GMXRC
adjusts a number of
environment variables (such as PATH
and LD_LIBRARY_PATH
)
but also sets Gromacs-specific environment variables such as GMXBIN
,
GMXDATA
, and many others:
source /usr/local/bin/GMXRC
(where the path to GMXRC
is often set differently to disntinguish different
installed versions of Gromacs).
Alternatively, GromacsWrapper can itself source a GMXRC
file and set the
environment with the set_gmxrc_environment()
function. The path to a
GMXRC
file can be set in the config file in the [Gromacs]
section as
[Gromacs]
GMXRC = /usr/local/bin/GMXRC
When GromacsWrapper starts up, it tries to set the environment using the
GMXRC
defined in the config file. If this is left empty or is not in the
file, nothing is being done.
- gromacs.config.set_gmxrc_environment(gmxrc)[source]
Set the environment from
GMXRC
provided in gmxrc.Runs
GMXRC
in a subprocess and puts environment variables loaded by it into this Python environment.If gmxrc evaluates to
False
then nothing is done. If errors occur then only a warning will be logged. Thus, it should be safe to just call this function.
List of tools
The list of Gromacs tools can be specified in the config file in the
[Gromacs]
section with the tools
variable.
The tool groups are a list of names that determines which tools are made
available as classes in gromacs.tools
. If not provided
GromacsWrapper will first try to load Gromacs 5.x then Gromacs 4.x
tools.
If you choose to provide a list, the Gromacs tools section of the config file can be like this:
[Gromacs]
# Release of the Gromacs package to which information in this sections applies.
release = 4.5.3
# tools contains the file names of all Gromacs tools for which classes are
# generated. Editing this list has only an effect when the package is
# reloaded.
# (Note that this example has a much shorter list than the actual default.)
tools =
editconf make_ndx grompp genion genbox
grompp pdb2gmx mdrun mdrun_d
# which tool groups to make available
groups = tools extra
For Gromacs 5.x and later (e.g., 2021) use a section like the following, where driver commands are supplied:
[Gromacs]
# Release of the Gromacs package to which information in this sections applies.
release = 5.0.5
# GMXRC contains the path for GMXRC file which will be loaded. If not
provided is expected that it was sourced as usual before importing this
library.
GMXRC = /usr/local/gromacs/bin/GMXRC
# tools contains the command names of all Gromacs tools for which classes are generated.
# Editing this list has only an effect when the package is reloaded.
# (Note that this example has a much shorter list than the actual default.)
tools = gmx gmx_d
For example, on the commandline you would run
gmx grompp -f md.mdp -c system.gro -p topol.top -o md.tpr
and within GromacsWrapper this would become
gromacs.grompp(f="md.mdp", c="system.gro", p="topol.top", o="md.tpr")
Note
Because of changes in the Gromacs tool in 5.x, GromacsWrapper scripts might break, even if the tool names are still the same.
Location of template files
Template variables list files in the package that can be used as templates such as run input files. Because the package can be a zipped egg we actually have to unwrap these files at this stage but this is completely transparent to the user.
- gromacs.config.qscriptdir = '/home/docs/.gromacswrapper/qscripts'
Directory to store user supplied queuing system scripts. The default value is
~/.gromacswrapper/qscripts
.
- gromacs.config.templatesdir = '/home/docs/.gromacswrapper/templates'
Directory to store user supplied template files such as mdp files. The default value is
~/.gromacswrapper/templates
.
- gromacs.config.templates = {'darwin.sh': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/darwin.sh', 'em.mdp': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/em.mdp', 'gromacswrapper.cfg': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/gromacswrapper.cfg', 'gromacswrapper_465.cfg': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/gromacswrapper_465.cfg', 'local.sh': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/local.sh', 'md_CHARMM27.mdp': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/md_CHARMM27.mdp', 'md_CHARMM27_gpu.mdp': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/md_CHARMM27_gpu.mdp', 'md_G43a1.mdp': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/md_G43a1.mdp', 'md_OPLSAA.mdp': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/md_OPLSAA.mdp', 'md_OPLSAA_gpu.mdp': '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/md_OPLSAA_gpu.mdp'}
GromacsWrapper comes with a number of templates for run input files and queuing system scripts. They are provided as a convenience and examples but WITHOUT ANY GUARANTEE FOR CORRECTNESS OR SUITABILITY FOR ANY PURPOSE.
All template filenames are stored in
gromacs.config.templates
. Templates have to be extracted from the GromacsWrapper python egg file because they are used by external code: find the actual file locations from this variable.Gromacs mdp templates
These are supplied as examples and there is NO GUARANTEE THAT THEY PRODUCE SENSIBLE OUTPUT — check for yourself! Note that only existing parameter names can be modified with
gromacs.cbook.edit_mdp()
at the moment; if in doubt add the parameter with its gromacs default value (or empty values) and modify later withedit_mdp()
.The safest bet is to use one of the
mdout.mdp
files produced bygromacs.grompp()
as a template as this mdp contains all parameters that are legal in the current version of Gromacs.Queuing system templates
The queing system scripts are highly specific and you will need to add your own into
gromacs.config.qscriptdir
. Seegromacs.qsub
for the format and how these files are processed.
- gromacs.config.qscript_template = '/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/local.sh'
The default template for SGE/PBS run scripts.
gromacs.environment
– Run time modification of behaviour
Some aspects of GromacsWrapper can be determined globally. The
corresponding flags Flag
are set in the environment (think of
them like environment variables). They are accessible through the
pseudo-dictionary gromacs.environment.flags
.
The entries appear as ‘name’-‘value’ pairs. Flags check values and illegal ones
raise a ValueError
. Documentation on all flags can be obtained with
print gromacs.environment.flags.doc()
List of GromacsWrapper flags with default values
- class gromacs.environment.flagsDocs[source]
capture_output = False
Select if Gromacs command output is always captured.
>>> flags['capture_output'] = False
By default a
GromacsCommand
will direct STDOUT and STDERR output from the command itself to the screen (through /dev/stdout and /dev/stderr). When running the command, this can be changed with the keywords stdout and stderr as described ingromacs.core
andCommand
.If this flag is set to
True
then by default STDOUT and STDERR are captured as if one had setstdout=False, stderr=False
Explicitly setting stdout and/or stderr overrides the behaviour described above.
If set to the special keyword
"file"` then the command writes to the file whose name is given by ``flags['capture_output_filename']
. This file is over-written for each command. In this way one can investigate the output from the last command (presumably because it failed). STDOUT and STDERR are captured into this file by default. STDERR is printed first and then STDOUT, which does not necessarily reflect the order of output one would see on the screen.The default is False.
capture_output_filename = ‘gromacs_captured_output.txt’
Name of the file that captures output if
flags['capture_output'] = "file"
>>> flags['capture_output_filename'] = 'gromacs_captured_output.txt'
This is an experimental feature. The default is ‘gromacs_captured_output.txt’.
Classes
- gromacs.environment.flags
- class gromacs.environment.Flags(*args)[source]
Global registry of flags. Acts like a dict for item access.
There are a number flags defined that influence how GromacsWrapper behaves. They are accessible through the pseudo-dictionary
The entries appear as ‘name’-‘value’ pairs. Flags check values and illegal ones raise a
ValueError
. Documentation on all flags can be obtained withprint gromacs.environment.flags.__doc__
New flags are added with the
Flags.register()
method which takes a newFlag
instance as an argument.For developers: Initialize Flags registry with a list of
Flag
instances.
- class gromacs.environment.Flag(name, default, mapping=None, doc=None)[source]
A Flag, essentially a variable that knows its default and legal values.
Create a new flag which will be registered with Flags.
Usage
newflag = Flag(name, default, mapping, doc)
- Arguments:
- name
name of the flag, must be a legal python name
- default
default value
- mapping
dict that maps allowed input values to canonical values; if
None
then no argument checking will be performed and all values are directly set.- doc
doc string; may contain string interpolation mappings for:
%%(name)s name of the flag %%(default)r default value %%(value)r current value %%(mapping)r mapping
Doc strings are generated dynamically and reflect the current state.
gromacs.formats
– Accessing various files
This module contains classes that represent data files on disk. Typically one creates an instance and
reads from a file using a
read()
method, orpopulates the instance (in the simplest case with a
set()
method) and the uses thewrite()
method to write the data to disk in the appropriate format.
For function data there typically also exists a plot()
method
which produces a graph (using matplotlib).
The module defines some classes that are used in other modules; they
do not make use of gromacs.tools
or gromacs.cbook
and
can be safely imported at any time.
Contents
Simple xmgrace XVG file format
Gromacs produces graphs in the xmgrace (“xvg”) format. These are
simple multi-column data files. The class XVG
encapsulates
access to such files and adds a number of methods to access the data
(as NumPy arrays), compute aggregates, or quickly plot it.
The XVG
class is useful beyond reading xvg files. With the
array keyword or the XVG.set()
method one can load data from
an array instead of a file. The array should be simple “NXY” data
(typically: first column time or position, further columns scalar
observables). The data should be a NumPy numpy.ndarray
array
a
with shape
(M, N)
where M-1 is the
number of observables and N the number of observations, e.g.the
number of time points in a time series. a[0]
is the time or
position and a[1:]
the M-1 data columns.
Errors
The XVG.error
attribute contains the statistical error for
each timeseries. It is computed from the standard deviation of the
fluctuations from the mean and their correlation time. The parameters
for the calculations of the correlation time are set with
XVG.set_correlparameters()
.
See also
numkit.timeseries.tcorrel()
Plotting
The XVG.plot()
and XVG.errorbar()
methods are set up to
produce graphs of multiple columns simultaneously. It is
typically assumed that the first column in the selected (sub)array
contains the abscissa (“x-axis”) of the graph and all further columns
are plotted against the first one.
Data selection
Plotting from XVG
is fairly flexible as one can always pass
the columns keyword to select which columns are to be
plotted. Assuming that the data contains [t, X1, X2, X3]
, then one
can
plot all observable columns (X1 to X3) against t:
xvg.plot()
plot only X2 against t:
xvg.plot(columns=[0,2])
plot X2 and X3 against t:
xvg.plot(columns=[0,2,3])
plot X1 against X3:
xvg.plot(columns=[2,3])
Coarse grainining of data
It is also possible to coarse grain the data for plotting (which typically results in visually smoothing the graph because noise is averaged out).
Currently, two alternative algorithms to produce “coarse grained” (decimated) graphs are implemented and can be selected with the method keyword for the plotting functions in conjuction with maxpoints (the number of points to be plotted):
mean histogram (default) — bin the data (using
numkit.timeseries.regularized_function()
and compute the mean for each bin. Gives the exact number of desired points but the time data are whatever the middle of the bin is.smooth subsampled — smooth the data with a running average (other windows like Hamming are also possible) and then pick data points at a stepsize compatible with the number of data points required. Will give exact times but not the exact number of data points.
For simple test data, both approaches give very similar output.
For the special case of periodic data such as angles, one can use the
circular mean (“circmean”) to coarse grain. In this case, jumps across
the -180º/+180º boundary are added as masked datapoints and no line is
drawn across the jump in the plot. (Only works with the simple
XVG.plot()
method at the moment, errorbars or range plots are
not implemented yet.)
See also
Examples
In this example we generate a noisy time series of a sine wave. We store the time, the value, and an error. (In a real example, the value might be the mean over multiple observations and the error might be the estimated error of the mean.)
>>> import numpy as np
>>> import gromacs.formats
>>> X = np.linspace(-10,10,50000)
>>> yerr = np.random.randn(len(X))*0.05
>>> data = np.vstack((X, np.sin(X) + yerr, np.random.randn(len(X))*0.05))
>>> xvg = gromacs.formats.XVG(array=data)
Plot value for all time points:
>>> xvg.plot(columns=[0,1], maxpoints=None, color="black", alpha=0.2)
Plot bin-averaged (decimated) data with the errors, over 1000 points:
>>> xvg.errorbar(maxpoints=1000, color="red")
(see output in Figure Plot of Raw vs Decimated data)

Plot of Raw vs Decimated data. Example of plotting raw data
(sine on 50,000 points, gray) versus the decimated graph (reduced
to 1000 points, red line). The errors were also decimated and
reduced to the errors within the 5% and the 95% percentile. The
decimation is carried out by histogramming the data in the desired
number of bins and then the data in each bin is reduced by either
numpy.mean()
(for the value) or
scipy.stats.scoreatpercentile()
(for errors).
In principle it is possible to use other functions to decimate the
data. For XVG.plot()
, the method keyword can be changed (see
XVG.decimate()
for allowed method values). For
XVG.errorbar()
, the method to reduce the data values (typically
column 1) is fixed to “mean” but the errors (typically columns 2 and 3)
can also be reduced with error_method = “rms”.
If one wants to show the variation of the raw data together with the decimated and smoothed data then one can plot the percentiles of the deviation from the mean in each bin:
>>> xvg.errorbar(columns=[0,1,1], maxpoints=1000, color="blue", demean=True)
The demean keyword indicates if fluctuations from the mean are
regularised [1]. The method XVG.plot_coarsened()
automates this approach and can plot coarsened data selected by the
columns keyword.
Footnotes
Classes and functions
- class gromacs.fileformats.xvg.XVG(filename=None, names=None, array=None, permissive=False, **kwargs)[source]
Class that represents the numerical data in a grace xvg file.
All data must be numerical.
NAN
andINF
values are supported via python’sfloat()
builtin function.The
array
attribute can be used to access the the array once it has been read and parsed. Thema
attribute is a numpy masked array (good for plotting).Conceptually, the file on disk and the XVG instance are considered the same data. Whenever the filename for I/O (
XVG.read()
andXVG.write()
) is changed then the filename associated with the instance is also changed to reflect the association between file and instance.With the permissive =
True
flag one can instruct the file reader to skip unparseable lines. In this case the line numbers of the skipped lines are stored inXVG.corrupted_lineno
.A number of attributes are defined to give quick access to simple statistics such as
mean
: mean of all data columnsstd
: standard deviationmin
: minimum of datamax
: maximum of dataerror
: error on the mean, taking correlation times into account (see alsoXVG.set_correlparameters()
)tc
: correlation time of the data (assuming a simple exponential decay of the fluctuations around the mean)
These attributes are numpy arrays that correspond to the data columns in
XVG.array
, i.e.XVG.array[1:]
.Note
Only simple XY or NXY files are currently supported, not Grace files that contain multiple data sets separated by ‘&’.
Any kind of formatting (i.e. xmgrace commands) is discarded.
Initialize the class from a xvg file.
- Arguments:
- filename
is the xvg file; it can only be of type XY or NXY. If it is supplied then it is read and parsed when
XVG.array
is accessed.- names
optional labels for the columns (currently only written as comments to file); string with columns separated by commas or a list of strings
- array
read data from array (see
XVG.set()
)- permissive
False
raises aValueError
and logs and errior when encountering data lines that it cannot parse.True
ignores those lines and logs a warning—this is a risk because it might read a corrupted input file [False
]- stride
Only read every stride line of data [1].
- savedata
True
includes the data (XVG.array`
and associated caches) when the instance is pickled (seepickle
); this is oftens not desirable because the data are already on disk (the xvg file filename) and the resulting pickle file can become very big.False
omits those data from a pickle. [False
]- metadata
dictionary of metadata, which is not touched by the class
- property array
Represent xvg data as a (cached) numpy array.
The array is returned with column-first indexing, i.e. for a data file with columns X Y1 Y2 Y3 … the array a will be a[0] = X, a[1] = Y1, … .
- decimate(method, a, maxpoints=10000, **kwargs)[source]
Decimate data a to maxpoints using method.
If a is a 1D array then it is promoted to a (2, N) array where the first column simply contains the index.
If the array contains fewer than maxpoints points or if maxpoints is
None
then it is returned as it is. The default for maxpoints is 10000.Valid values for the reduction method:
“mean”, uses
XVG.decimate_mean()
to coarse grain by averaging the data in bins along the time axis“circmean”, uses
XVG.decimate_circmean()
to coarse grain by calculating the circular mean of the data in bins along the time axis. Use additional keywords low and high to set the limits. Assumes that the data are in degrees.“min” and “max* select the extremum in each bin
“rms”, uses
XVG.decimate_rms()
to coarse grain by computing the root mean square sum of the data in bins along the time axis (for averaging standard deviations and errors)“percentile” with keyword per:
XVG.decimate_percentile()
reduces data in each bin to the percentile per“smooth”, uses
XVG.decimate_smooth()
to subsample from a smoothed function (generated with a running average of the coarse graining step size derived from the original number of data points and maxpoints)
- Returns:
numpy array
(M', N')
from a(M', N)
array withM' == M
(except whenM == 1
, see above) andN' <= N
(N'
is maxpoints).
- decimate_circmean(a, maxpoints, **kwargs)[source]
Return data a circmean-decimated on maxpoints.
Histograms each column into maxpoints bins and calculates the weighted circular mean in each bin as the decimated data, using
numkit.timeseries.circmean_histogrammed_function()
. The coarse grained time in the first column contains the centers of the histogram time.If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of maxpoints points is returned.
Keywords low and high can be used to set the boundaries. By default they are [-pi, +pi].
This method returns a masked array where jumps are flagged by an insertion of a masked point.
Note
Assumes that the first column is time and that the data are in degrees.
Warning
Breaking of arrays only works properly with a two-column array because breaks are only inserted in the x-column (a[0]) where y1 = a[1] has a break.
- decimate_error(a, maxpoints, **kwargs)[source]
Return data a error-decimated on maxpoints.
Histograms each column into maxpoints bins and calculates an error estimate in each bin as the decimated data, using
numkit.timeseries.error_histogrammed_function()
. The coarse grained time in the first column contains the centers of the histogram time.If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of maxpoints points is returned.
See also
numkit.timeseries.tcorrel()
Note
Assumes that the first column is time.
Does not work very well because often there are too few datapoints to compute a good autocorrelation function.
- decimate_max(a, maxpoints, **kwargs)[source]
Return data a max-decimated on maxpoints.
Histograms each column into maxpoints bins and calculates the maximum in each bin as the decimated data, using
numkit.timeseries.max_histogrammed_function()
. The coarse grained time in the first column contains the centers of the histogram time.If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of maxpoints points is returned.
Note
Assumes that the first column is time.
- decimate_mean(a, maxpoints, **kwargs)[source]
Return data a mean-decimated on maxpoints.
Histograms each column into maxpoints bins and calculates the weighted average in each bin as the decimated data, using
numkit.timeseries.mean_histogrammed_function()
. The coarse grained time in the first column contains the centers of the histogram time.If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of maxpoints points is returned.
Note
Assumes that the first column is time.
- decimate_min(a, maxpoints, **kwargs)[source]
Return data a min-decimated on maxpoints.
Histograms each column into maxpoints bins and calculates the minimum in each bin as the decimated data, using
numkit.timeseries.min_histogrammed_function()
. The coarse grained time in the first column contains the centers of the histogram time.If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of maxpoints points is returned.
Note
Assumes that the first column is time.
- decimate_percentile(a, maxpoints, **kwargs)[source]
Return data a percentile-decimated on maxpoints.
Histograms each column into maxpoints bins and calculates the percentile per in each bin as the decimated data, using
numkit.timeseries.percentile_histogrammed_function()
. The coarse grained time in the first column contains the centers of the histogram time.If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of maxpoints points is returned.
Note
Assumes that the first column is time.
- Keywords:
- per
percentile as a percentage, e.g. 75 is the value that splits the data into the lower 75% and upper 25%; 50 is the median [50.0]
See also
numkit.timeseries.regularized_function()
withscipy.stats.scoreatpercentile()
- decimate_rms(a, maxpoints, **kwargs)[source]
Return data a rms-decimated on maxpoints.
Histograms each column into maxpoints bins and calculates the root mean square sum in each bin as the decimated data, using
numkit.timeseries.rms_histogrammed_function()
. The coarse grained time in the first column contains the centers of the histogram time.If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of maxpoints points is returned.
Note
Assumes that the first column is time.
- decimate_smooth(a, maxpoints, window='flat')[source]
Return smoothed data a decimated on approximately maxpoints points.
Produces a smoothed graph using the smoothing window window; “flat” is a running average.
select points at a step size approximatelt producing maxpoints
If a contains <= maxpoints then a is simply returned; otherwise a new array of the same dimensions but with a reduced number of points (close to maxpoints) is returned.
Note
Assumes that the first column is time (which will never be smoothed/averaged), except when the input array a is 1D and therefore to be assumed to be data at equidistance timepoints.
TODO: - Allow treating the 1st column as data
- default_color_cycle = ['black', 'red', 'blue', 'orange', 'magenta', 'cyan', 'yellow', 'brown', 'green']
Default color cycle for
XVG.plot_coarsened()
:['black', 'red', 'blue', 'orange', 'magenta', 'cyan', 'yellow', 'brown', 'green']
- default_extension = 'xvg'
Default extension of XVG files.
- property error
Error on the mean of the data, taking the correlation time into account.
See [FrenkelSmit2002] p526:
error = sqrt(2*tc*acf[0]/T)
where acf() is the autocorrelation function of the fluctuations around the mean, y-<y>, tc is the correlation time, and T the total length of the simulation.
[FrenkelSmit2002]D. Frenkel and B. Smit, Understanding Molecular Simulation. Academic Press, San Diego 2002
- errorbar(**kwargs)[source]
errorbar plot for a single time series with errors.
Set columns keyword to select [x, y, dy] or [x, y, dx, dy], e.g.
columns=[0,1,2]
. SeeXVG.plot()
for details. Only a single timeseries can be plotted and the user needs to select the appropriate columns with the columns keyword.By default, the data are decimated (see
XVG.plot()
) for the default of maxpoints = 10000 by averaging data in maxpoints bins.x,y,dx,dy data can plotted with error bars in the x- and y-dimension (use filled =
False
).For x,y,dy use filled =
True
to fill the region between y±dy. fill_alpha determines the transparency of the fill color. filled =False
will draw lines for the error bars. Additional keywords are passed topylab.errorbar()
.By default, the errors are decimated by plotting the 5% and 95% percentile of the data in each bin. The percentile can be changed with the percentile keyword; e.g. percentile = 1 will plot the 1% and 99% perentile (as will percentile = 99).
The error_method keyword can be used to compute errors as the root mean square sum (error_method = “rms”) across each bin instead of percentiles (“percentile”). The value of the keyword demean is applied to the decimation of error data alone.
See also
XVG.plot()
lists keywords common to both methods.
- property ma
Represent data as a masked array.
The array is returned with column-first indexing, i.e. for a data file with columns X Y1 Y2 Y3 … the array a will be a[0] = X, a[1] = Y1, … .
inf and nan are filtered via
numpy.isfinite()
.
- property max
Maximum of the data columns.
- maxpoints_default = 10000
Aim for plotting around that many points
- property mean
Mean value of all data columns.
- property min
Minimum of the data columns.
- parse(stride=None)[source]
Read and cache the file as a numpy array.
Store every stride line of data; if
None
then the class default is used.The array is returned with column-first indexing, i.e. for a data file with columns X Y1 Y2 Y3 … the array a will be a[0] = X, a[1] = Y1, … .
- plot(**kwargs)[source]
Plot xvg file data.
The first column of the data is always taken as the abscissa X. Additional columns are plotted as ordinates Y1, Y2, …
In the special case that there is only a single column then this column is plotted against the index, i.e. (N, Y).
- Keywords:
- columnslist
Select the columns of the data to be plotted; the list is used as a numpy.array extended slice. The default is to use all columns. Columns are selected after a transform.
- transformfunction
function
transform(array) -> array
which transforms the original array; must return a 2D numpy array of shape [X, Y1, Y2, …] where X, Y1, … are column vectors. By default the transformation is the identity [lambda x: x
].- maxpointsint
limit the total number of data points; matplotlib has issues processing png files with >100,000 points and pdfs take forever to display. Set to
None
if really all data should be displayed. At the moment we simply decimate the data at regular intervals. [10000]- method
method to decimate the data to maxpoints, see
XVG.decimate()
for details- color
single color (used for all plots); sequence of colors (will be repeated as necessary); or a matplotlib colormap (e.g. “jet”, see
matplotlib.cm
). The default is to use theXVG.default_color_cycle
.- ax
plot into given axes or create new one if
None
[None
]- kwargs
All other keyword arguments are passed on to
matplotlib.pyplot.plot()
.
- Returns:
- ax
axes instance
- plot_coarsened(**kwargs)[source]
Plot data like
XVG.plot()
with the range of all data shown.Data are reduced to maxpoints (good results are obtained with low values such as 100) and the actual range of observed data is plotted as a translucent error band around the mean.
Each column in columns (except the abscissa, i.e. the first column) is decimated (with
XVG.decimate()
) and the range of data is plotted alongside the mean usingXVG.errorbar()
(see for arguments). Additional arguments:- Kewords:
- maxpoints
number of points (bins) to coarsen over
- color
single color (used for all plots); sequence of colors (will be repeated as necessary); or a matplotlib colormap (e.g. “jet”, see
matplotlib.cm
). The default is to use theXVG.default_color_cycle
.- method
Method to coarsen the data. See
XVG.decimate()
The demean keyword has no effect as it is required to be
True
.See also
- set(a)[source]
Set the array data from a (i.e. completely replace).
No sanity checks at the moment…
- set_correlparameters(**kwargs)[source]
Set and change the parameters for calculations with correlation functions.
The parameters persist until explicitly changed.
- Keywords:
- nstep
only process every nstep data point to speed up the FFT; if left empty a default is chosen that produces roughly 25,000 data points (or whatever is set in ncorrel)
- ncorrel
If no nstep is supplied, aim at using ncorrel data points for the FFT; sets
XVG.ncorrel
[25000]- force
force recalculating correlation data even if cached values are available
- kwargs
see
numkit.timeseries.tcorrel()
for other options
- property std
Standard deviation from the mean of all data columns.
- property tc
Correlation time of the data.
See
XVG.error()
for details.
- gromacs.fileformats.xvg.break_array(a, threshold=3.141592653589793, other=None)[source]
Create a array which masks jumps >= threshold.
Extra points are inserted between two subsequent values whose absolute difference differs by more than threshold (default is pi).
Other can be a secondary array which is also masked according to a.
Returns (a_masked, other_masked) (where other_masked can be
None
)
Gromacs XPM file format
Gromacs stores matrix data in the xpm file format. This implementation of a Python reader is based on Tsjerk Wassenaar’s post to gmx-users numerical matrix from xpm file (Mon Oct 4 13:05:26 CEST 2010). This version returns a NumPy array and can guess an appropriate dtype for the array.
Classes
- class gromacs.fileformats.xpm.XPM(filename=None, **kwargs)[source]
Class to make a Gromacs XPM matrix available as a NumPy
numpy.ndarray
.The data is available in the attribute
XPM.array
.Note
By default, the rows (2nd dimension) in the
XPM.array
are re-ordered so that row 0 (i.e.array[:,0]
corresponds to the first residue/hydrogen bond/etc. The original xpm matrix is obtained for reverse =False
. TheXPM
reader always reorders theXPM.yvalues
(obtained from the xpm file) to match the order of the rows.Initialize xpm structure.
- Arguments:
- filename
read from xpm file directly
- autoconvert
try to guess the type of the output array from the colour legend [
True
]- reverse
reverse rows (2nd dimension): re-orders the rows so that the first row corresponds e.g. to the first residue or first H-bonds and not the last) [
True
]
- xvalues
Values of on the x-axis, extracted from the xpm file.
- yvalues
Values of on the y-axis, extracted from the xpm file. These are in the same order as the rows in the xpm matrix. If reverse =
False
then this is typically a descending list of numbers (highest to lowest residue number, index number, etc). For reverse =True
it is resorted accordingly.
- COLOUR = re.compile(' ^.*" # start with quotation mark\n (?P<symbol>[ -~])# printable ASCII symbol used in the actual pixmap: \'space\' to \'~\'\n \\s+ , re.VERBOSE)
compiled regular expression to parse the colors in the xpm file:
static char *gromacs_xpm[] = { "14327 9 2 1", " c #FFFFFF " /* "None" */, "o c #FF0000 " /* "Present" */,
Matches are named “symbol”, “color” (hex string), and “value”. “value” is typically autoconverted to appropriate values with
gromacs.fileformats.convert.Autoconverter
. The symbol is matched as a printable ASCII character in the range 0x20 (space) to 0x7E (~).
- property array
XPM matrix as a
numpy.ndarray
.The attribute itself cannot be assigned a different array but the contents of the array can be modified.
- default_extension = 'xpm'
Default extension for files read/written by this class.
Example: Analysing H-bonds
Run gromacs.g_hbond()
to produce the existence map (and the log
file for the atoms involved in the bonds; the ndx file is also
useful):
gromacs.g_hbond(s=TPR, f=XTC, g="hbond.log", hbm="hb.xpm", hbn="hb.ndx")
Load the XPM:
hb = XPM("hb.xpm", reverse=True)
Calculate the fraction of time that each H-bond existed:
hb_fraction = hb.array.mean(axis=0)
Get the descriptions of the bonds:
desc = [line.strip() for line in open("hbond.log") if not line.startswith('#')]
Note
It is important that reverse=True
is set so that the rows in
the xpm matrix are brought in the same order as the H-bond
labels.
Show the results:
print "\n".join(["%-40s %4.1f%%" % p for p in zip(desc, 100*hb_fraction)])
Gromacs parameter MDP file format
The .mdp file contains a list of keywords that are used to set up a
simulation with Grompp
. The class MDP
parses this file and provides access to the keys and values as ordered
dictionary.
- class gromacs.fileformats.mdp.MDP(filename=None, autoconvert=True, **kwargs)[source]
Class that represents a Gromacs mdp run input file.
The MDP instance is an ordered dictionary.
Parameter names are keys in the dictionary.
Comments are sequentially numbered with keys Comment0001, Comment0002, …
Empty lines are similarly preserved as Blank0001, ….
When writing, the dictionary is dumped in the recorded order to a file. Inserting keys at a specific position is not possible.
Currently, comments after a parameter on the same line are discarded. Leading and trailing spaces are always stripped.
See also
For editing a mdp file one can also use
gromacs.cbook.edit_mdp()
(which works like a poor replacement for sed).Initialize mdp structure.
- Arguments:
- filename
read from mdp file
- autoconvertboolean
True
converts numerical values to python numerical types;False
keeps everything as strings [True
]- kwargs
Populate the MDP with key=value pairs. (NO SANITY CHECKS; and also does not work for keys that are not legal python variable names such as anything that includes a minus ‘-’ sign or starts with a number).
- default_extension = 'mdp'
Default extension for files read/written by this class.
- write(filename=None, skipempty=False)[source]
Write mdp file to filename.
- Keywords:
- filename
output mdp file; default is the filename the mdp was read from
- skipemptyboolean
True
removes any parameter lines from output that contain empty values [False
]
Note
Overwrites the file that the mdp was read from if no filename supplied.
Gromacs NDX index file format
The .ndx file contains lists of atom indices that are grouped in
sections by group names. The classes NDX
and
uniqueNDX
can parse such ndx files and provide convenient
access to the individual groups.
- class gromacs.fileformats.ndx.NDX(filename=None, **kwargs)[source]
Gromacs index file.
Represented as a ordered dict where the keys are index group names and values are numpy arrays of atom numbers.
Use the
NDX.read()
andNDX.write()
methods for I/O. Access groups by name via theNDX.get()
andNDX.set()
methods.Alternatively, simply treat the
NDX
instance as a dictionary. Setting a key automatically transforms the new value into a integer 1D numpy array (not a set, as would be the make_ndx behaviour).Note
The index entries themselves are ordered and can contain duplicates so that output from NDX can be easily used for g_dih and friends. If you need set-like behaviour you will have do use
gromacs.formats.uniqueNDX
orgromacs.cbook.IndexBuilder
(which uses make_ndx throughout).Example
Read index file, make new group and write to disk:
ndx = NDX() ndx.read('system.ndx') print ndx['Protein'] ndx['my_group'] = [2, 4, 1, 5] # add new group ndx.write('new.ndx')
Or quicker (replacing the input file
system.ndx
):ndx = NDX('system') # suffix .ndx is automatically added ndx['chi1'] = [2, 7, 8, 10] ndx.write()
- default_extension = 'ndx'
Default extension for files read/written by this class.
- format = '%6d'
standard ndx file format: ‘%6d’
- property groups
Return a list of all groups.
- ncol = 15
standard ndx file format: 15 columns
- property ndxlist
Return a list of groups in the same format as
gromacs.cbook.get_ndx_groups()
.- Format:
[ {‘name’: group_name, ‘natoms’: number_atoms, ‘nr’: # group_number}, ….]
- setdefault(**kwargs)[source]
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- property sizes
Return a dict with group names and number of entries,
- class gromacs.fileformats.ndx.uniqueNDX(filename=None, **kwargs)[source]
Index that behaves like make_ndx, i.e. entries behaves as sets, not lists.
The index lists behave like sets: - adding sets with ‘+’ is equivalent to a logical OR: x + y == “x | y” - subtraction ‘-’ is AND: x - y == “x & y” - see
join()
for ORing multiple groups (x+y+z+…)Example
I = uniqueNDX('system.ndx') I['SOLVENT'] = I['SOL'] + I['NA+'] + I['CL-']
- join(*groupnames)[source]
Return an index group that contains atoms from all groupnames.
The method will silently ignore any groups that are not in the index.
Example
Always make a solvent group from water and ions, even if not all ions are present in all simulations:
I['SOLVENT'] = I.join('SOL', 'NA+', 'K+', 'CL-')
Gromacs Preprocessed Topology (top) Parser
New in version 0.5.0.
Gromacs can produce preprocessed topology files that contain all topology information (generated using grompp -pp processed.top
).
Reading the regular topol.top is not supported, for now, since the #include
statements are not handled.
The TOP
parser can read an write processed.top files.
The TOP
also provides an interface to modify the force-field terms and parameters in a programmatic way.
Example applications involve system preparation for Hamiltonian-replica exchange (REST2 with lambda scaling), and automated force-field parametrization.
Gromacs TOP file format
Classes
- class gromacs.fileformats.top.TOP(fname)[source]
Class to make a TOP object from a GROMACS processed.top file
The force-field and molecules data is exposed as python object.
Note
Only processed.top files generated by GROMACS ‘grompp -pp’ are supported - the usual topol.top files are not supported (yet!)
Initialize the TOP structure.
- Arguments:
- fname
name of the processed.top file
- class gromacs.fileformats.top.SystemToGroTop(system, outfile='output.top', multiple_output=False)[source]
Converter class - represent TOP objects as GROMACS topology file.
Initialize GROMACS topology writer.
- Arguments:
- system
blocks.System
object, containing the topology- outfile
name of the file to write to
- multiple_output
if True, write moleculetypes to separate files, named mol_MOLNAME.itp (default: False)
History
Sources adapted from code by Reza Salari https://github.com/resal81/PyTopol
Example: Read a processed.top file and scale charges
Run grompp -pp
to produce a processed.top from conf.gro, grompp.mdp and topol.top files:
$ grompp -pp
This file now containts all the force-field information:
from gromacs.fileformats import TOP
top = TOP("processed.top")
Scale the LJ epsilon by an arbitrary number, here 0.9
scaling = 0.9
for at in top.atomtypes:
at.gromacs['param']['lje'] *= scaling
Write out the scaled down topology:
top.write("output.top")
Note
You can use this to prepare a series of top files for Hamiltonian Replica
Exchange (HREX) simulations. See scripts/gw-partial_tempering.py
for an example.
Gromacs TOP - BLOCKS boiler-plate code
Classes
- class gromacs.fileformats.blocks.System[source]
Top-level class containing molecule topology.
Contains all the parameter types (AtomTypes, BondTypes, … ) and molecules.
- class gromacs.fileformats.blocks.Molecule[source]
Class that represents a Molecule
Contains all the molecule attributes: atoms, bonds, angles dihedrals. Also contains settle, cmap and exclusion sections, if present.
- class gromacs.fileformats.blocks.Atom[source]
Class that represents an Atom
Contains only the simplest atom attributes, that are contained like in section example below.
Molecule
cantains anatoms
that’s a list-container forAtom
instances.
- class gromacs.fileformats.blocks.Param(format)[source]
Class that represents an abstract Parameter.
This class is the parent to AtomType, BondType and all the other parameter types.
The class understands a parameter line and that a
comment
that may follow. CMapType is an exception (it’s a multi-line parameter).convert()
provides a rudimentary support for parameter unit conversion between GROMACS and CHARMM notation: change kJ/mol into kcal/mol and nm into Angstrom.disabled
for supressing output when writing-out to a file.
History
Sources adapted from code by Reza Salari https://github.com/resal81/PyTopol
gromacs.fileformats.convert
— converting entries of tables
The Autoconverter
converts input values to appropriate Python
types.
It is mainly used by gromacs.fileformats.xpm.XPM
to automagically generate useful NumPy arrays from xpm files. Custom
conversions beyond the default ones in Autoconverter
can be
provided with the constructor keyword mapping.
See also
The Autoconverter
class was taken and slightly adapted from
recsql.converter in RecSQL.
- class gromacs.fileformats.convert.Autoconverter(mode='fancy', mapping=None, active=True, sep=False, **kwargs)[source]
Automatically convert an input value to a special python object.
The
Autoconverter.convert()
method turns the value into a special python value and casts strings to the “best” type (seebesttype()
).The defaults for the conversion of a input field value to a special python value are:
value
python
‘
---
’None
‘’
None
‘True’
True
‘x’
True
‘X’
True
‘yes’
True
‘Present’
True
‘False’
False
‘-’
False
‘no’
False
‘None’
False
‘none’
False
If the sep keyword is set to a string instead of
False
then values are split into tuples. Probably the most convenient way to use this is to set sep =True
(orNone
) because this splits on all white space whereas sep = ‘ ‘ would split multiple spaces.Example
With sep =
True
: ‘foo bar 22 boing---
’ –> (‘foo’, ‘bar’, 22, ‘boing’, None)With sep = ‘,’: 1,2,3,4 –> (1,2,3,4)
Initialize the converter.
- Arguments:
- mode
defines what the converter does
- “simple”
convert entries with
besttype()
- “singlet”
convert entries with
besttype()
and apply mappings- “fancy”
first splits fields into lists, tries mappings, and does the stuff that “singlet” does
- “unicode”
convert all entries with
to_unicode()
- mapping
any dict-like mapping that supports lookup. If``None`` then the hard-coded defaults are used
- active or autoconvert
initial state of the
Autoconverter.active
toggle.False
deactivates any conversion. [True
]- sep
character to split on (produces lists); use
True
orNone
(!) to split on all white space.
Changed in version 0.7.0: removed encoding keyword argument
- convert(x)
Convert x (if in the active state)
- active
If set to
True
then conversion takes place;False
just returnsbesttype()
applid to the value.
- property active
Toggle the state of the Autoconverter.
True
uses the mode,False
does nothing
- gromacs.fileformats.convert.besttype(x)[source]
Convert string x to the most useful type, i.e. int, float or unicode string.
If x is a quoted string (single or double quotes) then the quotes are stripped and the enclosed string returned.
Note
Strings will be returned as Unicode strings (using
to_unicode()
).Changed in version 0.7.0: removed encoding keyword argument
gromacs.utilities
– Helper functions and classes
The module defines some convenience functions and classes that are
used in other modules; they do not make use of gromacs.tools
or gromacs.cbook
and can be safely imported at any time.
Classes
FileUtils
provides functions related to filename handling. It
can be used as a base or mixin class. The gromacs.analysis.Simulation
class is derived from it.
- class gromacs.utilities.FileUtils[source]
Mixin class to provide additional file-related capabilities.
- check_file_exists(filename, resolve='exception', force=None)[source]
If a file exists then continue with the action specified in
resolve
.resolve
must be one of- “ignore”
always return
False
- “indicate”
return
True
if it exists- “warn”
indicate and issue a
UserWarning
- “exception”
raise
IOError
if it exists
Alternatively, set force for the following behaviour (which ignores resolve):
True
same as resolve = “ignore” (will allow overwriting of files)
False
same as resolve = “exception” (will prevent overwriting of files)
None
ignored, do whatever resolve says
- default_extension = None
Default extension for files read/written by this class.
- filename(filename=None, ext=None, set_default=False, use_my_ext=False)[source]
Supply a file name for the class object.
Typical uses:
fn = filename() ---> <default_filename> fn = filename('name.ext') ---> 'name' fn = filename(ext='pickle') ---> <default_filename>'.pickle' fn = filename('name.inp','pdf') --> 'name.pdf' fn = filename('foo.pdf',ext='png',use_my_ext=True) --> 'foo.pdf'
The returned filename is stripped of the extension (
use_my_ext=False
) and if provided, another extension is appended. Chooses a default if no filename is given.Raises a
ValueError
exception if no default file name is known.If
set_default=True
then the default filename is also set.use_my_ext=True
lets the suffix of a provided filename take priority over a defaultext
tension.Changed in version 0.3.1: An empty string as ext = “” will suppress appending an extension.
- class gromacs.utilities.AttributeDict[source]
A dictionary with pythonic access to keys as attributes — useful for interactive work.
- class gromacs.utilities.Timedelta[source]
Extension of
datetime.timedelta
.Provides attributes ddays, dhours, dminutes, dseconds to measure the delta in normal time units.
ashours gives the total time in fractional hours.
Functions
Some additional convenience functions that deal with files and directories:
- gromacs.utilities.openany(directory[, mode='r'])[source]
Context manager to open a compressed (bzip2, gzip) or plain file (uses
anyopen()
).
- gromacs.utilities.anyopen(datasource, mode='rt', reset=True)[source]
Open datasource (gzipped, bzipped, uncompressed) and return a stream.
datasource can be a filename or a stream (see
isstream()
). By default, a stream is reset to its start if possible (viaseek()
orreset()
).If possible, the attribute
stream.name
is set to the filename or “<stream>” if no filename could be associated with the datasource.- Arguments:
- datasource
a file (from
file
oropen()
) or a stream (e.g. fromurllib2.urlopen()
orcStringIO.StringIO
)- mode
{‘r’, ‘w’, ‘a’} (optional), Open in r(ead), w(rite) or a(ppen) mode. More complicated modes (‘r+’, ‘w+’, …) are not supported; only the first letter of mode is used and thus any additional modifiers are silently ignored.
- reset
bool (optional), try to read (mode ‘r’) the stream from the start
- Returns:
file-like object
- gromacs.utilities.isstream(obj)[source]
Detect if obj is a stream.
We consider anything a stream that has the methods
close()
and either set of the following
read()
,readline()
,readlines()
write()
,writeline()
,writelines()
- Arguments:
- obj
stream or str
- Returns:
bool,
True
if obj is a stream,False
otherwise
See also
New in version 0.7.1.
- gromacs.utilities.realpath(*args)[source]
Join all args and return the real path, rooted at /.
Expands
~
and environment variables such as$HOME
.Returns
None
if any of the args is none.
- gromacs.utilities.in_dir(directory[, create=True])[source]
Context manager to execute a code block in a directory.
The directory is created if it does not exist (unless create =
False
is set)At the end or after an exception code always returns to the directory that was the current directory before entering the block.
- gromacs.utilities.find_first(filename, suffices=None)[source]
Find first filename with a suffix from suffices.
- Arguments:
- filename
base filename; this file name is checked first
- suffices
list of suffices that are tried in turn on the root of filename; can contain the ext separator (
os.path.extsep
) or not
- Returns:
The first match or
None
.
- gromacs.utilities.withextsep(extensions)[source]
Return list in which each element is guaranteed to start with
os.path.extsep
.
- gromacs.utilities.which(program)[source]
Determine full path of executable program on
PATH
.(Jay at http://stackoverflow.com/questions/377017/test-if-executable-exists-in-python)
New in version 0.5.1.
Functions that improve list processing and which do not treat strings as lists:
- gromacs.utilities.iterable(obj)[source]
Returns
True
if obj can be iterated over and is not a string.
- gromacs.utilities.asiterable(obj)[source]
Returns obj so that it can be iterated over; a string is not treated as iterable
- gromacs.utilities.firstof(obj)[source]
Returns the first entry of a sequence or the obj.
Treats strings as single objects.
Functions that help handling Gromacs files:
- gromacs.utilities.unlink_gmx(*args)[source]
Unlink (remove) Gromacs file(s) and all corresponding backups.
- gromacs.utilities.unlink_gmx_backups(*args)[source]
Unlink (rm) all backup files corresponding to the listed files.
- gromacs.utilities.number_pdbs(*args, **kwargs)[source]
Rename pdbs x1.pdb … x345.pdb –> x0001.pdb … x0345.pdb
- Arguments:
args: filenames or glob patterns (such as “pdb/md*.pdb”)
format: format string including keyword num [“%(num)04d”]
Functions that make working with matplotlib easier:
- gromacs.utilities.activate_subplot(numPlot)[source]
Make subplot numPlot active on the canvas.
Use this if a simple
subplot(numRows, numCols, numPlot)
overwrites the subplot instead of activating it.
- gromacs.utilities.remove_legend(ax=None)[source]
Remove legend for axes or gca.
See http://osdir.com/ml/python.matplotlib.general/2005-07/msg00285.html
Miscellaneous functions:
- gromacs.utilities.convert_aa_code(x)[source]
Converts between 3-letter and 1-letter amino acid codes.
Data
- gromacs.utilities.amino_acid_codes = {'A': 'ALA', 'C': 'CYS', 'D': 'ASP', 'E': 'GLU', 'F': 'PHE', 'G': 'GLY', 'H': 'HIS', 'I': 'ILE', 'K': 'LYS', 'L': 'LEU', 'M': 'MET', 'N': 'ASN', 'P': 'PRO', 'Q': 'GLN', 'R': 'ARG', 'S': 'SER', 'T': 'THR', 'V': 'VAL', 'W': 'TRP', 'Y': 'TYR'}
translation table for 1-letter codes –> 3-letter codes .. Note: This does not work for HISB and non-default charge state aa!
analysis.collections
– Handling of groups of simulation instances
This module contains classes and functions that combine multiple
gromacs.analysis.core.Simulation
objects. In this way the
same kind of analysis or plotting task can be carried out
simultaneously for all simulations in the collection.
- class gromacs.collections.Collection(iterable=(), /)[source]
Multiple objects (organized as a list).
Methods are applied to all objects in the Collection and returned as new Collection:
>>> from gromacs.analysis.collections import Collection >>> animals = Collection(['ant', 'boar', 'ape', 'gnu']) >>> animals.startswith('a') Collection([True, False, True, False])
Similarly, attributes are returned as a Collection.
Using
Collection.save()
one can save the whole collection to disk and restore it later with theCollection.load()
method>>> animals.save('zoo') >>> arc = Collection() >>> arc.load('zoo') >>> arc.load('zoo', append=True) >>> arc ['ant', 'boar', 'ape', 'gnu', 'ant', 'boar', 'ape', 'gnu']
gromacs.tools
– Gromacs commands classes
The underlying idea of GromacsWrapper is to automatically generate Python
classes that run the actual GROMACS tools. These classes are written in such a
way that they can be called as if they were functions with parameters
resembling the GROMACS tools commandline arguments. This a two-step process
when gromacs
is imported: First a tool class is generated for each
GROMACS tool (the Command list). Then an instance of each of these
classes is instantiated in placed into the top level gromacs
module. Thus, gromacs.grompp
is an instance of
gromacs.tools.Grompp
. Users typically only access the tool instances
at the top level of the module.
Note
Because the tool instances are autogenerated on the fly and depend on the
installed GROMACS release, there is no autogenerated documentation available
for them. Use in Python help(gromacs.grompp)
to get help.
The following documentation for the gromacs.tools
module describes in
more detail how GROMACS tools are generated and managed and is primarily of
interest to developers.
Gromacs tool instantiation
A Gromacs command class produces an instance of a Gromacs tool command
(gromacs.core.GromacsCommand
), any argument or keyword argument
supplied will be used as default values for when the command is run.
Classes have the same name of the corresponding Gromacs tool with the first letter capitalized and dot and dashes replaced by underscores to make it a valid python identifier.
The list of tools to be loaded is configured with the tools
and groups
options of the ~/.gromacswrapper.cfg
file. Guesses are made if these
options are not provided; see gromacs.config
for details.
In the following example we create two instances of the
gromacs.tools.Trjconv
command (which runs the Gromacs trjconv
command):
from gromacs.tools import Trjconv
trjconv = tools.Trjconv()
trjconv_compact = tools.Trjconv(ur='compact', center=True, boxcenter='tric', pbc='mol',
input=('protein','system'))
The first one, trjconv
, behaves as the standard commandline tool but the
second one, trjconv_compact
, will by default create a compact
representation of the input data by taking into account the shape of the unit
cell. Of course, the same effect can be obtained by providing the corresponding
arguments to trjconv
but by naming the more specific command differently
one can easily build up a library of small tools that will solve a specific,
repeatedly encountered problem reliably. This is particularly helpful when doing
interactive work.
Aliased commands
GromacsWrapper has been around since ancient GROMACS 4.5.x and throughout the
history it has provided a way to run different versions of GROMACS commands
with the same Python script in a backwards compatible manner by aliasing
equivalent GROMACS tools to the same GromacsWrapper tool (using
NAMES5TO4
). Modern GROMACS (since 5.x and throughout 2016-2023) tools
are aliased to their Gromacs 4 tool names for backwards compatibility.
For example, in “classic” GROMACS we had the g_sas command that
became gmx sasa since GROMACS 5.x. In GromacsWrapper you can access
the same tool as gromacs.g_sas
or gromacs.sasa
.
Warning
You should check that the different GROMACS versions of tools give equivalent answers for your problem. The aliasing just makes it easy for you to call the tool in the same manner. You are still responsible for validating your own results.
Sometimes GROMACS changes commands in an way that is fundamentally incompatible
and in this way there is not much that GromacsWrapper can do. The best you can
probably do in your own scripts is to use gromacs.release
(see docs
for gromacs.tools.Release
) to check in your own script which release
of GROMACS is running and then call different tools depending on what you
find. For example, GROMACS 2023 replaced gmx do_dssp with
gmx dssp which is not directly argument-compatible so GromacsWrapper
does not alias it. Therefore, scripts relying on gromacs.do_dssp
will break unless you account for it explicitly with code similar to
release_year = int(gromacs.release()[:4])
if release_year >= 2023:
gromacs.dssp(s=TPR, f=XTC, sel="Protein",
o="dssp.dat", pbc=True, hmode="dssp")
else:
gromacs.do_dssp(s=TPR, f=XTC, input=["Protein"]
ssdump="ssdump.dat", o="ss.xpm", sc="scount.xvg")
Note
It is recommended to keep a single version of all tools for a project and record the version in the methods section of a publication.
Multi index
It is possible to extend the tool commands and patch in additional
functionality. For example, the GromacsCommandMultiIndex
class makes a
command accept multiple index files and concatenates them on the fly; the
behaviour mimics Gromacs’ “multi-file” input that has not yet been enabled for
all tools.
- class gromacs.tools.GromacsCommandMultiIndex(**kwargs)[source]
Command class that accept multiple index files.
It works combining multiple index files into a single temporary one so that tools that do not (yet) support multi index files as input can be used as if they did.
It creates a new file only if multiple index files are supplied.
Set up the command with gromacs flags as keyword arguments.
The following are generic instructions; refer to the Gromacs command usage information that should have appeared before this generic documentation.
As an example, a generic Gromacs command could use the following flags:
cmd = GromacsCommand('v', f=['md1.xtc','md2.xtc'], o='processed.xtc', t=200, ...)
which would correspond to running the command in the shell as
GromacsCommand -v -f md1.xtc md2.xtc -o processed.xtc -t 200
Gromacs command line arguments
Gromacs boolean switches (such as
-v
) are given as python positional arguments ('v'
) or as keyword argument (v=True
); note the quotes in the first case. Negating a boolean switch can be done with'nov'
,nov=True
orv=False
(and evennov=False
works as expected: it is the same asv=True
).Any Gromacs options that take parameters are handled as keyword arguments. If an option takes multiple arguments (such as the multi-file input
-f file1 file2 ...
) then the list of files must be supplied as a python list.If a keyword has the python value
None
then it will not be added to the Gromacs command line; this allows for flexible scripting if it is not known in advance if an input file is needed. In this case the default value of the gromacs tool is used.Keywords must be legal python keywords or the interpreter raises a
SyntaxError
but of course Gromacs commandline arguments are not required to be legal python. In this case “quote” the option with an underscore (_
) and the underscore will be silently stripped. For instance,-or
translates to the illegal keywordor
so it must be underscore-quoted:cmd(...., _or='mindistres.xvg')
Command execution
The command is executed with the
run()
method or by calling it as a function. The two next lines are equivalent:cmd(...) cmd.run(...)
When the command is run one can override options that were given at initialization or one can add additional ones. The same rules for supplying Gromacs flags apply as described above.
Non-Gromacs keyword arguments
The other keyword arguments (listed below) are not passed on to the Gromacs tool but determine how the command class behaves. They are only useful when instantiating a class, i.e. they determine how this tool behaves during all future invocations although it can be changed by setting
failuremode
. This is mostly of interest to developers.- Keywords:
- failure
determines how a failure of the gromacs command is treated; it can be one of the following:
- ‘raise’
raises GromacsError if command fails
- ‘warn’
issue a
GromacsFailureWarning
None
just continue silently
- docstring
additional documentation (ignored) []
Changed in version 0.6.0: The doc keyword is now ignored (because it was not worth the effort to make it work with the lazy-loading of docs).
Virtual Gromacs commands
The following “commands” do not exist as tools in the Gromacs package but are added here because they are useful.
- class gromacs.tools.Release[source]
Release string of the currently loaded Gromacs version.
- Returns:
str, Release string such as “2018.2” or “4.6.5” or
None
if Gromacs can not be found.
Note
The release string is obtained from the output of
gmx grompp -version
, specifically the line starting withGromacs version:
. If this changes then this function will break.Example
This command allows user code to work around known issues with old/new versions of Gromacs:
if gromacs.release.startswith("4"): # do something for classic Gromacs else: # do it the modern way
(Note that calling
gromacs.release()
will simply return the release string, which is equivalent tostr(gromacs.release)
. For convenience, thestartswith()
method exists, which directly works with the release string.)New in version 0.8.0.
Helpers
These helper functions are necessary for collecting and setting up the Gromacs tools. They are mostly of interest to developers.
- gromacs.tools.NAMES5TO4 = {'check': 'gmxcheck', 'convert-tpr': 'tpbconv', 'distance': 'g_dist', 'do_dssp': 'do_dssp', 'dump': 'gmxdump', 'editconf': 'editconf', 'eneconv': 'eneconv', 'gangle': 'g_sgangle', 'genconf': 'genconf', 'genion': 'genion', 'genrestr': 'genrestr', 'grompp': 'grompp', 'make_edi': 'make_edi', 'make_ndx': 'make_ndx', 'mdrun': 'mdrun', 'pdb2gmx': 'pdb2gmx', 'sasa': 'g_sas', 'solvate': 'genbox', 'trjcat': 'trjcat', 'trjconv': 'trjconv', 'trjorder': 'trjorder', 'xpm2ps': 'xpm2ps'}
dict of names in Gromacs 5 that correspond to an equivalent tool in in Gromacs 4. The names are literal Gromacs names.
- gromacs.tools.V4TOOLS = ('g_cluster', 'g_dyndom', 'g_mdmat', 'g_principal', 'g_select', 'g_wham', 'mdrun', 'do_dssp', 'g_clustsize', 'g_enemat', 'g_membed', 'g_protonate', 'g_sgangle', 'g_wheel', 'mdrun_d', 'editconf', 'g_confrms', 'g_energy', 'g_mindist', 'g_rama', 'g_sham', 'g_x2top', 'mk_angndx', 'eneconv', 'g_covar', 'g_filter', 'g_morph', 'g_rdf', 'g_sigeps', 'genbox', 'pdb2gmx', 'g_anadock', 'g_current', 'g_gyrate', 'g_msd', 'g_sorient', 'genconf', 'g_anaeig', 'g_density', 'g_h2order', 'g_nmeig', 'g_rms', 'g_spatial', 'genion', 'tpbconv', 'g_analyze', 'g_densmap', 'g_hbond', 'g_nmens', 'g_rmsdist', 'g_spol', 'genrestr', 'trjcat', 'g_angle', 'g_dielectric', 'g_helix', 'g_nmtraj', 'g_rmsf', 'g_tcaf', 'gmxcheck', 'trjconv', 'g_bar', 'g_dih', 'g_helixorient', 'g_order', 'g_rotacf', 'g_traj', 'gmxdump', 'trjorder', 'g_bond', 'g_dipoles', 'g_kinetics', 'g_pme_error', 'g_rotmat', 'g_tune_pme', 'grompp', 'g_bundle', 'g_disre', 'g_lie', 'g_polystat', 'g_saltbr', 'g_vanhove', 'make_edi', 'xpm2ps', 'g_chi', 'g_dist', 'g_luck', 'g_potential', 'g_sas', 'g_velacc', 'make_ndx')
List of tools coming with standard Gromacs 4.x.
- gromacs.tools.tool_factory(clsname, name, driver, base=<class 'gromacs.core.GromacsCommand'>)[source]
Factory for GromacsCommand derived types.
- gromacs.tools.load_v4_tools()[source]
Load Gromacs 4.x tools automatically using some heuristic.
Tries to load tools (1) in configured tool groups (2) and fails back to automatic detection from
GMXBIN
(3) then to a prefilled list.Also load any extra tool configured in
~/.gromacswrapper.cfg
- Returns:
dict mapping tool names to GromacsCommand classes
- gromacs.tools.load_v5_tools()[source]
Load Gromacs 2023/…/2016/5.x tools automatically using some heuristic.
Tries to load tools (1) using the driver from configured groups (2) and falls back to automatic detection from
GMXBIN
(3) then to rough guesses.In all cases the command
gmx help
is run to get all tools available.- Returns:
dict mapping tool names to GromacsCommand classes
- gromacs.tools.find_executables(path)[source]
Find executables in a path.
Searches executables in a directory excluding some know commands unusable with GromacsWrapper.
- Parameters:
path – dirname to search for
- Returns:
list of executables
Gromacs tools
Each command class in the Command list below is used to create
a command instance in the top level gromacs
module if the
Gromacs tools can be found in the file system (see
gromacs.config
). For example, the class Grompp
is used
to create the command gromacs.grompp
.
Registry
The Command list below reflects the Gromacs commands that were
available when the documentation was built and can vary from
installation to installation. All currently available Gromacs commands
are listed in the dictionary gromacs.tools.registry
, in
particular, gromacs.tools.registry.keys()
lists the names.
- gromacs.tools.registry = {'Grompp': <gromacs.tools.Grompp', 'Mdrun': <gromacs.tools.Mdrun', ...}
dict
that contains all currently available Gromacs commands as well as the Virtual Gromacs commands. Theregistry
is generated when thegromacs.tools
package is imported for the first time.
Command list
The list below reflects the Gromacs commands that were available when
the documentation was built and can vary from installation to
installation. All currently available Gromacs commands are listed in
the dictionary gromacs.tools.registry
, which can be processed
at run time.
- class gromacs.tools.Anaeig
- class gromacs.tools.Analyze
- class gromacs.tools.Angle
- class gromacs.tools.Awh
- class gromacs.tools.Bar
- class gromacs.tools.Bundle
- class gromacs.tools.Check
- class gromacs.tools.Chi
- class gromacs.tools.Cluster
- class gromacs.tools.Clustsize
- class gromacs.tools.Confrms
- class gromacs.tools.Convert_tpr
- class gromacs.tools.Convert_trj
- class gromacs.tools.Covar
- class gromacs.tools.Current
- class gromacs.tools.Density
- class gromacs.tools.Densmap
- class gromacs.tools.Densorder
- class gromacs.tools.Dielectric
- class gromacs.tools.Dipoles
- class gromacs.tools.Disre
- class gromacs.tools.Distance
- class gromacs.tools.Dos
- class gromacs.tools.Dssp
- class gromacs.tools.Dump
- class gromacs.tools.Dyecoupl
- class gromacs.tools.Editconf
- class gromacs.tools.Eneconv
- class gromacs.tools.Enemat
- class gromacs.tools.Energy
- class gromacs.tools.Extract_cluster
- class gromacs.tools.Filter
- class gromacs.tools.Freevolume
- class gromacs.tools.Gangle
- class gromacs.tools.Genconf
- class gromacs.tools.Genion
- class gromacs.tools.Genrestr
- class gromacs.tools.Grompp
- class gromacs.tools.Gyrate
- class gromacs.tools.H2order
- class gromacs.tools.Hbond
- class gromacs.tools.Helix
- class gromacs.tools.Helixorient
- class gromacs.tools.Help
- class gromacs.tools.Hydorder
- class gromacs.tools.Insert_molecules
- class gromacs.tools.Lie
- class gromacs.tools.Make_edi
- class gromacs.tools.Make_ndx
- class gromacs.tools.Mdmat
- class gromacs.tools.Mdrun
- class gromacs.tools.Mindist
- class gromacs.tools.Mk_angndx
- class gromacs.tools.Msd
- class gromacs.tools.Nmeig
- class gromacs.tools.Nmens
- class gromacs.tools.Nmr
- class gromacs.tools.Nmtraj
- class gromacs.tools.Nonbonded_benchmark
- class gromacs.tools.Order
- class gromacs.tools.Pairdist
- class gromacs.tools.Pdb2gmx
- class gromacs.tools.Pme_error
- class gromacs.tools.Polystat
- class gromacs.tools.Potential
- class gromacs.tools.Principal
- class gromacs.tools.Rama
- class gromacs.tools.Rdf
- class gromacs.tools.Report_methods
- class gromacs.tools.Rms
- class gromacs.tools.Rmsdist
- class gromacs.tools.Rmsf
- class gromacs.tools.Rotacf
- class gromacs.tools.Rotmat
- class gromacs.tools.Saltbr
- class gromacs.tools.Sans
- class gromacs.tools.Sasa
- class gromacs.tools.Saxs
- class gromacs.tools.Select
- class gromacs.tools.Sham
- class gromacs.tools.Sigeps
- class gromacs.tools.Solvate
- class gromacs.tools.Sorient
- class gromacs.tools.Spatial
- class gromacs.tools.Spol
- class gromacs.tools.Tcaf
- class gromacs.tools.Traj
- class gromacs.tools.Trajectory
- class gromacs.tools.Trjcat
- class gromacs.tools.Trjconv
- class gromacs.tools.Trjorder
- class gromacs.tools.Tune_pme
- class gromacs.tools.Vanhove
- class gromacs.tools.Velacc
- class gromacs.tools.Wham
- class gromacs.tools.Wheel
- class gromacs.tools.X2top
- class gromacs.tools.Xpm2ps
- class gromacs.tools.G_anaeig
- class gromacs.tools.G_analyze
- class gromacs.tools.G_angle
- class gromacs.tools.G_awh
- class gromacs.tools.G_bar
- class gromacs.tools.G_bundle
- class gromacs.tools.Gmxcheck
- class gromacs.tools.G_chi
- class gromacs.tools.G_cluster
- class gromacs.tools.G_clustsize
- class gromacs.tools.G_confrms
- class gromacs.tools.Tpbconv
- class gromacs.tools.G_convert_trj
- class gromacs.tools.G_covar
- class gromacs.tools.G_current
- class gromacs.tools.G_density
- class gromacs.tools.G_densmap
- class gromacs.tools.G_densorder
- class gromacs.tools.G_dielectric
- class gromacs.tools.G_dipoles
- class gromacs.tools.G_disre
- class gromacs.tools.G_dist
- class gromacs.tools.G_dos
- class gromacs.tools.G_dssp
- class gromacs.tools.Gmxdump
- class gromacs.tools.G_dyecoupl
- class gromacs.tools.G_enemat
- class gromacs.tools.G_energy
- class gromacs.tools.G_extract_cluster
- class gromacs.tools.G_filter
- class gromacs.tools.G_freevolume
- class gromacs.tools.G_sgangle
- class gromacs.tools.G_gyrate
- class gromacs.tools.G_h2order
- class gromacs.tools.G_hbond
- class gromacs.tools.G_helix
- class gromacs.tools.G_helixorient
- class gromacs.tools.G_help
- class gromacs.tools.G_hydorder
- class gromacs.tools.G_insert_molecules
- class gromacs.tools.G_lie
- class gromacs.tools.G_mdmat
- class gromacs.tools.G_mindist
- class gromacs.tools.G_mk_angndx
- class gromacs.tools.G_msd
- class gromacs.tools.G_nmeig
- class gromacs.tools.G_nmens
- class gromacs.tools.G_nmr
- class gromacs.tools.G_nmtraj
- class gromacs.tools.G_nonbonded_benchmark
- class gromacs.tools.G_order
- class gromacs.tools.G_pairdist
- class gromacs.tools.G_pme_error
- class gromacs.tools.G_polystat
- class gromacs.tools.G_potential
- class gromacs.tools.G_principal
- class gromacs.tools.G_rama
- class gromacs.tools.G_rdf
- class gromacs.tools.G_report_methods
- class gromacs.tools.G_rms
- class gromacs.tools.G_rmsdist
- class gromacs.tools.G_rmsf
- class gromacs.tools.G_rotacf
- class gromacs.tools.G_rotmat
- class gromacs.tools.G_saltbr
- class gromacs.tools.G_sans
- class gromacs.tools.G_sas
- class gromacs.tools.G_saxs
- class gromacs.tools.G_select
- class gromacs.tools.G_sham
- class gromacs.tools.G_sigeps
- class gromacs.tools.Genbox
- class gromacs.tools.G_sorient
- class gromacs.tools.G_spatial
- class gromacs.tools.G_spol
- class gromacs.tools.G_tcaf
- class gromacs.tools.G_traj
- class gromacs.tools.G_trajectory
- class gromacs.tools.G_tune_pme
- class gromacs.tools.G_vanhove
- class gromacs.tools.G_velacc
- class gromacs.tools.G_wham
- class gromacs.tools.G_wheel
- class gromacs.tools.G_x2top
Gromacs building blocks
Building blocks are small classes or functions that can be freely combined in setup or analysis scripts or used interactively. These modules act as “library” for common tasks.
gromacs.cbook
– Gromacs Cook Book
The cbook
(cook book) module contains short recipes for tasks
that are often repeated. In the simplest case this is just one of the
gromacs tools with a certain set of default command line options.
By abstracting and collecting these invocations here, errors can be reduced and the code snippets can also serve as canonical examples for how to do simple things.
Miscellaneous canned Gromacs commands
Simple commands with new default options so that they solve a specific problem (see also Manipulating trajectories and structures):
- gromacs.cbook.rmsd_backbone([s="md.tpr", f="md.xtc"[, ...]])
Computes the RMSD of the “Backbone” atoms after fitting to the “Backbone” (including both translation and rotation).
Manipulating trajectories and structures
Standard invocations for manipulating trajectories.
- gromacs.cbook.trj_compact([s="md.tpr", f="md.xtc", o="compact.xtc"[, ...]])
Writes an output trajectory or frame with a compact representation of the system centered on the protein. It centers on the group “Protein” and outputs the whole “System” group.
- gromacs.cbook.trj_xyfitted([s="md.tpr", f="md.xtc"[, ...]])
Writes a trajectory centered and fitted to the protein in the XY-plane only.
This is useful for membrane proteins. The system must be oriented so that the membrane is in the XY plane. The protein backbone is used for the least square fit, centering is done for the whole protein., but this can be changed with the input =
('backbone', 'protein','system')
keyword.Note
Gromacs 4.x only
- gromacs.cbook.trj_fitandcenter(xy=False, **kwargs)[source]
Center everything and make a compact representation (pass 1) and fit the system to a reference (pass 2).
- Keywords:
- s
input structure file (tpr file required to make molecule whole); if a list or tuple is provided then s[0] is used for pass 1 (should be a tpr) and s[1] is used for the fitting step (can be a pdb of the whole system)
If a second structure is supplied then it is assumed that the fitted trajectory should not be centered.
- f
input trajectory
- o
output trajectory
- input
A list with three groups. The default is
['backbone', 'protein','system']
. The fit command uses all three (1st for least square fit, 2nd for centering, 3rd for output), the centered/make-whole stage use 2nd for centering and 3rd for output.- input1
If input1 is supplied then input is used exclusively for the fitting stage (pass 2) and input1 for the centering (pass 1).
- n
Index file used for pass 1 and pass 2.
- n1
If n1 is supplied then index n1 is only used for pass 1 (centering) and n for pass 2 (fitting).
- xyboolean
If
True
then only do a rot+trans fit in the xy plane (good for membrane simulations); default isFalse
.- kwargs
All other arguments are passed to
Trjconv
.
Note that here we first center the protein and create a compact box, using
-pbc mol -ur compact -center -boxcenter tric
and write an intermediate xtc. Then in a second pass we perform a rotation+translation fit (or restricted to the xy plane if xy =True
is set) on the intermediate xtc to produce the final trajectory. Doing it in this order has the disadvantage that the solvent box is rotating around the protein but the opposite order (with center/compact second) produces strange artifacts where columns of solvent appear cut out from the box—it probably means that after rotation the information for the periodic boundaries is not correct any more.Most kwargs are passed to both invocations of
gromacs.tools.Trjconv
so it does not really make sense to use eg skip; in this case do things manually.By default the input to the fit command is (‘backbone’, ‘protein’,’system’); the compact command always uses the second and third group for its purposes or if this fails, prompts the user.
Both steps cannot performed in one pass; this is a known limitation of
trjconv
. An intermediate temporary XTC files is generated which should be automatically cleaned up unless bad things happened.The function tries to honour the input/output formats. For instance, if you want trr output you need to supply a trr file as input and explicitly give the output file also a trr suffix.
Note
For big trajectories it can take a very long time and consume a large amount of temporary diskspace.
We follow the g_spatial documentation in preparing the trajectories:
trjconv -s a.tpr -f a.xtc -o b.xtc -center -boxcenter tric -ur compact -pbc mol trjconv -s a.tpr -f b.xtc -o c.xtc -fit rot+trans
- gromacs.cbook.cat(prefix='md', dirname='.', partsdir='parts', fulldir='full', resolve_multi='pass')[source]
Concatenate all parts of a simulation.
The xtc, trr, and edr files in dirname such as prefix.xtc, prefix.part0002.xtc, prefix.part0003.xtc, … are
moved to the partsdir (under dirname)
concatenated with the Gromacs tools to yield prefix.xtc, prefix.trr, prefix.edr, prefix.gro (or prefix.md) in dirname
Store these trajectories in fulldir
Note
Trajectory files are never deleted by this function to avoid data loss in case of bugs. You will have to clean up yourself by deleting dirname/partsdir.
Symlinks for the trajectories are not handled well and break the function. Use hard links instead.
Warning
If an exception occurs when running this function then make doubly and triply sure where your files are before running this function again; otherwise you might overwrite data. Possibly you will need to manually move the files from partsdir back into the working directory dirname; this should onlu overwrite generated files so far but check carefully!
- Keywords:
- prefix
deffnm of the trajectories [md]
- resolve_multi
how to deal with multiple “final” gro or pdb files: normally there should only be one but in case of restarting from the checkpoint of a finished simulation one can end up with multiple identical ones.
“pass” : do nothing and log a warning
“guess” : take prefix.pdb or prefix.gro if it exists, otherwise the one of prefix.partNNNN.gro|pdb with the highes NNNN
- dirname
change to dirname and assume all tarjectories are located there [.]
- partsdir
directory where to store the input files (they are moved out of the way); partsdir must be manually deleted [parts]
- fulldir
directory where to store the final results [full]
- class gromacs.cbook.Frames(structure, trj, maxframes=None, format='pdb', **kwargs)[source]
A iterator that transparently provides frames from a trajectory.
The iterator chops a trajectory into individual frames for analysis tools that only work on separate structures such as
gro
orpdb
files. Instead of turning the whole trajectory immediately into pdb files (and potentially filling the disk), the iterator can be instructed to only provide a fixed number of frames and compute more frames when needed.Note
Setting a limit on the number of frames on disk can lead to longish waiting times because
trjconv
must re-seek to the middle of the trajectory and the only way it can do this at the moment is by reading frames sequentially. This might still be preferrable to filling up a disk, though.Warning
The maxframes option is not implemented yet; use the dt option or similar to keep the number of frames manageable.
Set up the Frames iterator.
- Arguments:
- structure
name of a structure file (tpr, pdb, …)
- trj
name of the trajectory (xtc, trr, …)
- format
output format for the frames, eg “pdb” or “gro” [pdb]
- maxframesint
maximum number of frames that are extracted to disk at one time; set to
None
to extract the whole trajectory at once. [None
]- kwargs
All other arguments are passed to class:~gromacs.tools.Trjconv; the only options that cannot be changed are sep and the output file name o.
- property all_frames
Unordered list of all frames currently held on disk.
Holds the current frame number of the currently extracted batch of frames. Increases when iterating.
- totalframes
Total number of frames read so far; only important when maxframes > 0 is used.
- class gromacs.cbook.Transformer(s='topol.tpr', f='traj.xtc', n=None, force=None, dirname='.', outdir=None)[source]
Class to handle transformations of trajectories.
Center, compact, and fit to reference structure in tpr (optionally, only center in the xy plane):
center_fit()
Write compact xtc and tpr with water removed:
strip_water()
Write compact xtc and tpr only with protein:
keep_protein_only()
Set up Transformer with structure and trajectory.
Supply n = tpr, f = xtc (and n = ndx) relative to dirname.
- Keywords:
- s
tpr file (or similar); note that this should not contain position restraints if it is to be used with a reduced system (see
strip_water()
)- f
trajectory (xtc, trr, …)
- n
index file (it is typically safe to leave this as
None
; in cases where a trajectory needs to be centered on non-standard groups this should contain those groups)- force
Set the default behaviour for handling existing files:
True
: overwrite existing trajectoriesFalse
: throw a IOError exceptionNone
: skip existing and log a warning [default]
- dirname
directory in which all operations are performed, relative paths are interpreted relative to dirname [.]
- outdir
directory under which output files are placed; by default the same directory where the input files live
- center_fit(**kwargs)[source]
Write compact xtc that is fitted to the tpr reference structure.
See
gromacs.cbook.trj_fitandcenter()
for details and description of kwargs (including input, input1, n and n1 for how to supply custom index groups). The most important ones are listed here but in most cases the defaults should work.- Keywords:
- s
Input structure (typically the default tpr file but can be set to some other file with a different conformation for fitting)
- n
Alternative index file.
- o
Name of the output trajectory.
- xyBoolean
If
True
then only fit in xy-plane (useful for a membrane normal to z). The default isFalse
.- force
True
: overwrite existing trajectoriesFalse
: throw a IOError exceptionNone
: skip existing and log a warning [default]
- Returns:
dictionary with keys tpr, xtc, which are the names of the the new files
- fit(xy=False, **kwargs)[source]
Write xtc that is fitted to the tpr reference structure.
Runs
gromacs.tools.trjconv
with appropriate arguments for fitting. The most important kwargs are listed here but in most cases the defaults should work.Note that the default settings do not include centering or periodic boundary treatment as this often does not work well with fitting. It is better to do this as a separate step (see
center_fit()
orgromacs.cbook.trj_fitandcenter()
)- Keywords:
- s
Input structure (typically the default tpr file but can be set to some other file with a different conformation for fitting)
- n
Alternative index file.
- o
Name of the output trajectory. A default name is created. If e.g. dt = 100 is one of the kwargs then the default name includes “_dt100ps”.
- xyboolean
If
True
then only do a rot+trans fit in the xy plane (good for membrane simulations); default isFalse
.- force
Override standard behavior (potentially dangerous) -
True
: overwrite existing trajectories -False
: throw a IOError exception -None
: skip existing and log a warning [default]- fitgroup
index group to fit on [“backbone”]
Note
If keyword input is supplied then it will override fitgroup; input =
[fitgroup, outgroup]
- kwargs
kwargs are passed to
trj_xyfitted()
- Returns:
dictionary with keys tpr, xtc, which are the names of the the new files
- keep_protein_only(os=None, o=None, on=None, compact=False, groupname='proteinonly', **kwargs)[source]
Write xtc and tpr only containing the protein.
- Keywords:
- os
Name of the output tpr file; by default use the original but insert “proteinonly” before suffix.
- o
Name of the output trajectory; by default use the original name but insert “proteinonly” before suffix.
- on
Name of a new index file.
- compact
True
: write a compact and centered trajectoryFalse
: use trajectory as it is [False
]- groupname
Name of the protein-only group.
- keepalso
List of literal make_ndx selections of additional groups that should be kept, e.g. [‘resname DRUG’, ‘atom 6789’].
- forceBoolean
True
: overwrite existing trajectoriesFalse
: throw a IOError exceptionNone
: skip existing and log a warning [default]
- kwargs
are passed on to
gromacs.cbook.trj_compact()
(unless the values have to be set to certain values such as s, f, n, o keywords). The input keyword is always mangled: Only the first entry (the group to centre the trajectory on) is kept, and as a second group (the output group) groupname is used.
- Returns:
dictionary with keys tpr, xtc, ndx which are the names of the the new files
Warning
The input tpr file should not have any position restraints; otherwise Gromacs will throw a hissy-fit and say
Software inconsistency error: Position restraint coordinates are missing
(This appears to be a bug in Gromacs 4.x.)
- outfile(p)[source]
Path for an output file.
If
outdir
is set then the path isoutdir/basename(p)
else justp
- rp(*args)[source]
Return canonical path to file under dirname with components args
If args form an absolute path then just return it as the absolute path.
- strip_fit(**kwargs)[source]
Strip water and fit to the remaining system.
First runs
strip_water()
and thenfit()
; see there for arguments.strip_input is used for
strip_water()
(but is only useful in special cases, e.g. when there is no Protein group defined. Then set strip_input =['Other']
.input is passed on to
fit()
and can contain the[center_group, fit_group, output_group]
fitgroup is only passed to
fit()
and just contains the group to fit to (“backbone” by default)Warning
fitgroup can only be a Gromacs default group and not a custom group (because the indices change after stripping)
By default fit = “rot+trans” (and fit is passed to
fit()
, together with the xy =False
keyword)
Note
The call signature of
strip_water()
is somewhat different from this one.
- strip_water(os=None, o=None, on=None, compact=False, resn='SOL', groupname='notwater', **kwargs)[source]
Write xtc and tpr with water (by resname) removed.
- Keywords:
- os
Name of the output tpr file; by default use the original but insert “nowater” before suffix.
- o
Name of the output trajectory; by default use the original name but insert “nowater” before suffix.
- on
Name of a new index file (without water).
- compact
True
: write a compact and centered trajectoryFalse
: use trajectory as it is [False
]- centergroup
Index group used for centering [“Protein”]
Note
If input is provided (see below under kwargs) then centergroup is ignored and the group for centering is taken as the first entry in input.
- resn
Residue name of the water molecules; all these residues are excluded.
- groupname
Name of the group that is generated by subtracting all waters from the system.
- forceBoolean
True
: overwrite existing trajectoriesFalse
: throw a IOError exceptionNone
: skip existing and log a warning [default]
- kwargs
are passed on to
gromacs.cbook.trj_compact()
(unless the values have to be set to certain values such as s, f, n, o keywords). The input keyword is always mangled: Only the first entry (the group to centre the trajectory on) is kept, and as a second group (the output group) groupname is used.
- Returns:
dictionary with keys tpr, xtc, ndx which are the names of the the new files
Warning
The input tpr file should not have any position restraints; otherwise Gromacs will throw a hissy-fit and say
Software inconsistency error: Position restraint coordinates are missing
(This appears to be a bug in Gromacs 4.x.)
Processing output
There are cases when a script has to to do different things depending on the output from a Gromacs tool.
For instance, a common case is to check the total charge after
grompping a tpr file. The grompp_qtot
function does just that.
- gromacs.cbook.grompp_qtot(*args, **kwargs)[source]
Run
gromacs.grompp
and return the total charge of the system.- Arguments:
The arguments are the ones one would pass to
gromacs.grompp()
.- Returns:
The total charge as reported
Some things to keep in mind:
The stdout output of grompp is only shown when an error occurs. For debugging, look at the log file or screen output and try running the normal
gromacs.grompp()
command and analyze the output if the debugging messages are not sufficient.Check that
qtot
is correct. Because the function is based on pattern matching of the informative output of grompp it can break when the output format changes. This version recognizes lines like' System has non-zero total charge: -4.000001e+00'
using the regular expression
System has non-zero total charge: *(?P<qtot>[-+]?d*.d+([eE][-+]d+)?)
.
- gromacs.cbook.get_volume(f)[source]
Return the volume in nm^3 of structure file f.
(Uses
gromacs.editconf()
; error handling is not good)
- gromacs.cbook.parse_ndxlist(output)[source]
Parse output from make_ndx to build list of index groups:
groups = parse_ndxlist(output)
output should be the standard output from
make_ndx
, e.g.:rc,output,junk = gromacs.make_ndx(..., input=('', 'q'), stdout=False, stderr=True)
(or simply use
rc,output,junk = cbook.make_ndx_captured(…)
which presets input, stdout and stderr; of course input can be overriden.)
- Returns:
The function returns a list of dicts (
groups
) with fields- name
name of the groups
- nr
number of the group (starts at 0)
- natoms
number of atoms in the group
Working with topologies and mdp files
- gromacs.cbook.create_portable_topology(topol, struct, **kwargs)[source]
Create a processed topology.
The processed (or portable) topology file does not contain any
#include
statements and hence can be easily copied around. It also makes it possible to re-grompp without having any special itp files available.- Arguments:
- topol
topology file
- struct
coordinate (structure) file
- Keywords:
- processed
name of the new topology file; if not set then it is named like topol but with
pp_
prepended- includes
path or list of paths of directories in which itp files are searched for
- grompp_kwargs*
other options for grompp such as
maxwarn=2
can also be supplied
- Returns:
full path to the processed topology
- gromacs.cbook.edit_mdp(mdp, new_mdp=None, extend_parameters=None, **substitutions)[source]
Change values in a Gromacs mdp file.
Parameters and values are supplied as substitutions, eg
nsteps=1000
.By default the template mdp file is overwritten in place.
If a parameter does not exist in the template then it cannot be substituted and the parameter/value pair is returned. The user has to check the returned list in order to make sure that everything worked as expected. At the moment it is not possible to automatically append the new values to the mdp file because of ambiguities when having to replace dashes in parameter names with underscores (see the notes below on dashes/underscores).
If a parameter is set to the value
None
then it will be ignored.- Arguments:
- mdpfilename
filename of input (and output filename of
new_mdp=None
)- new_mdpfilename
filename of alternative output mdp file [None]
- extend_parametersstring or list of strings
single parameter or list of parameters for which the new values should be appended to the existing value in the mdp file. This makes mostly sense for a single parameter, namely ‘include’, which is set as the default. Set to
[]
to disable. [‘include’]- substitutions
parameter=value pairs, where parameter is defined by the Gromacs mdp file; dashes in parameter names have to be replaced by underscores. If a value is a list-like object then the items are written as a sequence, joined with spaces, e.g.
ref_t=[310,310,310] ---> ref_t = 310 310 310
- Returns:
Dict of parameters that have not been substituted.
Example
edit_mdp('md.mdp', new_mdp='long_md.mdp', nsteps=100000, nstxtcout=1000, lincs_iter=2)
Note
Dashes in Gromacs mdp parameters have to be replaced by an underscore when supplied as python keyword arguments (a limitation of python). For example the MDP syntax is
lincs-iter = 4
but the corresponding keyword would belincs_iter = 4
.If the keyword is set as a dict key, eg
mdp_params['lincs-iter']=4
then one does not have to substitute.Parameters aa_bb and aa-bb are considered the same (although this should not be a problem in practice because there are no mdp parameters that only differ by a underscore).
This code is more compact in
Perl
as one can uses///
operators:s/^(\s*${key}\s*=\s*).*/$1${val}/
See also
One can also load the mdp file with
gromacs.formats.MDP
, edit the object (a dict), and save it again.
- gromacs.cbook.add_mdp_includes(topology=None, kwargs=None)[source]
Set the mdp include key in the kwargs dict.
Add the directory containing topology.
Add all directories appearing under the key includes
Generate a string of the form “-Idir1 -Idir2 …” that is stored under the key include (the corresponding mdp parameter)
By default, the directories
.
and..
are also added to the include string for the mdp; when fed intogromacs.cbook.edit_mdp()
it will result in a line such asinclude = -I. -I.. -I../topology_dir ....
Note that the user can always override the behaviour by setting the include keyword herself; in this case this function does nothing.
If no kwargs were supplied then a dict is generated with the single include entry.
- Arguments:
- topologytop filename
Topology file; the name of the enclosing directory is added to the include path (if supplied) [
None
]- kwargsdict
Optional dictionary of mdp keywords; will be modified in place. If it contains the includes keyword with either a single string or a list of strings then these paths will be added to the include statement.
- Returns:
kwargs with the include keyword added if it did not exist previously; if the keyword already existed, nothing happens.
Note
The kwargs dict is modified in place. This function is a bit of a hack. It might be removed once all setup functions become methods in a nice class.
- gromacs.cbook.grompp_qtot(*args, **kwargs)[source]
Run
gromacs.grompp
and return the total charge of the system.- Arguments:
The arguments are the ones one would pass to
gromacs.grompp()
.- Returns:
The total charge as reported
Some things to keep in mind:
The stdout output of grompp is only shown when an error occurs. For debugging, look at the log file or screen output and try running the normal
gromacs.grompp()
command and analyze the output if the debugging messages are not sufficient.Check that
qtot
is correct. Because the function is based on pattern matching of the informative output of grompp it can break when the output format changes. This version recognizes lines like' System has non-zero total charge: -4.000001e+00'
using the regular expression
System has non-zero total charge: *(?P<qtot>[-+]?d*.d+([eE][-+]d+)?)
.
Working with index files
Manipulation of index files (ndx
) can be cumbersome because the
make_ndx
program is not very sophisticated (yet) compared to
full-fledged atom selection expression as available in Charmm, VMD, or
MDAnalysis. Some tools help in building and interpreting index files.
See also
The gromacs.formats.NDX
class can solve a number
of index problems in a cleaner way than the classes and
functions here.
- class gromacs.cbook.IndexBuilder(struct=None, selections=None, names=None, name_all=None, ndx=None, out_ndx='selection.ndx', offset=0)[source]
Build an index file with specified groups and the combined group.
This is not a full blown selection parser a la Charmm, VMD or MDAnalysis but a very quick hack.
Example
How to use the
IndexBuilder
:G = gromacs.cbook.IndexBuilder('md_posres.pdb', ['S312:OG','T313:OG1','A38:O','A309:O','@a62549 & r NA'], offset=-9, out_ndx='selection.ndx') groupname, ndx = G.combine() del G
The residue numbers are given with their canonical resids from the sequence or pdb. offset=-9 says that one calculates Gromacs topology resids by subtracting 9 from the canonical resid.
The combined selection is
OR
ed by default and written to selection.ndx. One can also add all the groups in the initial ndx file (or the make_ndx default groups) to the output (see the defaultgroups keyword forIndexBuilder.combine()
).Generating an index file always requires calling
combine()
even if there is only a single group.Deleting the class removes all temporary files associated with it (see
IndexBuilder.indexfiles
).- Raises:
If an empty group is detected (which does not always work) then a
gromacs.BadParameterWarning
is issued.- Bugs:
If
make_ndx
crashes with an unexpected error then this is fairly hard to diagnose. For instance, in certain cases it segmentation faults when a tpr is provided as a struct file and the resulting error messages becomesGromacsError: [Errno -11] Gromacs tool failed Command invocation: make_ndx -o /tmp/tmp_Na1__NK7cT3.ndx -f md_posres.tpr
In this case run the command invocation manually to see what the problem could be.
See also
In some cases it might be more straightforward to use
gromacs.formats.NDX
.Build a index group from the selection arguments.
If selections and a structure file are supplied then the individual selections are constructed with separate calls to
gromacs.make_ndx()
. UseIndexBuilder.combine()
to combine them into a joint selection orIndexBuilder.write()
to simply write out the individual named selections (useful with names).- Arguments:
- structfilename
Structure file (tpr, pdb, …)
- selectionslist
The list must contain strings or tuples, which must be be one of the following constructs:
“<1-letter aa code><resid>[:<atom name]”
Selects the CA of the residue or the specified atom name.
example:
"S312:OA"
or"A22"
(equivalent to"A22:CA"
)(“<1-letter aa code><resid>”, “<1-letter aa code><resid>, [“<atom name>”])
Selects a range of residues. If only two residue identifiers are provided then all atoms are selected. With an optional third atom identifier, only this atom anme is selected for each residue in the range. [EXPERIMENTAL]
“@<make_ndx selection>”
The
@
letter introduces a verbatimmake_ndx
command. It will apply the given selection without any further processing or checks.example:
"@a 6234 - 6238"
or'@"SOL"'
(note the quoting) or"@r SER & r 312 & t OA"
.- nameslist
Strings to name the selections; if not supplied or if individuals are
None
then a default name is created. When simply usingIndexBuilder.write()
then these should be supplied.- name_allstring
Name of the group that is generated by
IndexBuilder.combine()
.- offsetint, dict
This number is added to the resids in the first selection scheme; this allows names to be the same as in a crystal structure. If offset is a dict then it is used to directly look up the resids.
- ndxfilename or list of filenames
Optional input index file(s).
- out_ndxfilename
Output index file.
- combine(name_all=None, out_ndx=None, operation='|', defaultgroups=False)[source]
Combine individual groups into a single one and write output.
- Keywords:
- name_allstring
Name of the combined group,
None
generates a name. [None
]- out_ndxfilename
Name of the output file that will contain the individual groups and the combined group. If
None
then default from the class constructor is used. [None
]- operationcharacter
Logical operation that is used to generate the combined group from the individual groups: “|” (OR) or “&” (AND); if set to
False
then no combined group is created and only the individual groups are written. [“|”]- defaultgroupsbool
True
: append everything to the default groups produced by make_ndx (or rather, the groups provided in the ndx file on initialization — if this wasNone
then these are truly default groups);False
: only use the generated groups
- Returns:
(combinedgroup_name, output_ndx)
, a tuple showing the actual group name and the name of the file; useful when all names are autogenerated.
Warning
The order of the atom numbers in the combined group is not guaranteed to be the same as the selections on input because
make_ndx
sorts them ascending. Thus you should be careful when using these index files for calculations of angles and dihedrals. Usegromacs.formats.NDX
in these cases.See also
IndexBuilder.write()
.
- gromacs.cbook.parse_ndxlist(output)[source]
Parse output from make_ndx to build list of index groups:
groups = parse_ndxlist(output)
output should be the standard output from
make_ndx
, e.g.:rc,output,junk = gromacs.make_ndx(..., input=('', 'q'), stdout=False, stderr=True)
(or simply use
rc,output,junk = cbook.make_ndx_captured(…)
which presets input, stdout and stderr; of course input can be overriden.)
- Returns:
The function returns a list of dicts (
groups
) with fields- name
name of the groups
- nr
number of the group (starts at 0)
- natoms
number of atoms in the group
- gromacs.cbook.get_ndx_groups(ndx, **kwargs)[source]
Return a list of index groups in the index file ndx.
- Arguments:
ndx is a Gromacs index file.
kwargs are passed to
make_ndx_captured()
.
- Returns:
list of groups as supplied by
parse_ndxlist()
Alternatively, load the index file with
gromacs.formats.NDX
for full control.
- gromacs.cbook.make_ndx_captured(**kwargs)[source]
make_ndx that captures all output
Standard
make_ndx()
command with the input and output pre-set in such a way that it can be conveniently used forparse_ndxlist()
.- Example::
ndx_groups = parse_ndxlist(make_ndx_captured(n=ndx)[0])
Note that the convenient
get_ndx_groups()
function does exactly that and can probably used in most cases.- Arguments:
keywords are passed on to
make_ndx()
- Returns:
(returncode, output,
None
)
File editing functions
It is often rather useful to be able to change parts of a template file. For specialized cases the two following functions are useful:
- gromacs.cbook.edit_mdp(mdp, new_mdp=None, extend_parameters=None, **substitutions)[source]
Change values in a Gromacs mdp file.
Parameters and values are supplied as substitutions, eg
nsteps=1000
.By default the template mdp file is overwritten in place.
If a parameter does not exist in the template then it cannot be substituted and the parameter/value pair is returned. The user has to check the returned list in order to make sure that everything worked as expected. At the moment it is not possible to automatically append the new values to the mdp file because of ambiguities when having to replace dashes in parameter names with underscores (see the notes below on dashes/underscores).
If a parameter is set to the value
None
then it will be ignored.- Arguments:
- mdpfilename
filename of input (and output filename of
new_mdp=None
)- new_mdpfilename
filename of alternative output mdp file [None]
- extend_parametersstring or list of strings
single parameter or list of parameters for which the new values should be appended to the existing value in the mdp file. This makes mostly sense for a single parameter, namely ‘include’, which is set as the default. Set to
[]
to disable. [‘include’]- substitutions
parameter=value pairs, where parameter is defined by the Gromacs mdp file; dashes in parameter names have to be replaced by underscores. If a value is a list-like object then the items are written as a sequence, joined with spaces, e.g.
ref_t=[310,310,310] ---> ref_t = 310 310 310
- Returns:
Dict of parameters that have not been substituted.
Example
edit_mdp('md.mdp', new_mdp='long_md.mdp', nsteps=100000, nstxtcout=1000, lincs_iter=2)
Note
Dashes in Gromacs mdp parameters have to be replaced by an underscore when supplied as python keyword arguments (a limitation of python). For example the MDP syntax is
lincs-iter = 4
but the corresponding keyword would belincs_iter = 4
.If the keyword is set as a dict key, eg
mdp_params['lincs-iter']=4
then one does not have to substitute.Parameters aa_bb and aa-bb are considered the same (although this should not be a problem in practice because there are no mdp parameters that only differ by a underscore).
This code is more compact in
Perl
as one can uses///
operators:s/^(\s*${key}\s*=\s*).*/$1${val}/
See also
One can also load the mdp file with
gromacs.formats.MDP
, edit the object (a dict), and save it again.
- gromacs.cbook.edit_txt(filename, substitutions, newname=None)[source]
Primitive text file stream editor.
This function can be used to edit free-form text files such as the topology file. By default it does an in-place edit of filename. If newname is supplied then the edited file is written to newname.
- Arguments:
- filename
input text file
- substitutions
substitution commands (see below for format)
- newname
output filename; if
None
then filename is changed in place [None
]
substitutions is a list of triplets; the first two elements are regular expression strings, the last is the substitution value. It mimics
sed
search and replace. The rules for substitutions:substitutions ::= "[" search_replace_tuple, ... "]" search_replace_tuple ::= "(" line_match_RE "," search_RE "," replacement ")" line_match_RE ::= regular expression that selects the line (uses match) search_RE ::= regular expression that is searched in the line replacement ::= replacement string for search_RE
Running
edit_txt()
does pretty much what a simplesed /line_match_RE/s/search_RE/replacement/
with repeated substitution commands does.
Special replacement values: -
None
: the rule is ignored -False
: the line is deleted (even if other rules match)Note
No sanity checks are performed and the substitutions must be supplied exactly as shown.
All substitutions are applied to a line; thus the order of the substitution commands may matter when one substitution generates a match for a subsequent rule.
If replacement is set to
None
then the whole expression is ignored and whatever is in the template is used. To unset values you must provided an empty string or similar.Delete a matching line if replacement=``False``.
gromacs.setup
– Setting up a Gromacs MD run
Individual steps such as solvating a structure or energy minimization are set up in individual directories. For energy minimization one should supply appropriate mdp run input files; otherwise example templates are used.
Warning
You must check all simulation parameters for yourself. Do not rely on any defaults provided here. The scripts provided here are provided under the assumption that you know what you are doing and you just want to automate the boring parts of the process.
User functions
The individual steps of setting up a simple MD simulation are broken down in a sequence of functions that depend on the previous step(s):
topology()
generate initial topology file (limited functionality, might require manual setup)
solvate()
solvate globular protein and add ions to neutralize
energy_minimize()
set up energy minimization and run it (using
mdrun_d
)em_schedule()
set up and run multiple energy minimizations one after another (as an alternative to the simple single energy minimization provided by
energy_minimize()
)MD_restrained()
set up restrained MD
MD()
set up equilibrium MD
Each function uses its own working directory (set with the dirname
keyword
argument, but it should be safe and convenient to use the defaults). Other
arguments assume the default locations so typically not much should have to be
set manually.
One can supply non-standard itp files in the topology directory. In
some cases one does not use the topology()
function at all but
sets up the topology manually. In this case it is safest to call the
topology directory top
and make sure that it contains all relevant
top, itp, and pdb files.
Example
Run a single protein in a dodecahedral box of SPC water molecules and
use the GROMOS96 G43a1 force field. We start with the structure in
protein.pdb
:
from gromacs.setup import *
f1 = topology(protein='MyProtein', struct='protein.pdb', ff='G43a1', water='spc', force=True, ignh=True)
Each function returns “interesting” new files in a dictionary in such a away that it can often be used as input for the next function in the chain (although in most cases one can get away with the defaults of the keyword arguments):
f2 = solvate(**f1)
f3 = energy_minimize(**f2)
Now prepare input for a MD run with restraints on the protein:
MD_restrained(**f3)
Use the files in the directory to run the simulation locally or on a cluster. You can provide your own template for a queuing system submission script; see the source code for details.
Once the restraint run has completed, use the last frame as input for the equilibrium MD:
MD(struct='MD_POSRES/md.gro', runtime=1e5)
Run the resulting tpr file on a cluster.
User functions
The following functions are provided for the user:
- gromacs.setup.topology(struct=None, protein='protein', top='system.top', dirname='top', posres='posres.itp', ff='oplsaa', water='tip4p', **pdb2gmx_args)[source]
Build Gromacs topology files from pdb.
- Keywords:
- struct
input structure (required)
- protein
name of the output files
- top
name of the topology file
- dirname
directory in which the new topology will be stored
- ff
force field (string understood by
pdb2gmx
); default “oplsaa”- water
water model (string), default “tip4p”
- pdb2gmxargs
other arguments for
pdb2gmx
Note
At the moment this function simply runs
pdb2gmx
and uses the resulting topology file directly. If you want to create more complicated topologies and maybe also use additional itp files or make a protein itp file then you will have to do this manually.
- gromacs.setup.solvate(struct='top/protein.pdb', top='top/system.top', distance=0.9, boxtype='dodecahedron', concentration=0, cation='NA', anion='CL', water='tip4p', solvent_name='SOL', with_membrane=False, ndx='main.ndx', mainselection='"Protein"', dirname='solvate', **kwargs)[source]
Put protein into box, add water, add counter-ions.
Currently this really only supports solutes in water. If you need to embedd a protein in a membrane then you will require more sophisticated approaches.
However, you can supply a protein already inserted in a bilayer. In this case you will probably want to set distance =
None
and also enable with_membrane =True
(using extra big vdw radii for typical lipids).Note
The defaults are suitable for solvating a globular protein in a fairly tight (increase distance!) dodecahedral box.
- Arguments:
- structfilename
pdb or gro input structure
- topfilename
Gromacs topology
- distancefloat
When solvating with water, make the box big enough so that at least distance nm water are between the solute struct and the box boundary. Set boxtype to
None
in order to use a box size in the input file (gro or pdb).- boxtype or bt: string
Any of the box types supported by
Editconf
(triclinic, cubic, dodecahedron, octahedron). Set the box dimensions either with distance or the box and angle keywords.If set to
None
it will ignore distance and use the box inside the struct file.bt overrides the value of boxtype.
- box
List of three box lengths [A,B,C] that are used by
Editconf
in combination with boxtype (bt
in editconf) and angles. Setting box overrides distance.- angles
List of three angles (only necessary for triclinic boxes).
- concentrationfloat
Concentration of the free ions in mol/l. Note that counter ions are added in excess of this concentration.
- cation and anionstring
Molecule names of the ions. This depends on the chosen force field.
- waterstring
Name of the water model; one of “spc”, “spce”, “tip3p”, “tip4p”. This should be appropriate for the chosen force field. If an alternative solvent is required, simply supply the path to a box with solvent molecules (used by
genbox()
’s cs argument) and also supply the molecule name via solvent_name.- solvent_name
Name of the molecules that make up the solvent (as set in the itp/top). Typically needs to be changed when using non-standard/non-water solvents. [“SOL”]
- with_membranebool
True
: use specialvdwradii.dat
with 0.1 nm-increased radii on lipids. Default isFalse
.- ndxfilename
How to name the index file that is produced by this function.
- mainselectionstring
A string that is fed to
Make_ndx
and which should select the solute.- dirnamedirectory name
Name of the directory in which all files for the solvation stage are stored.
- includes
List of additional directories to add to the mdp include path
- kwargs
Additional arguments are passed on to
Editconf
or are interpreted as parameters to be changed in the mdp file.
- gromacs.setup.energy_minimize(dirname='em', mdp='/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/em.mdp', struct='solvate/ionized.gro', top='top/system.top', output='em.pdb', deffnm='em', mdrunner=None, mdrun_args=None, **kwargs)[source]
Energy minimize the system.
This sets up the system (creates run input files) and also runs
mdrun_d
. Thus it can take a while.Additional itp files should be in the same directory as the top file.
Many of the keyword arguments below already have sensible values.
- Keywords:
- dirname
set up under directory dirname [em]
- struct
input structure (gro, pdb, …) [solvate/ionized.gro]
- output
output structure (will be put under dirname) [em.pdb]
- deffnm
default name for mdrun-related files [em]
- top
topology file [top/system.top]
- mdp
mdp file (or use the template) [templates/em.mdp]
- includes
additional directories to search for itp files
- mdrunner
gromacs.run.MDrunner
instance; by default we just trygromacs.mdrun_d()
andgromacs.mdrun()
but a MDrunner instance gives the user the ability to run mpi jobs etc. [None]- mdrun_args
arguments for mdrunner (as a dict), e.g.
{'nt': 2}
; empty by defaultNew in version 0.7.0.
- kwargs
remaining key/value pairs that should be changed in the template mdp file, eg
nstxtcout=250, nstfout=250
.
Note
If
mdrun_d()
is not found, the function falls back tomdrun()
instead.
- gromacs.setup.em_schedule(**kwargs)[source]
Run multiple energy minimizations one after each other.
- Keywords:
- integrators
list of integrators (from ‘l-bfgs’, ‘cg’, ‘steep’) [[‘bfgs’, ‘steep’]]
- nsteps
list of maximum number of steps; one for each integrator in in the integrators list [[100,1000]]
- kwargs
mostly passed to
gromacs.setup.energy_minimize()
- Returns:
dictionary with paths to final structure (‘struct’) and other files
- Example:
Conduct three minimizations:
low memory Broyden-Goldfarb-Fletcher-Shannon (BFGS) for 30 steps
steepest descent for 200 steps
finish with BFGS for another 30 steps
We also do a multi-processor minimization when possible (i.e. for steep (and conjugate gradient) by using a
gromacs.run.MDrunner
class for a mdrun executable compiled for OpenMP in 64 bit (seegromacs.run
for details):import gromacs.run gromacs.setup.em_schedule(struct='solvate/ionized.gro', mdrunner=gromacs.run.MDrunnerOpenMP64, integrators=['l-bfgs', 'steep', 'l-bfgs'], nsteps=[50,200, 50])
Note
You might have to prepare the mdp file carefully because at the moment one can only modify the nsteps parameter on a per-minimizer basis.
- gromacs.setup.MD_restrained(dirname='MD_POSRES', **kwargs)[source]
Set up MD with position restraints.
Additional itp files should be in the same directory as the top file.
Many of the keyword arguments below already have sensible values. Note that setting mainselection =
None
will disable many of the automated choices and is often recommended when using your own mdp file.- Keywords:
- dirname
set up under directory dirname [MD_POSRES]
- struct
input structure (gro, pdb, …) [em/em.pdb]
- top
topology file [top/system.top]
- mdp
mdp file (or use the template) [templates/md.mdp]
- ndx
index file (supply when using a custom mdp)
- includes
additional directories to search for itp files
- mainselection
make_ndx selection to select main group [“Protein”] (If
None
then no canonical index file is generated and it is the user’s responsibility to set tc_grps, tau_t, and ref_t as keyword arguments, or provide the mdp template with all parameter pre-set in mdp and probably also your own ndx index file.)- deffnm
default filename for Gromacs run [md]
- runtime
total length of the simulation in ps [1000]
- dt
integration time step in ps [0.002]
- qscript
script to submit to the queuing system; by default uses the template
gromacs.config.qscript_template
, which can be manually set to another template fromgromacs.config.templates
; can also be a list of template names.- qname
name to be used for the job in the queuing system [PR_GMX]
- mdrun_opts
option flags for the mdrun command in the queuing system scripts such as “-stepout 100”. [“”]
- kwargs
remaining key/value pairs that should be changed in the template mdp file, eg
nstxtcout=250, nstfout=250
or command line options forgrompp` such as ``maxwarn=1
.In particular one can also set define and activate whichever position restraints have been coded into the itp and top file. For instance one could have
define = “-DPOSRES_MainChain -DPOSRES_LIGAND”
if these preprocessor constructs exist. Note that there must not be any space between “-D” and the value.
By default define is set to “-DPOSRES”.
- Returns:
a dict that can be fed into
gromacs.setup.MD()
(but check, just in case, especially if you want to change thedefine
parameter in the mdp file)
Note
The output frequency is drastically reduced for position restraint runs by default. Set the corresponding
nst*
variables if you require more output. The pressure coupling option refcoord_scaling is set to “com” by default (but can be changed via kwargs) and the pressure coupling algorithm itself is set to Pcoupl = “Berendsen” to run a stable simulation.
- gromacs.setup.MD(dirname='MD', **kwargs)[source]
Set up equilibrium MD.
Additional itp files should be in the same directory as the top file.
Many of the keyword arguments below already have sensible values. Note that setting mainselection =
None
will disable many of the automated choices and is often recommended when using your own mdp file.- Keywords:
- dirname
set up under directory dirname [MD]
- struct
input structure (gro, pdb, …) [MD_POSRES/md_posres.pdb]
- top
topology file [top/system.top]
- mdp
mdp file (or use the template) [templates/md.mdp]
- ndx
index file (supply when using a custom mdp)
- includes
additional directories to search for itp files
- mainselection
make_ndx
selection to select main group [“Protein”] (IfNone
then no canonical index file is generated and it is the user’s responsibility to set tc_grps, tau_t, and ref_t as keyword arguments, or provide the mdp template with all parameter pre-set in mdp and probably also your own ndx index file.)- deffnm
default filename for Gromacs run [md]
- runtime
total length of the simulation in ps [1000]
- dt
integration time step in ps [0.002]
- qscript
script to submit to the queuing system; by default uses the template
gromacs.config.qscript_template
, which can be manually set to another template fromgromacs.config.templates
; can also be a list of template names.- qname
name to be used for the job in the queuing system [MD_GMX]
- mdrun_opts
option flags for the mdrun command in the queuing system scripts such as “-stepout 100 -dgdl”. [“”]
- kwargs
remaining key/value pairs that should be changed in the template mdp file, e.g.
nstxtcout=250, nstfout=250
or command line options for :program`grompp` such asmaxwarn=1
.
- Returns:
a dict that can be fed into
gromacs.setup.MD()
(but check, just in case, especially if you want to change the define parameter in the mdp file)
Helper functions
The following functions are used under the hood and are mainly useful when writing extensions to the module.
- gromacs.setup.make_main_index(struct, selection='"Protein"', ndx='main.ndx', oldndx=None)[source]
Make index file with the special groups.
This routine adds the group __main__ and the group __environment__ to the end of the index file. __main__ contains what the user defines as the central and most important parts of the system. __environment__ is everything else.
The template mdp file, for instance, uses these two groups for T-coupling.
These groups are mainly useful if the default groups “Protein” and “Non-Protein” are not appropriate. By using symbolic names such as __main__ one can keep scripts more general.
- Returns:
groups is a list of dictionaries that describe the index groups. See
gromacs.cbook.parse_ndxlist()
for details.- Arguments:
- structfilename
structure (tpr, pdb, gro)
- selectionstring
is a
make_ndx
command such as"Protein"
orr DRG
which determines what is considered the main group for centering etc. It is passed directly tomake_ndx
.- ndxstring
name of the final index file
- oldndxstring
name of index file that should be used as a basis; if None then the
make_ndx
default groups are used.
This routine is very dumb at the moment; maybe some heuristics will be added later as could be other symbolic groups such as __membrane__.
- gromacs.setup.get_lipid_vdwradii(outdir='.', libdir=None)[source]
Find vdwradii.dat and add special entries for lipids.
See
gromacs.setup.vdw_lipid_resnames
for lipid resnames. Add more if necessary.
- gromacs.setup._setup_MD(dirname, deffnm='md', mdp='/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/md_OPLSAA.mdp', struct=None, top='top/system.top', ndx=None, mainselection='"Protein"', qscript='/home/docs/checkouts/readthedocs.org/user_builds/gromacswrapper/checkouts/latest/gromacs/templates/local.sh', qname=None, startdir=None, mdrun_opts='', budget=None, walltime=0.3333333333333333, dt=0.002, runtime=1000.0, **mdp_kwargs)[source]
Generic function to set up a
mdrun
MD simulation.See the user functions for usage.
Defined constants:
- gromacs.setup.CONC_WATER = 55.345
Concentration of water at standard conditions in mol/L. Density at 25 degrees C and 1 atmosphere pressure: rho = 997.0480 g/L. Molecular weight: M = 18.015 g/mol. c = n/V = m/(V*M) = rho/M = 55.345 mol/L.
- gromacs.setup.vdw_lipid_resnames = ['POPC', 'POPE', 'POPG', 'DOPC', 'DPPC', 'DLPC', 'DMPC', 'DPPG']
Hard-coded lipid residue names for a
vdwradii.dat
file. Use together withvdw_lipid_atom_radii
inget_lipid_vdwradii()
.
- gromacs.setup.vdw_lipid_atom_radii = {'C': 0.25, 'H': 0.09, 'N': 0.16, 'O': 0.155}
Increased atom radii for lipid atoms; these are simply the standard values from
GMXLIB/vdwradii.dat
increased by 0.1 nm (C) or 0.05 nm (N, O, H).
gromacs.scaling
– Partial tempering
- Author:
Jan Domanski, @jandom
New in version 0.5.0.
Helper functions for scaling gromacs topologies; useful for setting up simulations with Hamiltonian replicate exchange and partial tempering (REST2).
- gromacs.scaling.scale_dihedrals(mol, dihedrals, scale, banned_lines=None)[source]
Scale dihedral angles
gromacs.qsub
– utilities for batch submission systems
The module helps writing submission scripts for various batch submission
queuing systems. The known ones are listed stored as
QueuingSystem
instances in
queuing_systems
; append new ones to this list.
The working paradigm is that template scripts are provided (see
gromacs.config.templates
) and only a few place holders are substituted
(using gromacs.cbook.edit_txt()
).
User-supplied template scripts can be stored in
gromacs.config.qscriptdir
(by default ~/.gromacswrapper/qscripts
)
and they will be picked up before the package-supplied ones.
At the moment, some of the functions in gromacs.setup
use this module
but it is fairly independent and could conceivably be used for a wider range of
projects.
Queuing system templates
The queuing system scripts are highly specific and you will need to add
your own. Templates should be shell scripts. Some parts of the
templates are modified by the
generate_submit_scripts()
function. The “place
holders” that can be replaced are shown in the table below. Typically,
the place holders are either shell variable assignments or batch
submission system commands. The table shows SGE commands but PBS and
LoadLeveler have similar constructs; e.g. PBS commands start with
#PBS
and LoadLeveller uses #@
with its own command keywords).
place holder |
default |
replacement |
description |
regex |
---|---|---|---|---|
#$ -N |
GMX_MD |
sgename |
job name |
/^#.*(-N|job_name)/ |
#$ -l walltime= |
00:20:00 |
walltime |
max run time |
/^#.*(-l walltime|wall_clock_limit)/ |
#$ -A |
BUDGET |
budget |
account |
/^#.*(-A|account_no)/ |
DEFFNM= |
md |
deffnm |
default gmx name |
/^ *DEFFNM=/ |
STARTDIR= |
. |
startdir |
remote jobdir |
/^ *STARTDIR=/ |
WALL_HOURS= |
0.33 |
walltime h |
mdrun’s -maxh |
/^ *WALL_HOURS=/ |
NPME= |
npme |
PME nodes |
/^ *NPME=/ |
|
MDRUN_OPTS= |
“” |
mdrun_opts |
more options |
/^ *MDRUN_OPTS=/ |
Lines with place holders should not have any white space at the
beginning. The regular expression pattern (“regex”) is used to find
the lines for the replacement and the literal default values
(“default”) are replaced. (Exception: any value that follows an equals
sign “=” is replaced, regardless of the default value in the table
except for MDRUN_OPTS
where only “” will be replace.) Not all
place holders have to occur in a template; for instance, if a queue
has no run time limitation then one would probably not include
walltime and WALL_HOURS place holders.
The line # JOB_ARRAY_PLACEHOLDER
can be replaced by
generate_submit_array()
to produce a “job array”
(also known as a “task array”) script that runs a large number of
related simulations under the control of a single queuing system
job. The individual array tasks are run from different sub
directories. Only queuing system scripts that are using the
bash shell are supported for job arrays at the moment.
A queuing system script must have the appropriate suffix to be properly recognized, as shown in the table below.
Queuing system |
suffix |
notes |
---|---|---|
Sun Gridengine |
.sge |
Sun’s Sun Gridengine |
Portable Batch queuing system |
.pbs |
|
LoadLeveler |
.ll |
IBM’s LoadLeveler |
bash script |
.bash, .sh |
|
csh script |
.csh |
Example queuing system script template for PBS
The following script is a usable PBS script for a super computer. It contains almost all of the replacement tokens listed in the table (indicated by ++++++).
#!/bin/bash
# File name: ~/.gromacswrapper/qscripts/supercomputer.somewhere.fr_64core.pbs
#PBS -N GMX_MD
# ++++++
#PBS -j oe
#PBS -l select=8:ncpus=8:mpiprocs=8
#PBS -l walltime=00:20:00
# ++++++++
# host: supercomputer.somewhere.fr
# queuing system: PBS
# set this to the same value as walltime; mdrun will stop cleanly
# at 0.99 * WALL_HOURS
WALL_HOURS=0.33
# ++++
# deffnm line is possibly modified by gromacs.setup
# (leave it as it is in the template)
DEFFNM=md
# ++
TPR=${DEFFNM}.tpr
OUTPUT=${DEFFNM}.out
PDB=${DEFFNM}.pdb
MDRUN_OPTS=""
# ++
# If you always want to add additional MDRUN options in this script then
# you can either do this directly in the mdrun commandline below or by
# constructs such as the following:
## MDRUN_OPTS="-npme 24 $MDRUN_OPTS"
# JOB_ARRAY_PLACEHOLDER
#++++++++++++++++++++++ leave the full commented line intact!
# avoids some failures
export MPI_GROUP_MAX=1024
# use hard coded path for time being
GMXBIN="/opt/software/SGI/gromacs/4.0.3/bin"
MPIRUN=/usr/pbs/bin/mpiexec
APPLICATION=$GMXBIN/mdrun_mpi
$MPIRUN $APPLICATION -stepout 1000 -deffnm ${DEFFNM} -s ${TPR} -c ${PDB} -cpi $MDRUN_OPTS -maxh ${WALL_HOURS} > $OUTPUT
rc=$?
# dependent jobs will only start if rc == 0
exit $rc
Save the above script in ~/.gromacswrapper/qscripts
under the name
supercomputer.somewhere.fr_64core.pbs
. This will make the script
immediately usable. For example, in order to set up a production MD run with
gromacs.setup.MD()
for this super computer one would use
gromacs.setup.MD(..., qscripts=['supercomputer.somewhere.fr_64core.pbs', 'local.sh'])
This will generate submission scripts based on
supercomputer.somewhere.fr_64core.pbs
and also the default local.sh
that is provided with GromacsWrapper.
In order to modify MDRUN_OPTS
one would use the additonal mdrun_opts
argument, for instance:
gromacs.setup.MD(..., qscripts=['supercomputer.somewhere.fr_64core.pbs', 'local.sh'],
mdrun_opts="-v -npme 20 -dlb yes -nosum")
Currently there is no good way to specify the number of processors when creating run scripts. You will need to provide scripts with different numbers of cores hard coded or set them when submitting the scripts with command line options to qsub.
Classes and functions
- class gromacs.qsub.QueuingSystem(name, suffix, qsub_prefix, array_variable=None, array_option=None)[source]
Class that represents minimum information about a batch submission system.
Define a queuing system’s functionality
- Arguments:
- name
name of the queuing system, e.g. ‘Sun Gridengine’
- suffix
suffix of input files, e.g. ‘sge’
- qsub_prefix
prefix string that starts a qsub flag in a script, e.g. ‘#$’
- Keywords:
- array_variable
environment variable exported for array jobs, e.g. ‘SGE_TASK_ID’
- array_option
qsub option format string to launch an array (e.g. ‘-t %d-%d’)
- gromacs.qsub.generate_submit_scripts(templates, prefix=None, deffnm='md', jobname='MD', budget=None, mdrun_opts=None, walltime=1.0, jobarray_string=None, startdir=None, npme=None, **kwargs)[source]
Write scripts for queuing systems.
This sets up queuing system run scripts with a simple search and replace in templates. See
gromacs.cbook.edit_txt()
for details. Shell scripts are made executable.- Arguments:
- templates
Template file or list of template files. The “files” can also be names or symbolic names for templates in the templates directory. See
gromacs.config
for details and rules for writing templates.- prefix
Prefix for the final run script filename; by default the filename will be the same as the template. [None]
- dirname
Directory in which to place the submit scripts. [.]
- deffnm
Default filename prefix for mdrun
-deffnm
[md]- jobname
Name of the job in the queuing system. [MD]
- budget
Which budget to book the runtime on [None]
- startdir
Explicit path on the remote system (for run scripts that need to cd into this directory at the beginning of execution) [None]
- mdrun_opts
String of additional options for mdrun.
- walltime
Maximum runtime of the job in hours. [1]
- npme
number of PME nodes
- jobarray_string
Multi-line string that is spliced in for job array functionality (see
gromacs.qsub.generate_submit_array()
; do not use manually)- kwargs
all other kwargs are ignored
- Returns:
list of generated run scripts
- gromacs.qsub.generate_submit_array(templates, directories, **kwargs)[source]
Generate a array job.
For each
work_dir
in directories, the array job willcd into
work_dir
run the job as detailed in the template
It will use all the queuing system directives found in the template. If more complicated set ups are required, then this function cannot be used.
- Arguments:
- templates
Basic template for a single job; the job array logic is spliced into the position of the line
# JOB_ARRAY_PLACEHOLDER
The appropriate commands for common queuing systems (Sun Gridengine, PBS) are hard coded here. The queuing system is detected from the suffix of the template.
- directories
List of directories under dirname. One task is set up for each directory.
- dirname
The array script will be placed in this directory. The directories must be located under dirname.
- kwargs
See
gromacs.setup.generate_submit_script()
for details.
- gromacs.qsub.detect_queuing_system(scriptfile)[source]
Return the queuing system for which scriptfile was written.
- gromacs.qsub.queuing_systems = [<Sun Gridengine QueuingSystem instance>, <PBS QueuingSystem instance>, <LoadLeveler QueuingSystem instance>, <Slurm QueuingSystem instance>]
Pre-defined queuing systems (SGE, PBS). Add your own here.
gromacs.run
– Running simulations
The gromacs.run
module contains tools for launching a Gromacs MD
simulation with gmx mdrun. The basic tool is the MDrunner
class that customizes how mdrun
is actually
called. It enables setting a driver such as mpiexec for launching
MPI-enabled runs. The Example: How to create your own MDrunner with mpiexec -n should make clearer what one
needs to do.
Additionally, Helper functions are provided to check and manage MD runs.
Example: How to create your own MDrunner with mpiexec -n
Question: How do I change the GromacsWrapper configuration file so that
mdrun
gets called with anmpiexec -n
prefix?Answer: That’s not directly supported but if you just want to change how
mdrun
is launched then you can create a customMDrunner
for this purpose.
In many cases, you really only need the path to mpiexec and then you
can just derive your own class MDrunnerMPI
:
import gromacs.run
class MDrunnerMPI(gromacs.run.MDrunner):
"""Manage running :program:`mdrun` as an MPI multiprocessor job."""
mdrun = "gmx_mpi mdrun"
mpiexec = "/opt/local/bin/mpiexec"
The full path to the MPI runner mpiexec (or mpirun) is
stored in the class attribute MDrunnerMPI.mpiexec
.
This class can then be used as
mdrun_mpi = MDrunnerMPI(s="md.tpr", deffnm="md")
rc = mdrun_mpi.run(ncores=16)
Our MDrunnerMPI
only supports running mpiexec -n ncores
gmx mdrun ...
, i.e., only the -n ncores
arguments for mpiexec
is supported. If you need more functionality then you need write your own
MDrunner.mpicommand()
method, which you would add to your own
MDrunnerMPI
class.
The included MDrunnerOpenMP
could be used instead of our own
MDrunnerMPI
; the only difference is that multiple names of MPI-enabled
mdrun
binaries are stored as a tuple in the attribute
MDrunnerOpenMP.mdrun
so that the class works for old Gromacs 4.x and
modern Gromacs ≥ 2016.
If you need to run some code before or after launching you can add it as the
MDrunnerMPI.prehook()
and MDrunnerMPI.posthook()
methods as shown
in MDrunnerMpich2Smpd
.
MDrunner
The MDrunner
wraps gromacs.tools.Mdrun
to customize launching
a Gromacs MD simulation from inside the Python interpreter.
- class gromacs.run.MDrunner(dirname='.', **kwargs)[source]
A class to manage running mdrun in various ways.
In order to do complicated multiprocessor runs with mpiexec or similar you need to derive from this class and override
MDrunner.mdrun
with the path to themdrun
executableMDrunner.mpiexec
with the path to the MPI launcherMDrunner.mpicommand()
with a function that returns the mpi command as a list
In addition there are two methods named
prehook()
andposthook()
that are called right before and after the process is started. If they are overriden appropriately then they can be used to set up a mpi environment.The
run()
method can take arguments for the mpiexec launcher but it can also be used to supersede the arguments for mdrun.The actual mdrun command is set in the class-level attribute
mdrun
. This can be a single string or a sequence (tuple) of strings. On instantiation, the first entry inmdrun
that can be found on thePATH
is chosen (withfind_gromacs_command()
). For example,gmx mdrun
from Gromacs 5.x but justmdrun
for Gromacs 4.6.x. Similarly, alternative executables (such as double precision) need to be specified here (e.g.("mdrun_d", "gmx_d mdrun")
).Note
Changing mdrun arguments permanently changes the default arguments for this instance of
MDrunner
. (This is arguably a bug.)Changed in version 0.5.1: Added detection of bare Gromacs commands (Gromacs 4.6.x) or commands run through gmx (Gromacs 5.x).
Changed in version 0.6.0: Changed syntax for Gromacs 5.x commands.
Set up a simple run with
mdrun
.- Keywords:
- dirname
Change to this directory before launching the job. Input files must be supplied relative to this directory.
- keywords
All other keword arguments are used to construct the
mdrun
commandline. Note that only keyword arguments are allowed.
- check_success()[source]
Check if mdrun finished successfully.
(See
check_mdrun_success()
for details)
- commandline(**mpiargs)[source]
Returns simple command line to invoke mdrun.
If
mpiexec
is set thenmpicommand()
provides the mpi launcher command that prefixes the actualmdrun
invocation:The mdrun-args are set on initializing the class. Override
mpicommand()
to fit your system if the simple default OpenMP launcher is not appropriate.
- mdrun = ('mdrun', 'gmx mdrun')
Path to the mdrun executable (or the name if it can be found on
PATH
); this can be a tuple and then the program names are tried in sequence. For Gromacs 5 prefix with the driver command, e.g.,gmx mdrun
.New in version 0.5.1.
- mpicommand(*args, **kwargs)[source]
Return a list of the mpi command portion of the commandline.
- Only allows primitive mpi at the moment:
mpiexec -n ncores mdrun mdrun-args
(This is a primitive example for OpenMP. Override it for more complicated cases.)
- mpiexec = None
path to the MPI launcher (e.g. mpiexec)
- run(pre=None, post=None, mdrunargs=None, **mpiargs)[source]
Execute the mdrun command (possibly as a MPI command) and run the simulation.
- Keywords:
- pre
a dictionary containing keyword arguments for the
prehook()
- post
a dictionary containing keyword arguments for the
posthook()
- mdrunargs
a dictionary with keyword arguments for mdrun which supersede and update the defaults given to the class constructor
- mpiargs
all other keyword arguments that are processed by
mpicommand()
- run_check(**kwargs)[source]
Run mdrun and check if run completed when it finishes.
This works by looking at the mdrun log file for ‘Finished mdrun on node’. It is useful to implement robust simulation techniques.
- Arguments:
kwargs are keyword arguments that are passed on to
run()
(typically used for mpi things)- Returns:
True
if run completed successfullyFalse
otherwise
- class gromacs.run.MDrunnerDoublePrecision(dirname='.', **kwargs)[source]
Manage running mdrun_d.
Set up a simple run with
mdrun
.- Keywords:
- dirname
Change to this directory before launching the job. Input files must be supplied relative to this directory.
- keywords
All other keword arguments are used to construct the
mdrun
commandline. Note that only keyword arguments are allowed.
Example implementations
- class gromacs.run.MDrunnerOpenMP(dirname='.', **kwargs)[source]
Manage running mdrun as an OpenMP multiprocessor job.
Set up a simple run with
mdrun
.- Keywords:
- dirname
Change to this directory before launching the job. Input files must be supplied relative to this directory.
- keywords
All other keword arguments are used to construct the
mdrun
commandline. Note that only keyword arguments are allowed.
- mdrun = ('mdrun_openmp', 'gmx_openmp mdrun')
Path to the mdrun executable (or the name if it can be found on
PATH
); this can be a tuple and then the program names are tried in sequence. For Gromacs 5 prefix with the driver command, e.g.,gmx mdrun
.New in version 0.5.1.
- mpiexec = 'mpiexec'
path to the MPI launcher (e.g. mpiexec)
- class gromacs.run.MDrunnerMpich2Smpd(dirname='.', **kwargs)[source]
Manage running mdrun as mpich2 multiprocessor job with the SMPD mechanism.
Set up a simple run with
mdrun
.- Keywords:
- dirname
Change to this directory before launching the job. Input files must be supplied relative to this directory.
- keywords
All other keword arguments are used to construct the
mdrun
commandline. Note that only keyword arguments are allowed.
- mdrun = 'mdrun_mpich2'
Path to the mdrun executable (or the name if it can be found on
PATH
); this can be a tuple and then the program names are tried in sequence. For Gromacs 5 prefix with the driver command, e.g.,gmx mdrun
.New in version 0.5.1.
- mpiexec = 'mpiexec'
path to the MPI launcher (e.g. mpiexec)
Helper functions
- gromacs.run.check_mdrun_success(logfile)[source]
Check if
mdrun
finished successfully.Analyses the output from
mdrun
in logfile. Right now we are simply looking for the line “Finished mdrun on node” in the last 1kb of the file. (The file must be seeakable.)- Arguments:
- logfilefilename
Logfile produced by
mdrun
.
- Returns:
True
if all ok,False
if not finished, andNone
if the logfile cannot be opened
Alternatives to GromacsWrapper
GromacsWrapper is simplistic; in particular it does not directly link to the GROMACS libraries but relies on Python wrappers to call GROMACS tools. Some people find this very crude (the author included). Other people have given more thought to the problem and you are encouraged to see if their efforts speed up your work more than does GromacsWrapper.
- gmxapi (M.E. Irrgang, J.M. Hays, and P.M. Kasson)
gmxapi
provides interfaces for managing and extending molecular dynamics simulation workflows. In this repository, a Python package provides thegmx
module for high-level interaction with GROMACS.gmx.core
provides Python bindings to the gmxapi C++ GROMACS external API.Irrgang, M. E., Hays, J. M., & Kasson, P. M. gmxapi: a high-level interface for advanced control and extension of molecular dynamics simulations. Bioinformatics 2018. DOI: 10.1093/bioinformatics/bty484
- gromacs_py (Samuel Murail, Maxence Delaunay, Damien Espana)
Gromacs_py is a Python library allowing a simplified use of the Gromacs MD simulation software. Gromacs_py can build a system topologie based on a pdb file, create the simulation system (pbc box, adding water and ions) and run minimisation, equilibration and production runs. One of the main objective of the Gromacs_py wrapper is to automatize routine operations for MD simulation of multiple systems.
- MDAnalysis (N. Michaud-Agrawal, E. J. Dennning, and O. Beckstein)
Reads various trajectories (dcd, xtc, trr) and makes coordinates available as numpy arrays. It also has a fairly sophisticated selection language, similar to Charmm or VMD.
- ParmEd
A general tool for working with topology files for all the popular MD codes, including the parmed.gromacs module for ITP and TOP files.
- pymacs (Daniel Seeliger)
pymacs is a python module for dealing with structure files and trajectory data from the GROMACS molecular dynamics package. It has interfaces to some gromacs functions and uses gromacs routines for command line parsing, reading and writing of structure files (pdb,gro,…) and for reading trajectory data (only xtc at the moment). It is quite useful to write python scripts for simulation setup and analysis that can be combined with other powerful python packages like numpy, scipy or plotting libraries like pylab. It has an intuitive data structure (Model –> Chain –> Molecule –> Atom) and allows modifications at all levels like
Changing of atom, residue and chain properties (name, coordinate, b-factor,…
Deleting and inserting atoms, residues, chains
Straightforward selection of structure subsets
Structure building from sequence
Handling gromacs index files
- gmxscript (Pedro Lacerda)
gmxscript is a framework for GROMACS simulations. Its main goal is make simulation protocols easily reproducible and to define canonical steps to perform and analyze a simulation. The commands are stored in very readable and structured Python file that requires no programming knowledge except syntax.
- GROMACS XTC Library
Version 1.1 of the separate xtc/trr library contains example code to access a GROMACS trajectory from python. It appears to be based on grompy (also see below).
- various implementations of python wrappers
See the discussion on the gmx-developers mailinglist: check the thread [gmx-developers] Python interface for Gromacs
- grompy (René Pool, Martin Hoefling, Roland Schulz)
uses ctypes to wrap libgmx:
“Here’s a bunch of code I wrote to wrap libgmx with ctypes and make use of parts of gromacs functionality. My application for this was the processing of a trajectories using gromac’s pbc removal and fitting routines as well as reading in index groups etc. It’s very incomplete atm and also focused on wrapping libgmx with all gromacs types and definitions…
… so python here feels a bit like lightweight c-code glueing together gromacs library functions :-)
The attached code lacks a bit of documentation, but I included a test.py as an example using it.”
Roland Schulz added code:
“I added a little bit wrapper code to easily access the atom information in tpx. I attached the version. It is backward compatible …”
A working grompy tar ball (with Roland’s enhancements) is cached at gmane.org and the latest sources are hosted at https://github.com/GromPy
- LOOS (Grossfield lab at the University of Rochester)
The idea behind LOOS (Lightweight Object-Oriented Structure library) is to provide a lightweight C++ library for analysis of molecular dynamics simulations. This includes parsing a number of PDB variants, as well as the native system description and trajectory formats for CHARMM, NAMD, and Amber. LOOS is not intended to be an all-encompassing library and it is primarily geared towards reading data in and processing rather than manipulating the files and structures and writing them out.
The LOOS documentation is well written and comprehensive and the code is published under the GPL.
- copernicus
Copernicus is a Python-based client-server network that allows running of large and complicated MD simulation workflows. It supports GROMACS (grompp and mdrun).
- VMD (Schulten lab at UIUC)
VMD is a great analysis tool; the only downside is that (at the moment) trajectories have to fit into memory. In some cases this can be circumvented by reading a trajectory frame by frame using the bigdcd script (which might also work for GROMACS xtcs).
- JGromacs (Márton Münz and Philip C Biggin)
JGromacs is a Java library designed to facilitate the development of cross-platform analysis applications for Molecular Dynamics (MD) simulations. The package contains parsers for file formats applied by GROMACS. It provides a multilevel object-oriented representation of simulation data to integrate and interconvert sequence, structure and dynamics information. In addititon, a basic analysis toolkit is included in the package. The programmer is also provided with simple tools (e.g. XML-based configuration) to create applications with a user interface resembling the command-line UI of GROMACS applications.
Please open an issue in the issue tracker to let us know of other efforts so that they can be added here. Thanks.