Programs¶
Programs summary tables:
Executable |
Used for |
Description |
---|---|---|
|
Preparation |
Cavity mapping and preparation of docking site (.as) file. |
|
Preparation |
Calculation of vdW grid files (usually called by |
|
Docking |
The main rDock docking engine itself. |
Executable |
Used for |
Description |
---|---|---|
|
Preparation |
Prepares a ligand SD file for tethered scaffold docking. Annotates ligand SD file with tethered substructure atom indices. Requires Open Babel Python bindings. |
|
Preparation |
Used to optimise a high-throughput docking protocol from an initial exhaustive docking of a small representative ligand library. Parametrize a multi-step protocol for your system. |
|
Preparation |
Creates the vdW grid files required for grid-based docking protocols
( |
|
Analysis |
Converts rDock vdW grids to InsightII grid format for visualisation. |
|
Analysis |
Converts rDock vdW grids to MOE grid format for visualisation. |
|
Analysis |
Outputs miscellaneous information for ligand SD file records. |
|
Analysis |
Calculation of ligand Root Mean Squared Displacement (RMSD) between reference and docked poses, taking into account ligand topological symmetry. Requires Open Babel Python bindings. |
|
Analysis |
Utility for filtering SD files by arbitrary data field expressions. Useful for simple post-docking filtering by score components. |
|
Analysis |
Utility for sorting SD files by arbitrary data field. Useful for simple post-docking filtering by score components. |
|
Analysis |
Utility for reporting SD file data field values. Output in tab-delimited or CSV format. |
|
Utility |
Splits an SD file into multiple smaller SD files of fixed number of records. |
|
Utility |
Sets the molecule title line of each SD record equal to a given SD data field. |
Programs reference¶
rbdock¶
rbdock
– The rDock docking engine itself.
$RBT_ROOT/bin/rbdock
{-i input ligand MDL SD file}
{-o output MDL SD file}
{-r system definition.prm file}
{-p docking protocol.prm file}
[-n number of docking runs/ligand]
[-s random seed]
[-T debug trace level]
[[-t SCORE.INTER threshold] | [-t filter definition file]]
[-ap -an -allH -cont]
Simple exhaustive docking¶
The minimum requirement for rbdock is to specify the input (-i
) and output
(-o
) ligand SD file names, the system definition .prm file (-r
) and the
docking protocol .prm file (-p
). This will perform one docking run per
ligand record in the input SD file and output all docked ligand poses to the
output SD file. Use -n
to increase the number of docking runs per ligand
record.
High-throughput docking, option 1¶
The -t
and -cont
options can be used to construct high-throughput
protocols. If the argument following -t
is numeric it is interpreted as a
threshold value for SCORE.INTER
, the total intermolecular score between
ligand and receptor/solvent. In the absence of -cont, the threshold acts as an
early termination filter, and the docking runs for each ligand will be
terminated early once the threshold value has been exceeded. Note that the
threshold is applied only at the end of each individual docking run, not during
the runs themselves. If the -cont
(continue) option is specified as well,
the threshold acts as an output pose filter instead of a termination filter.
The docking runs for each ligand run to completion as in the exhaustive case,
but only the docking poses that exceed the threshold value of SCORE.INTER
are written to the output SD file.
High throughput docking, option 2¶
Alternatively, if the argument following -t
is non-numeric it is interpreted
as a filter definition file. The filter definition file can be used to define
multiple termination filters and multiple output pose filters in a generic way.
Any docking score component can be used in the filter definitions.
run_rbscreen.pl
generates a filter definition file for multi-stage,
high-throughput docking, with progressive score thresholds for early termination
of poorly performing ligands. The use of filter definition files is preferred
over the more limited SCORE.INTER
filtering described above, whose use is
now deprecated.
Automated ligand protonation/deprotonation¶
The -ap
option activates the automated protonation of ligand positive
ionisable centres, notably amines, guanidines, imidazoles, and amidines. The
-an
option activates the automated deprotonation of ligand negative
ionisable centres, notably carboxylic acids, phosphates, phosphonates,
sulphates, and sulphonates. The precise rules used by rDock for protonation
and deprotonation are quite crude, and are not user-customisable. Therefore
these flags are not recommended for detailed validation experiments, in which
care should be taken that the ligand protonation states are set correctly in the
input SD file. Note that rDock is not capable of converting ionised centres back
to the neutral form; these are unidirectional transformations.
Control of ligand non-polar hydrogens¶
By default, rDock uses an implicit non-polar hydrogen model for receptor and
ligand, and all of the scoring function validation has been performed on this
basis. If the -allH
option is not defined (recommended), all explicit
non-polar hydrogens encountered in the ligand input SD file are removed, and
only the polar hydrogens (bonded to O, N, or S) are retained. If the -allH
option is defined (not recommended), no hydrogens are removed from the ligand.
Note that rDock is not capable of adding explicit non-polar hydrogens, if none
exist. In other words, the -allH
option disables hydrogen removal, it does
not activate hydrogen addition. You should always make sure that polar hydrogens
are defined explicitly. If the ligand input SD file contains no explicit
non-polar hydrogens, the -allH
option has no effect. Receptor protonation is
controlled by the system definition prm file.
rbcavity¶
rbcavity
– Cavity mapping and preparation of docking site (.as
) file.
$RBT_ROOT/bin/rbcavity
{-r system definition .prm file}
[-ras -was -d -v -s]
[-l distance from cavity]
[-b border]
Exploration of cavity mapping parameters¶
rbcavity -r .prm file
You can run rbcavity with just the -r
argument when first preparing a new
receptor for docking. This allows you to explore rapidly the impact of the
cavity mapping parameters on the generated cavities, whilst avoiding the
overhead of actually writing the docking site (.as
) file to disk. The number
of cavities and volume of each cavity are written to standard output.
Visualisation of cavities¶
rbcavity -r .prm file -d
If you have access to InsightII you can use the -d
option to dump the cavity
volumes in InsightII grid file format. There is no need to write the docking
site (.as
) file first. The InsightII grid files should be loaded into the
reference coordinate space of the receptor and contoured at a contour level of
0.99.
Writing the docking site (.as) file¶
rbcavity -r .prm file -was
When you are happy the mapping parameters, use the -was
option to write the
docking site (.as
) file to disk. The docking site file is a binary file that
contains the cavity volumes in a compact format, and a pre-calculated cuboid
grid extending over the cavities. The grid represents the distance from each
point in space to the nearest cavity grid point, and is used by the cavity
penalty scoring function. Calculating the distance grid can take a long time
(whereas the cavity mapping itself is usually very fast), hence the -was
option should be used sparingly.
Analysis of cavity atoms¶
rbcavity -r .prm file -ras -l distance
Use the -l
options to list the receptor atoms within a given distance of any
of the cavity volumes, for example to determine which receptor OH/NH3+ groups
should be flexible. This option requires access to the pre-calculated distance
grid embedded within the docking site (.as) file, and is best used in
combination with the -ras
option, which loads a previously generated docking
site file. This avoids the time consuming step of generating the cavity distance
grid again. If -l
is used without -ras
, the cavity distance grid will be
calculated on-the-fly each time.
Miscellaneous options¶
The -s
option writes out various statistics on the cavity and on the
receptor atoms in the vicinity of the cavity. These values have been used in
genetic programming model building for docking pose false positive removal. The
-v
option writes out the receptor coordinates in PSF/CRD format for use by
the rDock Viewer (not documented here). Note that the PSF/CRD files are not
suitable for simulation purposes, only for visualisation, as the atom types are
not set correctly. The -b
option controls the size of the cavity distance
grid, and represents the border beyond the actual cavity volumes. It should not
be necessary to vary this parameter (default = 8 Å) unless longer-range scoring
functions are implemented.
rbcalcgrid¶
rbcalcgrid
– Calculation of vdW grid files (usually called by
make_grid.csh
wrapper script).
$RBT_ROOT/bin/rbcalcgrid
{-r system definition file}
{-o output suffix for generated grids}
{-p vdW scoring function prm file}
[-g grid step]
[-b border]
Note that, unlike rbdock
and rbcavity
, spaces are not tolerated between
the command-line options and their corresponding arguments. See
$RBT_ROOT/bin/make_grid.csh
for common usage.
make_grid.csh¶
Creates vdW grids for all receptor .prm
files listed on command line.
Front-end to rbcalcgrid
.
rbconvgrid¶
rbmoegrid¶
rbmoegrid
– Calculates grids for a given atom type.
rbmoegrid -o <OutputRoot> -r <ReceptorPrmFile> -p <SFPrmFile> [-g <GridStep> -b <border> -t <tripos_type>]
-o <OutFileName> (.grd is suffiexed)
-r <ReceptorPrmFile> - receptor param file (contains active site params)
-p <SFPrmFile> - scoring function param file (default calcgrid_vdw.prm)
-g <GridStep> - grid step (default = 0.5A)
-b <Border> - grid border around docking site (default = 1.0A)
-t <AtomType> - Tripos atom type (default is C.3)
sdrmsd¶
sdrmsd
– calculation of ligand root mean squared displacement (RMSD)
between reference and docked poses. It takes into account molecule topological
symmetry. Requires Open Babel Python bindings.
$RBT_ROOT/bin/sdrmsd [options] {reference SD file} {input SD file}
With two arguments¶
sdrmsd
calculates the RMSD between each record in the input SD file and the
first record of the reference SD file. If there is a mismatch in the number of
atoms, the record is skipped and the RMSD is not calculated. The RMSD is
calculated over the heavy (non-hydrogen) atoms only. Results are output to
standard output. If some record was skipped, a warning message will be printed
to standard error.
With fitting¶
A molecular superposition will be done before calculation of the RMSD. The output will specify an RMSD FIT calculation was done.
sdrmsd -o output.sdf reference.sdf input.sdf
sdrmsd --out=output.sdf reference.sdf input.sdf
Output a SD file¶
This option will write an output SD file with the input molecules adding an extra RMSD field to the file. If fitting was done, the molecule coordinates will also be fitted to the reference.
sdrmsd -o output.sdf reference.sdf input.sdf
sdrmsd --out=output.sdf reference.sdf input.sdf
sdtether¶
sdtether
– Prepares a ligand SD file for tethered scaffold docking.
Requires Open Babel Python bindings. Annotates ligand SD file with tethered
substructure atom indices.
$RBT_ROOT/bin/sdtether {ref. SDfile} {in SDfile} {out SDfile} "{SMARTS query}"
sdtether
performs the following actions:
Runs the SMARTS query against the reference SD file to determine the tethered substructure atom indices and coordinates.
If more than one substructure match is retrieved (e.g. due to topological symmetry, or if the query is too simple) all substructure matchs are retained as the reference and all ligands will be tethered according to all possible matches.
Runs the SMARTS query against each record of the input ligand SD file in turn.
For each substructure match, the ligand coordinates are transformed such that the principal axes of the matching substructure coordinates are aligned with the reference substructure coordinates.
In addition, an SD data field is added to the ligand record which lists the atom indices of the substructure match, for later retrieval by rDock.
Each transformed ligand is written to the output SD file.
Note that if the SMARTS query returns more than one substructure match for a ligand, that ligand is written multiple times to the output file, once for each match, each of which will be docked independently with different tethering information.
sdfilter¶
sdfilter
– Post-process an SD file by filtering the records according to
data fields or attributes.
sdfilter -f '$<DataField> <Operator> <Value>' [-s <DataField>] [sdFiles]
or
sdfilter -f <filename> [-s <DataField>] [sdFiles]
Note
Multiple filters are allowed and are OR’d together. Filters can be provided in a file, one per line. Standard Perl operators should be used. e.g.
eq ne lt gt le ge # for strings
== != < > <= >= # for numeric
_REC
(record #) is provided as a pseudo-data field. If -s
option is used, _COUNT
(#occurrences of DataField) is provided as a pseudo-data field. If SD file list not given, reads from standard input. Output is to standard output.
For example, if results.sd
constains multiple ligands each having multiple poses (ordered by score), then running
sdfilter -f'$_COUNT == 1' results.sd
will get you the first entry for each ligand.
sdreport¶
sdreport
– Produces text summaries of SD records.
sdreport [-l] [-t [<FieldName, FieldName...>]] [-c <FieldName, FieldName...>] [-id <IDField>] [-nh] [-o] [-s] [-sup] [sdFiles]
-l (list format) output all data fields for each record as processed
-t (tab format) tabulate selected fields for each record as processed
-c (csv format) comma delimited output of selected fields for each record as processed
-s (summary format) output summary statistics for each unique value of ligand ID
-sup (supplier format) tabulate supplier details (from Catalyst)
-id <IDField> data field to use as ligand ID
-nh don't output column headings in -t and -c formats
-o use old (v3.00) score field names as default columns in -t and -c formats, else use v4.00 field names
-norm use normalised score filed names as default columns in -t and -c formats (normalised = score / #ligand heavy atoms)
Note
If -l
, -t
or -c
are combined with -s
, the listing/table is
output withing each ligand summary. -sup
should not be combined with
other options. Default field names for -t
and -c
are RiboDock score
field names. Default ID field name is Name
. If sdFiles
not given,
reads from standard input. Output is to standard output.
sdsplit¶
sdsplit
– Splits SD records into multiple files of equal size.
sdsplit [-<RecSize>] [-o <OutputRoot>] [sdFiles]
-<RecSize> record size to split into (default = 1000 records)
-o <OutputRoot> Root name for output files (default = tmp)
Note
If SD file list not given, reads from standard input.
sdsort¶
sdsort
– Sorts SD records by given data field.
sdsort [-n] [-r] [-f <DataField>] [sdFiles]
-n numeric sort (default is text sort)
-r descending sort (default is ascending sort)
-f <DataField> specifies sort field
-s fast mode. Sorts the records for each named compound independently (must be consecutive)
-id <NameField> specifies compound name field (default = 1st title line)
Note
_REC
(record #) is provided as a pseudo-data field. If SD file list not
given, reads from standard input. Output is to standard output. Fast mode can
be safely used for partial sorting of huge SD files of raw docking hits
without running into memory problems.
sdmodify¶
sdmodify
– Script to set the first title line equal to a given data field.
sdmodify -f <DataField> [sdFiles]
Note
If sdFiles
not given, reads from standard input. Output is to standard
output.
rbhtfinder¶
rbhtfinder
– Script that simulates the result of a high throughput
protocol.
1st) exhaustive docking of a small representative part of the
whole library.
2nd) Store the result of sdreport -t over that exhaustive dock.
in file that will be the input of this
script.
3rd) rbhtfinder <sdreport_file> <output_file> <thr1max> <thr1min> <ns1> <ns2>
<ns1> and <ns2> are the number of steps in stage 1 and in
stage 2. If not present, the default values are 5 and 15
<thrmax> and <thrmin> setup the range of thresholds that will
be simulated in stage 1. The threshold of stage 2 depends
on the value of the threshold of stage 1.
An input of -22 -24 will try protocols:
5 -22 15 -27
5 -22 15 -28
5 -22 15 -29
5 -23 15 -28
5 -23 15 -29
5 -23 15 -30
5 -24 15 -29
5 -24 15 -30
5 -24 15 -31
Output of the program is a 7 column values. First column
represents the time. This is a percentage of the time it
would take to do the docking in exhaustive mode, i.e.
docking each ligand 100 times. Anything
above 12 is too long.
Second column is the first percentage. Percentage of
ligands that pass the first stage.
Third column is the second percentage. Percentage of
ligands that pass the second stage.
The four last columns represent the protocol.
All the protocols tried are written at the end.
The ones for which time is less than 12%, perc1 is
less than 30% and perc2 is less than 5% but bigger than 1%
will have a series of *** after, to indicate they are good choices
WARNING! This is a simulation based in a small set.
The numbers are an indication, not factual values.
An example file would look like as follows:
# 3 steps as the running filters (set by the "3" in next line)
3
if - -10 SCORE.INTER 1.0 if - SCORE.NRUNS 9 0.0 -1.0,
if - -20 SCORE.INTER 1.0 if - SCORE.NRUNS 14 0.0 -1.0,
if - SCORE.NRUNS 49 0.0 -1.0,
# 1 writing filter (defined by the "1" in next line)
1
- SCORE.INTER -10,
In other (more understandable) words.
First, rDock runs 3 consecutive steps:
Run 10 runs and check if the
SCORE.INTER
is lower than -10, if it is the case:Then run 5 more runs (until 15 runs) to see if the
SCORE.INTER
reaches -20. If it is the case:Run up to 50 runs to freely sample the different conformations the molecule displays.
And, second:
For the printing information, only print out all those poses where
SCORE.INTER
is better than -10 (for avoiding excessive printing).
rblist¶
rblist
– Output interaction center info for ligands in SD file (with optional autoionisation).
rblist -i <InputSDFile> [-o <OutputSDFile>] [-ap ] [-an] [-allH]
-i <InputSDFile> - input ligand SD file
-o <OutputSDFilde> - output SD file with descriptors (default = no output)
-ap - protonate all neutral amines, guanidines, imidazoles (default = disabled)
-an - deprotonate all carboxylic, sulphur and phosphorous acid groups (default = disabled)
-allH - read all hydrogens present (default = polar hydrogens only)
-tr - rotate all secondary amides to trans (default = leave alone)
-l - verbose listing of ligand atoms and rotable bonds (default = compact table format)