Validation experiments¶
In this section you will find the instructions about how to reproduce our validation experiments using different test sets. Three different sets were analyzed for three different purposes:
ASTEX set for binding mode prediction in proteins.
RNA set for assess RNA-ligand docking.
DUD set for database enrichment.
Binding mode prediction in proteins¶
First of all, please go to ASTEX set SourceForge download page to download a compressed file with the necessary data.
After downloading the file ASTEX_rDock_TestSet.tar.gz
, uncompress the file
with the following command, which will create a folder called
ASTEX_rDock_TestSet
:
$ tar -xvzf ASTEX_rDock_TestSet.tar.gz
$ cd ASTEX_rDock_TestSet/
Here you will have the instructions for one of the systems (1sj0
), to run
with the rest of the systems, just change the pdb code with the one desired.
Then, make sure that the necessary environmental variables for running rDock are
well defined and run the following commands for entering to the folder and
running rDock with the same settings that we have used:
$ cd 1sj0/
#first create the cavity using rbcavity
$ rbcavity -r 1sj0_rdock.prm -was > 1sj0_cavity.log
#then use rbdock to run docking
$ rbdock -r 1sj0_rdock.prm -p dock.prm -n 100 -i 1sj0_ligand.sd \
-o 1sj0_docking_out > 1sj0_docking_out.log
#sdsort for sorting the results according to their score
$ sdsort -n -f'SCORE' 1sj0_docking_out.sd > 1sj0_docking_out_sorted.sd
#calculate rmsd from the output comparing with the crystal structure of the ligand
$ sdrmsd 1sj0_ligand.sd 1sj0_docking_out_sorted.sd
Binding mode prediction in RNA¶
In a similar way of the section above, here you will find a brief tutorial on how to run rDock with the RNA TestSet used in the validation. As in the first section, please go to RNA set SourceForge download page to download a compressed file with the necessary data.
After downloading the file RNA_rDock_TestSet.tar.gz
, uncompress the file
with the following command, which will create a folder called
RNA_rDock_TestSet
:
$ tar -xvzf RNA_rDock_TestSet.tar.gz
$ cd RNA_rDock_TestSet/
Here you will have the instructions for one of the systems (1nem
), to run
with the rest of the systems, just change the pdb code with the one desired.
Then, make sure that the necessary environmental variables for running rDock are
well defined and run the following commands for entering to the folder and
running rDock with the same settings that we have used (if you have run the
previous set, the variables should already be correctly defined):
$ cd 1nem/
#first create the cavity using rbcavity
$ rbcavity -r 1nem_rdock.prm -was > 1nem_cavity.log
#then use rbdock to run docking
$ rbdock -r 1nem_rdock.prm -p dock.prm -n 100 -i 1nem_lig.sd \
-o 1nem_docking_out > 1nem_docking_out.log
#sdsort for sorting the results according to their score
$ sdsort -n -f'SCORE' 1nem_docking_out.sd > 1nem_docking_out_sorted.sd
#calculate rmsd from the output comparing with the crystal structure of the ligand
$ sdrmsd 1nem_lig.sd 1nem_docking_out_sorted.sd
Database enrichment (actives vs. decoys – for HTVS)¶
In this section you will find a brief tutorial on how to run rDock with the DUD test set used in the validation and how to perform different analysis of the results. As in the sections above, please go to DUD set SourceForge download page to download a compressed file with the necessary data.
After downloading the file DUD_rDock_TestSet.tar.gz
, uncompress the file
with the following command, which will create a folder called
DUD_rDock_TestSet
:
$ tar -xvzf DUD_rDock_TestSet.tar.gz
$ cd DUD_rDock_TestSet/
Here you will have the instructions for one of the systems (hivpr
), to run
with the rest of the systems, just change the DUD system code with the one
desired. Then, make sure that the necessary environmental variables for running
rDock are well defined and run the following commands for entering to the folder
and running rDock with the same settings that we have used (if you have run the
previous sets, the variables should already be correctly defined):
$ cd hivpr/
#first create the cavity using rbcavity
$ rbcavity -r hivpr_rdock.prm -was > hivpr_cavity.log
As the number of ligands to dock is very high, we suggest you to use any
distributed computing environments, such as SGE or Condor, and configure rDock
to run in multiple CPUs. Namely, split the input ligands file in as many parts
as desired (very easy using sdsplit
tool) and run independent rDock docking
jobs for each “splitted” input file. However, for this example purpose, you will
have the instructions for running all set of actives and decoys in one docking
job:
#uncompress ligand file
$ gunzip hivpr_ligprep.sdf.gz
#use rbdock to run docking
$ rbdock -r hivpr_rdock.prm -p dock.prm -n 100 -i hivpr_ligprep.sdf \
-o hivpr_docking_out > hivpr_docking_out.log
#sdsort with -n and -s flags will sort internally each ligand by increasing
#score and sdfilter will get only the first entry of each ligand
$ sdsort -n -s -fSCORE hivpr_docking_out.sd | sdfilter \
-f'$_COUNT == 1' > hivpr_1poseperlig.sd
#sdreport will print all the scores of the output in a tabular format and,
#with command awk, we will format the results
$ sdreport -t hivpr_1poseperlig.sd | awk '{print $2,$3,$4,$5,$6,$7}' > dataforR_uq.txt
At this point, you should have a file called hivpr_docking_out.sd
with all
docking poses written by rDock (100 * number of input ligands), a file called
hivpr_1poseperlig.sd
with the best scored docking pose for each ligand and a
file called dataforR_uq.txt
that will be used for calculating ROC curves
using R. The next step is to calculate ROC curves and other statistics. To do
so, please visit section Calculating ROC curves
and jump to the subsection “R commands for generating ROC curves”.