Help for using HDOCK server


1. How to provide input for docked molecules

The HDOCK server is to predict the binding complexes between two molecules like proteins and nucleic acids by using a hybrid docking strategy. Therefore, users need to provide input for the two molecule to be docked. The HDOCK server can accept four types of input for molecules:

Only ONE type of input is needed for each molecule.

If more than one types of input are provided, the first one will be used. For the "PDB ID:ChainID" input, the user can provide one single chain ID or multiple chain IDs. For example, "1CGI:E" stands for the chain E of the pdb file of 1CGI; "1AHW:AB" stands for the chains A and B of the pdb file of 1AHW. If only a sequence is provided, the server will automatically constuct a model structure from a homologous template in the Protein Data Bank using a in-house modeling pipeline of HH Suite , Clustalw2, and MODELLER. In addition, users are also recommended to submit their own pdb file if the protein contains multiple chains, as our pipeline is currently designed to model single-chain proteins.

NOTE: For docking efficiency, it is recommended that the larger one of two molecules is input as receptor if one molecule is much larger than the other one.

Molecular Type:
"Select a type" is not needed for structure input, as the HDOCK server is able to determine a molecular type according to the input structure. However, for sequence input, users are strongly recommended to select a molecular type; otherwise, the server will guess one from `Protein', `ssRNA', or `dsDNA' based on the input sequence.
Here are the definitions of different molecular types:

Type             Description
Protein         Standard protein molecule
ssRNA           General single-chain RNA molecule
ssDNA           General single-chain DNA molecule
dsDNA           Double-stranded B-DNA duplex molecule
dsRNA           Double-stranded A-RNA duplex molecule
where the maximum input sequence is 500 for double-stranded (ds) RNA/DNA molecules.

2. RNA/DNA 3D structure modeling

HDOCK server now accepts sequence inputs for single-stranded (ss) or double-stranded (ds) RNA/DNA. Only the sequence of a single strand is needed, which can contain the sequence only like this
>example
GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCCGCCA
or both the sequence and its secondary structure for single-stranded (ss) RNA/DNA like this
>example
GGAGCGGUAGUUCAGUCGGUUAGAAUACCUGCCUGUCACGCAGGGGGUCGCGGGUUCGAGUCCCGUCCGUUCCGCCA
(((((((..((((.........))))((((.(((((...))))))))).(((((.......))))))))))))....
HDOCK will then build its 3D structure based on the single sequence, or model a double-stranded 3D duplex structure by construting a complementary Watson-Crick paired second strand.

3. How to specify the binding site [optional]

The HDOCK performs global docking to predict the binding complexes between two molecules. Therefore, no information about the binding site is necessary for the docking job. However, the server also gives users the option to specify the binding site residues if such information is available, such that the predicted models will have a higher accuracy. Two types of binding site information can be provided.

4. SAXS experimental data curve

The small-angle X-ray scattering (SAXS) experimental data can be provided as a post-docking filter for ranking the binding modes predicted by the HDOCK docking. The SAXS data file contains three columns, q, I(q), and error, like this
        0.0000E+00  1.4612E+07  3.0685E+03
        1.0000E-03  1.4743E+07  4.8653E+03
        2.0000E-03  1.4827E+07  7.3394E+03
        3.0000E-03  1.4685E+07  1.0573E+04
        4.0000E-03  1.4674E+07  1.3206E+04
        5.0000E-03  1.4659E+07  1.5831E+04
        6.0000E-03  1.4729E+07  1.5466E+04
        7.0000E-03  1.4707E+07  1.7649E+04
        8.0000E-03  1.4594E+07  2.3642E+04
        9.0000E-03  1.4787E+07  2.8835E+04
With the SAXS experimental curve, the binding models will be ranked according to a weighted score of the docking energy score calculated by our scoring function and the CHI value that measure the goodness of the predicted binding modes fitting to the SAXS experimental data.

5. Post-docking process (optional)

This step is for advanced users if they want to obtain more than 100 predicted complex models or filter the docked complex models with their own experimental information. The downloaded package contains an HDOCK output file, named like hdock_5c984053e4b83.out, that includes all 4392 docking solutions like this
Grid spacing:     1.200
Angle step:    15.000
Initial rotation:     0.00000   0.00000   0.00000
1CGI_r_b.pdb      23.562    26.523    22.675
1CGI_l_b.pdb      47.776    34.961    33.826
   1.27246   0.01055   5.02167    -0.328    -0.164     0.264   -445.20      0.45      1.00
   2.80075   0.00162   3.49381    -0.286    -0.209     0.111   -444.37      0.38      1.00
   0.02137   0.00051  -0.00948    -0.267    -0.212     0.104   -444.28      0.36      1.00
   2.98094   0.00164   3.31735    -0.237    -0.259     0.116   -444.15      0.37      1.00
   3.04247   0.00300   3.25767    -0.340    -0.315     0.134   -442.80      0.49      1.00
   ...
where the first 5 lines have the following definitions
   The 1st line is the Grid spacing of three (x, y, z) translational degrees of freedom.
   The 2nd line is the Euler angle step for three rotational degrees of freedom.
   The 3rd line are the initial rotation of the ligand before docking (optional).
   The 4th line stands for the receptor file and its center of geometry.
   The 5th line is the ligand file and its center of geometry.
Starting from the 6th line are the predicted binding modes each of which is represented by three translations, three rotations, its binding score, RMSD from the initial ligand orientation, and the translational ID for the rotation.

Users can download our "createpl_linux" program and run it locally to generate complex models like this

	createpl_linux hdock_5c984053e4b83.out top100.pdb -nmax 100 -complex -models
where binding site residues or restraints can be applied to filter the complex models. Users can type
	createpl_linux
for the detailed usage about the program.

After generating the complex models, users may also use a third-party program like FoXS to calculate the SAXS CHI values of the models based on their small-angle X-ray scattering (SAXS) profile file.

6. Explanations of evaluation metrics