EVA home   EVA e-mail   EVA mirrors   -   Secondary structure   Comparative modelling   Threading   Contacts
 
Eva-CM, a web server for CONTINUOUS and AUTOMATED evaluation of protein structure prediction servers, works by continually taking new structures from Protein Data Bank (PDB), submitting them to various participating modeling servers, accepting the returned predictions, evaluating the predictions, and displaying the results on the web. 

CRITERIA:

The intent for EVA is to have a small number of simple criteria, arranged in a hierarchical manner from coarse to fine, measuring the main aspects of comparative modeling. These aspects include fold recognition, alignment, overall accuracy, core accuracy, loop accuracy, and sidechain accuracy. Another intent is to have criteria such that their distributions determined by EVA can be applied as predictors of accuracy of a comparative model where the actual structure is not known. 

1) Does the model have the correct fold?

The model has the correct fold when a structure-based alignment of the model with the actual target structure aligns at least 30% of the residues. The alignment is defined by the least-squares superposition that maximizes the subset of the CE (Shindyalov, I.N.,  Bourne, P.E.  1998. Protein Eng. 11 pp739-47.) equivalences corresponding to pairs of identical Calpha atoms within 3.5A of each other. This is obtained by the MODELLER "SUPERPOSE REFINE_FIT = on" command applied to the CE model-target alignment, with the Calpha-Calpha distance cutoff of 3.5A, and removing the equivalences between non-identical residues from the refined list. Two residues are identical when they are the same residue in the sequence.
 

2) If a model does not have the correct fold, did a modeling server fail because there is no related fold or because it did not find the correct fold?

In other words, is the modeled sequence related structurally to any of the known protein structures? The answer is obtained by using CE to determine if the actual target structure has at least 30% of its residues aligned within 3.5A with any of the known structures. This additional fold assignment qualification penalizes the modeling server that returns to a user a model for a sequence unrelated to any of the known structures.

3) Difficulty of the modeling case.

The difficulty level is obtained by comparing the modeled sequence with the sequence of the structurally most similar known structure. The most similar known structure is defined as the structure that has the largest number of residues aligned with the target structure, as defined in point 2. Once the closest template is found (by CE search), the difficulty level is based on the comparison of the pairwise alignment derived from the structure alignment and the pairwise alignment of both sequences by ALIGN command in MODELLER.

4) Alignment accuracy.

Alignment accuracy is the fraction of aligned residues defined in (1). Note that this does not measure the accuracy of the input alignment, but the fraction of the residues that are modeled "correctly", some of them possibly because loop modeling worked well. 

5) Overall accuracy.

The RMSD and DRMS errors are calculated separately for all pairs of identical Calpha, mainchain, sidechain, and all atoms from the model and the target. Each RMSD is calculated for a least-squares superposition of the atoms that are compared. The MODELLER SUPERPOSE command is used. Four different RMSDs are given for Calpha atoms, main-chain atoms, side-chain atoms and all atoms in the model.

6) Accuracy of the correctly aligned positions.

The RMSD and DRMS errors are calculated separately for all pairs of aligned Calpha, mainchain, sidechain, and all atoms from the model and the target. The aligned residues are defined in (1), thus the same comment as in (3) also applies here. Each RMSD is calculated for a least-squares superposition of the atoms that are compared.

7) Accuracy of the protein core.

The core region is defined by the target residues that are buried or in secondary structure according to MODELLER (WRITE_DATA command). A residue is buried when less than 20% of its surface area is exposed. The RMSD and DRMS are calculated for core Calpha atoms, mainchain atoms, sidechain atoms, and all atoms. Rotamer classes are also compared (chi1, chi2, chi3, chi4, as well as chi1+2). The MODELLER COMPARE command is used. The rotamer results are reported separately for the 20 residue types.

8) Accuracy of the non-core regions.

Each non-core region, or "loop", is treated separately. The local and global RMSDs are calculated for their Calpha, main-chain, side-chain and all atoms in the model and the target. The global superposition is defined as the least-squares superposition of the Calpha atoms aligned as defined in (1). 

10) Stereochemical quality of the model

Several stereochemical criteria are also tabulated, including the Procheck criteria for Phi/Psi values and H-bonds, as well as the stereochemical g-factor.

11) Contribution of a model.

All returned models are assessed and they contribute equally to the server average.
 

The Eva Team


  EVA home   EVA e-mail   EVA mirrors   -   Secondary structure   Comparative modelling   Threading   Contacts