EVA: EValuation of Automatic protein structure prediction
Objectives
CASP addresses the question how well can experts predict protein structure if given sufficient incentive to do so?. In contrast, the question addressed by EVA is how well could molecular biologists predict protein structure, if they simply take the output from the programs out there?. Thus, the goals are:
- Provide a continuous, fully automated, and statistically significant analysis of structure prediction servers.
- As has been shown by many of us, predictions based on small numbers of samples are NOT representative. EVA running for a year could produce a fairly representative picture. Even running for a month EVA could produce more reliable estimates than CASP can do in 2 years (at least, for answering the particular, restricted but important - question how well do servers do).
- EVA will NOT answer to requests of users!! It will NOT be a meta-server, rather it will simply sit there and evaluate servers based on known structures.
- EVA will NOT evaluate any server without the consent of the author. (Of course, the hope is that most of you to whom this message goes would co-operate.)
- We are seriously concerned about the negative aspect of the freedom of the Web being that any newcomer can spend a day and hack out a program that predicts 3D structure, put it on the web, and it will be used.
Technical aspects
- Repeat: no use of EVA upon request, only for evaluation purposes.
- Targets: provided by the PDB pre-release, i.e. 20 + per week.
- Update of evaluation data: once a month (or may be on a running average basis).
- The detail of the data displayed remains to be discussed (clearly, we could profit from the lessons learned by the Alamos pioneers: Fidelis et al., here!).
- The particular programs used for the evaluation remain to be discussed. We hope that the decision on which software to use for evaluation would be supported by the community.
- Anybody who feels that she or he has a better method for evaluation could make this method available to EVA and it will be used in addition from that moment on. (There may be some mechanism to decide after a year that some of the results could be left out from now on, since they do NOT add particularly to the overall evaluation procedure.)
- Any developer will have the chance at any point to add (any) comment regarding her (or his) servers output linked to the EVA results of that server.
- There will be a public forum where users (and developers) can deposit their statements about any result (i.e. developers can talk about the results of others, here).
Methods covered
In its first implementation, EVA should address the following fields of protein structure prediction:
- Predictions of protein structure in 1D (secondary structure, solvent accessibility)
- Predictions of protein structure in 2D (inter-residue distances)
- Predictions of protein structure in 3D (homology modelling)
- Prediction of protein structure in 3D- , i.e. threading results that may imply 3D predictions but may actually never go the entire way, but may be restricted to stating A is similar to B.
- Predictions of novel folds.
Soon after EVA will be running, we hope to extend it to also cover other aspects of protein structure, such as membrane regions, signal peptides, cleavage sites, may be structural/functional motifs (others to be added to the list).
Framework
EVA could physically be located in a particular place, or could be shared between various labs involved in the effort. Currently, we favour to have it running on one particular machine, and to install various WWW mirrors for the output. It may be reasonable to request some financial help to pull it online. At this moment, we envision that EVA would be realised by a few people who actually are willing to spend resources on the issue. Apart from that we should hope that there will be a slightly larger group of people who are on the board of the service, i.e. who contribute their experience, their tools, their ideas, and their support.