![]() |
Site Map | Search ExPASy | Contact us |
Hosted by NCSC US | Mirror sites: | Canada | China | Korea | Switzerland | Taiwan |
An experimental unified interface to query several protein identification tools accessible on the web.
Introduction
How to use CombSearch
Detailed interface description
Program-specific features
Known limitations
FAQ
Acknowledgements
What is CombSearch?
CombSearch is an attempt to provide a unified interface to query several protein
identification tools accessible on the web. Currently it includes PeptIdent, TagIdent and MultiIdent from ExPASy, MS-Fit from ProteinProspector, Mowse from UK Human Genome Mapping Project Resource Centre , ProFound from PROWL and PeptideSearch
from the Bioanalytical Research Group at EMBL.
The aim in creating CombSearch was to allow researchers to fill in only one form and query
several programs and databases.
What CombSearch is NOT?
CombSearch is NOT a new program for searching databases. It only combines the necessary
information in order to call several protein identification tools on the web. It also
doesn't make any speculation as to the accuracy of provided data on search results.
CombSearch query is composed of two pages.
On the first page (phase 1) you are asked to enter the information you have on your
protein, such as MW, pI, mass of peptide fragments after a digestion, AA composition,
small sequence tag, etc. Then you press "Analyse".
CombSearch looks through all the data and determines which program can be called. It will
then present you with a second page (phase 2) where all the programs supported by
CombSearch are listed. On this page you will see a marked checkbox next to the program
name, if there is enough information to call it. Minimal data requirements are also listed
here. At this point it is possible to select some program specific options, for example
the database to be used. It is also possible to deselect the programs you don't need. Once
you are satisfied with your choice click on "Query now" to perform it.
The last page will give you a summary of programs you have selected to query and whether
they could be successfully contacted. CombSearch has to call all the programs and
depending on Internet traffic and individual engine workload it may take some time to get
results, so be patient. Waiting times of up to 5 minutes are quite normal if you query all
the programs.
Results.
Responses from each engine are sent separately by e-mail and the results will contain the
name of your query and the name of the program. We strongly suggest you filling in the
"Sample name" field otherwise you could easily confuse
received responses without labels in your mailbox. Depending on the network traffic and
workload of programs, e-mails will arrive with some delay.
CombSearch first page (phase 1)
This is the main page which requests you to fill in an interactive form with necessary
data for initialization of your query. In the table below you will find a description of
the different fields and required data.
Field name: | Description: | Required by, Used by: | Value limit (default): |
Sample name | Name for your query | PeptIdent, MS-Fit, Mowse, ProFound, MultiIdent, TagIdent. | |
Organism Species or Organism Classification |
Used for OS/OC field of SWISS-PROT db or converted to corresponding fields of MS-Fit and ProFound. | PeptIdent, MS-Fit, ProFound, MultiIdent, TagIdent. | |
Keyword | Used for KW field of SWISS-PROT db. | MultiIdent, TagIdent. | |
Protein Mass (MW) | Mass of whole protein in [Da]. Leave empty to cover whole possible range. | ProFound, PeptideSearch, PeptideSearch Edman,
TagIdent. PeptIdent, MS-Fit, Mowse, MultiIdent. |
0 - 1000000 |
MW range | Range of MW in percents. | ProFound, PeptideSearch, PeptideSearch Edman,
TagIdent. PeptIdent, MS-Fit, Mowse, MultiIdent. |
0 - 20 - 100 % |
pI (isoelectric point) Min-Max | Isoelectric point of the whole protein. Leave empty to cover the whole range. | TagIdent. PeptIdent, MS-Fit, MultiIdent. |
0 < pI < 14 |
Peptide Mass Fingerprinting |
|||
Cleavage agent | Enzyme or chemical reagent used to produce peptide fragments. | PeptIdent, MS-Fit, Mowse, ProFound, PeptideSearch, MultiIdent, PeptideSearch Edman. | Arg-C Asp-N Glu Chymotrypsin CNBr Glu-C (E) Lys-C Trypsin V8-bicarb (E) V8-phosph (E,D) |
Allow for # missed cleavage sites (MC) | Number of sites that are not cleaved by the enzyme or the chemical (thus producing incorrect size fragments). | MS-Fit. PeptIdent, ProFound, PeptideSearch, |
0 - 1 - 5 ? |
Cysteine | Modifications of cysteine, such as iodoacetylation, performed during
sample preparation/analysis. If the program does not support a modification free cysteine will be used instead. |
PeptIdent, MS-Fit, ProFound, PeptideSearch, PeptideSearch Edman. |
Acetamidomethyl-Cys Acrylamido-Cys Aedans-Cys Aminoethyl-Cys Benzyl-Cys __-dimethyl-Cys Carboxymethyl-Cys Carbamidomethyl-Cys Cys Cysteic Acid-Cys Pyridylethyl-Cys S-cysteinyl-Cys S-farnesyl-Cys S-palmityl-Cys |
Methionine | Whether methionine is oxidized or not. | PeptIdent, MS-Fit, PeptideSearch, MultiIdent, PeptideSearch Edman. | Normal Oxidized |
Number of peptides fragments required for protein match: |
Minimal number of matching peptide fragments to identify a protein. | PeptideSearch. PeptIdent, MS-Fit |
0 - 5 - 20 |
Ion mode | Whether peptide fragments are in protonated or neutral form during the MS analysis. | PeptIdent, ProFound, PeptideSearch, MultiIdent. | Neutral [M] Protonated [M+H]+ |
Isotopic resolution | Whether monoisotopic or average masses were measured for peptide fragments on MS. | PeptIdent, MS-Fit, ProFound, PeptideSearch, MultiIdent, PeptideSearch Edman. | Average Monoisotopic |
within mass range | Accuracy in measuring peptide fragments masses in [Da] or ppm | PeptIdent, MS-Fit, Mowse, PeptideSearch. ProFound, MultiIdent. |
< 5 [Da] or > 0.0001 ppm |
List of peptide masses | Put here a list of peptide masses separated by spaces or newlines. All characters other than numbers will be ignored. | PeptIdent, MS-Fit, Mowse, ProFound, PeptideSearch. MultiIdent. |
> 1 [Da] |
Single AA substitution / peptide | How many mutations are allowed by peptide, minimal number fragments with no AA substitutions required for a match and which way the shift in fragments mass is allowed. All these fields are for MS-Fit, please check their page for a more detailed description. | MS-Fit. | |
Min. number of matches with no AA substitutions | 0 - 1 - 4 | ||
Peptide Mass shift | |||
Amino Acid Composition |
|||
AA names | AA composition in molar percent. These fields are MultiIdent specific. If Gly field is filled, constellation 2 is used, otherwise constellation 4 is used. For more information check MultiIdent site. | MultiIdent. | the sum of all AA must not exceed 100% |
Tagging |
|||
Peptide Mass (neutral) within mass range. | Mass of the peptide fragment used for Edman sequencing and its accuracy. | PeptideSearch Edman. | 0 - 2000 < 5 [Da] or > 10 ppm |
Peptide sequence tag | Sequence of the peptide tag. All valid AA names are allowed and "x" can be used to designate any AA. MultiIdent and TagIdent only work with tags of 6 AA or smaller. | PeptideSearch Edman, TagIdent. MultiIdent. |
PeptideSearch Edman tag = any length MultiIdent, TagIdent tag <= 6 AA. |
Search for all possible tag permutations | Allows permutation in AA tag order | MultiIdent, TagIdent | |
Display only the sequences matching the tag. | Display a sequence in found proteins matching the tag. | TagIdent | |
Include scan of your tag against all fragments SWISS-PROT/TrEMBL | Allows you to perform a search of all proteins matching your tag and having the OS/OC field you specified. | TagIdent | |
Results format |
|||
Display the predicted N-terminal sequence of the protein | When checked, returned result will include a short (40 AA) sequence of the specified region of the protein. | MultiIdent, TagIdent. | N-terminal sequence C-terminal sequence (internal) area |
Number of results returned | How many results you want do be displayed. | PeptIdent, MS-Fit, ProFound, PeptideSearch, MultiIdent, PeptideSearch Edman. | 0 - 30 - 100 |
Your e-mail | Please note that CombSearch checks the general validity of the e-mail, so be sure to provide your correct e-mail address. | MultiIdent. PeptIdent, Mowse, TagIdent. |
Any valid e-mail address. |
CombSearch second page (phase 2)
On this page you have a list of programs which can be used according to introduced data.
The marked checkbox appears next to the titles of the programs which have sufficient data
to function. There are also some fields for selection of additional program specific
functions, which are summarized in the table below:
Field name | Description | Program |
Database on which the scan should be performed | Database that will be used for your search by this engine. | MS-Fit, TagIdent |
Search for | Specify whether your sample contains a single protein or a binary mixture. | ProFound |
Type of Edman search | Specify the direction of the Edman sequencing. | PeptideSearch by sequence tag |
PeptIdent.
PeptIdent only requires a list of peptide masses and their error to function. The error can be either in [Da] or ppm and is
passed directly from the CombSearch form. Peptide fragments types such as ion mode,
isotopic resolution, oxidized methionine and cysteine modifications, are also supported.
At the moment PeptIdent allows digestion solely by Trypsin, and cannot be used if you
select an other cleavage agent. The only limitation of CombSearch compared to the original
interface is the impossibility to have acrylamide adducts on cysteines with other
cysteines modifications (such as iodoacetamide) at the same time. PeptIdent will also make
use of protein mass, pI and organism species data if they are entered.
PeptIdent, as all ExPASy tools, has its own e-mail server and will send you results
directly.
MS-Fit.
Minimal requirements for MS-Fit are a list of peptide masses and their error and minimum number of peptides for a
match. The error will be passed from CombSearch form as [Da] or ppm (% and mmu are not
supported). You will be able to choose the database on the second page of CombSearch
query. The species to be used will be translated from the "Organism species"
field according to this table. Ion mode ([M]+
or [M+H]+), isotopic resolution, oxidized methionine and cysteine modifications
will be passed from the CombSearch form. Following values will be set by default: DNA
Frame translation = 3, N Terminus = Hydrogen, C Terminus = Free Acid, Possible
Modifications Mode = Peptide N-terminal Gln to pyroGlu, Protein N-terminus Acetylated,
Phosphorylation of S, T and Y, Report MOWSE Scores = yes, Pfactor = 0.4. Also, combined
digestion such as "CNBr/Trypsin" are not supported. MS-Fit will use MW and pI
data.
MS-Fit does not have its own e-mail server, so the results will be relayed to you by
CombSearch.
Mowse.
Mowse requires a list of peptide masses and their
error to function. The error in "%" is calculated from the [Da] or ppm
entered in CombSearch form by taking the value of the smallest entered peptide fragment
mass. The value of Pfactor is fixed to 0.2 as in the original Mowse interface.
Mowse has its own e-mail server and will send you the results.
ProFound.
This program requires MW and a list of peptide
masses to function. ProFound will automatically generate an error range for peptide
fragment masses if it is omitted or conflicting. The data about ion mode ([M+H]+),
cysteine modifications, digest chemistry, is also supported. The taxonomic category will be translated from the
"Organism species" field and you can select whether you search for a single
protein or a binary mixture one the second page of CombSearch form. Unlike the original
form, you cannot enter monoisotopic and average data at the same time. You also cannot
enter multiple digestion data for the same sample.
ProFound does not have an e-mail server, so the data will be forwarded to you through
CombSearch.
PeptideSearch.
Minimal requirements for PeptideSearch are MW, a
list of peptide masses and their error, and a
number of peptide fragments for a match. The error in "%" is calculated by
taking the value of the smallest peptide fragment mass and in relation to the data in the
[Da] or ppm field of the CombSearch form. All the other data, such as ion mode, isotopic
resolution, cysteine modifications and cleavage agent is fully supported as well.
PeptideSearch does not have its own e-mail server, so the results will be relayed to you
by CombSearch.
MultiIdent.
MultiIdent requires data about amino acid composition to function.
CombSearch will pass all the information from the "Amino Acid Composition"
fields. If you entered information for glycine (GLY) Constallation2 will be used,
otherwise Constellation4 is used. For more information about constellations consult MultiIdent and AACompIdent pages. The data about MW,
pI, OS/OC, keyword, tag <=6 AA, and peptide mass fragments is also used. All the
enzymes, except for "Asp N + N-terminal Glu", and cysteine modifications are
supported and the full range of peptide masses (from 0 to 5000) is considered. The list of
best-matched proteins is 5 times the number of results to be reported. Finally, the
results will report:
"closest SWISS-PROT entries for selected species and keyword, having pI and MW values
in the specified range, and within that: the AAComposition scores, the peptide mass
fingerprinting scores and/or the integrated scores for AAComposition and peptide mass
fingerprinting".
MultiIdent, as all ExPASy tools, has its own e-mail server and will send you results
directly.
PeptideSearch by sequence tag (or PeptideSearch Edman).
This program requires MW, a peptide sequence tag, its
mass and error. The error is converted to "%" if
necessary. The data about cleavage agent, cysteine and methionine state and mass type of
the tag are taken from the "Peptide Mass Fingerprinting" part of the CombSearch form.
You can select the type of Edman degradation on the second page of CombSearch . Please
note that from the data above only the Edman type searches can be performed by
PeptideSearch by sequence tag.
PeptideSearch by sequence tag does not have an e-mail server, so the data will be
forwarded to you through CombSearch.
TagIdent.
TagIdent requires data about MW and pI of the protein.
It can also use a sequence tag if it is composed of 6 AA or less. The OS/OC field is
generated from "Organism Species or
Organism Classification". The database selection is made on the second page of
CombSearch. All the other fields are fully implemented in the CombSearch form as well.
TagIdent, as all ExPASy tools, has its own e-mail server and will send you the results
directly.
They are all program-specific, please consult the section above.
Q: Why can't I select feature X of the search engine Y in CombSearch?
A: It is a good question. Basically there are two reasons: real estate of your screen and
simplicity. We tried to put as much information as possible in only one page without
sacrificing function, so some program specific fields were left out. If you feel that some
feature should be present send us an e-mail with the program name and the feature you
would like to see in CombSearch. Also make sure that the feature is not already present,
only under an other name. For that check Program-specific features
on this page.
Q: Why CombSearch does not support engine X?
A: Another good question. The answer is again screen size and simplicity. As far as we
know CombSearch is the first attempt (well, second if you count MultiIdent) to create a
unified interface for several protein identification tools available on the web. All
programs are created by groups in different places, the challenge was to find similar
parts for all of them, therefore we had to leave some tools out. Again, if you feel that
some particular engine should be in CombSearch send us your suggestions.
The authors of this project would like to express gratitude to the following people:
Ron D. Appel for coming up with the idea of this project.
Elisabeth Gasteiger and Pierre-Alain Binz for helping us to understand various aspects of protein identification techniques using computers, designing CombSearch interface, providing help on query strings, beta testing and bearing with us for all this time.
Bertrand Ibrahim and Weijiang Wang for their help with TCL and Internet programming.
Remi Hammerli would like to personally thank:
all those who had to do without his presence during the many hours spent on this project.
Pavel Dobrokhotov's gratefulness goes to:
![]() |
Site Map | Search ExPASy | Contact us |
Hosted by NCSC US | Mirror sites: | Canada | China | Korea | Switzerland | Taiwan |