ExPASy Home page

Site Map

Search ExPASy

Contact us

Mirror sites:

An experimental unified interface to query several protein identification tools accessible on the web.

Help

Introduction
How to use CombSearch
Detailed interface description
Program-specific features
Known limitations
FAQ
Acknowledgements

Introduction.

What is CombSearch?
CombSearch is an attempt to provide a unified interface to query several protein identification tools accessible on the web. Currently it includes PeptIdent, TagIdent and MultiIdent from ExPASy, MS-Fit from ProteinProspector, Mowse from UK Human Genome Mapping Project Resource Centre , ProFound from PROWL and PeptideSearch from the Bioanalytical Research Group at EMBL.
The aim in creating CombSearch was to allow researchers to fill in only one form and query several programs and databases.
What CombSearch is NOT?
CombSearch is NOT a new program for searching databases. It only combines the necessary information in order to call several protein identification tools on the web. It also doesn't make any speculation as to the accuracy of provided data on search results.

How to use CombSearch.
- CombSearch query is composed of two pages.
  On the first page (phase 1) you are asked to enter the information you have on your protein, such as MW, pI, mass of peptide fragments after a digestion, AA composition, small sequence tag, etc. Then you press "Analyse".
  CombSearch looks through all the data and determines which program can be called. It will then present you with a second page (phase 2) where all the programs supported by CombSearch are listed. On this page you will see a marked checkbox next to the program name, if there is enough information to call it. Minimal data requirements are also listed here. At this point it is possible to select some program specific options, for example the database to be used. It is also possible to deselect the programs you don't need. Once you are satisfied with your choice click on "Query now" to perform it.
  The last page will give you a summary of programs you have selected to query and whether they could be successfully contacted. CombSearch has to call all the programs and depending on Internet traffic and individual engine workload it may take some time to get results, so be patient. Waiting times of up to 5 minutes are quite normal if you query all the programs.
- Results.
  Responses from each engine are sent separately by e-mail and the results will contain the name of your query and the name of the program. We strongly suggest you filling in the "Sample name" field otherwise you could easily confuse received responses without labels in your mailbox. Depending on the network traffic and workload of programs, e-mails will arrive with some delay.

Detailed interface description.

CombSearch first page (phase 1)
This is the main page which requests you to fill in an interactive form with necessary data for initialization of your query. In the table below you will find a description of the different fields and required data.

Field name:	Description:	Required by, Used by:	Value limit (default):
Sample name	Name for your query	PeptIdent, MS-Fit, Mowse, ProFound, MultiIdent, TagIdent.
Organism Species or Organism Classification	Used for OS/OC field of SWISS-PROT db or converted to corresponding fields of MS-Fit and ProFound.	PeptIdent, MS-Fit, ProFound, MultiIdent, TagIdent.
Keyword	Used for KW field of SWISS-PROT db.	MultiIdent, TagIdent.
Protein Mass (MW)	Mass of whole protein in [Da]. Leave empty to cover whole possible range.	ProFound, PeptideSearch, PeptideSearch Edman, TagIdent. PeptIdent, MS-Fit, Mowse, MultiIdent.	0 - 1000000
MW range	Range of MW in percents.	ProFound, PeptideSearch, PeptideSearch Edman, TagIdent. PeptIdent, MS-Fit, Mowse, MultiIdent.	0 - 20 - 100 %
pI (isoelectric point) Min-Max	Isoelectric point of the whole protein. Leave empty to cover the whole range.	TagIdent. PeptIdent, MS-Fit, MultiIdent.	0 < pI < 14
Peptide Mass Fingerprinting
Cleavage agent	Enzyme or chemical reagent used to produce peptide fragments.	PeptIdent, MS-Fit, Mowse, ProFound, PeptideSearch, MultiIdent, PeptideSearch Edman.	Arg-C Asp-N Glu Chymotrypsin CNBr Glu-C (E) Lys-C Trypsin V8-bicarb (E) V8-phosph (E,D)
Allow for # missed cleavage sites (MC)	Number of sites that are not cleaved by the enzyme or the chemical (thus producing incorrect size fragments).	MS-Fit. PeptIdent, ProFound, PeptideSearch,	0 - 1 - 5 ?
Cysteine	Modifications of cysteine, such as iodoacetylation, performed during sample preparation/analysis. If the program does not support a modification free cysteine will be used instead.	PeptIdent, MS-Fit, ProFound, PeptideSearch, PeptideSearch Edman.	Acetamidomethyl-Cys Acrylamido-Cys Aedans-Cys Aminoethyl-Cys Benzyl-Cys __-dimethyl-Cys Carboxymethyl-Cys Carbamidomethyl-Cys Cys Cysteic Acid-Cys Pyridylethyl-Cys S-cysteinyl-Cys S-farnesyl-Cys S-palmityl-Cys
Methionine	Whether methionine is oxidized or not.	PeptIdent, MS-Fit, PeptideSearch, MultiIdent, PeptideSearch Edman.	Normal Oxidized
Number of peptides fragments required for protein match:	Minimal number of matching peptide fragments to identify a protein.	PeptideSearch. PeptIdent, MS-Fit	0 - 5 - 20
Ion mode	Whether peptide fragments are in protonated or neutral form during the MS analysis.	PeptIdent, ProFound, PeptideSearch, MultiIdent.	Neutral [M] Protonated [M+H]⁺
Isotopic resolution	Whether monoisotopic or average masses were measured for peptide fragments on MS.	PeptIdent, MS-Fit, ProFound, PeptideSearch, MultiIdent, PeptideSearch Edman.	Average Monoisotopic
within mass range	Accuracy in measuring peptide fragments masses in [Da] or ppm	PeptIdent, MS-Fit, Mowse, PeptideSearch. ProFound, MultiIdent.	< 5 [Da] or > 0.0001 ppm
List of peptide masses	Put here a list of peptide masses separated by spaces or newlines. All characters other than numbers will be ignored.	PeptIdent, MS-Fit, Mowse, ProFound, PeptideSearch. MultiIdent.	> 1 [Da]
Single AA substitution / peptide	How many mutations are allowed by peptide, minimal number fragments with no AA substitutions required for a match and which way the shift in fragments mass is allowed. All these fields are for MS-Fit, please check their page for a more detailed description.	MS-Fit.
Min. number of matches with no AA substitutions			0 - 1 - 4
Peptide Mass shift
Amino Acid Composition
AA names	AA composition in molar percent. These fields are MultiIdent specific. If Gly field is filled, constellation 2 is used, otherwise constellation 4 is used. For more information check MultiIdent site.	MultiIdent.	the sum of all AA must not exceed 100%
Tagging
Peptide Mass (neutral) within mass range.	Mass of the peptide fragment used for Edman sequencing and its accuracy.	PeptideSearch Edman.	0 - 2000 < 5 [Da] or > 10 ppm
Peptide sequence tag	Sequence of the peptide tag. All valid AA names are allowed and "x" can be used to designate any AA. MultiIdent and TagIdent only work with tags of 6 AA or smaller.	PeptideSearch Edman, TagIdent. MultiIdent.	PeptideSearch Edman tag = any length MultiIdent, TagIdent tag <= 6 AA.
Search for all possible tag permutations	Allows permutation in AA tag order	MultiIdent, TagIdent
Display only the sequences matching the tag.	Display a sequence in found proteins matching the tag.	TagIdent
Include scan of your tag against all fragments SWISS-PROT/TrEMBL	Allows you to perform a search of all proteins matching your tag and having the OS/OC field you specified.	TagIdent
Results format
Display the predicted N-terminal sequence of the protein	When checked, returned result will include a short (40 AA) sequence of the specified region of the protein.	MultiIdent, TagIdent.	N-terminal sequence C-terminal sequence (internal) area
Number of results returned	How many results you want do be displayed.	PeptIdent, MS-Fit, ProFound, PeptideSearch, MultiIdent, PeptideSearch Edman.	0 - 30 - 100
Your e-mail	Please note that CombSearch checks the general validity of the e-mail, so be sure to provide your correct e-mail address.	MultiIdent. PeptIdent, Mowse, TagIdent.	Any valid e-mail address.

CombSearch second page (phase 2)
On this page you have a list of programs which can be used according to introduced data. The marked checkbox appears next to the titles of the programs which have sufficient data to function. There are also some fields for selection of additional program specific functions, which are summarized in the table below:

Field name	Description	Program
Database on which the scan should be performed	Database that will be used for your search by this engine.	MS-Fit, TagIdent
Search for	Specify whether your sample contains a single protein or a binary mixture.	ProFound
Type of Edman search	Specify the direction of the Edman sequencing.	PeptideSearch by sequence tag

CombSearch third page (Queries sent)
This page provides information of whether or not your data has been submitted successfully.

Program-specific features.
- PeptIdent.
  PeptIdent only requires a list of peptide masses and their error to function. The error can be either in [Da] or ppm and is passed directly from the CombSearch form. Peptide fragments types such as ion mode, isotopic resolution, oxidized methionine and cysteine modifications, are also supported. At the moment PeptIdent allows digestion solely by Trypsin, and cannot be used if you select an other cleavage agent. The only limitation of CombSearch compared to the original interface is the impossibility to have acrylamide adducts on cysteines with other cysteines modifications (such as iodoacetamide) at the same time. PeptIdent will also make use of protein mass, pI and organism species data if they are entered.
  PeptIdent, as all ExPASy tools, has its own e-mail server and will send you results directly.
- MS-Fit.
  Minimal requirements for MS-Fit are a list of peptide masses and their error and minimum number of peptides for a match. The error will be passed from CombSearch form as [Da] or ppm (% and mmu are not supported). You will be able to choose the database on the second page of CombSearch query. The species to be used will be translated from the "Organism species" field according to this table. Ion mode ([M]⁺ or [M+H]⁺), isotopic resolution, oxidized methionine and cysteine modifications will be passed from the CombSearch form. Following values will be set by default: DNA Frame translation = 3, N Terminus = Hydrogen, C Terminus = Free Acid, Possible Modifications Mode = Peptide N-terminal Gln to pyroGlu, Protein N-terminus Acetylated, Phosphorylation of S, T and Y, Report MOWSE Scores = yes, Pfactor = 0.4. Also, combined digestion such as "CNBr/Trypsin" are not supported. MS-Fit will use MW and pI data.
  MS-Fit does not have its own e-mail server, so the results will be relayed to you by CombSearch.
- Mowse.
  Mowse requires a list of peptide masses and their error to function. The error in "%" is calculated from the [Da] or ppm entered in CombSearch form by taking the value of the smallest entered peptide fragment mass. The value of Pfactor is fixed to 0.2 as in the original Mowse interface.
  Mowse has its own e-mail server and will send you the results.
- ProFound.
  This program requires MW and a list of peptide masses to function. ProFound will automatically generate an error range for peptide fragment masses if it is omitted or conflicting. The data about ion mode ([M+H]⁺), cysteine modifications, digest chemistry, is also supported. The taxonomic category will be translated from the "Organism species" field and you can select whether you search for a single protein or a binary mixture one the second page of CombSearch form. Unlike the original form, you cannot enter monoisotopic and average data at the same time. You also cannot enter multiple digestion data for the same sample.
  ProFound does not have an e-mail server, so the data will be forwarded to you through CombSearch.
- PeptideSearch.
  Minimal requirements for PeptideSearch are MW, a list of peptide masses and their error, and a number of peptide fragments for a match. The error in "%" is calculated by taking the value of the smallest peptide fragment mass and in relation to the data in the [Da] or ppm field of the CombSearch form. All the other data, such as ion mode, isotopic resolution, cysteine modifications and cleavage agent is fully supported as well.
  PeptideSearch does not have its own e-mail server, so the results will be relayed to you by CombSearch.
- MultiIdent.
  MultiIdent requires data about amino acid composition to function. CombSearch will pass all the information from the "Amino Acid Composition" fields. If you entered information for glycine (GLY) Constallation2 will be used, otherwise Constellation4 is used. For more information about constellations consult MultiIdent and AACompIdent pages. The data about MW, pI, OS/OC, keyword, tag <=6 AA, and peptide mass fragments is also used. All the enzymes, except for "Asp N + N-terminal Glu", and cysteine modifications are supported and the full range of peptide masses (from 0 to 5000) is considered. The list of best-matched proteins is 5 times the number of results to be reported. Finally, the results will report:
  "closest SWISS-PROT entries for selected species and keyword, having pI and MW values in the specified range, and within that: the AAComposition scores, the peptide mass fingerprinting scores and/or the integrated scores for AAComposition and peptide mass fingerprinting".
  MultiIdent, as all ExPASy tools, has its own e-mail server and will send you results directly.
- PeptideSearch by sequence tag (or PeptideSearch Edman).
  This program requires MW, a peptide sequence tag, its mass and error. The error is converted to "%" if necessary. The data about cleavage agent, cysteine and methionine state and mass type of the tag are taken from the "Peptide Mass Fingerprinting" part of the CombSearch form. You can select the type of Edman degradation on the second page of CombSearch . Please note that from the data above only the Edman type searches can be performed by PeptideSearch by sequence tag.
  PeptideSearch by sequence tag does not have an e-mail server, so the data will be forwarded to you through CombSearch.
- TagIdent.
  TagIdent requires data about MW and pI of the protein. It can also use a sequence tag if it is composed of 6 AA or less. The OS/OC field is generated from "Organism Species or
  Organism Classification". The database selection is made on the second page of CombSearch. All the other fields are fully implemented in the CombSearch form as well.
  TagIdent, as all ExPASy tools, has its own e-mail server and will send you the results directly.

Known limitations.
- They are all program-specific, please consult the section above.
FAQ

Please see this section before crying, blowing your top, stop using CombSearch (WHAT?!), slamming your computer, sending <flame> e-mails (check all that applies).
- Q: Why can't I select feature X of the search engine Y in CombSearch?
  A: It is a good question. Basically there are two reasons: real estate of your screen and simplicity. We tried to put as much information as possible in only one page without sacrificing function, so some program specific fields were left out. If you feel that some feature should be present send us an e-mail with the program name and the feature you would like to see in CombSearch. Also make sure that the feature is not already present, only under an other name. For that check Program-specific features on this page.
- Q: Why CombSearch does not support engine X?
  A: Another good question. The answer is again screen size and simplicity. As far as we know CombSearch is the first attempt (well, second if you count MultiIdent) to create a unified interface for several protein identification tools available on the web. All programs are created by groups in different places, the challenge was to find similar parts for all of them, therefore we had to leave some tools out. Again, if you feel that some particular engine should be in CombSearch send us your suggestions.
Acknowledgements.
- The authors of this project would like to express gratitude to the following people:
  - Ron D. Appel for coming up with the idea of this project.
  - Elisabeth Gasteiger and Pierre-Alain Binz for helping us to understand various aspects of protein identification techniques using computers, designing CombSearch interface, providing help on query strings, beta testing and bearing with us for all this time.
  - Bertrand Ibrahim and Weijiang Wang for their help with TCL and Internet programming.
- Remi Hammerli would like to personally thank:
  - all those who had to do without his presence during the many hours spent on this project.
- Pavel Dobrokhotov's gratefulness goes to:
  - My parents and my brother whose care helped me every day to concentrate on my studies. I always wondered how they could admit that flying through Andes at full afterburner and listening to MODs and S3Ms was part of the project.

ExPASy Home page

Site Map

Search ExPASy

Contact us

Mirror sites:

Help

Introduction.

How to use CombSearch.

Detailed interface description.

Program-specific features.

Known limitations.

FAQ

Acknowledgements.