Pagni, M.1, Falquet, L.1, Iseli, C.2, Jongeneel, V.2, Junier, T.3, Bucher, P.3
1Swiss Institute of Bioinformatics, Ch. des Boveresses 155 CH-1066 Epalinges, Switzerland
2Office of Information Technology, Ludwig Institute for Cancer Research, Ch. des Boveresses 155 CH-1066 Epalinges, Switzerland
3Swiss Institute for Experimental Cancer Reaserch Ch. des Boveresses 155 CH-1066 Epalinges, Switzerland
Contact Marco.Pagni@isb-sib.ch
High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits.isb-sib.ch).
The collection of tools available to query the Hits database is continuously increased and improved. New species, including mouse and rat, were added to the trEST and trGEN databases.
The authors thank the Swiss federal governement, the Ludwig Institute for cancer research and Glaxo-Smithkline for financial support.
Category Protein Databases