| Pairwise and Multiple Sequence Alignments and Similarity Searches |
Alignments are fundamental to most bioinformatics methods. Good alignments can be used to infer the common ancestry of a number of sequences, to make predictions about function and structure and to identify conserved regions that may be of structural or functional importance.In this part of the practical we will encounter some basic alignment techniques using server-based sequence database search methods and multiple sequence alignment tools.
References
| Patterns and Profiles, Protein Motifs and Domains |
Introduction
In general protein sequences can be grouped in clusters. It is possible to identify and cluster groups of sequences that are maximally similar between them, and minimally similar to other clusters. These clusters of evolutionary-related protein sequences are called families.The sequence information from alignments of these clusters or families can be combined in order to search for function or for more distantly-related sequences (remote homologues). Evolutionary information from aligned homologous sequences combined in this way is known as a profile.
Another general property of sequences is that sequence similarity may be restricted to short stretches of sequence called domains and motifs.
The definition of these conserved sub-sequences depends on their size and function. Domains are stretches of sequence that appear as structural modules, often within many proteins. Motifs are short conserved sub-sequences that often correspond to active or functional sites. Motifs can be used to help predict function and in the identification of remote homologues.
In the second part of the practical we will go through several examples to illustrate the concepts of patterns, profiles, motifs, domains and families. Each of the exercises will be centered in the analysis of a specific sequence. The examples include results from database searches or from the application of a range of tools.
Tools and Databases
References