|
|
 |
 |
Querying With SRS
|
3.1 The
basics of Querying |
|
You have a question, you want to know how many entries there
are in one or more databanks, on a particular subject. As you will see
throughout this chapter, using SRS to ask that question means that you
only need to supply some of the details that you already know. SRS will
examine all the entries in that databank and return a list of entries that
contain your search term.
For example, using the term "Aldehyde reductase" as the basis for a
search against ENZYME, will return a list of entries in the databank that
are related to this enzyme. The query method you use will make a difference
in the amount of and type of information included in the results list.
Throughout this chapter you will learn about the different query methods,
the ways to formulate a query, and how to use the various query methods.
|
Picking
a Search Method |
|
There are several ways to search a databank with SRS. They
are listed here:
-
Quick Search
-
Browse Index
-
Standard Query
-
Extended Query
-
Expression Query
Deciding the query type that is best suited to the task at hand can seem
somewhat intimidating considering the options mentioned. A brief description
is given below for each search option to help you choose the best one.
You will learn how to use these in the next section.
Quick Search
The "Quick Search" option on the "Top" page (shown in figure 1), is useful
when you want to get a general idea about the type of and approximate amount
of information available for a subject. It works by searching all data
fields of type 'text' for the selected databanks. This is the fastest way
to generate query results; there are only three steps from selecting the
databanks to viewing the results. There is a possibility with using this
search method that the results will have only a cursory relationship with
the subject of interest.
FIGURE 1. The Quick Search Query.
To learn how to use the "Quick Search" method, please skip ahead now to
the "Doing a Quick Search" section.
Browse Index
Browse specific data field indices when you want to find out how many entries
exist for a subject that are available through that data field. Figure
2 shows the "Browse Index" form. The "Keyword" field, for example, is available
for many databanks; this field is used to make clear the relationship between
an entry and a particular subject.
FIGURE 2. Browse an Index.
If you are interested in the proteins involved in a subject you can search
the "Keyword" field of SWISS-PROT for that subject. You might not get all
the entries that are related to the subject, but this will give you a starting
point.
To learn how to browse the indices of a databank with SRS, please
skip ahead now to the "Browsing Databank Indices with SRS" section.
Standard Query
The more details you have about a subject, the better your chances of finding
only the entries that are interesting. The "Standard" query form allows
you to enter up to four separate search terms against up to four different
data fields at once, as you can see in figure 3. Using the "Standard" query
form you can search for, a phrase or expression in the "Description" field,
a specific "Organism", a "Date" range, and a particular "Author".
If you know of a scientist who published data about "breast cancer"
in "humans" at some time over a two year period, enter the details in the
query to get a list of related entries.
FIGURE 3. The Standard Query Form.
To learn how to query a databank with the "Standard" query form, please
skip ahead now to the "Using the Standard Query Form" section.
Extended Query
A variation on the "Standard" query form is the "Extended" query form.
It lists all the common data fields and allows you to enter a search term
for as many fields as you want, you must use at least one field.
The "Extended" query form, shown in figure 4, is similar to the "Standard"
query in that they are both forms that can use several search terms for
the query.
FIGURE 4. The Extended Query Form.
They are different in a many ways. For example, the "Extended" query allows
you to enter a search term for more than four data fields. This means that
you can further limit the results of a query.
The data fields in the "Standard" query form can be arranged in any
order that you want. The "Extended" query form lists the data fields that
are available in a static way.
The "Extended" form makes numeric range queries easier. The different
ways of searching for numeric range data are all displayed in a drop-down
menu. You only need to pick the one you want and enter the range details.
Finally, if a field has a limited number of search terms available,
like the "Division" field of EMBL. They will all be displayed in a group
in the "Extended" form, (see figure 4).
Expression Query
Using the syntax described for the SRS Query Language you can write your
query in the "Expression" box, found in the "Results" page, (see figure
5).
FIGURE 5. The Expression Query.
To learn how to query a databank using the "Expression" box please skip
ahead now to the "Doing an Expression Query" section. To learn more about
the SRS query language see the "SRS Query Language" chapter. |
Constructing
Search Terms |
|
You can enter a query in the "Quick Search" input field
of the "Top" page or you can use one of the query forms. The query forms
make searching a databank very easy because they set out in an ordered
way the data fields that you can search against. The "Standard" query form
lets you mix and match data fields to create your own query.
Completing an "Extended" query form is as easy as completing a
questionnaire where you answer only the questions that you want. Regardless
of the query option you have selected, you will still have to make a choice
about the way to construct your search term.
There are several ways to construct a search in SRS, they are
listed here:
-
Single-Word Search
-
Multi-Word Search
-
Number
-
Regular Expressions
Single-Word Search
When you search a databank for a single word in a single field you will
get, as the results of your query, a list of entries that match that word
in the selected data field. To increase the number of entries that are
returned you can search for the word in more fields.
For example, when you look for "reductase" in the "AltName" (alternate
name) field of ENZYME you will get all the entries that include "reductase"
as part of the name. If you also search for "reductase" in the "Description"
field, you will get a listing of entries that have the word "reductase"
in the "description" as well as those that have it in the "AltName" field.
The "combine search words with OR'" option must be used. There is more
information about query operators later. See figure 6.
FIGURE 6. Single-Word, Multi-Field Phrase.
The "Quick Search" option combines all the fields with a data type of text
using the "OR" combination operator.
There will be some overlap because some of the entries that include
the word in the "AltName" field will also include it in the "Description"
field (or some other field). Regardless of the number of times the query
hits an entry, SRS will include it in the results listing only once.
Note: Ranking of hits is not dependant on the number of hits for
an entry.
Multiple-Word Phrases
You can search for a phrase such as "Aldehyde reductase" in many ways,
depending on the search method and query operators you use. For example,
if you type "Aldehyde reductase", with quotes, SRS will search for the
complete string "Aldehyde reductase". But without the quotes, SRS will
search for entries that have both "Aldehyde" and "reductase" or the complete
string "Aldehyde reductase". You can make explicit the relationship between
the words by including an operator in the string, for example, "Aldehyde
& reductase" (AND), or "Aldehyde | reductase" (OR), or "Aldehyde !
reductase" (BUTNOT).
FIGURE 7. Multi-Word, Multi-Field Phrase.
If you are using one of the query forms and split the phrase into two textboxes,
the relationship needs to be made explicit by using one of the options
in the "combine searches with" drop down menu. It is set to "AND" by default.
FIGURE 8. "Combine Searches with" Drop-down menu.
Numbers and Regular Expressions
You are not limited to words and phrases when querying a databank with
SRS. You can search for numeric data like dates and you can search a databank
using a regular expression.
You can look for entries that have, for example, a sequence length
"SeqLength" (see figure 9), ":n" (less or equal), ":!n" (less), "!n:" (greater),
"n:" (greater or equal), to something. Here is how to use it.
-
12:15
-
Greater than or equal to 12 less than or equal to 15.
-
12:
-
Greater than or equal to 12.
-
!12:
-
Greater than 12 (but not 12).
-
:12
-
Less than or equal to 12.
-
:!12
-
Less than 12.
You can write a numeric query in the "Expression" box like this:
it tells SRS to look for all entries in EMBL that have a "SeqLength" that
is greater than 12 and less than 15.
The numeric range query is written using the SRS query language syntax.
You can also write this same query using the "Standard" or "Extended" query
form. See figure 9. When done in the "Extended" query form you do not need
to use the query language syntax.
The difference between doing the query with the "Extended" query form
and using any other query option is that the "Extended" query form lets
you select, from a drop-down menu, the significance of the values.
FIGURE 9. Numeric Data Query.
If you are unsure of the spelling of a search word you could use some combination
of characters along with regular expression characters and get a list of
matching entries as your result. For example, "/ca.*r/" will include "cancer"
and "carter", etc., in the results. You can also apply controls to the
regular expression that will limit the type of search it performs, thus
saving a lot of time for the query. You need to include the "/" character
at the start and end of the regular expression string.
You can apply the SRS query language wildcard to the search word if
you would rather not use regular expression syntax. Using the query language
the search term above becomes "ca*r". Read more about the query language
in chapter 8, "SRS Query Language".
|
 |
|