|
Next:
Usage Up: The
Computational Biochemistry Server Previous:
Methods
In some cases, recognition of proteins can be done by fragmenting
the protein according to certain patterns and using the molecular
weights of the fragments as a trace. This method is not effective
to find the composition of an unknown protein, but it is effective
in locating an unknown sample if its sequence is recorded in a protein
database.
One of the ways of breaking a protein into smaller pieces according
to a certain pattern is by using enzymes which digest the protein.
For example, trypsin breaks a protein after every Arginine (R) or
after every Lysine (K) not followed by a Proline (P). AspN breaks
a protein before every Aspartic acid (D). A table of recognized
enzymes and their cleavage rules is given below.
The molecular weight of fragments can be found experimentally
by mass spectrometry methods to a good level of accuracy. More importantly,
these methods typically require very small samples in the order
of fractions of pico-moles.
The problem of identifying a sampled protein can be reduced to
digesting the protein with an enzyme, finding the molecular weights
of each of the pieces and then comparing this set of weights to
what would be obtained from the digestion of each protein in the
database. The process can be repeated with several different enzymes
to increase its selectivity.
The function MassSearch locates the best candidates in the SwissProt
database that would fit the given weights once digested by the given
enzyme. The function DNAMassSearch locates the best candidates in
the EMBL DNA database that would encode to a protein that would
fit the given weights.
This type of searching has been found particularly useful in the
following circumstances:
- To identify proteins when the amount available is very small,
for example as can be separated by 2D gels.
- To determine whether an unknown protein is already known in
the database before spending a significant effort in sequencing.
- To identify more than one protein which cannot be separated
by other means (this method has been successfully used to identify
two proteins which were digested together).
Increased precision in the searching is obtained when more than
one digestion is available. In general it is much better to perform
2 digestions with different enzymes (with half of the material and
hence at a slightly lower accuracy) than a single digestion with
all the material. The precision of the retrieval increases with
the number of digestions available.
Next:
Usage Up: The
Computational Biochemistry Server Previous:
Methods
CBRG
|