|
Back to Darwin's Home
Functions in Darwin that we think are better than the ones provided
by other systems:
- Local Alignments at Best PAM. This is a simple alignment
between two amino acid sequences, where we adjust the scoring
matrix (by adjusting the PAM value, or evolutionary distance between
the sequences). This gives a maximum likelihood alignment. Very
distant sequences are typically aligned better with this technique.
- Multiple Sequence Alignments / Probabilistic Ancestral Sequences.
Darwin can compute a MSA together with PAS. A PAS is a probability
profile of the possible amino acids in each position of the root
of the tree. The MSA derived in this way (and with the additional
heuristics by Chantal
Korostensky) is usually much better than the ones produced
by other methods.
- PAM distance between sequences. Darwin can compute, statistically
sound, distances between sequences, including their variance.
This is also used to build phylogenetic trees. Darwin's algorithm
for building trees based on distance matrices (optionally with
variances) is very good.
- Mass profile. We call mass profiling the searching of
a protein database based on the digestion of a protein and its
reading with a mass spectrometer. We do this for protein or for
RNA/DNA databases, with the appropriate conversions. A dynamic
programming search of a protein database given a fragment of a
protein and its partial weights is also available. (i.e. all/some
of the weights of subfragments of a fragment).
- Nucleotide-Peptide alignment. Darwin does a nucleotide-peptide
alignment which includes the genetic code table in the scoring
function. This has the advantage of allowing the detection of
deletions in the RNA/DNA that cause frame shifts and the clear
detection of introns.
- Least Squares, Best Basis. This is a purely statistical
facility. Darwin computes the best subset of variables that will
fit certain data. This is the full solution of the problem sometimes
called "stepwise regression" in statistics. It is usually a very
useful tool when given a large number of observed variables and
when we want to determine which are the ones best explaining certain
phenomenon.
Back to Darwin's Home
|