In the following equations xi is the ith component of vector x and yi is the ith component of vector y. xl and yl correspond to the lengths of proteins x and y respectively. xli and yli correspond to the frequencies of the ith aminoacids normalised by the lengths of proteins x and y respectively.
The similarity in amino
acid composition between x and y was taken from the following
equations.
One minus Euclidean distance:
(1)
One over standardised
Euclidean distance:
(2)
where = standard deviation for each amino acid i.
One minus cosine theta
distance (normalised Euclidean distance):
(3)
where and
One minus Manhattan metric
distance:
(4)
One minus relative entropy
(5)
Pearson's Correlation
coefficient:
(6)