34 Applied Data Mining
them is the dot product of x and y divided by the L2-norms of x and y (i.e.,
their Euclidean distances from the origin). Recall that the dot product of
vectors
x
= [x
1
, x
2
, · · · , x
n
] and
y
= [y
1
, y
2
, · · · , y
n
] is
1
n
ii
i
xy
=
∗
∑
, the cosine
similarity is defi ned as:
CosSim
(, )
|| || || ||
xy
xy
xy
⋅
=
∗
(2.4.1)
We must show that the cosine similarity is indeed a distance measure.
We have defi ned that the angle of two vector is in the range of 0 to 180, no
negative similarity value is possible. Two vectors have an angle of zero if
and only if they are along the same direction but with possible different
length magnitude. Symmetry is obvious: the angle between x and y is the
same as the angle between y and x. The triangle inequality is best argued
by physical reasoning.
One way to rotate from x to y is to rotate to z and thence to y. The sum of
those two rotations cannot be less than the rotation directly from x to y.
2.4.2 Adjusted Cosine Similarity
Although the prejudices of individuals can be certainly amended by Cosine
similarity, but only to distinguish the individual differences between the
different dimensional cannot measure the value of each dimension, it would
lead to such a situation, for example, the content ratings by 5 stars, two user
X and Y, on the two resources ratings are respectively (1, 2) and (4, 5), using
the results of the cosine similarity is 0.98, both are very similar. But with
the score of X, it seems X don’t like these two resources, and Y. The reason
for this situation is that likes it more the distance metric is a measure of
space between each points’ absolute distance with each location coordinates
directly; and the cosine similarity measure relies on space vector angle and is
refl ected in the direction of the difference, not location. So the adjust cosine
similarity appeared. All dimension values are subtracted from an average
value, such as X and Y scoring average is 3, so after adjustment for (-2, -1)
and (1,2), then the cosine similarity calculation, -0.8, similarity is negative
and the difference is not small, but clearly more in line with the reality.
Based on the above exposition, computing similarity using basic cosine
measure in item-based case has one important drawback—the difference in
rating scale between different users are not taken into account. The adjusted
cosine similarity offsets this drawback by subtracting the corresponding
user average from each co-rated pair. Formally, the similarity between items
i and j using this scheme is given by