AP (Affinity Propagation) finds a set of exemplars, , in the dataset and assigns nonselected points to the exemplars. An exemplar is the representative of a cluster.
Two types of messages are exchanged between data objects or points; they are explained here:
Both r(i, k) and a(i,k) are initialized as 0 at the beginning of the algorithm:
, for s(k, k)
is initialized with the same value (typically defined with heuristic knowledge) for each point at the start and updated in the following description to recur the affection. s(i, k)
denotes the extent to which is suited to be the exemplar of . Here is a possible value for s(i, i)
to be set as a constant:
The index of exemplar, , which is for data point , is defined with the following formula:
arg max {a(i,k) + r(i,k), k = 1,…, N}
Given R = (r(i, j))
as the responsibility matrix and A = (a(i, j))
as the availability matrix, t represents the iteration counts, where a damping factor is set to depress numerical oscillations that might arise:
Take a look at the ch_05_affinity_clustering.R
R code file from the bundle of R code for the previously mentioned algorithms. The codes can be tested with the following command:
> source("ch_05_affinity_clustering.R")
Due to the massive number of images and other multimedia documents, the task to classify images becomes even harder than before. Unsupervised image categorization is frequently utilized by image and video summarization, or it just serves as a preprocessing step in supervised methods for classification.
One major issue related to unsupervised image categorization is to estimate the distribution of image categories. Further on, finding the most descriptive prototypes of the image categories is another main issue of image categorization.
Each image can be represented as a high-dimensional data instance, including features related to color, texture, and shape. The exemplar technique is applied here; it represents image categories by a small set of image or its fragments. Given exemplar concepts, the dimension of an image data instance reduces to a relatively small size and eases further processing. The measures applied here include the Chamfer, Hausdorff, and shuffle distances.
Natural categories of the dataset can be of various complex types; overlapping might be a frequent shape.
Unsupervised image categorization or classification is a clustering problem. Image clustering is to identify a set of similar image primitives, such as pixels, line elements, regions, and so on. Given the complex dataset, the recommended way is to use the prototype-based clustering algorithm. Affinity propagation algorithms can be applied to unsupervised categorization by finding a subset of representative exemplars.
18.116.90.59