10.2 Random PPI (RPPI)

In order to implement PPI in Section 7.2.1, first, one must know how many skewers, K, are needed to be generated. This value of K is generally determined an empirical basis. Second, it also needs to know how many dimensions, denoted by q to be retained after dimensionality reduction. Third, despite the fact that PPI does not need the knowledge of the number of endmembers, p, to be generated, it does require a parameter t to be used to threshold PPI counts it generates to extract endmembers. Finally, it requires human intervention to manually select final endmembers from those data sample vectors whose PPI counts are thresholded by t. Most importantly, for PPI to work effectively, the skewers must also be generated in such a random manner that the skewers can cover as many directions on which the data samples are projected as possible. However, this practice also comes with a drawback, that is, the results obtained from different sets of same-number skewers are different. Consequently, it must be interpreted by human analysts who manipulate results to produce best possible sets of endmembers.

The RPPI presented in this section is originally derived from its earlier version, called the automatic PPI (APPI), developed in Chaudhry et al. (2006). It also inherits the original structure of PPI but remedies the above-mentioned drawbacks suffered in PPI. It does not require dimensionality reduction (DR) as does the APPI. More specifically, RPPI converts the disadvantage of using random vectors as skewers to an advantage that can surprisingly resolve all the above-mentioned issues except that it still needs to know the value of K that must be selected prior to PPI implementation. The idea can be illustrated by the coin flipping example described in the introduction. If each of the N trials conducted for the coin flipping experiment is considered as a skewer, the number of N trials is the same as the number of skewers, K. If we further interpret that the outcome of a head turned up is the outcome of a data sample vector r falling at one of two ends of a skewer, then the number of heads turned up in the N trials, nH, can be interpreted as the number of skewers on which a data sample vector occurred at one of their end points after orthogonal projection. This number is exactly the PPI count of the data sample vector r, img, defined by (7.1). Within this context, we can interpret that PPI using K skewers is a random experiment carried out by a random algorithm that makes use of K randomly generated skewers. In this case, a single run produced by PPI using one random set of skewers can be considered as a realization of such a random algorithm. The underlying assumption is that if there is an endmember present in the data, it should appear in every realization produced by PPI. If this is not true for one set of skewers used by PPI, a user who happens to generate this particular set of skewers may not be able to find this endmember after all. If PPI is implemented repeatedly using different sets of random initial endmembers, all the resulting realizations shall eventually converge to a common set, which is supposed to be the set of the true endmembers. As a result, the intersection of the final endmembers generated by all different runs should include all the true endmembers. PPI is then terminated when the final sets of endmembers produced by consecutive runs converge to the same set of endmembers. Accordingly, PPI automatically finds endmembers without appealing for any criterion to determine the p. PPI implemented in such a random fashion is called RPPI, which can be described as follows.

RPPI Algorithm

1. Let K be the number of skewers to be fixed.
2. Initially, set k = 0 and generate a set of K skewers, denoted by K(0). Run PPI using K(0) to produce an initial endmember set, img defined by a set made up of all data sample vectors with their PPI counts greater than 0.
3. For img generate a kth skewer set of K skewers, denoted by img, and run PPI using the K(k) to produce PPI counts for all data sample vectors and form an kth endmember set img.
4. Let img and img. If img, go to step 3. Otherwise, continue.
5. If img, go to step 3. Otherwise, the algorithm is terminated. In this case, img and the sample pixel vectors in E(k) are the desired true endmembers.

It should be noted that for a given set of K skewers, a cycle of implementation of steps 3–5 is considered as a single run for PPI.

There are four advantages of using RPPI over PPI. One is that RPPI automatically determines the actual number of endmembers, p. Second, the number of pixels corresponding to endmembers extracted by RPPI is generally much smaller than that extracted by PPI. Third, there is no need of performing dimensionality reduction. Finally and most importantly, RPPI does not use a threshold value to threshold PPI count of each data sample vector to produce endmembers as required by PPI. This is a significant advantage since determining an appropriate threshold for the PPI count is a key success for PPI. Intuitively, the higher the PPI count of a sample vector, the more likely the sample vector to be an endmember. Unfortunately, according to our extensive experiments this is generally not true, but it is true that endmembers must have their PPI counts at least greater than 0. This is the main reason that Sk in step 3 of RPPI is found by all data sample vectors r with img. In the ENVI's PPI this requires an appropriate selection of a threshold value and human involvement to manually determine the final selection of endmembers. The only disadvantage of implementing RPPI is its computational complexity resulting from multiple runs by repeatedly implementing PPI.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.19.30.232