In this chapter, a survey of literature related to biometric indexing is reported. The chapter is organized as follows. Various iris data indexing techniques will be reviewed in Section 3.1. Section 3.2 will cover an in-depth discussion about the fingerprint biometric data indexing. Work related to face biometric data indexing will be presented in Section 3.3. Section 3.4 will review the work on multimodal biometric data indexing. Finally, the chapter will be summarized in Section 3.5.
In the last few decades, iris recognition algorithms for verification [19, 21, 20, 58, 59, 45, 88, 22, 74, 46, 33] have been extensively investigated. However, work on iris data indexing have been scarcely reported. All the existing work on iris data indexing techniques [83, 66, 67, 61, 62, 26, 78, 69, 71, 40] can be divided into two categories: iris texture-based indexing and iris color-based indexing. In iris texture-based indexing, the index keys are generated from the iris texture whereas color of iris is used in iris color-based indexing. Detailed descriptions of these techniques are given in Section 3.1.1 and 3.1.2 .
A coarse iris classification technique using fractals has been proposed by Yu et. al. [83]. This technique classifies iris images into four categories. In this method, an iris image is segmented into sixteen blocks. Among these blocks, eight blocks belong to the upper part and remaining blocks to the lower part of the iris. Then, the fractal dimension values are calculated from the image blocks and the mean values of the fractal dimension of the upper and the lower parts are used to classify iris images into four categories.
Mukherjee and Ross [67, 66] develop two methods for iris data indexing. In first method, IrisCode feature [21] is calculated using Gabor wavelet. IrisCode features according to their method are as large as of 2048 dimensions. To reduce the dimensionality of the feature vector, row/column/block averaging and Principal Component Analysis (PCA) methods are used. Then k-means clustering approach is applied to partition the reduced dimension iris code into multiple groups. In other technique, an iris texture is divided into a number of blocks. For each block, the histogram of signed differences pixel intensities (of similar positioned pixels in the block and adjacent blocks) is calculated. This method is known as Signed Pixel Level Difference Histogram (SPLDH) analysis [67, 66]. Like the IrisCode method, this method also deals with features of around 256 dimensions.
Local Binary Pattern (LBP) [68] is used in another iris data indexing method proposed by Mukherjee [66]. Author divides the normalized iris into a number of blocks and computes block-based LBP values. Finally, k-means clustering is applied to group the iris data.
Mehrotra et al. [62] use Scale Invariant Feature Transform (SIFT) [55] method to extract texture features for iris data indexing. This method finds a number of key points. Then high dimensional (124 dimensions) SIFT features are extracted for each key point. The key points are indexed using geometric hashing [81]. This indexing method also deals with a large number of key points (≈ 100 key points) with high dimensional feature vector.
In another work, Mehrotra et al. [61] propose an iris data indexing technique based on Discrete Cosine Transform (DCT). In this method, the normalized iris image [61] is divided into sub-bands using multiresolution DCT transformation [82] and the DCT energy based histogram is calculated for each sub-band. Then, each histogram is divided into fixed size bins to group the iris images. To generate a global key for each iris image, they obtain the bin number for each sub-band and traverse all sub-bands in Morton order [65]. B-tree structure is used to store the iris data. At the time of probing, a key from query iris image is generated and the tree is traversed using the query key. Finally, the retrieved templates are compared with the query template to find the best match.
An iris indexing mechanism based on the context clustering property of the Burrows-Wheeler Transform (BWT) [8] is developed by Gadde et al. [26]. This method converts a normalized gray scale iris image to a binary image and chooses a horizontal n-bit pattern. Then the locations of these patterns are found in the iris image using BWT. An indexing scheme is defined based on the count of occurrence of the n-bit binary patterns in the normalized binary iris image. For this purpose, the image is divided into K segments vertically. After that the iris image is assigned an index with the segment number where the segment number has the maximum number of occurrences for the n-bit pattern. At the time of probing, the test image is assigned four indices corresponding to the top four segments with maximum counts of occurrence of n-bit pattern. Finally, the test image is compared only against those images in the database that have these 4 indices as their index, starting with the first index.
In another work, Rathgeb and Uhl [71] utilize biometric hash generation (HG) [76] for iris indexing. The basic idea of this method is to generate hash value from the different parts of an iris image. In this approach, the normalized iris image [71] is divided into several blocks and an n-bit binary hash is computed using HG [76]. Then an n-bit Karnaugh Map (KM) [79] is constructed to store the iris data. A pointer to each iris template is stored at the according node of the KM. At the time of querying, the extracted hash is mapped onto the stored KM. If a node points at one or more iris templates, the extracted iris-code is matched against those. This procedure is repeated recursively for neighboring nodes until identification is yielded or a certain search depth is reached.
Apart from the above texture-based iris data indexing techniques, Vatsa et al. [78] develop another indexing scheme which consists of two phases. In the first phase, the method uses Euler code [7, 77] to generate a small subset of possible matches. In the next phase, it uses 2ν-SVM [17] match score fusion algorithm to find the best matches from the list of possible matches obtained in first phase using textural and topological features of the iris images.
Few iris color-based indexing schemes have been proposed in the existing literature. Fu et al. [24] make the first attempt to classify iris images using color. This technique is based on artificial color filtering. All irises images are divided into nine classes and an artificial color filter [15] is designed corresponding to each class. For an iris image, all nine filters are applied to the image and the class is assigned corresponding to the filter which gives the maximum response. The artificial color filter trains with a discriminator that assigns a pixel either to the class of interest or to some other classes.
Another attempt to index iris images using color has been made by Puhan and Sudha [69]. This method also refers group based color indexing scheme which relies on the natural iris color. In this technique, the iris color image is converted from RGB space to YCbCr color space and two types of color indices, namely blue and red indices, are computed using Cb and Cr components, respectively. The range of values of red and blue color indices are partitioned into a number of groups. Depending upon the values of red and blue color indices, an image is assigned to one of these groups. During searching, for a query, a few groups from blue and red color indices are selected based on the blue and red color indices of the query image. The nearest identities are declared based on the intersection between these groups.
Jayaraman et al. [40] develop a hierarchical approach to efficiently retrieve the iris data from a large iris database. This approach uses iris color to index the database and iris texture to retrieve the iris images from the indexed iris database. First, a set of iris images is selected which are similar to the query by applying indexing technique. The index value is computed by averaging the intensity values of all red and blue color pixels. To store the iris data the kd-tree structure is used. Then, iris texture features of the selected images are used to determine the identity of the query image. The iris texture features are extracted through Speed Up Robust Features (SURF) [2] algorithm.
A summary of the different iris biometric-based data indexing techniques is given in Table 3.1.
Fingerprint based biometric data indexing has been extensively studied. Several techniques [6, 10, 11, 14, 13, 27, 30, 41, 47, 48, 49, 54, 51, 52, 57, 66, 73, 4, 86, 16, 18, 23, 37, 43, 53, 60, 80] have been proposed in literature for fingerprint data indexing. The existing fingerprint indexing techniques can be divided into two main categories based on the feature extraction methods. The first category is Minutiae-based indexing which uses minutiae points of a fingerprint to extract the indexing features and the second category is Ridge orientation-based indexing which uses orientation of the ridges and valleys of a fingerprint for index key generation. A detailed descriptions of some important fingerprint data indexing methods are given in the following.
In this category of fingerprint indexing techniques, the index feature vectors are extracted from either the minutiae description or the triplet structure form by the minutiae points. Germain et al. [27] present an algorithm to generate a list of candidate set using minutiae triplets from the database. This method uses two structures, called map and multimap [9], to compute all index score values. In enrollment time, all possible minutiae triplets from a fingerprint are extracted and the set of index keys from these triplets is created. The length of each side, ridge counts between each pair of sides, angles measures with respect to the sides for a triplet are used as index key features. In this method, a label is given to each index key and an entry to the multimap structure is added for the index key. In the querying process, a set of index keys is generated for a given query and used to retrieve the items from the multimap which are stored under the same index. Each retrieved item is represented by a hypothesized match between subsets of features in the query instance and the reference model instance that creates the item stored in the multimap. This hypothesized match is labeled by the reference model identifier. Further, a map structure is used to calculate the votes for these hypothesized matches and generate a candidate list.
Bebis et al. [4] propose a fingerprint indexing technique based on the Delaunay triangulations [5]. This method, first, generates all Delaunay triangles from the extracted minutiae points. The ratios of largest side with the two smallest sides and cosine angle between the two smallest sides of the triangle are defined as index feature vector. The method uses fingerprint retrieving method similar to that described by Germain et al. [27] to generate candidate set.
Bhanu and Tan [6] propose fingerprint indexing technique based on minutiae triplets. First, all possible triplets are created from the extracted minutiae points. Geometric features like angles, handedness, type, direction and maximum side are used for indexing purpose. In the querying process, this method searches for a number of matching triplets between query and all stored templates by applying some constraints on the geometric features of the triplets. The templates has been selected based on the number of matched triplets. Further, posterior probability of the query and the selected templates is calculated as index score and the index scores are sorted in descending order. Finally, candidate set is generated by putting a threshold on the index score.
Another Delaunay triangulations based fingerprint indexing approach is proposed by Liang et al. [49]. However, this approach uses low-order Delaunay triangulation [1, 29]. In this approach, first, authors find all minutiae points from a fingerprint and generate a set of minutiae triplet using low-order Delaunay triangulation [1]. For each triplet, minutia details of each vertex of the triangle, minimum and median angles in the triangle, triangle handedness, length of the longest edge in the triangle and the difference between angles of two edges are computed. These information are used as indexing features. In this way, a set of indexing features are generated from the all triplets. At the time of querying, first the matched triplets between the query and the stored templates are found by applying individual threshold on each feature. Finally, indexing scores are calculated between matched triplets and query triplet and based on a threshold value a candidate list is generated.
Ross and Mukherjee [73] suggest ridge associated features for calculating index vectors. This approach is also based on Delaunay triangulation [29, 1]. A set of triplets are derived using Delaunay triangulation and a feature vector is generated from each triplet. Then features corresponds to the geometry of the triangles formed by the triplets are computed. Further, the shape of the ridges associated with the three minutiae points constituting the triplet are estimated. The shape features are computed by fitting a quadratic curve to the ridges associated with each triplet. Once all feature vectors of all triplets are generated, these vectors are partitioned into a number of clusters using k-means clustering algorithm for indexing purpose. Querying performs by k-mean clustered search. The score value is computed from the correspondence between the fingerprint templates of the retrieved clusters and the query template.
Zhang et al. [86] introduce an indexing technique based on minutiae structure. The direction, coherence and curvature are extracted from each minutiae point as indexing features. In addition to this, an orientation vector is defined to characterize the each minutiae. To estimate the similarity between two minutiae points, an orderly sequence of these features are maintained. The similar minutiae points from two fingerprints are determined by comparing the direction, coherence, curvature and orientation features at the time of querying. An index score is computed which represents the possibility that two minutiae are correspondent based on not only their similarity but the similarities to other minutiae points. Finally, the fingerprint are sorted in descending order of the index scores and retrieve top fingerprints to generate candidate list.
In a recent work, Minutia Cylinder-Code (MCC) [12] based fingerprint indexing technique has been proposed by Cappelli et al. [14]. They represent spatial and directional relationships between a minutiae and its neighborhood structure with a minutiae cylinder [12]. Then each minutiae cylinder is encoded into a fixed-length binary feature vector. Hence, a fingerprint image is represented by a set of bit vectors. The bit vectors are indexed by means of Locality Sensitive Hashing (LSH) [36, 28]. In LSH, each vector is projected into a subspace with low dimension. For this purpose, they define a set of hash functions. These hash functions map a binary vector to a natural number in low dimension. They maintain a hash table corresponding to a hash function to store the feature vectors into the database. The feature vectors are stored into the hash tables based on the hash values. To retrieve the candidates from the database, the same hash functions are applied to each binary vector of the query template and the number of collisions with database vectors is counted using an accumulator matrix. The candidates are finally ranked according to the similarity distances of the query and the retrieved templates. The similarity between two MCC codes is measured by Hamming distance.
In this category, fingerprint indexing techniques use the ridge orientation to compute the index feature vectors. Lumini et al. [57] present an indexing scheme based on directional image which represents the local ridge and valley directions from a fingerprint image. They generate a multidimensional feature vectors from the directional image. Further, they apply Karhunen-Loève (KL) transform to reduce the dimensionality of the directional feature vectors. At the time of retrieving, they apply linear search with these low-dimensional features and generate the candidate list based on a threshold value.
Cappelli et al. [11] presents an fingerprint indexing method which uses dynamic masks for directional image partitioning. Directional image represents local average directions of a fingerprint ridge lines and is obtained by Chong et al. [18] method. Authors perform “guided” partitioning of the directional image according to the fingerprint topology. For this purpose they define a set of dynamic masks which are directly derived from the most common fingerprint classes (Arch, Tented Arch, Left Loop, Right Loop, Whorl, etc.). Each mask is characterized by a set of vertices which defines the borders of the regions and use to segment the directional image. Further, the application of these masks produces a numerical feature vector which represents each fingerprint as a multidimensional point. At the time of retrieval, the numeral feature vector is generated from a query image and used as an access key for similarity search. The searching is performed in the multidimensional space.
Lee et al. [47] aim for a fingerprint classification method based on a feature map. They use ridge features consisting of orientations and inter-ridge spacing within a relatively small fingerprint area to build the feature map. They divide the fingerprint image into a number of non-overlap blocks. The local ridge orientation and inter-ridge spacing are calculated for a block using the method proposed by Hong et al. [35] and each block is represented by a vector. They define feature map as a matrix, where each element represents a distribution of a feature vector. Each row and column of the matrix represents orientation and inter-ridge spacing values, respectively. The dimension of the feature map is reduced using well-known Principal Component Analysis [25] dimensionality reduction method. Finally, they compute the distances between query and all stored templates and generate a candidate list by putting threshold on the distance.
Another fingerprint retrieval framework has been reported by Jiang et al. [41]. Authors use orientation features as the main retrieval feature and the dominant ridge distance as an auxiliary retrieval feature. To compute orientation features, first, they estimate orientation field [35] of a fingerprint image which consists of a local dominant orientations in different local neighborhoods of a fingerprint image. Then they select a reference point and find its location and direction. Finally, they align the local orientations with respect to the reference point. The orientation feature vector is constructed by concatenating the aligned local orientations within a circle. Additional to this feature vector, they define the dominant ridge distance as the second mean of the local ridge distances. The local ridge distance is computed by measuring the distance between the center points of two adjacent ridges along a line perpendicular to the local ridge orientation using Hong et al. [35] method. They perform indexing using both continuous classification and clustering approach. In continuous classification, they sort the fingerprints in the database according to the dominant ridge distance. At the time of retrieval, first, they narrow the search space by selecting a subset of fingerprints whose dominant ridge distances are within a range centered at the query dominant ridge distance. Then, they use orientation vector to generate the final list from the reduced search space. In clustering technique, they use k-mean clustering algorithm to cluster the fingerprints into different groups.
Liu et al. [51] introduce a fingerprint retrieval based on two complex filter responses. In their approach, first, they compute the orientation field of the fingerprint image and find the location and orientation of some singular points (e.g. core or delta) in the fingerprint image. They apply two complex filters on the orientation field at the singular point and calculate the magnitude of the filter responses. Further, they align the filter response with the direction of the singular point and construct a numerical feature vector by concatenating the aligned magnitudes of the two filter responses. This feature vector is used as the global feature for fingerprint indexing. For a query fingerprint, the retrieval is done by measuring the Euclidean distances between the query and all stored fingerprints. Based on a threshold value they generate the final candidate list.
In another approach, Liu et al. [52] propose clustering of fingerprints using multi-scale orientation features and average ridge distance feature. First, they compute the orientation field by uniformly dividing the fingerprint into a number of blocks [42]. To compute the multi-scale orientation field, they construct a circular tessellation with non-uniform blocks and place the circular tessellation at some reference point (core or delta) of the fingerprint. They compute the orientation feature vectors from the each block of circular tessellation. In addition to this, they compute the average ridge distance of the fingerprint. They partition all fingerprints by applying k-means clustering with the orientation feature vectors. Further, they divide each partition into a number of bins based on the average ridge distance. At the time of retrieval, first they select the cluster using orientation feature vectors of query fingerprint. Further, they use average ridge distance to retrieve the fingerprint from the bins of the selected cluster.
In a recent work, Cappelli [10] proposes a fingerprint indexing approach based on vector and scalar fingerprint features. The both types of features are obtained from local ridge-line orientations and frequencies. To compute local orientations, they divide the image into a number of non-overlapping blocks and use traditional gradient-based technique [70] to estimate the local orientations. Once the orientations are available, they estimate local ridge frequencies at the same location. They also find the fingerprint core points in order to correctly align the fingerprint [43, 72] using local orientations. Then both the orientation and frequency images are aligned with respect to each core point and are downsampled. They compute the downsampled local orientations, corresponding strength of the orientations and downsampled local frequencies as feature vector for indexing. In addition to this feature vector, two additional scalar features are used. These two scalar values are the average frequency and the average vertical orientation difference of the non-downsampled orientation and frequency images. To compare two fingerprints, they calculate four score values. First score compares local orientations and strength, second score compares local frequencies, third score compares average frequencies and fourth score compares average vertical orientation differences of two fingerprints. The weighted combination of these four scores is used to retrieve the candidate set. At the time of retrieval, the combined scores are calculated between query template and all stored templates. The candidates similar to the query fingerprint are retrieved from the database by putting a threshold on the combined score.
Other than the above two categories, some approaches use different features for fingerprint indexing. For example, Li et al. [48] use three symmetric filters to measure different orientation structures of a fingerprint. Based on three symmetrical measurements they propose a fingerprint indexing approach. They design three kinds of symmetrical filters, which are the core type filter, the delta type filter and the parallel type filter, on the orientation image to map the different structures of fingerprints into three different feature spaces. They obtain three index vectors from these feature vectors. These vectors are reduced to a fixed length and rotated to set the direction of the cores as a vertical line. Authors generate three lists of candidates at the time of querying. The Euclidean distances between the index vectors of the query template and all stored templates are used to measure the similarity. Finally, they combine these three resulting lists using highest rank method to generate the final candidate set.
In another work, Gyaourova and Ross [30] develop a method to index and retrieve fingerprint images by utilizing match scores. Their method relies on comparing a fingerprint with a small set of reference images. First, they select a set of reference subjects and compute match scores against each reference subject. Then discretization function is applied on the match scores to convert them to a discrete domain. These discrete score values are used to generate an index key for a fingerprint. At the time of retrieval, they compute Hamming distance between query index key and all stored index keys. A candidate is retrieved by applying a threshold on the calculated Hamming distance.
The existing fingerprint data indexing techniques are summarized in Table 3.2.
In last few decades, extensive studies have been done in the field of face recognition algorithms. Several feature extraction methods are known for face-based authentication system. Nevertheless, retrieval of face biometric data with a large number of features stored in a huge pool of memory are scarcely reported. In the existing literature, very few work have been reported for face indexing to reduce the search space. We describe these techniques [50, 64, 44] in the following.
Lin et al. [50] propose an indexing structure to search the face from a large database. They compute a set of Eigenfaces based on the faces in the database. Then, they assign a rank to each face in the database according to its projection onto each of the Eigenface [75]. Similarly, they compute the Eigenfaces for a query and rank a query face. Finally, they select a set of faces from the database corresponding to the nearest faces in the ranked position with respect to each Eigenface of the query face. These selected faces are used for recognition.
A linear subspace approximation method for face indexing has been developed by Mohanty et al. [64]. They build a linear model to create a subspace-based on the match scores. A linear transformation is applied to project face images into the linear subspace. For this purpose, first, they apply a rigid transformation obtained through Principal Component Analysis and then a non-rigid affine transformation. An iterative stress minimization algorithm is used to obtain a distance matrix in a low-dimensional space and propose a linear out-of-sample projection scheme for test images. Any new face image is projected into this embedded space using an affine transformation.
Kaushik et al. [44] introduce a modified geometric hashing technique to index the face database. They extract features from a face image using SURF [3, 2] operator. They apply mean centering, principal component analysis, rotation and normalization to preprocess the SURF features. Finally, they use geometric hashing to hash these features to index each facial image in the database.
Other than the above indexing techniques, several dimensionality reduction techniques [75, 85, 56, 63, 87, 84, 56, 64] have been proposed for face identification system. But these techniques are not utilizing any indexing technique. Hence, the discussion of these techniques has been skipped. We have given a summary of the above mentioned face biometric based indexing techniques in Table 3.3.
Multimodal biometric data indexing is newly explored area in the field of biometric identification. Very few work have been reported in the literature. We give an overview of these techniques in the following.
Jayaraman et al. [38, 39] propose an indexing technique for multimodal biometric system using iris, ear, face and signature. They used kd-tree to store the feature vector of these traits. Before storing the data into kd-tree they normalized the features of each biometric trait and projected to a lower dimensional feature space. They applied PCA for dimensionality reduction. In this technique, more than 100 dimension feature vectors are used.
Cascading technique for multimodal biometric system has been developed by Hong et. al. [34]. They use face and fingerprint in their multimodal system. In this technique, exhaustive searching is performed with a single biometric trait to reduce the search space and final matching is performed with multiple traits into the reduced search space.
Gyaourova and Ross [32, 31] developed an indexing technique to reduce the search space for multimodal biometric system. They select a fixed set of images of each biometric trait from the database. They compute scores between an image and the fixed set of images for each biometric trait. Based on these scores they generate a fixed-length index codes for each biometric trait and store these index code into the database. To retrieve a set of candidates, they compute similarity between the probe index code and all enrolled index codes and put a threshold to select the candidates. They propose two fusion techniques which use the information from multiple modalities. In one fusion technique, they concatenate the index codes of different modalities and retrieve the candidate set. In another technique [32, 31], they retrieve the candidate set corresponding to the index code of each modality and then take the union of all retrieved candidate set. We summarize the above mentioned multimodal biometric indexing techniques in Table 3.4.
In this chapter, a survey on biometric data indexing techniques with iris, fingerprint, face and multimodal traits has been presented. From the survey of iris data indexing, it can be observed that the existing indexing techniques use either texture or color features for indexing. The texture feature-based techniques use high dimensions of features whereas the color features are with very low dimension. Further, the iris color is not stable in all times. Several indexing techniques are reported for fingerprint data indexing. However, the majority of these approaches use continuous classification scheme. In continuous classification, the fingerprint search is performed by either comparing the query fingerprint with all database templates based on index key values or comparing the query fingerprint with the templates until the match value satisfies some criteria like less than some threshold value. Although, the comparison between the query fingerprint and template is much faster than the fine matching. Moreover, the continuous classification only ranks the database templates according to their similarities to the query fingerprint. Some techniques use a large number of low dimensional index key for searching. Face-based indexing techniques are not well studied in the literature. However, there are several methods for dimensionality reduction of face features are available in the literature. But these methods are suitable when the database size is very large. Further, reported work for multimodal biometric data indexing deal with the high dimensional feature vectors of different unimodal traits. The dimensionality of the feature vector affects the searching time. In addition to this, cascading technique uses single trait for searching the database which does not take the advantages of other biometric traits at the time of searching. Finally, the multimodal data indexing approaches use different level of fusions which, in fact, affect the performance of a multimodal biometric identification system.
18.191.192.59