Chapter 7

Soft Biometric Attributes in the Wild: Case Study on Gender Classification

Modesto Castrillón-Santana; Javier Lorenzo-Navarro    SIANI, Universidad de Las Palmas de Gran Canaria, Las Palmas de Gran Canaria, Spain

Abstract

Soft biometrics has become an active field of research, as it provides useful attributes to assist in recognition systems. Its fusion with strong traits may serve to achieve reasonable recognition rates in less cooperative scenarios. These attributes can also be used to speed up database searches, or to describe an anonymous subject within a demographic group. Agreeing with recent research trends on the need to evaluate biometric systems using “in the wild” datasets, the current state-of-the-art in the emerging field of soft biometrics is presented, together with proposals and results on the particular problem of gender classification “in the wild”.

Keywords

Soft biometrics; In the wild; Gender classification

7.1 Introduction

A biometric trait may be defined as any living physical or behavioral characteristic that can be measured rather than being something possessed, memorized, attached, or injected. Biometrics is the science devoted to automatically recognizing humans based on common traits. Typical human biometric traits are fingerprints, face, iris, gait, DNA, etc.

This field is not necessarily restricted to humans, as similar traits may be present in other creatures. This fact is shown by emerging interest in animal biometrics [71,78], in particular, with applications for identification to aid conservation. For animals, the most frequently used traits are nose prints, DNA, and Retinal Vascular Pattern (RVP), or tail fluke in whales.

This chapter focuses on soft biometric attributes in the wild for humans. We, as humans, are required to prove our identity in our daily lives. Biometric traits used to identify people are assumed to be strong, as they are characteristics that present distinctiveness and permanence to differentiate clearly any two individuals. There is, however, additional or ancillary information that may be extracted from the same biometric traits, as for example, gender, race, mood, kinship, and apparent age from a face [68]. This extra information is not unique, comprising discrete attributes that serve to describe people in pre-defined and meaningful, non-overlapping categories or groups, i.e., to anonymously determine the broad characteristics of an individual [38,39,92,135]. Some attributes or descriptors may lack sufficient distinctiveness to distinguish between two individuals, but are valid for other purposes by providing several qualitative and personal descriptions of an individual [38,68]. These attributes are known as soft biometrics after Jain et al. [67], but have also been known as light biometrics [3] or semantics [107].

Among these semantic attributes, we can mention: gender, age, height, weight, ethnicity, body geometry, skin color, eye color, or hair color and length [135], and others that not every person necessarily has, such as the presence of scars, marks, or tattoos [59].

Different taxonomies have been proposed for soft biometric attributes in the recent literature. Klare and Jain focused exclusively on attributes extracted from the facial pattern [75]. Initially, Dantcheva et al. [39] distinguished facial, body, and accessory descriptors. In a more recent and deeper analysis, those authors have extended the attribute collection emphasizing ease of extraction, and low requirement of subject cooperation [38]. Their updated taxonomy considers four groups of attributes: demographic, anthropometric, medical, material, and behavioral.

Contrary to classical traits, soft biometric attributes present low variance across subjects, offering robustness when faced with low quality and resolution images [38,108]. This is illustrated in Fig. 7.1 presenting different captures in which the facial region to obtain individuals' strong identity information is often not available for various reasons, such as occlusion or distance. However, some soft biometric descriptors such as gender, hair length and color, and clothes color are easily extracted by humans in a natural way. They are useful to distinguish one individual from others [96], and may be obtained from low quality imagery [108]. Most attributes based on appearance are estimated using vision, but alternatives may be designed to obtain at least some of them differently, as an example, we mention a sensing seat to obtain body characteristics [65].

Image
Figure 7.1 Two captures containing people at different distance ranges. Even if facial details are not visible, different attributes are easily estimated by human observers, and serve to identify individuals.

As mentioned above, soft biometric descriptors are not distinctive by themselves, but in the literature their utility has been demonstrated in applications for different purposes such as for [38,39,68,108]: (i) soft biometric multimodal fusion for recognition, (ii) improving the performance obtained from (strong) biometric traits, (iii) searching for space reduction or indexation, and (iv) for demographic and user profiling. Below, each application is briefly described.

Multimodal fusion for recognition. The first mention of soft biometrics was by Jain et al. [67] who suggested that these attributes provide valid information to recognize individuals. Their lack of distinctiveness is certainly not homogeneous; some labels such as gender, ethnicity, and skin color are described as global by some authors [108]. Others have a higher discriminative power like leg thickness. However, this lack of distinctiveness may be overcome by considering multiple soft biometric descriptors that together can define a signature that is robust and more distinctive (and more computational demanding) [108]. Certainly, multimodal strong biometrics is well known as a standard boosting approach [105]. However, focusing strictly on soft biometrics, multimodality was adopted by Alphonse Bertillon in the 19th century to register criminals based on biometric, morphological, and anthropometric measurements. Bertillon created a procedure, Bertillonage, to categorize individuals, with the advantage of generally speeding up the search process [39,96,104]. Most of the characteristics used by Bertillonage are nowadays considered soft biometric labels. Currently, the fusion of multiple soft biometric facial attributes, with the possibility of missing ones, is being used for identification, exploiting visual attributes and a bag of soft biometrics [39,73,96], as well as extending the fusion to body soft biometrics [1]. The multimodal soft biometric approach provides a solution in less constrained scenarios where strong biometrics is not available in a non-intrusive way.

Improved classical biometric performance. Considering that soft biometric fusion allows identification, it may be expected that its fusion with one or more available strong biometric traits will be relevant under highly variable conditions. As mentioned above, the first application of soft biometrics, following Bertillonage, for recognition was by Jain et al. [67]. These authors studied, for the first time, the fusion of classical biometric traits with soft biometric descriptors. In fact, they combined gender, ethnicity and height, with fingerprint recognition, improving recognition accuracy by around 6%. Soft biometric descriptors aid recognition rather than being used for identification, serving to discriminate between groups rather than individuals [67]. Along these lines, it is indeed known that FBI makes use of some offline demographic information to check fingerprint recognition [96]. Similarly, social context information has shown to add useful information to improve face identification [11].

Results achieved with face verification and recognition have recently confirmed [74,117,135] that soft biometrics provides useful additional discriminatory information [4]. In particular, soft biometrics can improve or compensate for the performance of low quality captures of classical strong traits, as they can be applied to more realistic scenarios such as “at a distance” or “on the move” situations. In these scenarios, likely face images are usually captured with low resolution and poor quality [125,127].

Search space indexation. The Bertillonage system categorizes individuals, thus speeding up the search process. This feature may be adopted not only to reduce search time in large biometric databases [133], but also in image retrieval from surveillance video footage or multimedia content databases; all these tasks are time consuming. Therefore, the aim of using soft biometric descriptors is to leverage the human operator by constraining the search to people within a specific group [38,86,96], speeding up the overall process, and thus reducing the processing costs. It should be noted that soft biometric attributes have a semantic meaning for humans to describe anonymous people [6], such as in a typical re-identification scenario, where unregistered people have been captured by non-overlapping camera networks. The re-identification problem refers to determining whether a target person has been observed by any other camera of the system at any moment [19]. It is also worth mentioning the existence of uncommon or unusual features, which are useful for a limited set of users [94,130], as shown in [69,101] with features such as scars, moles, freckles, acne, and wrinkles.

Demographic user profiling. Demographic profiling is used in marketing and broadcasting to describe a market segment in terms of age range, social class, and gender (or even urban tribe [77]). Customer profiling, for example, requires knowing as exactly as possible the number and demographic attributes of people entering/leaving a shopping area. Currently, an important aspect is automatic audience analytics using existing solutions, such as Quividi1 and TruMedia,2 for applications related to market analysis or public security.

In fact, there is the potential for a wide range of applications for marketing purposes. Individual demographic profiles may be used for offline customer analytics but also online ones, where demographic patterns serve to adapt or even restrict (drink vending machines that detect age) the offer to the user. In this sense, available digital signages may update their content in real time to the audience, catching their attention more effectively [33]. The concept of a consumer profile involves the creation of a mavatar (marketing avatar) [62]. Thus, advertising can be designed to target consumers belonging to a group, or even to make personalized offers, like, for example, FaceDeals.3

There is an emerging field of proposals to identify with sufficient accuracy who makes the purchasing decision in a group based on demographic characteristics and the size of the buying group [111]. This involves recognizing kinship [80,81,131] or family members in a group [37], and there are even commercial solutions to provide marketers with the emotional reaction of customers (Kairos,4 emotient,5 or affectiva6), or even the emotional state in a social relation between two or more people [136].

The rest of the chapter is divided into three sections. In the second section, we argue for the need for in the wild scenarios to evaluate biometric proposals in situations closer to real world applications. The third section focuses specifically on the gender classification problem, presenting some selected datasets and an overview of recent proposals and results. The chapter ends summarizing the main conclusions.

7.2 Biometrics in the Wild

So far, computer vision solutions have reported relevant results in several challenging problems. Focusing on biometric systems, recent proposals have been able to solve different tasks with high precision. However, their adaptation to real scenarios is still far from achieving similar performances, which is probably partially due to the limitations and bias present in the datasets used for experimental evaluations [124]. Considering vision based biometrics, in most cases these datasets have been collected under laboratory conditions making use of a single sensor, using similar image resolutions and illumination conditions.

However, real world applications must cope with capture variations that are far different from controlled laboratory conditions. Different internal and external factors hamper progress to solutions to a chosen biometric task. Internal complications can be illustrated by considering, for example, the facial trait, which is of great relevance for soft biometrics. Face muscles allow great plasticity; the use of facial make-up may also significantly affect the facial appearance, all of which lead to modifications to the viewed area. All of these are parameters that will present challenging changes to be dealt with by any automatic system. In addition, external factors comprise variability in terms of illumination conditions, background, distance, and capturing device [92].

Biometric recognition in unconstrained settings (outdoors, at a distance, without cooperation), commonly referred as “in the wild”, imposes restrictions on the acquisition procedure. Thus, this fact increases the domain where biometrics can be used, representing a far more demanding task, as there is a decrease in discriminability of the information obtained. In this sense, people analysis in real scenarios presents a wider collection of challenging situations to cope with due to highly variable lighting conditions, occlusions, weather, pose, expression, illumination, background, and resolution, which severely affect the intra-class variations. As an example, lower quality captures among other alterations degrades computer vision solutions' performance during biometric processing [5,54,57].

These observations added to the current large deployment of cameras is increasing the amount of available data captured by millions of sensors of a wide range of characteristics. This can provide information that may be analyzed to extract useful and valuable data, justifying the interest in tackling real world scenarios with biometric systems. For this purpose, it is necessary to build benchmarks that allow the unbiased evaluation of research proposals. These datasets must therefore contain wider and more challenging imagery closer to real situations as a test-bed for the development of new robust biometric solutions.

To sum up, there is an increasing interest in building datasets that reduce experimental bias [124], including ones with larger variability to better evaluate current solutions in situations closer to real world scenarios. This can be illustrated by the facial recognition problem. This is a field that has progressed far beyond still image recognition in controlled imaging environments. Face recognition “in the wild” must tackle face appearance that is evidently affected by internal and external factors such as age, pose, illumination, and expression (A-PIE) [44], and by artificial modifications due to plastic surgery [91], or hormonal treatments [90]. All these modifications may affect biometric recognition. Some efforts have already been made as, for example, the gathering of the Labeled Faces in the Wild (LFW) dataset [64]. This dataset contains face photographs designed for studying the problem of unconstrained face recognition. It is currently used as the reference benchmark for face recognition systems.

State-of-the-art face recognition with LFW is currently reporting accuracy of 99.5% [84]. Does this mean that face recognition “in the wild” has now been solved? The answer is no [76], suggesting the limitations of this particular dataset, that though showing a step forward, still seems to be far from truly wild conditions. The dataset lacks these full wild conditions, e.g., it includes high-quality images, avoids severe occlusions [44], all of which benefit recognition. In the best cases, wild or non-ideal conditions have been taken into account when building a dataset by diving into Internet photo albums, with the advantage of avoiding illumination, pose, sensor, and partially good resolution limitations. However, these images have been artificially selected to be contained in an album. Therefore, their processing results will certainly be affected by imagery that results in some quality and conditions being introduced by human filtering. This circumstance produces an undesirable effect when some solutions are over-fitted to restricted conditions, and therefore cannot be generalized to other unrestricted scenarios. However, data availability in wilder conditions is increasing, in line with community interest, although substantial effort in expensive annotation is needed. Below, is a list of some, though not all, of the initiatives that have been recently proposed for in the wild challenges and for different biometric related problems:

• Labeled Faces in the Wild (LFW) [64],

• Emotion Recognition in the Wild Challenge (EmotiW) [42],

• Acted Facial Expression in the Wild (AFEW)/Static Facial Expression in the Wild (SFEW) [113],

• Kinship Face in the Wild (KinFaceW) [80],

• International Workshop on Biometrics in the Wild 2015 [8],

• 300 Faces in the Wild [123],

• Quis–Campi [98,119], biometrics in a surveillance scenario.

In the next section, we will specifically focus on one of the global soft biometric attributes that have received the widest attention by the community – gender.

7.3 Gender Classification in the Wild

This section aims to provide an overview of the most relevant solutions related to gender classification (GC) in the wild. It provides a comprehensive starting point for those who are considering approaching this research field.

According to Bruce and Young [15], gender is one of the first attributes extracted from the face by human facial analysis. Indeed, it is obtained before face recognition. In computer vision, gender is probably the soft biometric attribute that has received the most attention. It is of great interest for different applications as well as for the recognition purposes mentioned above, since it is an important variable for social interactions, and commonly studied by marketers.

The GC problem is a bi-class task, which is performed effortlessly with high precision by humans. First attempts on GC using biometrics were based on geometrical features [13,45], but the current standard appearance-based approach was adopted by SexNet in the early 1990s [55].

Nowadays, GC is an active field of research. It should be noted that in 2015, the National Institute of Standards and Technology (NIST) edited for the first time a Face Recognition Vendor Tests (FRVT) report devoted to the problem [97]. This is the result of the recent attention received by the field and is reflected in different surveys [18,93,112], and recent proposals.

According to Dantcheva et al. [38], automated GC has been accomplished by making use of different biometric traits: face, iris, fingerprint, body, hand and speech. Among them, the most useful traits from a distance are face and body. As mentioned, most state-of-the-art GC solutions are based on the facial pattern in the visible spectrum, as suggested by recent surveys and leading proposals [6,7,9,23,35,66,93,97,99,129], but there is increasing interest beyond static 2D visible spectrum face-images to consider 3D data, as well as thermal and near-infrared face images [24,92].

In the standard GC approach, after detecting the biometric trait, the face in most cases, gender information is obtained in two stages: feature extraction and then classification [38,112]. The former extracts features from the biometric trait, the latter assigns the extracted features to one of two classes: male or female. However, many proposals also include, after trait detection, a normalization step. To illustrate this approach, let us consider the face as the input trait. Following face detection, some techniques extract features directly from the detection container, whereas others perform a light 2D normalization that fixes some main facial elements, such as the eye location, comprising 2D translation, scale, and rotation. Another more sophisticated and expensive 3D normalization, also referred to as frontalization, may involve 3D rotations, and is generally based on multi-fiducial points creating an Active Appearance Model (AAM) or similar.

Some authors have pointed out that observing only face patterns may lead to both restrictions of application, as an almost frontal face is needed, and perception errors or illusions [110]. Non-facial cues seem to be particularly important in real scenarios where degraded, low resolution, occluded or noisy images are present [120] rather than in a typical experimental setup with large, high quality facial images [70,120].

Indeed, the integration of non-facial features is consistent with the human perception that is able to perform GC, making use of just non-facial cues or hybrid approaches [16,17,20,70,126]. Among non-facial areas, we may mention external facial features [12,23,73,85,114], the individual clothing [21,47,82], the body [12,20,32,58,121], or a combination of these with other cues [63].

Of great interest in surveillance scenarios are methods based on the full body. However, even including wild conditions, the reduced size datasets used so far makes them very limited compared to facial-based ones. There is also the additional problem of frequently requiring the whole body view, a circumstance that is not always present. Despite these setbacks, body-based approaches are summarized in the next paragraphs.

Full body frontal and back views are used by Cao et al. [20]. The authors designed a part-based approach evaluated sing a subset of the MIT pedestrian dataset (888 images), achieving slightly better results for frontal images, around 76% accuracy. The reduced dataset dimensions indicate the difficulties in building a large series of benchmarks. The results are therefore not easily generalized. Back facing video sequences have been studied by Tan et al. [128] who designed a pyramid segmentation approach to process the sequence, achieving an accuracy of over 92% in a dataset containing 720 videos of 60 individuals.

The recent availability of low cost RGBD sensors provides the possibility of performing the task with 3D data. Linder et al. [87], instead of considering frontal body views, made use of RGBD data including side and back views as well. They built a dataset with 118 individuals using 3D point cloud data to achieve better accuracy, close to 90%, higher than appearance-based alternatives.

In the following subsections, we will give more details on face-based GC, first introducing the most used in the wild facial datasets for GC, and then summarizing other interesting proposals and results.

7.3.1 Datasets

GC in restricted datasets, i.e., those that have been collected under well-constrained environments, has reported very high classification rates that unfortunately have not been generalized to other independent datasets. This conclusion is evidenced by the GC results achieved by the Face Recognition Technology (FERET) dataset [102], a database containing 2413 high-resolution images of 856 individuals, see Fig. 7.2. Even though FERET was created to evaluate facial recognition, it has also been used to evaluate facial GC, as suggested by the Mäkinen et al. survey [93].

Image
Figure 7.2 Sample image from FERET dataset, original resolution 256 × 384 pixels.

Although several tests have achieved almost perfect performance for FERET, the resulting classifiers are not able to maintain a similar performance in unrestricted image collections, evidencing that the dataset bias advantage produces an optimistic GC performance [9]. In fact, different authors have suggested that correct classification rates of such classifiers trained with FERET drop substantially when tested with wild datasets, see, for example, Erdogmus et al. [43] who showed that after almost perfect GC on FERET, only 65.2% was achieved on LFW.

In this sense, recent GC work has shifted towards more challenging and unconstrained viewing conditions, following a similar development in facial recognition research, suggesting the community interest in evaluating GC in the wild, i.e., uncontrolled conditions related to capture (illuminations, sensor, etc.), and subject. This situation has also been remarked in the FRVT report, where large in-the-wild datasets mean greater variability in terms of (i) identity, age, and ethnicity, (ii) pose and illumination conditions, and (iii) image resolution.

In addition to the LFW and GROUPS in the wild datasets considered in the FRVT report, we will also briefly describe PubFig and Adience, and later mention other datasets referred to in the literature:

• Labeled Faces in the Wild (LFW) [64]. As mentioned above, this dataset was created collecting photographs to study the problem of unconstrained face recognition, which nowadays is the standard benchmark for this purpose [84]. The dataset contains more than 13,000 images of 5749 individuals collected from the web. Among them, 1680 have two or more photos in the dataset. Related to GC, we can mention some characteristics: (i) it contains several samples per individual, (ii) both classes are not balanced, and (iii) the inclusion of public figures introduces a selection bias [124]. As argued by Baluja and Rowley [14], GC results are biased when the same individual is present in both the training and test sets.

• Public Figures Face Database (PubFig) [72]. The dataset contains 58,797 images of 200 individuals taken in completely uncontrolled situations with non-cooperative subjects, presenting large variation in pose, lighting, expression, scene, camera, imaging conditions and parameters, etc. However, we are unaware of any previous work which used this set for GC, probably due to the low number of different identities.

• The Images of Groups (GROUPS) [51]. This dataset was created to study social interaction with images present in Flickr with more than one individual. The complete dataset contains 5080 images with 28,231 annotated faces in terms of gender and age group. According to the FRVT report [97], this database is currently the hardest for GC.

• Adience [41]. The authors claim to have included a far wider range of challenging real-world imaging conditions, compared to other datasets such as LFW or PubFig. The dataset comprises changes in appearance, noise, pose, lighting and more, without careful preparation or posing. Once again the sources are Flickr albums. The dataset includes 26,580 samples belonging to 2284 individuals.

The various dataset statistics are summarized in Table 7.1, and some dataset images are shown in Fig. 7.3, excluding PubFig, that, as far as we know, has no GC results reported.

Table 7.1

In the wild datasets statistics

Benchmark Faces Individuals Female Male
LFW 13,233 5749 2977 10,256
GROUPS 28,231 14,503 13,626
Adience 19,487 9411 8192
PubFig 58,797 200

Image

Image
Figure 7.3 Sample images from The images of Groups [51], The Labeled Faces in the Wild [64], and Adience datasets, respectively. Their respective original resolutions are 391 × 293, 249 × 249, and 816 × 816 pixels, suggesting a relevant difference in the facial pattern resolution.

As well as to the 2015 NIST report, the recent survey by Santarcangelo et al. [112] included the following alternatives in their list of unconstrained datasets for GC: ClothesDB [50], Genki-4K [53], and KinFace [122]. The limited number of samples contained in them, 931, 3000 and 600, respectively, compared to GROUPS, LFW, and Adience, led us to exclude them in the proposals summarized below.

We would also like to mention the dataset created by Satta et al. [114] to evaluate GC in children. Their aim was to tackle the difficulties, even for humans, to classify gender among children due to their lack of many gender-specific adult face traits. They have collected a dataset making use of standard face detection techniques, integrating contextual features to boost classification accuracy.

A final remark may be made on the influence of ethnicity and age in GC [10,52,56]. The former may be analogous to the human “other-race” effect, which is explained by psychologists with the “contact hypothesis”, that highlights the difficulties in extracting demographics from faces of other races [31].

Similarly, automatic systems might be positively affected by an ethnicity-balanced training set, or the alternative would be to create ethnicity-specific gender classifiers [48]. In any case, balanced datasets in terms of ethnicity have rarely been considered in the literature for GC. We should mention the EGA (Ethnicity, Gender, and Age) dataset [109], originally created for face recognition purposes, which has recently been evaluated for GC [25,26]. EGA has been created with images from other known face datasets. These samples have been chosen to reduce the distortions due to pose, illumination, and expression (PIE), to concentrate more on demographics. Unfortunately, these PIE restrictions and number of samples do not qualify the dataset as being in-the-wild.

7.3.2 Proposals Summary

As mentioned above, current state-of-the-art GC approaches achieve high precision based simply on visual facial features in controlled scenarios. This fact has also been proven with commercial solutions, as stated in the 2015 FRVT report on Performance of Automated Gender Classification Algorithms, which focused for the first time on GC [97]. This evaluation reported an accuracy of around 96.5% with an independent dataset containing roughly one million facial samples acquired under constrained conditions.

However, these results were not reproduced in datasets with unconstrained capture conditions or in the wild. Two of the above mentioned datasets were evaluated by Ngan et al. [97]: (i) LFW and (ii) GROUPS. In these datasets, available commercial solutions were not able to maintain similar classification rates in both cases. On the one hand, LFW seems to be a simpler dataset as the best accuracy reported by standard commercial solutions was 95.2%, certainly quite close to the numbers reported for constrained datasets. On the other hand, the accuracy achieved for GROUPS was significantly worse, hardly reaching 90.4%. This evidence confirms the difficult scenario represented by this particular dataset. This conclusion has already been highlighted by recent surveys and experimental evaluations [23,96].

Below we summarize GC results for three of the above mentioned in-the-wild datasets: LFW, GROUPS, and Adience. Observe that PubFig is not included due to the absence of experimental evaluations. Two scenarios are considered, depending on if the experimental evaluation includes a single or double datasets. When a single dataset is used, training and test sets are built based on the same dataset, a fact that may be affected by the dataset gathering protocol, or even include the same identity in both training and test sets. These results are commonly referred to in the literature as single/in/intra/within-database GC. A summary of recent results is reported in Table 7.2.

Table 7.2

GC accuracies in recent literature for LFW and GROUPS. Full datasets are used (28,000 samples for GROUPS or 13,233 for LFW) with the following exceptions: 1 Dago's protocol [36] containing around 14,000 samples with inter ocular distance larger than 20 pixels, 1bImage over 20-year-old individuals present in Dago's protocol, 2 22,778 automatically detected faces, 3 over 12-year-olds, 4 1978 facial images, 5 7443 of the total images, 6 BEFIT protocol

Reference Dataset Accuracy (%)
[36] GROUPS1 86.6
[27] GROUPS1 89.8
[23] GROUPS1 91.6
[88] GROUPS1 91.59
[30] GROUPS1 92.46
[28] GROUPS1 93.26
[29] GROUPS1 94.04
[23] GROUPS1b 94.28
[73] GROUPS2 86.4
[22] GROUPS2 90.4
[10] GROUPS3 80.5
[89] GROUPS4 93.3
[61] GROUPS 87.14
[41] GROUPS 88.6
[23] GROUPS 97.2
[115] LFW5 94.8
[129] LFW5 98.0
[106] LFW5 98.0
[36] LFW6 94.01
[43] LFW6 93.98
[88] LFW6 96.25
[10] LFW 79.5
[40] LFW 91.5
[118] LFW 94.6
[61] LFW 95.4
[83] LFW 94.0
[23] LFW 98.1
[41] Adience 76.1
[60] Adience 79.3
[79] Adience 86.8
[132] Adience 87.2

However, single dataset results may be overly optimistic. Indeed, the community knows that achieving high GC rates for a dataset does not generalize to all scenarios due to the influence of dataset bias [124]. Close to perfect accuracies in FERET [102] have been achieved. The work by Moghaddam and Yang [95] reported an accuracy of 96.62%, making use of pixel intensities and support vector machines (SVM). However, even more recent developments, trained with FERET, have not been able to reach significant accuracies on in-the-wild datasets, as, for example, those reported by Erdogmus et al. [43], who achieved an accuracy of just 65.2% on LFW. Therefore, a more challenging evaluation is referred to as cross-database GC where independent datasets are used for training and testing. As observed in Table 7.3, summarizing cross-database results, the achieved accuracies for the test dataset are commonly lower.

Table 7.3

Cross-database accuracies in the literature: 1 Dago's protocol [36] containing around 14,000 samples with inter ocular distance larger than 20 pixels, 2 over 20-year-olds, 3 automatically detected faces of over 20-year-olds, 4 single face per identity, 5 subset containing 10147 samples

Reference Training set Test set Accuracy
[9] FERET UCN 81.29
[9] PAL UCN 74.09
[103] MORPH LFW 75.10
[43] MORPH LFW 76.64
[23] MORPH LFW 88.70
[36] GROUPS1 LFW 89.77
[88] GROUPS1 LFW 94.48
[10] GROUPS2 LFW 79.53
[34] GROUPS3 LFW 91.62
[66] 4 million faces LFW 96.86
[2] CASIA WebFace [134] LFW5 97.1
[103] MORPH GROUPS 76.74
[23] MORPH GROUPS 72.32
[36] LFW GROUPS1 81.02
[88] LFW GROUPS1 83.03
[35] LFW GROUPS3 85.00
[23] LFW GROUPS 90.14
[41] Adience GROUPS 83.0
[41] GROUPS Adience 75.9

Image

Added to the training and test data used to evaluate the different approaches, the main differences among them are related to the features used and classification design adopted [96]. Assuming the state-of-the-art appearance-based methods, approaches are based on features such as intensity raw values, or local descriptors, which may later be filtered or not, while the classification may cover SVMs, linear discriminant, nearest neighbor, or neural networks among others.

Among the three datasets, it can be observed that GROUPS and LFW have received far more attention, with the former apparently being the most challenging of them. The recently arrived Adience has attracted interest in Convolutional Neural Networks (CNN) based solutions. Below, we present a rough chronological summarized description of the different approaches evaluated in the wild, considering single and cross-database results.

Prior to Dago et al. [36], GC evaluations mainly focused on single datasets such as FERET, covering a relative large population containing a significant number of individuals. However, the dataset high quality images and controlled capture conditions are rather different from the image variety of uncontrolled or real-world scenarios. Dago et al. presented results for LFW making use of the BEFIT protocol,7 and configured an experimental protocol for GROUPS using a subset with larger faces.8 In the study, they evaluated the use of LBP and Gabor features using LDA or SVM for classification. For single dataset GC, both features reached similar results, slightly better using Gabor jets, achieving an accuracy of 86.61% for GROUPS and 94.01% for LFW. They also included additional cross-database results, i.e., results after training with one dataset and testing with a different one to avoid dataset bias [124]. They were probably the first researchers interested in performing cross-database evaluations considering in the wild databases. When training with GROUPS and testing with LFW, they reported an accuracy of 89.77% using LBP and SVM, while the opposite combination reached only 81.02%.

Bekios et al. [9] also reported cross-database GC results, but their experiments were restricted to PAL, UCN, and FERET. Their approach combined LDA/PCA features with a Bayesian classifier, focusing on reaching similar to state-of-the-art GC rates with linear classification. On those datasets, they succeeded in achieving quite similar performances while reducing computational requirements needed by SVM based systems. A major conclusion was the proof that cross-database experiments are closer to real-world scenarios and demonstrated that single database experiments are optimistically biased.

In a subsequent study, Bekios et al. [10] tackled the in-the-wild scenario. Their study evaluated dependencies among gender, age, and pose. The authors integrated pose information in the classification, avoiding the initial need for face alignment. Their evaluation using a five-fold strategy reported an accuracy of 80.5% for single database GC in GROUPS.

More recently, other single dataset GC experiments have been carried out on LFW. Shan [115] extracted LBP features that were later classified using SVM to obtain an accuracy of 94.8% in LFW. Subsequently, Shafey et al. [118] reported results for FERET and LFW, making use of Total Variability (i-vectors) and Inter-Session Variability (ISV) modeling techniques, reaching an accuracy of 94.6% for LFW. Tapia et al. [129] compared the fusion of different LBP-based features, scales, and mutual information measures, reporting for LFW an accuracy of 98%. Ren and Li [106] evaluated two types of local descriptors (gradient features and Gabor wavelets), introducing a later selection step based on RealAdaBoost. A linear SVM was applied as a classifier, and results for FERET, KinFace, and LFW were presented. For LFW, they reported 98%, claiming it to be faster than other previous approaches with similar accuracy. Erdogmus et al. [43] explored the best grid setup for the LBP-based features extraction with BANCA and MOBIO databases. Later, they evaluated on FERET, MORPH, and LFW, the latter with the BEFIT protocol and achieved an accuracy of 98%.

Reducing the working image resolution, El Din et al. [40] designed a two-stage system that combined appearance and shape features in the first stage, activating the second stage only for images with low confidence. Their reported accuracy for small, normalized thumbnails, 16×16Image pixels, of LFW reached an accuracy of 91.5%.

Specifically using cross-database results for LFW, Jia and Cristianini [66] focused on assembling and labeling large datasets automatically, avoiding the workload of human annotation. They evaluated their approach with LFW after gathering four million images and achieved an accuracy of 96.86%. Antipov et al. [2] applied CNN to GC on LFW, the authors claimed to use 10 times less training samples than Jia and Cristianini [66], i.e., 400,000, for identical experimental setup. They reported an accuracy of 97.1%Image using three CNNs.

Chen and Gallagher [22] built a facial appearance representation based on the 100 most common names in the USA. Each pairwise name classifier provided a score, the whole collection was used to create the feature vector. The voting of the top five names is used to assign a gender to a test image. The achieved accuracy for GROUPS reached 90.4%. They claimed to beat any previous evaluation on GROUPS, including the application of the gender classifier designed by Kumar et al. [73] integrated into a face verification approach based on describable visual attributes, which using GROUPS gave rise to an accuracy of 86.4%.

Han and Jain [61] applied pose and photometric normalizations before extracting biologically inspired features (BIF), which are later classified with SVM to get age, gender, and race. Their experimental results on GROUPS and LFW reported 87.14% and 95.4% accuracy, respectively. Also for GROUPS, Fazl-Ersi et al. [46] reached an accuracy of 91.4% using a combination of different visual information sources. They combined LBP, SIFT, and color histograms after a feature selection stage. More recently, Mery and Bowyer [89] evaluated a technique based on the sparse representation of random patches achieving in a GROUPS subset containing roughly 2000 samples, an accuracy of 93.3%.

Danisman et al. [34] presented different cross-database results, making use of pixel intensities and SVM classification. Training with their dataset, WebDB, they reported an accuracy for LFW and GROUPS of 91.87% and 88.16%, respectively. In a more recent work [35], they have integrated a Fuzzy Inference System (FIS), combining inner and outer facial features training with GROUPS. The system achieved a performance of 93.35% when testing with LFW.

Eidinger et al. [41] reported results on GROUPS and Adience. Their approach based on LBP-like features and SVM reported accuracies of 87.5% and 76.1%, respectively. From these results, they claimed that Adience represents better real world conditions. More recently, Hassner et al. [60] focused on the frontalization problem to better synthesize the frontal face from a given view. The GC evaluations in Adience boosted accuracy up to 79.3%.

In relation to our work on GC in the wild, we initially proposed, similar to other authors, the integration of facial and non-facial features. More specifically, we focused on the face, and head and shoulder patterns (F and HS, see Fig. 7.4) [23,27,103], extracting features based on LBP and HOG. The best reported in-dataset accuracies for LFW and the GROUPS Dago's protocol were 95% and 91.65%, respectively, though the latter increased to 94.28% when considering only adults for training and testing. The exploration of a larger collection of local descriptors, grid resolutions and the combination with more densely extracted features from the periocular area (P) [30], and the mouth area (M) [28,29] (see Fig. 7.4) have more recently led to an improvement in performance. For the GROUPS Dago's protocol, the accuracy integrating P increased up to 93.54%, including both P and M areas achieved 94.04%.

Image
Figure 7.4 From left to right, head and shoulders (HS) (64 × 64 pixels), face (F) (59 × 65 pixels), periocular (P) (49 × 19 pixels), and mouth (M) (37 × 31 pixels) regions. Sample taken from GROUPS.

Deep learning has recently achieved significant results in different computer vision tasks. Facial analysis is a problem, where the pattern variation complexity is well suited for applications in the wild. Liu et al. [83] adopted this focus to extract up to 40 facial attributes from CelebFaces and LFW, they achieved an accuracy of 94% for GC in LFW. Levi and Hassner [79] apparently reported the first GC results based on Deep CNN for a hard in the wild dataset, particularly for Adience. Using the in-plane aligned version of faces, they increased performance to 86.8%, suggesting the need to work on better alignment and frontalization to boost performance further. Another conclusion was the evidence of a larger number of classification errors in babies and children, an aspect also observed in our studies related to GROUPS [23].

More recently, different studies have suggested combining CNN outputs and local descriptors, also called handcrafted features, for GC. Wolfshaar et al. [132] proposed the fusion of Deep-CNN and SVM for GC, reporting results from FERET and Adience datasets. The latter achieved an accuracy of 87.25%. Mansanet et al. [88] weighted local features and CNN outputs achieving single dataset GC for LFW and GROUPS of 96.25% and 90.58%, respectively. Their cross-database experiments reported 94.48% training with GROUPS and testing with LFW, and 83.03% for the reverse.

Considering this trend, we have also performed a study to compare GC performance based on the fusion of local descriptors versus CNN [23]. The rather similar accuracies achieved for LFW, GROUPS, and MORPH motivated us to evaluate the integration of the CNN outputs in the fusion approach. The outcome is an evident increase in GC performance, reporting state-of-the-art accuracies in both in-database and cross-database GC performance. For full in-database cross-validation, accuracies reached over 97%Image (98%Image in adults) and 99%Image for GROUPS and LFW, respectively. Related to cross-database results [23], training with LFW and testing with GROUPS reported 90.14% and, while for the reverse combination, accuracy was 98%, considering full datasets in both cases. These cross-database accuracies beat most literature in-dataset evaluations for both datasets.

7.3.3 Discussion

Observing the summarized results for single database experiments in Table 7.2, the first impression suggests that LFW accuracies are typically over 90%, achieving up to 98%, thus suggesting little room for improvement. LFW is the lightest in the wild datasets studied as its performance is similar to constrained datasets, and quite different from the typical results achieved for Adience and GROUPS, which clearly exhibit lower accuracies. Thus, both datasets seem to be closer to real world situations, including larger variations in terms of pose, background and resolution. The former is a recent dataset with relatively few experimental evaluations so far, with accuracies under 88%. The latter is, according to the 2015 NIST report, currently the most challenging dataset with difficulties to achieve results over 92%, similar to commercial systems. It is worth highlighting our own recent approach that combines handcrafted features and CNN reaching over 97.2% for the GROUPS dataset.

This conclusion is also confirmed by observing the cross-database results in Table 7.3, where, with the exception of testing with the biased FERET or LFW, accuracy hardly reaches 85%, showing a lack of accuracy of state-of-the-art solutions in cross-database GC in most cases. This is probably due to the special test dataset characteristics, which reduce the complexity. We therefore agree that high accuracy can be achieved for homogeneous, biased and/or reduced datasets of good quality, etc.

Another conclusion is the current challenge scenario is to provide high precision with cross-database classification, i.e., testing with a completely independent in the wild dataset. Cross-database GC is closer to real world applications and avoids any dataset bias. In realistic applications, a gender classifier is first trained with a set of images, and then deployed under different scenarios that may certainly differ from those of the training dataset.

This fact has been demonstrated several times in recent GC literature, single dataset experiments with FERET have reported very high GC rates [7,9,93,132], but the generalization of the classifier is poor if tested in the wild [132], with a significant drop in accuracy [6]. This observation is also confirmed in Table 7.3, when LFW is used for training, in which accuracy testing with GROUPS is remarkably lower than for the reverse situation. GROUPS dataset again highlights the difficulties. Unfortunately, we are aware of just one reported cross-database results testing with Adience, reducing the relevance of further conclusions.

The latest results that integrate CNNs suggest their utility. The characteristic of CNNs as end-to-end classifier reduces the need for human design to select the features to use. This was suggested by Perlin et al. [100] for GC using full or almost full body images. Specifically, ViPer [49] and HATdb [116] extract information from the upper and lower body added to GC performance. Under this approach, the feature extraction task is mainly performed by the CNN. However, several recently reported results combining handcrafted features and CNN [23,88,132] suggest an improvement in the overall performance, when CNN and features are combined, indicating a promising research avenue.

To explore future improvements in GC, we would like to mention the results presented by Klare et al. [74] and the proven connection between age and gender by Bekios et al. [10]. Two aspects are highlighted: (i) the need for training on datasets that are evenly distributed across demographics, to manage different demographic groups; and (ii) the interest in designing solutions for different demographic groups to increase performance on such groups.

The former is evidenced by cross-database results, where training with the hardest datasets, reported higher accuracies than in lighter ones. It is also likely to be affected by the demographic variations included. The latter agrees with the results reported in [23], removing under 20-year-old individuals from the Dago's protocol, and reaching significantly higher GC accuracy, up to 94.2%. Table 7.4 presents the GROUPS average faces per age range and gender, with some evident appearance differences.

Table 7.4

Mean facial patterns per gender and age group in GROUPS

Image

Given existing in the wild datasets, whose compilations have not taken into consideration the balance in terms of variables such as ethnicity, age, etc., it is not straightforward to obtain a general conclusion about the current leading solutions. Fig. 7.5 illustrates a subset of the classification errors obtained when evaluating GROUPS with the state-of-the-art classifier proposed in [23]. The upper and bottom rows present, respectively, females and males wrongly classified in the samples. For both classes only the five most distant to the classification border samples are shown, presenting them according to their output score. For females, two samples produced a score significantly farther from the class border. Both belong to elderly ladies, and is likely to be due to a poor representation of that demographic group within the training set. For male samples, age seems to be again a misclassification factor, but also annotation errors might be present. In any case, it is clear that there is still room for improvement.

Image
Figure 7.5 (Upper row) Sorted list of female samples wrongly classified as males, their respective output scores are −2.01,−1.82,−0.82,−0.82, and −0.79. (Bottom row) Sorted list of male samples wrongly classified as females, their respective output scores are 0.85, 0.81, 0.75, 0.73, and 0.64.

A final comment can be made related to difficult or ambiguous samples [25], El Din et al. [40] made use of a second classifier specialized in difficult or borderline samples, which may be an approach to be considered.

7.4 Conclusions

In this chapter, we have reviewed different approaches in human recognition focusing on soft biometrics. Soft biometrics refers to attributes or descriptors that are extracted from different traits like the face or body. Unlike strong biometrics such as fingerprints, soft biometrics does not have the ability to discriminate between any two different individuals, but does allow people to be grouped into semantic categories like gender, age, ethnicity, and so on. We have explained the different applications, where these traits can be used: identity recognition by means of multimodal fusion; improved classical biometric performance; reduced search space for people in large databases or surveillance systems, and to obtain demographic user profiling in marketing.

After the general introduction to soft biometrics, the need for in the wild experimental evaluation is argued for as the logical evolution of methods that were conceived for solving problems under restricted conditions, and have been successful. In the near future, soft biometrics must be able to tackle deployment in real world scenarios. A key element of this evolution is the need for benchmarks that reproduce conditions similar to those that the systems will face in real situations. Some initiatives in this direction have been presented such as the LFW or EmotiW datasets.

One of the most useful and applied soft biometric traits, gender, is analyzed in depth, focusing exclusively on real world applications and therefore reporting on GC proposals evaluated on in the wild datasets. Indeed, the pros and cons of each of most referenced in the wild public datasets have been analyzed. Additionally, related to the datasets, the two main experimental setups, namely in-database and cross-database are explained. Finally, a review of the different GC proposals, from the first ones to the most recent, was carried out, showing the performance of the different methods.

References

[1] Olasimbo Ayodeji Arigbabu, Sharifah Mumtazah Syed Ahmad, Wan Azizun Wan Adnana, Salman Yussof, Integration of multiple soft biometrics for human identification, Pattern Recognit. Lett. 2015;68:278–287.

[2] Grigory Antipov, Sid-Ahmed Berrania, Jean-Luc Dugelay, Minimalistic CNN-based ensemble model for gender prediction from face images, Pattern Recognit. Lett. Jan. 2016;70:59–65.

[3] Brett Allen, Brian Curless, Zoran Popovic, The space of human body shape: reconstruction and parameterization, ACM Trans. Graph. 2003;22(3):587–594.

[4] Donald Adjeroh, Bojan Cukic, Arun Ross, Research Challenges in Biometrics and Indexed Biography of Relevant Biometric Research Literature. [Technical Report] West Virginia University; 2014.

[5] Fernando Alonso-Fernandez, Julian Fierrez, Javier Ortega-Garcia, Quality measures in biometric systems, IEEE Secur. Priv. Dec. 2012;10(9):52–62.

[6] Yasmina Andreu, Pedro García-Sevilla, Ramón A. Mollineda, Face gender classification: a statistical study when neutral and distorted faces are combined for training and testing purposes, Image Vis. Comput. Jan. 2014;32(1):27–36.

[7] Luis A. Alexandre, Gender recognition: a multiscale decision fusion approach, Pattern Recognit. Lett. 2010;31(11):1422–1427.

[8] Bir Bhanu, Abdenour Hadid, Qiang Ji, Mark Nixon, Vitomir Štruc, Foreword – Biometrics in the Wild 2015, In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), 2015, vol. 2. 2015:1–2 10.1109/FG.2015.7284809.

[9] Juan Bekios-Calfa, José M. Buenaposada, Luis Baumela, Revisiting linear discriminant techniques in gender recognition, IEEE Trans. Pattern Anal. Mach. Intell. Apr. 2011;33(4):858–864.

[10] Juan Bekios-Calfa, José M. Buenaposada, Luis Baumela, Robust gender recognition by exploiting facial attributes dependencies, Pattern Recognit. Lett. Jan. 2014;36:228–234.

[11] Romil Bhardwaj, Gaurav Goswami, Richa Singh, Mayank Vatsa, Harnessing social context for improved face recognition, In: International Conference on Biometrics (ICB). 19–22 May 2015:121–126.

[12] Lubomir Bourdev, Subhransu Maji, Jitendra Malik, Describing people: a poselet-based approach to attribute classification, In: International Conference on Computer Vision (ICCV). 2011:1543–1550.

[13] Roberto Brunelli, Tomasso Poggio, Face recognition: features versus templates, IEEE Trans. Pattern Anal. Mach. Intell. 1993;15(10):1042–1052.

[14] Shumeet Baluja, Henry A. Rowley, Boosting sex identification performance, Int. J. Comput. Vis. 2007;71(1):111–119.

[15] Vicki Bruce, Andy Young, Understanding face recognition, Br. J. Psychol. 1986;77(3):305–327.

[16] Vicki Bruce, Andy Young, The Eye of the Beholder. Oxford University Press; 1998.

[17] Deng Cao, Cunjian Chen, D. Adjeroh, A. Ross, Predicting gender and weight from human metrology using a copula model, In: 2012 IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems (BTAS). Sept. 2012:162–169.

[18] Pierluigi Carcagni, Marco Del Coco, Dario Cazzato, Marco Leo, Cosimo Distante, A study on different experimental configurations for age, race, and gender estimation problems, EURASIP J. Image Video Process. 2015.

[19] Dong Seon Cheng, Marco Cristani, Michele Stoppa, Loris Bazzani, Vittorio Murino, Custom pictorial structures for re-identification, In: British Machine Vision Conference. Dundee. Aug. 2011:1–11.

[20] Liangliang Cao, Mert Dikmen, Yun Fu, Thomas S. Huang, Gender recognition from body, In: Proceedings of the 16th ACM International Conference on Multimedia. 2008:725–728.

[21] Huizhong Chen, Andrew Gallagher, Bernd Girod, Describing clothing by semantic attributes, In: European Conference on Computer Vision (ECCV). 2012:609–623.

[22] Huizhong Chen, Andrew C. Gallagher, Bernd Girod, The hidden sides of names–face modeling with first name attributes, IEEE Trans. Pattern Anal. Mach. Intell. 2014;36(9):1860–1873.

[23] Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Enrique Ramón-Balmaseda, Descriptors and regions of interest fusion for gender classification in the wild, Image Vis. Comput. 7 November 2016. in press, preprint arXiv:1507.06838.

[24] Cunjian Chen, Arun Ross, Evaluation of gender classification methods on thermal and near-infrared face images, In: International Joint Conference on Biometrics (IJCB). 2011:1–8.

[25] Modesto Castrillón-Santana, Maria De Marsico, Michele Nappi, Daniel Riccio, MEG: Multi-expert gender classification in a demographics-balanced dataset, In: 18th International Conference on Image Analysis and Processing (ICIAP). 2015.

[26] Modesto Castrillón-Santana, Maria De Marsico, Michele Nappi, Daniel Riccio, MEG: texture operators for multi-expert gender classification, Comput. Vis. Image Underst. 2016 10.1016/j.cviu.2016.09.004.

[27] Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Enrique Ramón-Balmaseda, Improving gender classification accuracy in the wild, In: 18th Iberoamerican Congress on Pattern Recognition (CIARP). 2013:270–277.

[28] Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Enrique Ramón-Balmaseda, Fusion of holistic and part based features for gender classification in the wild, In: New Trends in Image Analysis and Processing – ICIAP 2015 Workshops. Springer International Publishing; 2015:43–50.

[29] Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Enrique Ramón-Balmaseda, Multi-scale score level fusion of local descriptors for gender classification in the wild, Multimed. Tools Appl. 2016 10.1007/s11042-016-3653-2.

[30] Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Enrique Ramón-Balmaseda, On using periocular biometric for gender classification in the wild, Pattern Recognit. Lett. 15 October 2016;82(2):181–189.

[31] Patrick Chiroro, Tim Valentine, An investigation of the contact hypothesis of the own-race bias in face recognition, Q. J. Exp. Psychol. 1995;48(4):879–894.

[32] Matthew Collins, Jianguo Zhang, Paul Miller, Hongbin Wang, Full body image feature representations for gender profiling, In: IEEE International Conference on Computer Vision Workshops (ICCVW). 2009:1235–1242.

[33] In: Cosimo Distante, Sebastiano Battiato, Andrea Cavallaro, eds. Video Analytics for Audience Measurement (VAAM). Springer; 2014.

[34] Taner Danisman, Ioan Marius Bilasco, Chabane Djeraba, Cross-database evaluation of normalized raw pixels for gender recognition under unconstrained settings, In: 22nd International Conference on Pattern Recognition (ICPR). 2014:3144–3149.

[35] Taner Danisman, Ioan Marius Bilasco, Jean Martinet, Boosting gender recognition performance with a fuzzy inference system, Expert Syst. Appl. 2015;42:2772–2784.

[36] Pablo Dago-Casas, Daniel González-Jiménez, Long Long-Yu, José Luis Alba-Castro, Single- and cross-database benchmarks for gender classification under unconstrained settings, In: Proc. First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies. 2011:2152–2159.

[37] Qieyun Dai, Peter Carr, Leonid Sigal, Derek Hoiem, Family member identification from photo collections, In: Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision. 2015:982–989.

[38] Antitza Dantcheva, Petros Elia, Arun Ross, What else does your biometrics data reveal? A survey on soft biometrics, IEEE Trans. Inf. Forensics Secur. 2016;11:441–467.

[39] Antitza Dantcheva, Carmelo Velardo, Angela D'Angelo, Jean-Luc Dugelay, Bag of soft biometrics for person identification new trends and challenges, Multimed. Tools Appl. 2011;51:739–777.

[40] Yomna Safaa El-Din, Mohamed N. Moustafa, Hani Mahdi, Gender classification using mixture of experts from low resolution facial images, In: Multiple Classifier Systems. Lecture Notes in Computer Science. Springer; 2013;vol. 7872:49–60.

[41] Eran Eidinger, Roee Enbar, Tal Hassner, Age and gender estimation of unfiltered faces, Special Issue on Facial Biometrics in the Wild. IEEE Trans. Inf. Forensics Secur. Dec. 2014;9(12):2170–2179.

[42] Abhinav Dhall, Roland Goecke, Jyoti Joshi, Karan Sikka, Tom Gedeon, The second emotion recognition in the wild challenge (emotiw 2014), In: Proceedings of the 16th International Conference on Multimodal Interaction. 2014:461–466 10.1145/2663204.2666275.

[43] Nesli Erdogmus, Matthias Vanoni, Sébastien Marcel, Within- and cross-database evaluations for face gender classification via befit protocols, In: IEEE 16th International Workshop on Multimedia Signal Processing (MMSP). 2014:1–6.

[44] Aly A. Farag, Bir Bhanu, Edwin R. Hancock, Gerard Medion, Jie Yang, Guest editorial special issue on facial biometrics in the wild, IEEE Trans. Inf. Forensics Secur. Dec. 2014;9(12):2019–2023.

[45] Jean-Marc Fellous, Gender discrimination and prediction on the basis of facial metric information, Vis. Res. 1997;37(14):1961–1973.

[46] Ehsan Fazl-Ersi, M. Esmaeel Mousa-Pasandi, Robert Laganiere, M. Awad, Age and gender recognition using informative features of various types, In: International Conference on Image Processing. 2014.

[47] David Freire-Obregón, Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Enrique Ramón-Balmaseda, Automatic clothes segmentation for soft biometrics, In: Proceedings of IEEE International Conference on Image Processing (ICIP). 2014:4972–4976.

[48] Wei Gao, Haizhou Ai, Face gender classification on consumer images in a multiethnic environment, In: Advances in Biometrics. Springer; 2009:169–178.

[49] Doug Gray, Shane Brennan, Hai Tao, Evaluating appearance models for recognition, reacquisition, and tracking, In: IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS). 2007.

[50] Andrew Gallagher, Tsuhan Chen, Clothing cosegmentation for recognizing people, In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). 2008.

[51] Andrew Gallagher, Tsuhan Chen, Understanding images of groups of people, In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). 2009:256–263.

[52] Guodong Guo, Charles R. Dyer, Yun Fu, Thomas S. Huang, Is gender recognition affected by age? In: IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops). 2009:2032–2039.

[53] The Machine Perception Lab GENKI Database, 2009.

[54] Suriya Gunasekar, Joydeep Ghosh, Alan C. Bovik, Face detection on distorted images augmented by perceptual quality-aware features, IEEE Trans. Inf. Forensics Secur. Dec. 2014;9(12):2119–2131.

[55] Breatice A. Golomb, David T. Lawrence, Terrence J. Sejnowski, SexNet: a neural network identifies sex from human faces, In: Advances in Neural Information Processing Systems. Morgan Kaufmann; 1991:572–577.

[56] Guodong Guo, Guowang Mu, A framework for joint estimation of age, gender and ethnicity on a large database, Image Vis. Comput. 2014;32:761–770.

[57] Javier Galbally, Sébastien Marcel, Julian Fierrez, Image quality assessment for fake biometric detection: application to iris, fingerprint and face recognition, IEEE Trans. Image Process. Feb. 2014;23(2):710–724.

[58] Guodong Guo, Guowang Mu, Yun Fu, Alade Tokuta, Gender from body: a biologically-inspired approach with manifold learning, In: Ninth Asian Conference on Computer Vision (ACCV). 2009:236–245.

[59] Walter Heflin, Brian Scheirer, Terrance E. Boult, Detecting and classifying scars, marks, and tattoos found in the wild, In: IEEE Fifth International Conference on Biometrics: Theory, Applications and Systems (BTAS). 2012:31–38.

[60] Tal Hassner, Shai Harel, Eran Paz, Roee Enbar, Effective face frontalization in unconstrained images, In: Computer Vision and Pattern Recognition. 2015:4295–4304.

[61] Hu. Han, Anil K. Jain, Age, Gender and Race Estimation From Unconstrained Face Images. [Technical Report MSU-CSE-14-5] Michigan State University; 2014.

[62] Andrew Harrison, Brian Mennecke, Anicia Peters, Marketing avatars revisited: a commentary on facial recognition and embodied representations in consumer profiling, Bus. Horiz. 2014;57(1):21–26.

[63] Abdenour Hadid, Matti Pietikäinen, Combining appearance and motion for face and gender recognition from videos, Pattern Recognit. Nov. 2009;42(11):2818–2827.

[64] Gary B. Huang, Manu Ramesh, Tamara Berg, Erik Learned-Miller, Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. [Technical Report 07-49] Amherst: University of Massachusetts; Oct. 2007.

[65] Dimosthenis Ioannidis, Dimitrios Tzovaras, Gabriele Dalle Mura, Marcello Ferro, Gaetano Valenza, Alessandro Tognett, Giovanni Pioggia, Gait and anthropometric profile biometrics: a step forward, In: Second Generation Biometrics: The Ethical, Legal and Social Context. The International Library of Ethics, Law and Technology. Springer; 2012;vol. 11:105–127.

[66] Jia Sen, Nello Cristianini, Learning to classify gender from four million images, Pattern Recognit. Lett. 2015;58:35–41.

[67] Anil K. Jain, Sarat C. Dass, Karthik Nandakumar, Soft biometric traits for personal recognition systems, In: International Conference on Biometric Authentication. 2004:731–738.

[68] Anil K. Jain, Ajay Kumar, Biometrics of next generation: an overview, In: Second Generation Biometrics. Springer; 2012:49–79.

[69] Anil K. Jain, Unsang Park, Facial marks: soft biometric for face recognition, In: International Conference on Image Processing. 2009:37–40.

[70] Izzat Jarudi, Pawan Sinha, Relative Roles of Internal and External Features in Face Recognition. [Technical Report Memo 225] CBCL; 2005.

[71] Hjalmar S. Kühl, Tilo Burghardt, Animal biometrics: quantifying and detecting phenotypic appearance, Trends Ecol. Evol. July 2013;28(7):432–441.

[72] Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, Shree K. Nayar, Attribute and simile classifiers for face verification, In: International Conference on Computer Vision (ICCV). 2009:365–372.

[73] Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, Shree K. Nayar, Describable visual attributes for face verification and image search, IEEE Trans. Pattern Anal. Mach. Intell. Oct. 2011:1962–1977.

[74] Brendan F. Klare, Mark J. Burge, Richard W. Vorder Bruegge, Joshua C. Klontz, Anil K. Jain, Face recognition performance: role of demographic information, IEEE Trans. Inf. Forensics Secur. 2012;7(7).

[75] Brendan Klare, Anil K. Jain, On a taxonomy of facial features, In: Fourth IEEE International Conference on Biometrics: Theory Applications and Systems (BTAS). 2010:1–8.

[76] Joshua C. Klontz, Anil K. Jain, A case study of automated face recognition: the Boston marathon bombings suspects, Computer 2013;46(11):91–94.

[77] Iljung Sam Kwak, Ana Cristina Murillo, David Kriegman, Serge Belongie, From bikers to surfers: visual recognition of urban tribes, In: British Machine Vision Conference (BMVC). 2013.

[78] Santosh Kumar, Sanjay Kumar Singh, Biometric recognition for pet animal, J. Softw. Eng. Appl. 2014;7(5):470–482.

[79] Gil Levi, Tal Hassner, Age and gender classification using convolutional neural networks, In: IEEE Workshop on Analysis and Modeling of Faces and Gestures (AMFG), at the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Boston. June 2015:34–42.

[80] Jiwen Lu, Junlin Hu, Venice Erin Liong, Xiuzhuang Zhou, Andrea Bottino, Ihtesham Ul Islam, Tiago Figueiredo Vieira, Xiaoqian Qin, Xiaoyang Tan, Songcan Chen, Shahar Mahpod, Yosi Keller, Lilei Zheng, Khalid Idrissi, Christophe Garcia, Stefan Duffner, Atilla Baskurt, Modesto Castrillón-Santana, Javier Lorenzo-Navarro, The FG 2015 kinship verification in the wild evaluation, In: Proc. IEEE International Conference on Automatic Face and Gesture Recognition (FG). 2015:1–7.

[81] Jiwen Lu, Junlin Hu, Xiuzhuang Zhou, Jie Zhou, Modesto Castrillón-Santana, Javier Lorenzo-Navarro, Lu Kou, Yuanyuan Shang, Andrea Bottino, Tiago Figuieiredo Vieira, Kinship verification in the wild: the first kinship verification competition, In: International Joint Conference on Biometrics (IJCB). 2014:1–6.

[82] Bing Li, Xiao-Chen Lian, Bao-Liang Lu, Gender classification by combining clothing, hair and facial component classifiers, Neurocomputing Jan. 2012;76(1):18–27.

[83] Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang, Deep learning face attributes in the wild, In: International Conference on Computer Vision. 2015.

[84] Erik Learned-Miller, Gary Huang, Aruni RoyChowdhury, Haoxiang Li, Gang Hua, Labeled faces in the wild: a survey, In: Michal Kawulok, M. Emre Celebi, Bogdan Smolka, eds. Advances in Face Detection and Facial Image Analysis. Springer; 2016:189–248.

[85] Ágata Lapedriza, David Masip, Jordi Vitriá, On the use of external features for face verification, J. Multim. 2006;1(4):11–20.

[86] Javier Lorenzo-Navarro, Modesto Castrillón-Santana, Daniel Hernández-Sosa, On the use of simple geometric descriptors provided by RGB-D sensors for re-identification, Sensors 2013;13(7):8222–8238.

[87] Thomas Linder, Sven Wehner, Kai O. Arras, Real-time full-body human gender recognition in (RGB)-d data, In: IEEE International Conference on Robotics and Automation (ICRA). 2015:3039–3045.

[88] Jordi Mansanet, Alberto Albiol, Roberto Paredes, Local deep neural networks for gender recognition, Pattern Recognit. Lett. 2016;70:80–86.

[89] Domingo Mery, Kevin Bowyer, Automatic facial attribute analysis via adaptive sparse representation of random patches, Pattern Recognit. Lett. 2015;68:260–269.

[90] Gayathri Mahalingam, Karl Ricanek Jr., A. Midori Albert, Investigating the periocular-based face recognition across gender transformation, IEEE Trans. Inf. Forensics Secur. December 2014;9(12):2180–2192.

[91] Maria De Marsico, Michele Nappi, Daniel Riccio, Harry Wechsler, Robust face recognition after plastic surgery using region-based approaches, Pattern Recognit. Apr. 2015;48(4):1261–1276.

[92] Maria De Marsico, Michele Nappi, Massimo Tistarelli, Face recognition in adverse conditions, In: IGI Global. 2014.

[93] Erno Mäkinen, Roope Raisamo, Evaluation of gender classification methods with automatically detected and aligned faces, IEEE Trans. Pattern Anal. Mach. Intell. Mar. 2008;30(3):541–547.

[94] Gian Luca Marcialis, Fabio Roli, Daniele Muntoni, Group-specific face verification using soft biometrics, J. Vis. Lang. Comput. 2009:101–109.

[95] Baback Moghaddam, Ming-Hsuan Yang, Learning gender with support faces, IEEE Trans. Pattern Anal. Mach. Intell. 2002;24(5):707–711.

[96] Mark Nixon, Paulo Correia, Kamal Nasrollahi, Thomas B. Moeslund, Abdenour Hadid, Massimo Tistarelli, On soft biometrics, Pattern Recognit. Lett. December 2015;68(2):218–230.

[97] Mei Ngan, Patrick Grother, Face Recognition Vendor Test (FRVT) Performance of Automated Gender Classification Algorithms. [Technical Report NIST IR 8052] National Institute of Standards and Technology; Apr. 2015.

[98] Joao C. Neves, Gil Santos, Sílvio Filipe, Emanuel Grancho, Silvio Barra, Fabio Narducci, Hugo Proença, Quis-campi: extending in the wild biometric recognition to surveillance environments, In: New Trends in Image Analysis and Processing – ICIAP 2015 Workshops. 2015:59–68.

[99] Choon-Boon Ng, Yong-Haur Tay, Bok-Min Goi, A review of facial gender recognition, Pattern Anal. Appl. July 2015;18:739–755.

[100] Hugo Alberto Perlin, Heitor Silvério Lopes, Extracting human attributes using a convolutional neural network approach, Pattern Recognit. Lett. 15 December 2015;68(2):250–259.

[101] Unsang Park, Shengcai Liao, Brendan Klare, Jimmy Voss, Anil K. Jain, Face Finder: Filtering a Large Face Database Using Scars, Marks and Tattoos. [Technical Report TR11] Michigan State University; 2011.

[102] P. Jonathon Phillips, Harry Wechsler, Jeffery Huang, Patrick J. Rauss, The FERET database and evaluation procedure for face-recognition algorithms, Image Vis. Comput. 1998;16(5):295–306.

[103] Enrique Ramón-Balmaseda, Javier Lorenzo-Navarro, Modesto Castrillón-Santana, Gender classification in large databases, In: 17th Iberoamerican Congress on Pattern Recognition (CIARP). 2012:74–81.

[104] Henry T.F. Rhodes, Alphonse Bertillon: Father of Scientific Detection. Literary Licensing, LLC; 2013.

[105] Arun Ross, Anil K. Jain, Multimodal biometrics: an overview, In: 12th European Signal Processing Conference (EUSIPCO). 2004:1221–1224.

[106] Haoyu Ren, Ze-Nian Li, Gender recognition using complexity-aware local features, In: International Conference on Pattern Recognition. 2014:2389–2394.

[107] Daniel Reid, Mark Nixon, Imputing human descriptions in semantic biometrics, In: ACM Workshop on Multimedia in Forensics, Security and Intelligence. 2010.

[108] Daniel A. Reid, Sina Samangooei, Cunjian Chen, Mark S. Nixon, Arun Ross, Soft biometrics for surveillance: an overview, In: Machine Learning: Theory and Applications. Handbook of Statistics. Elsevier; 2013;vol. 31:327–352.

[109] D. Riccio, G. Tortora, M. De Marsico, H. Wechsler, EGA – ethnicity, gender and age, a pre-annotated face database, In: 2012 IEEE Workshop on Biometric Measurements and Systems for Security and Medical Applications (BIOMS). 2012:1–8.

[110] Richard Russell, A sex difference in facial contrast and its exaggeration by cosmetics, Perception 2009;38(8):1211–1219.

[111] Robert Ravnik, Franc Solina, Vesna Zabkar, Modelling in-store consumer behaviour using machine learning and digital signage audience measurement data, In: Video Analytics for Audience Measurement – First International Workshop (VAAM). Revised Selected Papers. Aug. 24. Lecture Notes in Computer Science. Stockholm, Sweden: Springer; 2014;vol. 8811:53–65.

[112] Vito Santarcangelo, Giovanni Maria Farinella, Sebastiano Battiato, Gender recognition: methods, datasets and results, In: IEEE International Conference on Multimedia & Expo Workshops (ICMEW). 2015:1–6.

[113] Abhinav Dhall, Roland Goecke, Simon Lucey, Tom Gedeon, Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark, in: IEEE ICCV 2011 Workshop BEFIT.

[114] R. Satta, J. Galbally, L. Beslay, Children gender recognition under unconstrained conditions based on contextual information, In: 22nd IEEE International Conference on Pattern Recognition (ICPR). Stockholm, Sweden. 2014.

[115] Caifeng Shan, Learning local binary patterns for gender classification on real-world face images, Pattern Recognit. Lett. 2012;33:431–437.

[116] Gaurav Sharma, Frederic Jurie, Learning discriminative spatial representation for image classification, In: British Machine Vision Conference. 2011.

[117] Walter J. Scheirer, Neeraj Kumar, Karl Ricanek, N. Belhumeur, Terrance E. Boult, Fusing with context: a Bayesian approach to combining descriptive attributes, In: International Joint Conference on Biometrics (IJCB). 2011:1–8.

[118] Laurent El Shafey, Elie Khoury, Sebastien Marcel, Audio-visual gender recognition in uncontrolled environment using variability modeling techniques, In: International Joint Conference on Biometrics. 2014:1–8.

[119] SOCIA-LAB, Soft Computing and Image Analysis Group, International challenge on biometric recognition in the wild (ICB-RW).

[120] Pawan Sinha, Tomasso Poggio, I think I know that face ..., Nature 1996;384(6608):384–404.

[121] Carlos Serra-Toro, Vicente Javier Traver-Roig, Raúl Montoliú-Colás, José Martínez-Sotoca, On the importance of the grid size for gender recognition using full body static images, In: International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP). 2011.

[122] Ming Shao, Siyu Xia, Yun Fu, Genealogical face recognition based on UB KinFace database, In: IEEE CVPR Workshop on Biometrics. 2011.

[123] Jie Shen, Stefanos Zafeiriou, Grigorios S. Chrysos, Jean Kossaifi, Georgios Tzimiropoulos, Maja Pantic, The first facial landmark tracking in-the-wild challenge: Benchmark and results, In: IEEE International Conference on Computer Vision Workshops (ICCVW). 2015.

[124] Antonio Torralba, Alexei A. Efros, Unbiased look at dataset bias, In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2011:1521–1528.

[125] Pedro Tome, Julian Fierrez, Ruben Vera-Rodriguez, Mark S. Nixon, Soft biometrics and their application in person recognition at a distance, IEEE Trans. Inf. Forensics Secur. March 2014;9(3):464–475.

[126] Umar Toseeb, David R.T. Keeble, Eleanor J. Bryant, The significance of hair for face recognition, PLoS ONE 2012;7, e34144.

[127] In: Massimo Tistarelli, Stan Z. Li, Ramalingam Chellappa, eds. Handbook of Remote Biometrics for Surveillance and Security. New York, NY, USA: Springer-Verlag; 2009.

[128] Hao Tang, Hong Liu, Wei Xiao, Gender classification using pyramid segmentation for unconstrained back-facing video sequences, In: ACM Multimedia. 2015:1183–1186.

[129] Juan E. Tapia, Claudio A. Pérez, Gender classification based on fusion of different spatial scale features selected by mutual information from histogram of LBP, intensity and shape, IEEE Trans. Inf. Forensics Secur. 2013;8(3):488–499.

[130] Mazhuvancherry K. Unnikrishnan, How is the individuality of a face recognized? J. Theor. Biol. 2009;261(3):469–474.

[131] Tiago F. Vieira, Andrea Bottino, Aldo Laurentini, Matteo De Simone, Detecting siblings in image pairs, Vis. Comput. Dec. 2014;30(12):1333–1345.

[132] Jos van de Wolfshaar, Mahir F. Karaaba, Marco A. Wiering, Deep convolutional neural networks and support vector machines for gender recognition, In: IEEE Symposium Series on Computational Intelligence: Symposium on Computational Intelligence in Biometrics and Identity Management. 2015.

[133] James Wayman, Anil K. Jain, Davide Maltoni, Dario Maio, Biometric Systems: Technology, Design and Performance Evaluation. Springer Verlag; 2005.

[134] Dong Yi, Zhen Lei, Shengcai Liao, Stan Z. Li, Learning face representation from scratch, arXiv:1411.7923; 2014.

[135] Hao Zhang, J. Ross Beveridge, Bruce A. Draper, P. Jonathon Phillips, On the effectiveness of soft biometrics for increasing face verification rates, Comput. Vis. Image Underst. 2015;137:50–62.

[136] Zhanpeng Zhang, Ping Luo, Chen Change Loy, Xiaoou Tang, Learning social relation traits from face images, In: International Conference on Computer Vision. 2015.


1  www.quividi.com.”

2  www.tru-media.com.”

3  redpepper.land/lab/facedeals.”

4  www.kairos.com.”

5  emotient.com/.”

6  www.affectiva.com/.”

7  http://fipa.cs.kit.edu/431.php.”

8  “Also referred as Dago's protocol, currently also available at BEFIT site, visit http://i14s50.anthropomatik.kit.edu/431.php.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.113.80