|Year : 2012 | Volume
| Issue : 1 | Page : 54-75
Multimodal Biometric Person Authentication : A Review
Soyuj Kumar Sahoo, Tarun Choubisa, SR Mahadeva Prasanna
Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India
|Date of Web Publication||23-Feb-2012|
Soyuj Kumar Sahoo
Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam
| Abstract|| |
This paper provides a review of multimodal biometric person authentication systems. The paper begins with an introduction to biometrics, its advantages, disadvantages, and authentication system using them. A brief discussion on the selection criteria of different biometrics is also given. This is followed by a discussion on the classification of biometric systems, their strengths, and limitations. Detailed descriptions on the multimodal biometric person authentication system, different modes of operation, and integration scenarios are also provided. Considering the importance of information fusion in multi-biometric approach, a separate section is dedicated on the different levels of fusion, which include sensor-level, feature-level, score-level, rank-level, and abstract-level fusions, and also different rules of fusion. This paper also presents an overview of some performance parameters and error rates for biometric person authentication systems. A separate section is devoted to the recent trends in biometrics field, namely, adaptive biometric system, analysis of complementary and supplementary information, and physiological biometrics. The paper concludes with a discussion on the issues that are currently holding the deployment of multimodal biometric person authentication systems and a possible scope for future work.
Keywords: Biometrics, Person authentication, Unimodal, Multimodal, Fusion, Performance parameters, Adaptive biometric system
|How to cite this article:|
Sahoo SK, Choubisa T, Mahadeva Prasanna S R. Multimodal Biometric Person Authentication : A Review. IETE Tech Rev 2012;29:54-75
| 1. Introduction|| |
In today's world where technology is growing with a rapid pace, still there are several person authentication-related issues that need to be handled in daily life. Few of these issues may be: Is Arjuna entitled to access this server or privileged information? Does Bhima has any past criminal record? Does Chanakya has the authorization to access this facility? Is Duryodhana the person who committed the crime? Biometric is a reasonably dependable answer to all these questions. Person recognition can be performed by different methods like what we know? (knowledge based, e.g., password, PIN)  , what we have? (token based e.g., ATM card, credit card, smart card), and what we are? (biometric based e.g., face, speech, gait). Password or card can be shared, forgotten, or stolen, but not the biometrics. Acquisition of biometric is more complex compared with making combinations of digits or stealing the card. In this way, biometric is more secure compared with knowledge- and token-based approaches. Passwords are desirable to be different for different applications, but same biometric can be used for most of the applications and hence avoids book keeping. Therefore, biometrics can either replace or supplement existing technologies as the former have several advantages compared with traditional person authentication by non-biometric means.
Biometric comes from the Greek words bios (life) and metron (measure), and hence biological measurement is termed as biometric. It refers to the person's physiological (e.g., face, speech, fingerprint, iris) or behavioral (e.g., signature, gait or speech too) characteristics. Physiological biometrics are related to the shape of the body and are generally more stable. Behavioral biometrics are related to the behavior of the person and are comparably less stable. Hence, a proper selection of biometrics is more important than building a person authentication system using the same. The block diagram of biometric based person recognition system is given in [Figure 1].
To train any person authentication system, the biometric data are processed for feature extraction. Possibly the most important aspect of any biometric-based authentication system is the selection of an appropriate feature set which should be reasonably invariant with different degradations. Modeling is done to make the template for every person in such a way that it should hold all the variations captured by that particular biometric. In the testing phase, same features are computed from the unknown test biometric template and then compared with the models of each person. This is accomplished in the pattern matching stage. Finally, after pattern matching, we get the acceptance or rejection of the person as the output result.
Feature Extraction: Biometrics hold the physiological or behavioral characteristics of a person and hence unique information in terms of his/her gender, physique, emotion, behavior, etc. The person information present in any particular biometric is of more interest in biometric-based authentication task. The feature extraction stage extracts person information from the input biometric data. The objective is to effectively extract the person information for discrimination, at a reduced data rate for modeling and comparison. The performance of any biometric system depends upon the discriminating ability and the robustness against degradation of the selected features. Therefore, the selection of proper features plays a key role in the case of biometric authentication task.
Some of the commonly used transform domain features are spectral, cepstral, super vector, subspace, etc. Ideally, the feature extraction technique always depends upon the underlying biometric data used in the authentication task. The physiological characteristics of a person are more stable; hence, face and fingerprint are mostly used in various applications of biometric person authentication. Some of the standard forms of feature extraction techniques extensively used in face verification are principal component analysis (PCA)-based eigenfaces, linear discriminant analysis (LDA)-based fisherfaces, Gabor filter-based features, wavelet coefficients, fractals and iterated function systems-based features, which are briefly explained by Abate et al., in  . In case of fingerprint classification, few most successful feature extraction techniques like, core point detection, minutiae maps, orientation maps, orientation collinearity maps, Gabor feature maps, and that of multi-channel and structure-based, are explained in , , and  , respectively. On the other hand, most frequently used behavioral biometrics in recognition task are speech and signature. Some commonly used feature extraction techniques in case of speech are mel-frequency cepstral coefficients (MFCC), dynamic MFCC, liner prediction cepstral coefficients, and the non-parametric feature linear prediction residual derived cepstral coefficients , . In case of signature biometric, we generally use vertical and horizontal projection profiles and discrete cosine transform (DCT) for feature extraction  .
Template Generation: Ideally, a good feature should be less variant within a person and more variant across persons  . Accordingly, the feature vectors of a biometric data should be in a localized region in the feature space. Again, the feature vectors of different persons should be in non-overlapping regions. But this seldom happens in practice. In practice, the feature vectors of different persons are shared and overlapped with each other. Therefore, further processing of the feature vectors is essential to enhance the person information. In the modeling stage, a second level of processing is performed among the feature vectors of a particular person. The set of representative vectors derived from a large set of feature vectors make the template or model for a particular person. In the testing phase, instead of comparing with all the feature vectors, comparison is made with the corresponding person template or model.
One standard form of modeling or template generation technique, extensively used in biometric world, is vector quantization (VQ) , . In this modeling technique, large set of feature vectors are grouped into non-overlapping clusters by using unsupervised clustering approach. All vectors falling inside the cluster are represented by the centroid, perhaps the cluster mean or a member of the cluster. Here, one of the standard algorithms used for clustering is k-means algorithm  . Experimental results have shown that VQ with properly selected codebook size (set of centroids) can be a good candidate for any biometric modeling ,, . Another standard form of modeling technique used in person authentication system is the Gaussian mixture model (GMM) ,,, . In this modeling technique, the clusters are represented by mean, covariance, and weight. Auto associative neural network (AANN) is also employed in modeling the large set of feature vectors  . AANN is a feed forward neural network (FNN) which maps input feature vectors onto themselves , . The distribution capturing ability of AANN models in the context of person authentication has been illustrated in ,, . The recent overview of various modeling techniques rapidly used in speech biometric is given in  .
Testing: In the testing stage, the feature vectors extracted from the test biometric data are compared with the model. In the comparison process, a score is assigned to a model based on the similarity measure. Euclidean distance and log likelihood ratio scores are the two standard metrics used for the similarity measure , . Depending on the measurement technique and its match score, a decision is taken regarding either accepting or rejecting the claim of the speaker  .
1.1 Scope of this Paper
This paper provides an exposition to the multimodal biometric person authentication system. In most of the areas, a greater stress is given on the recent advancements. Related algorithms and methods are also discussed whenever required. The detailed introduction to biometric is excluded. Interested readers can refer to the well written paper on biometric  . In most of the sections, the several approaches and methods proposed in literature are discussed in brief to develop a basic understanding for the readers. A lot of details, approaches, and related references cannot be provided due to space limitation. The prime importance is given to the widely used and examined methods, along with the recent advancements in that direction.
Along with the outstanding books on biometrics by Jain et al.  , Nanavati et al.  , Wayman et al.  , and Woodward Jr. et al.  , some good textbooks on multimodal biometric by Surer  and Ross et al.  are written. Attention of readers should be called toward few survey papers ,,, . A detailed survey on biometric authentication is provided in  . They also described the biometric modalities, applications, and performance evaluation parameters briefly. Performance issues in biometric authentication based on information theoretic concepts are uncovered in  . In this paper, authors have described the recognition capacity, constrained capacity of the biometric system, and probability of the random correspondence of biometrics based on error exponents. Again, considering the importance of the input biometric sample's quality, some metrics for the comparative evaluation of sample's quality measurement algorithms are described in  . A very good analysis on multi-biometric approach can only be done if we have access to some quality databases, which is one of the most time- and resource-consuming tasks for the research community. Hence, the most important multimodal biometric databases publicly available are summarized, and the contents of some new multimodal databases under development are outlined in  . Further analysis on determining the required number of biometric samples needed at the training phase for better performance of any recognition system is required. Hence, Das et al. have demonstrated the minimum number of samples required to achieve confidence bands of desired width for the receiver operating characteristics (ROC) curve based on a developed multivariate copula models for correlated biometric acquisitions  .
A fundamental of the field of the biometrics is an ever-increasing need for better recognition and stronger security. But, as public and commercial biometric deployments increase in number, there is also more need to understand the privacy issues and to provide a much greater ease-of-use. Considering the above trade-off, a few list of quality works are reported in the two special issues on recent advances in biometric systems , . In contrast to the above papers, an overview of multimodal biometrics, challenges and main research areas in the multimodal biometrics, and its applications to develop the high security systems are provided in  . Therefore, further references and detailed analysis of literature are required in this domain.
Our review paper starts with basic introduction to biometric, carries forward to multimodal concept with a much detailed analysis on different levels and rules of fusion, and finally ends with the definitions and formulas of commonly used metrics of performance measure. Suitable and sufficient references with detailed analysis are provided in each section. A brief analysis on the selection of a particular biometric feature is given in this text. Considering the complexity involved in feature-level fusion, more emphasis is given to the same in this text with suitable figures and valuable references. This is one of the core mottos of this review paper. Again, considering the importance of verification mode in day-to-day authentication task, comparably more insight is given to the same in this paper.
1.2 Terms in Biometric Field
The organization of the paper is as follows: A detailed analysis on selection of biometric features for any particular application is given in Section II. Classification of biometric authentication systems based on the modes of operation and number of modalities used are discussed in Section III. Multimodal biometric system (MBS) and its different integration modes and scenarios are presented in Section IV. Various types of fusion and different rules for fusion are discussed in Section V. Performance of biometric system, different errors, and error rates are discussed in Section VI. Section VII reviews the recent trends in biometrics field. Finally, a brief summary and future scope in this area are given in Section VIII.
- Explicit claim of identity: In explicit claim of identity, claim is made by the person for a particular identity. Matching is performed between the currently taken sample and the reference model of claimed identity in one-to-one fashion. In other words, explicit claim of identity is also termed as verification.
- Implicit claim of identity: In implicit claim of identity, no claim is made by the person for the particular identity. Matching is performed between the currently taken sample and all the reference models in many-to-one fashion. In other words, implicit claim of identity is also termed as identification.
- Positive claim of identity: If the user claims for an enrolled identity, then this claim is termed as positive claim of identity. An explicit claim might be accompanied by a claimed identity in the form of a name or PIN. There are two cases: (1) In case of explicit claim, name or Identity (ID) is provided by the person; (2) In case of implicit claim, matching is performed between the currently taken sample and all the reference models in many-to-one fashion.
- Negative claim of identity: If the claim of user is not present in the database, then this claim is termed as negative claim of identity. There are two cases: (1) In case of explicit claim, name or ID is provided by the person, system searches for that particular name or ID in database. If no such name or ID is present in the database, then person is enrolled with that name or ID. (2) In case of implicit claim, biometric data of the person are compared with all the saved biometric data.
| 2. Choosing Biometrics for Person Authentication Application|| |
Any physiological and/or behavioral characteristic of person that satisfies the universality, distinctiveness, permanence, and collectability properties can be qualified as a biometric  . Universality infers about the availability across all persons. Distinctiveness ensures enough discrimination across different persons. Permanence infers about the longevity of the feature. Finally, collectability says how easy to collect the characteristic from persons. However, for a practical biometric system, there are several other issues which we should consider as measurement requirements. The following two subsections will focus on the selection of biometrics in any practical application depending upon the characteristic measurements and user requirements.
2.1 Selection of Biometric based on their Characteristics
In the implementation of any practical biometric system, we should consider the following factors in selecting a particular biometric feature:
Performance: All the errors, generally decision errors such as False Acceptance Rate (FAR), False Rejection Rate (FRR), etc., should be less. Number of persons processed per unit time is termed as throughput and should be high.
Acceptability: The technology should be acceptable for the users, because the biometric that interferes with the privacy is not desirable.
Circumvention: This indicates how easily the biometric system can be fooled and the biometric should be resistant against spoof attack.
Robustness: Technology should be robust against environmental and operational factors like illuminance, back-ground noise, etc.
Population coverage: The biometric should be covered across large population.
Size: Size of the system should be small so that it occupies less space. For example, biometric system on mobile phones, laptops, etc.
Identity theft deterrence: It should be very hard to steal someone else's identity. For instance, dummy fingerprint can be easily made, but the production of fake Electrocardiogram (ECG) information is difficult.
Reducibility: The biometric data should have correlation, so that it can be compressed and hence reduction in computational complexity.
2.2 Selection of Biometric based on User Requirement
In practice, a biometric feature is selected according to user requirements and resources as shown in [Figure 2]. It always depends upon the available resources and the area of application.
Sensor availability: Sometimes, sensors are already available in the devices, and then corresponding biometrics can be chosen for these applications. For example, camera and microphone are available in laptops and mobile phones; hence, face and speech can be used for the authentication.
Device availability: Sometimes, devices are already available; a sensor can be integrated to the device for using it to collect a biometric. For example, mouse, telephones, and Universal Serial Bus (USB) drives are already in use; a small, cheap fingerprint sensor can be easily integrated to them for security application.
Computational Time and Reliability: Computational time is also an important selection criterion. Fast biometric should be selected for the applications like airport and Automatic Teller Machine (ATM) center authentication; otherwise, a big queue will be built. On the other hand, for a military application, any relatively slow biometric can be used, but it should be reliable.
Cost: The biometric sensor cost should not be higher than that of physical device cost, which we are going to integrate with. For example, integration of a costly biometric sensor like iris sensor with a mobile phone is worthless. However, fingerprint sensor can be easily consolidated to any such device due to its less cost.
Sensor area: Sensor area is also an influencing consideration. If biometric sensor area is large compared with device area, then this biometric is useless for that application. For example, palm print sensor cannot be integrated in mobile phone. However, fingerprint sensor can be integrated anywhere due to its small size.
Power consumption: Power is an effective factor for mobile devices; hence, biometric for these devices should also consume less power. Comparatively, any biometric can be selected for steady devices due to continuous power availability.
| 3 Classification of Biometric Person Authentication System|| |
Biometric person authentication system is a pattern recognition system in which the recognition task is achieved by using biometric characteristics of a person. Furthermore, the biometric-based person authentication systems are classified into subgroups depending upon the mode of operation (verification/identification) or the number of modalities used in the recognition task  .
3.1 Classification based on the Mode of Operation
Generally, the recognition task can be performed in two ways, verification and identification. The major difference between these two modes of operation is at the testing phase which involves accepting or rejecting the identity claim of any person in the verification task and identifying the unknown person in the identification task.
Identification is further classified into two types, closed set and open set.
- Verification: In verification, matching of the test pattern is made only with the claimed model in a one-to-one fashion. The output of such comparison is either to accept or reject the person's claim. Three steps involved in person verification are shown in [Figure 3]. In the first step, reference models for all the users are generated and stored in the model database. In the second step, some samples are matched with reference models to generate the genuine and impostor scores. These scores are normalized and threshold is calculated. Calculated normalization parameters and threshold are used in the third phase, which is the testing phase, to test the claims. Claim is made by giving the user name, PIN, password, smart card, etc. Computational complexity is less in verification mode, as matching is performed in one-to-one fashion. As the identity proofs are already given to users to claim their identity, verification systems are used by all the commercial and government applications.
|Figure 3: Block diagram of person verification system through threshold.|
Click here to view
- Identification: In identification, no claim is made for identity, and matching of the test pattern is made across all the known reference models in a many-to-one fashion to provide the final result as the person identified or not identified. In this process, person does not give any PIN, password, or/and smart card. Computational complexity is more in identification mode, as matching is performed in many-to-one fashion. Identification mode is used in all the forensic applications, as they do not know that the biometric taken from the scene of crime belongs to whom. Hence, matching is performed against all the stored templates.
3.2 Classification based on Number of Biometrics Used
- Closed set: In closed set, the identification is always only across the group of known persons, and hence in case of similarity score, the maximum scorer is treated as identified user.
- Open set: In open set, the identification can also be for the persons outside the known group. Therefore, in open set identification, a threshold is required for deciding whether the test pattern belongs to an outside person or person from the enrolled group. Furthermore, based on the comparison of score with threshold, if the test pattern is identified to belong to group of enrolled persons, then the particular person is identified as in closed set case.
Recognition systems are classified into four types according to the number of biometric characteristics, snapshots, features, and matchers .
Unibiometric: Single biometric is taken such as either speech or any other biometric, but only one.
Unimodal: Unimodal biometric system is a subset of a unibiometric system that uses a single instance/snapshot, a single representation, and a single matcher. It is the most restrictive case.
Multi-biometric system: Multi-biometric system is a biometric system that uses more than one independent or weakly correlated biometric characteristic (e.g., speech and face of the same person, or fingerprints from two different fingers).
Multimodal Biometric System: MBS is a superset of a multi-biometric system that may use more than one correlated biometric measurement (e.g., multiple snapshots/instances, multiple impressions of a finger, multiple images of a face, multiple representations of a single input, multiple matchers of a single representation, or any combination of them). It is the most general case.
The system, which utilizes only one biometric, is termed as unimodal biometric system. Unimodal biometric systems are easier to install, easier to use, cheaper, and less complex, but it also faces the problems like noisy data, intra-class variation, non-universality, lack of distinctiveness, spoof attack, etc  . Users are more restricted in single biometric, i.e., individual has to put his hand in full contact with sensors during acquisition of palm print, etc. Each unimodal system has the upper bound in performance. All these problems can be solved by MBS. There are very less chances that both the biometrics have the same type of problems, and hence the advantage of this point can be taken by fusing all of them.
| 4. Multimodal Biometric System|| |
Multimodal biometric is a method that consolidates the evidences obtained from different sources to overcome the limitations of unimodal biometric systems , . Consequently, population coverage is increased as well as it can work in more robust conditions. Furthermore, MBSs are more reliable because independent biometric modalities are used. In certain situations, the user might find one form of biometric identification that is not exact enough for identification due to non-universality. Fingerprint biometric is an example, where at least 4% of the population have worn and/or cut. In MBS, if one of the technologies is unable to identify, the system can still use the others to accurately identify. All the biometric systems cannot be spoofed simultaneously; thus, the probability of accepting an impostor as genuine is greatly reduced with MBS. The presence of a live user can be ensured by asking the user to provide a random subset of biometric characteristics (e.g., second sentence should be provided first, and then first sentence should be provided for speaker recognition), movement of the face in random direction (e.g., first move the face in right, and then in left for face based recognition), data in random block (e.g., third block should be signed first, and then second), etc. Therefore, challenge response type of mechanism can be used to improve the security. In MBS, system administrator can decide the level of security. For a high security region, they may require all the biometric identifiers to recognize the person or for a lower security region, only some of them to recognize the person. MBSs increase the cost and verification time. The following are different methods for implementing MBSs in person recognition applications.
4.1 Integration Modes
From the point of flow, an MBS can operate in one of the different modes: Serial, parallel, hierarchical, pipelining, or sequential approach with reject option.
4.2 Integration Scenarios
- Serial mode: In serial mode, the output of one biometric modality is used to narrow down the number of possible individuals to N. Final decision is given by the second biometric system from these N individuals. More than two biometric systems can be integrated. Therefore, multiple sources of information (e.g., multiple modalities) do not have to be acquired simultaneously, and hence single sensor can be used for the collection of signature and handwriting. Furthermore, according to the setting, a decision could be made before acquiring all the modalities. Therefore, this early decision can reduce the overall recognition time. For example, an MBS is designed using face and speech. Recognition based on face is fast, but not very accurate. On the other hand, speaker recognition is comparably slow, but very much accurate, and hence recognition system based on face is used first to retrieve some top possible individuals, and then speaker recognition system is used to decide the single user as identified individual. In serial mode operation, generally, time consumption in each system is different. Face and speaker recognition modalities are used in serial mode in  . FRR is reduced 3.9%, but system response time is increased 67% at fixed FAR.
- Parallel system: In the parallel mode, the information from multiple modalities is used simultaneously in order to perform recognition. Time consumption of both the systems should be same; otherwise, one system has to wait. It is better to approach the parallel mode when all individual recognition systems are fast. Parallel mode offers the advantage when all the systems have different confusion matrices, but at the disadvantage of separate sensors required on the device, even though the same sensor can be used for different modalities.
- Hierarchical: In the hierarchical mode of operation, combination of serial and parallel systems is used. Advantages of both parallel and serial systems are inherited. Hierarchical mode is used in that condition when large number of biometric systems is present. Four types of features are extracted from palm print, and hierarchical approach is used for decision in .
- Pipelining: In the pipelining mode of operation, advantages like multimodal system can be obtained by single sensor and single feature extraction technique. When features are extracting for first modality, at the same time, second modality is collected from sensor, and so on. Time required by each module should be same in pipelining architecture.
- Sequential approach: In sequential approach, biometric systems are combined sequentially. Each biometric system has the option of reject. If any system observes that quality is not good, then modality is rejected, and decision depends on second biometric system, and so on  . Second system is more costly, more informative, and more computational complex than first. Classification time is reduced in multistage approach at the reasonable expense of classification accuracy  .
Several important concepts about integration are given in  . There are several motivations for information integration, such as using more than one type of sensor (e.g., capacitive, optical) can increase reliability, using complementary information (e.g., speech and face) can reduce error rates, using more than one representation (e.g., LDA, DCT, etc.) can provide different information, using multiple matching algorithms (e.g., Euclidean distance, threshold absolute distance, etc.) can reduce matching errors, implementation cost can also be reduced by using several cheap sensors in contrast of one costly sensor, information can integrate using more than one fusion rule (e.g., matcher weighting (MW), user weighting) to take the advantage of both, using multiple sensors at different positions to capture data from different points of view (e.g., 3D face recognition), etc. Thus, MBSs are designed to operate in one of the following five scenarios by considering above motivations  .
- Multiple sensors: Single biometric feature can be extracted through different sensors. For instance, optics, ultrasound, and semiconductor-based sensors can be used. Capacitive sensors have the advantage that they can give good results in the presence of dust also. Average of all the images is taken to get better image. Weighted sum can also be used to cash the performance knowledge of the systems. A multi-sensor fingerprint verification system based on the integration of optical and capacitive sensors is discussed in  . Experimental results demonstrate that multi-sensor fusion outperforms compared with verification performances of the best matcher based on a single sensor; however, it increases the system cost. Analysis reveals that optical and capacitive sensors offer complementarity information, and hence fusion is resulting superior accuracy. The consolidation of Infrared (IR) and visible light face images can outperform either IR or visible light alone, is disclosed in  . An integration method that considers the scores accomplished better accuracy than one that only considers ranks.
- Multiple biometrics: Number of biometric features can be captured simultaneously for recognition. For example, recognition system based on speech and face can be used at the same time. These recognition systems are fused at different levels such as sensor level, feature level, etc. In verification system, multiple biometrics are used to improve the system accuracy, while in an identification system, the matching speed can also be improved by operating in the serial mode.
The prime selection criterion for the selection of multiple biometrics is to choose the biometrics that are providing complementary information. Except this primary selection criterion, there may be some other reasons for the choice of multiple biometrics as mentioned below.
- First, from the single sensor more than one biometric characteristic can be acquired, such as palm print and hand geometry. Generally, combination of different biometrics improves the performance, but if a strong biometric is combined with a weak biometric, then performance will be degraded. Palmprint and hand geometry features are combined at score and feature level is described in  . Palmprint performed better compared with hand geometry, and score level performed better compared with feature level.
- Second, all biometrics can be available in single capturing, but may be from more than one sensor, such as face and speech in a video. Face and voice data are captured in  . Two classifiers for face and three classifiers for speech are used. In the first scenario, face and voice classifiers are combined separately, now the fused face and voice scores are integrated for computing the final score. In second scenario, all five classifiers are combined simultaneously. The combined system is performing better compared with one that considers only acoustic or visual evidences, but user interaction is needed with more than one sensor and hence increasing user inconvenience.
- Third, if dynamic biometric is available in the capturing (i.e., lip movement in the video), it can also be utilized. Voice and lip movements are dynamic features that provide more security compared with static features  . Consequently, in order to construct a robust identification or verification system, dynamic biometric cues should be integrated. A dynamic biometric feature, lip movement of a person, can be incorporated to resist against faking and criminal attacks. Accordingly, voice, lip movement, and face are combined in  . They obtained 0.21% FRR and 0.33% FAR in person verification using a 2-from-3 decision.
- Fourth, another motivation for combining the biometrics can be their physiological locations. Ear and profile face can be fused to take advantage of their physiological locations and the complement evidence. Ear and profile face are combined in  . Profile face has 93.46% recognition performance, ear has 91.77%, and feature-level fusion by the Weighted-sum rule has 96.84%.
- Fifth, many devices have one sensor, so another low-cost sensor can be incorporated in these devices to operate them in multi-biometric scenarios. Fingerprint sensor is very cheap and it can be included in laptop, which already have web camera, to run in multi-biometric schemes. Thereby, eigenfaces of face and minutiae of fingerprint are integrated in  . At 0.001% FAR, they achieved 64.1%, 14.9%, 9.8% FRR for face, fingerprint, and integration of them, respectively.
- Multiple units of the same biometric: Multiple units of many biometric characteristics are provided by nature, such as two irises, ten fingerprints, etc. Various units of one biometric characteristic can also be used for integration. Features of each unit are concatenated. Capturing more than one unit can make a user uncomfortable due to more restrictions. The process can also take a long time if the different units are not taken simultaneously. Advantage of capturing multiple units is that at a particular time one finger may get scars, but other may be secure. Also, it can ensure the presence of a live individual by asking him/her to present units in random order (e.g., first provide left retina and then right retina).
- Multiple snapshots/instances of the same biometric: Multiple snapshots are taken at different instances by one sensor (e.g., multiple utterances of the same sentence, or multiple images of the face, multiple impressions of the same finger). Average of all the images is taken to get better image. Averaging is the smoothing process and hence averaged image is a better representation. Multiple snapshots capturing is very much time consuming, but it is in the training phase and hence can be tolerated.
- Multiple representations and matching algorithms for the same biometric input signal: Multiple representation techniques can be integrated to take the advantage of all. For example, advantage of LDA is that it has more discriminative information and PCA has the advantage that it reduces the dimension, and hence computationally less complex. Performance can be improved by combining LDA and PCA. The features produced from two fingerprint images are matched using Hough Transform-based matching, string distance-based matching, and 2D dynamic programming-based matching algorithms to produce a matching score. Combinations of algorithms, which are supplying complementary information, elevate the performance. Discrimination between imposter and genuine is increased by fusing different fingerprint matching algorithms that is also described in , . Minutiae and Ridge feature map are used for matching purpose in  . The hybrid matcher performs superior in contrast of a minutiae-based fingerprint matching system.
| 5. Information Fusion in Multimodal Biometric System|| |
MBSs combine information given by multiple biometric traits. This has been shown to be a very promising trend, both in experiments and to some extent in real life biometric person authentication applications. Several important concepts about fusion are given in  . Various classifiers, predictors, and estimators can be used for information fusion  .
5.1 Types of Fusion
Broadly, the information fusion is divided into three parts, pre-mapping fusion, midst-mapping fusion, and post-mapping fusion/late fusion  . Mapping refers to transformation of sensor-data/feature space into opinion/decision space. Pre-mapping fusion refers to consolidating information prior to the application of any classifier (provides a hard decision) or expert (provides an opinion/score) or matching algorithm. Midst-mapping fusion refers to consolidating the information at the time of mapping from sensor-data/feature space into opinion/decision space. Post-mapping fusion refers to consolidating information after the application of the classifiers, experts, or matching algorithms.
Various combination rules are presented to combine the classifier such as maximum, median, etc., but there is no overall winning combination rule  . Product rule is very much affected by wrongly estimated posterior probabilities. One bad probability estimation (i.e., P=0) results in overall probability as 0. The Maximum rule chooses the matcher, which is producing the maximum probability, but it may be affected by noise (outlier). Median and mean average the posterior probability, and hence estimation errors are reduced. Simple sum, weighed sum, etc. require the normalization before combining the scores. Eigenfinger and eigenpalm are fused at match score level in  . Recognition rate = 100%, EER = 0.58%, and Total Error Rate (TER) = 0.72% are obtained.
- Pre-mapping fusion/early fusion: Prior to application of classifiers, experts, or matchers, information can be combined at the sensor level or at the feature level.
- Sensor level: Raw information acquired from the multiple sensors (same sensor different positions, different sensors) are combined in sensor-level fusion to produce a raw fused information. Consequently, features are derived from the fused raw information and matching is carried out. Sensor-level fusion can be mainly organized in three classes: (1) single sensor-multiple instances, (2) intra-class multiple sensors, and (3) inter-class multiple sensors [Figure 4].
- Single sensor-multiple instances: Multiple instances captured from a single sensor are integrated to obtain the information in more reliable and descriptive fashion. There are three main methods to accomplish this fusion: Averaging, weighted summation, and mosaic construction. For example, averaging, a smoothing operation is used to combine the multiple instances to reduce the noise. Generally, multiple instances collected from the sensor are not aligned and of same size and hence data must be aligned and normalized (resized) prior to combination. Weighted summation can be used to consolidate when the importance should be given to certain instances. Several panoramic face images are combined at sensor level to acquire a mosaic of them  , which produces some 3-D surface information, to handle the pose variation problem. They tested two panoramic face representations: Spatial and frequential. Frequential representation gives the better performance, with a correct recognition rate of 97.46%, vs 93.21% for spatial representation. Due to their very small contact areas and low cost, swipe fingerprint sensors are very convenient for different applications like mobile phones, PDAs, portable computers, and security applications  . They develop a hybrid swipe fingerprint mosaicing scheme to succeed in registering swipe fingerprint frames.
- Intra-class multiple sensors: In this case, multiple instances acquired from multiple sensors (of same class) are combined to account the different location information of the same sensor or variability of different sensors (e.g., visual and IR). Hence, this category is further classified into two subcategories: (i) same type-different locations and (ii) different types. In the same type-different locations subcategory, data are received by placing the same type of sensors at different locations and fused to elevate the accuracy of the system. For example, 14 different signals are captured from 14 sensors placed at different locations in a data glove for signature verification problem in  . They describe a technique, which is based on the Singular Value Decomposition, to attain some singular vectors sensing the maximal energy (i.e., most of the variation in the original data) of glove data matrix to reduce the effective dimensionality of the data. The above technique is examined against large number of legitimate and forgery signatures and shows remarkable level of accuracy in finding the similarities between genuine samples as well as the differences between genuine-forgery cases. In the different type subcategory, information are captured by distinct type of sensors and fused to cash the complementary behavior of data. For example, IR imagery is less sensitive to illumination changes compared with visible imagery  . In contrast, IR has some other limitations like it is opaque to glass. Hence, these limitations can override by fusion of IR and visible images. In weighted averaging-based fusion algorithms, if weights calculated from previous empirical results are constant for different input data, then it degrades the recognition accuracy. Hence, weights should be dynamically and locally allotted for optimal information fusion  . They proposed the multispectral face image fusion algorithm that dynamically and locally assigns the weights for image fusion by implementing 2v-Granular Support Vector Machine (2v-GSVM) and fused image is generated. Then, 2D log polar Gabor transform and local binary pattern feature extraction algorithms are applied to the fused face image to obtain global and local facial features, respectively. Experimental results show that this algorithm outperforms than existing fusion algorithms.
- Inter-class multiple sensors: Most of the existing works on multi-modal biometric is focused on consolidating the outputs of multiple classifiers at the decision level, score level, or rank level. Very less studies are carried out at sensor-level fusion considering inter-class modalities. In this category, the palm print and palm vein images are fused , . Shift Invariant Fourier Transform (SIFT) features are extracted from the fused image and matching is carried out. This method produces 95% recognition accuracy. The fingerprint image is decomposed to three levels using Daubechies 9/7 wavelet transform and the face, iris, and signature images are decomposed to second levels for data-level information fusion in  . Hence, it reduces the memory required for storing the multimodal images by 75%. Authors reveal that integrity of biometric features and the matching accuracy of the fused image are not affected.
- Feature level: Different feature vectors, which are extracted by either using several sensors (e.g., camera, microphone) or using several feature extraction techniques (e.g., PCA, LDA) on the same sensor data, are integrated for feature-level fusion. Feature-level fusion can be mainly organized in two categories: (1) intra-class and (2) inter-class [Figure 5]. Intra-class is again classified into four subcategories: (a) Same sensor-same features, (b) Same sensor-different features, (c) Different sensors-same features, and (d) Different sensors-different features. Feature mosaicing is performed for same sensor-same features subcategory. Feature mosaicing is the process of aligning the features of one sample over other to obtain the single master feature vector  . Experiment results for fingerprint feature mosaicing reflect that feature mosaicing performs superior compared with image mosaicing. Main benefits of feature mosaicing over existing image mosaicing approach is that it has low memory requirement and low computational complexity  . Some optimum features are selected based on different approaches for all other subcategories of intra-class and inter-class. GA is applied to acquire an optimum feature set for IR and visible face images in eigenspace domain  . Reported results demonstrate considerable performance enhancement. Amplitude and phase features are captured from the fused face image using 2D log polar Gabor wavelet in  . Then, an adaptive SVM learning algorithm is utilized to intelligently choose either the amplitude or phase features to produce a composite feature set. Experimental results express that the fusion of visible light- and short-wave IR spectrum face images outperforms with an equal error rate (EER) of 2.86%. It should be noted that in the above approaches, feature vectors are not concatenated. A general framework of the second fusion approach is illustrated in [Figure 6]. Two sets of feature vectors are obtained and normalized using any normalization technique, such as max, min-max, etc. Then, the resultant feature vector is produced by concatenating these normalized feature vectors. Consequently, a feature selection method (e.g., Forward selection, backward elimination, etc.) is employed to reduce the redundancy and supply the final feature vector. Concatenation can lead to "curse of dimensionality problem" , . Fusion is carried out at the feature level in three different scenarios in  : (i) fusion of PCA and LDA coefficients of face, (ii) fusion of LDA coefficients of the Red (R), Green (G), and Blue (B) channels of a face image, and (iii) fusion of face and hand modalities. First, feature vectors are normalized, then concatenated, and finally sequential forward floating selection is employed for effective feature selection. Hand shape and palm print texture are integrated at feature level in  . DCT coefficients and shape features are used for palm print- and hand geometry-based recognition, respectively. Correlation-based feature selection is used for efficient feature selection. Reported results present that fused feature vector can be a promising addition to the biometrics-based person authentication systems. SIFT features from face images and minutiae features from fingerprint images are extracted. The redundant features are eliminated by employing "k -means" clustering algorithm. If one feature set is dominating compared with other, then weight should be incorporated in concatenation. Discriminant features for face and palm print images are extracted using Gabor-based image preprocessing and PCA methods. Weighted concatenation is performed and weights are calculated based on separability criteria. Experimental results indicate significant improvement in the recognition accuracy. Average rule, product rule, and weighted sum rule can also be used for feature-level fusion. Features from face and ear are extracted by employing kernel Fisher discriminant analysis (KFDA) and combined at feature-level fusion in  . KFDA is a non-linear version of FDA. Three feature fusion rules average, product, and weighted sum based on the idea of decision-level fusion are presented. Advantages of feature-level fusion are: (i) it is a general belief that combining of information at an early stage gives more accurate results. Feature-level fusion is the early stage of score or decision-level fusion. Features have more information compared with score or decision. Thus, fusion at the feature level should give good results; (ii) it can remove the redundant information present in original multiple feature sets. Redundant information can be removed using correlation measurement across multiple feature sets. If feature sets are correlated, then they contain redundant information  . In contrast, some drawbacks of feature-level fusion are: (i) how much each vector is important than others may not be known. If the importance of each vector compared with other vector is known prior, then the redundancy can be removed by eliminating the correlated feature sets. Feature selection algorithms (e.g., forward selection, backward elimination) are employed; (ii) in this case, where feature sets are non-homogeneous, feature sets are concatenated for fusion and hence resulting feature set may be of large dimensionality that can cause the curse of dimensionality problem; (iii) feature vectors of each biometric modality must be provided at the same time interval (i.e., the feature extraction from different biometric modalities must be synchronous). In general, speech feature vectors are extracted at a rate of 100 per second and visual feature vectors are depended on the frame rate of video camera (25 frames per second [fps] in the Phase Alternate Line standard and 30 fps in the National Television Standards committee [NTSC] standard). Thus, it is difficult to combine speech and visual feature vectors; (iv) most commercial biometric systems do not provide information about the feature vectors, which they use, due to security reasons. Therefore, very less work has accomplished about feature-level fusion. Post-mapping fusion is generally preferred due to the above problems.
|Figure 6: A general framework for feature-level fusion using concatenation.|
Click here to view
- Midst-Mapping Fusion: In this algorithm, data streams are processed simultaneously by different models and their feature vectors are mapped into the combined/composite opinion or decision space. It processes the feature vectors by different models, and hence avoids the problems due to feature concatenation like the curse of dimensionality. Midst-mapping fusion can be accomplished using different variants of Hidden Markov Model (HMM), such as Multistream HMM  , Asynchronous HMM  , Product HMM  , and Coupled HMM , . For example, joint temporal modeling of the acoustic and visual feature vectors is performed using multistream HMM to account the asynchrony between the two modalities in  . A model, product HMM, is introduced in  at intermediate level to handle loose audiovisual synchronicity, but product HMM is not able to handle the loose synchronicity within phonemes. The coupled HMM models the state asynchrony of the audio and visual observations sequences, while still preserving their natural correlation over time  . They showed that coupled HMM performs superior than multistream HMM in audio visual speech recognition.
- Post-Mapping Fusion: Consolidation of information after the classifier can be accomplished by two main approaches: Dynamic classifier selection and classifier fusion. Classifier fusion is again classified into three categories: Fusion at the abstract level, fusion at the rank level, and fusion at the matching score level (measurement level/confidence level/opinion level).
- Dynamic Classifier Selection: In this case, classifier is chosen based on local accuracy  . Accuracy of classifier can be checked using EER (i.e., high EER refers to low accuracy). Particular biometric systems can perform well for a particular place. Recognition systems based on fingerprint generally perform better than face-based recognition systems, but for example, individuals of a particular place do not have proper fingerprint. Hence, the performance of fingerprint-based recognition system will be lower than face-based recognition system at that particular place. Thus, face recognition system is more suitable for that particular place. The winner-take-all approach is selected for classifier selection.
- Abstract-level Fusion: Each biometric classifier provides a decision based on the matching of input test pattern against enrolled model. Now the decisions can be combined using the approaches, such as majority voting , , weighted majority voting  , all accepted (forensic applications), all rejected (defense applications), output of any particular sensor is important, behavior knowledge space (BKS) , , AND rule and OR rule, weighted voting based on the Dempster Shafer theory (DST) of evidence  , etc., to obtain the final decision. DST considers uncertainties in classification for integrating the classification outputs. The recognition, substitution, and rejection rates are used to measure the belief of every classifier. DST is quite robust due to consideration of uncertainties, and it outperforms majority voting. In contrast, the belief measurement procedure is not optimal, as it does not consider the accuracy with respect to each class. An experimental comparison between fixed fusion rules, such as sum, majority voting, and order statistics-based rules and trained fusion rules, like BKS and the weighted averaging of classifiers outputs, for multimodal personal identity verification, is narrated in  . Theoretical and experimental results recommended that simple averaging of classifier's outputs generally perform better for classifiers with similar accuracy, while weighted averaging work well for imbalanced classifiers  . In contrast, trained fusion rules require good quality and adequate size of the training set for good performance, but usually trained rules work well compared with fixed rules. BKS is a combination scheme that is designed to avoid the independence assumption. Prior knowledge corresponding to the behavior of the classifier ensemble is stored in the BKS, which is an N-dimensional space, where individual dimension represents the decision corresponding to one classifier. 22N fusion rules can be possible for N sensors  . A multimodal person authentication based on face images is described in  . Information is extracted from both profile and frontal view images. Decision-level fusion is performed using OR and AND rules. Clustering algorithms can also be used for decision fusion  . Five person authentication algorithms based on grey level and shape information of a person's face and voice features are combined using fuzzy k-mean and fuzzy VQ. Fuzzy clustering algorithms provide better performance than the classical clustering algorithms.
- Rank-level Fusion: In the rank-level fusion, each classifier provides a rank to each class for an input test pattern presented to it. Combination is accomplished by employing different methods such as highest rank, Borda count, and logistic regression  . In highest (maximum) ranking, how many number of times each class get rank 1 for the given input test pattern presented is calculated. The class which gets maximum number of 1 rank is assigned as identified class. In Borda count, sum of the classes below (i.e., in rank) a particular class is assigned to that particular class. For example, there are three classes c1, c2, and c3. Class c1 get the rank 1, c2 get 2, and c3 get 3. c2 and c3 have the rank below c1 and hence Borda count for c1 is 2. c3 has rank below than c2 and hence Borda count for c2 is 1. No class has the rank below c3 and hence Borda count for c3 is 0. Now sum up the Borda counts of respective class for each classifier. The class which obtains the maximum number is assigned as identified class. The difference between classifiers is not considered into Borda count. All the classifiers are treated equally, even some work better than others. In maximum ranking and Borda count, there is the possibility of tie (i.e., two or more than two classes get same value). Hence, logistic regression is employed in which weights are assigned to the scores of Borda count for the classifiers that perform better. Weights are derived by logistic regression. In parametric methods, some assumption related to parametric family of distributions should follow that cannot always be considered in practical systems. Nonparametric methods do not require to follow any assumption, and hence it can be used in almost all the practical conditions. Nonparametric sequential ranking procedure using sequential probability ratio test is described in .
- Score-level Fusion: Scores are obtained after the application of the matching algorithms. These scores can be combined, but scores must normalize before combining to transform all the scores in same interval  . This fusion is termed as "fusion at the measurement level or confidence level or opinion level." Decision and rank-level fusion are at the later stage in the processing compared with score-level fusion. In general, score-level fusion provides better results than decision and opinion-level fusion, since more discriminative information is present at score level compared with decision- and rank-level fusion. Also, scores can be easily combined. All commercial biometric systems are ready to provide score data, which they use, since no security issue is present with scores. Typically, score-level fusion can be employed for all the combinations of different biometric modalities that is a major problem with feature-level fusion. Furthermore, no information about how the matching scores are related is required. Most of the research studies are accomplished at score-level fusion due to all these advantages. It is classified into two main categories: Classification and combination .
- Classification approach: In this approach, a multidimensional vector is formed with the help of the distances generated by the several classifiers. This multidimensional vector is again provided to the classifier for final result. The final result is "Accept" (genuine individual) or "Reject" (impostor individual) using the composed feature vector. For example, scores generated from speech-, face-, signature-, and handwriting-based recognition systems are 98, 80, 88, and 72, respectively. A vector, (98, 80, 88, 72), is composed and classified as accept or reject. If the matching scores of the face and iris are x1 and x2, then a two-dimensional feature vector (x1,x2) is constructed and a classifier such as FDA and a neural network with radial basis function are employed for classification  . Reduced multivariate polynomial model can be used to integrate the classifiers by taking their outputs as the inputs  . Feature extraction involves a certain decision process (e.g., difference of means should be high, variances should be low) to select relevant features. Estimator is required to calculate mean and variance. Thus, estimator can also be used for fusion. Learning algorithms optimize the parameters by training data. Some examples of learning algorithms are GMM, HMM, etc. Most of the learning algorithms go through trial and error efforts in the training phase to optimize the parameters for modeling. In finite iterations, the optimal solution cannot be obtained. Termination condition is used for ending of iteration. FNN is also an example of learning algorithm for optimization  , but it is not so much efficient. Ridge polynomial network is more efficient than FNN  . Several classifiers like SVM, multi layer perceptron, C4.5 decision tree, Fishers linear discriminant, and Bayesian classifier are used for fusion in  . Results show that Bayesian classifier and SVM perform better than others. Data modeling is compulsory for the Bayesian classifier, while SVM does not presume any specific data distribution. When the requirement is that the classifiers should be combined very quickly, reduced multivariate polynomial classifier can be used  . The parameters are solved very quickly. Reduced multivariate polynomial modal has the advantage over complex neural network that it has straight forward modal parameter calculation .
- Combination approach: In combination approach, first of all score normalization is performed, after that these scores are combined using different techniques like simple sum, weighed sum, max rule, min rule, etc  . In the context of verification, comparison of score against a threshold results in final decision. If the classifiers are independent, then it is very beneficial  . If classifiers are independent, then in the product rule,
5.2 Rules of Fusion
- Non-probabilistic rules: The normalized score for user i(i = 1, 2, 3,..., I, where I is the total number of individuals in the database) by matcher m(m = 1, 2, 3,..., M, where M is the total number of matchers) is denoted as . ƒi represents the fused score for user i. Some fusion rules are discussed in .
- Simple-Sum (SS): Scores of all the matchers for user i are summed up to provide fused score .
Sum rule performs better than others, since no probability estimation error occurs in it  as well as scores of all the classifiers are considered .
- Min-sore: Minimum score for user I is selected from any of the classifiers. It is useful when distance scale is used, as small score is better quantity in distance scale.
- Max-sore: Maximum score for user I is selected from any of the classifiers. It is useful when similarity scale is used because large score is better quantity in similarity scale.
- Matcher Weighting: Weights are assigned to all the matchers based on different methods.
- Exhaustive assignment: Weights are assigned to all the matchers in the multiple of 0.1, all the possible combinations are tried  . Combination giving the best result is selected as final weights. Other methods are tried due to computational complexity of this method.
- Weights are assigned to the all the matchers based on their EERs. The EER of matcher m(m = 1, 2, 3..., M, where M is the total number of matchers) is represented as r m. Then, the weight associated with matcher m is denoted as w m. When EER for a matcher is high, the weight for that matcher should be low. For example, assume that there are two matchers. The weight associated with particular matcher is inversely proportional to the error of that matcher and hence
Sum of all the weights should be equal to 1 and hence
The weights for more accurate matchers are higher than those of less accurate matchers. The MW fused score for user I is computed as
- User Weighting: The weight is assigned to every pair of user and matcher, (i, m), based on following approaches:
- Weight is assigned to every pair of user and matcher, (i, m), where each weight is the multiple of 0.1 in the range (0, 1). This weight assignment can be exhaustive if the number of matchers or users is very high
- Wolf-lamb concept is employed to obtain the weight corresponding to individual user and individual matcher. The users, who can be easily copied, are referred as lambs. In contrast, the us ers who can copy some other users are referred as wolves. When lambs and wolves are present in the system, the accuracy of biometric recognition systems is decreased, as false acceptances are increased. Lambness metric can be derived for every pair of user and matcher, (i, m), to calculate the weight for each pair. If the user I is a lamb for matcher m, then the weight associated with this matcher for user I is decreased. The main aim is to reduce the error rates by decreasing the lambness of the user i. By assuming that the mean and standard deviation of the genuine and impostor distributions are known for every (i, m) pair, when they are not known, they are estimated from the obtained matching scores. In this case, estimation error can create problem. The quantity represents the mean of the genuine distribution for the pair (i,m) and mean of the imposter distribution for the pair (i,m) is denoted as and denote the standard deviations of these distributions, respectively. The d-prime metric can be employed to access how much these two distributions are separated. Lambness metric can be measured for the pair (i,m) as given in ,
where, quantity represents the lambness metric for pair (i, m). Small refers that difference between means is less or variances are high. Hence, overlap region will be present and user I is a lamb for few wolves. In contrast, large refers that difference between means is high or variances of distributions are low. Hence, overlap region will be very small or not present and user i will be lamb for very few wolves or not be lamb for anyone. Small refers that user i is a lamb for few wolves for matcher m. Hence, weight for the pair (i, m) should be small to decrease the lambness. It implies that is proportional to . Assuming that there are two matchers:
Sum of all the weights should be equal to 1 and hence
- Probabilistic rules: In all above score fusion rules, score normalization is required prior to the fusion, but in probabilistic methods, probability directly provides the value between (0, 1). Hence, no score normalization is required. In probabilistic methods, probability estimation error is a problem, while it is not a problem in non-probabilistic methods. Some probabilistic fusion rules are also provided in .
- Product rule: When input pattern Z is provided to the matcher i(i = 1, 2, 3,..., M, where M is the total number of classifiers), extracted feature vector is x i. W j represents the class j(j = 1, 2, 3,..., m, where m is the total number of classes). is the posteriori probability of the input pattern Z belongs to class w j , given the feature vector x i. Estimation of probability of input pattern Z belonging to class w j given the feature vectors of all the classifiers considers an assumption that representations are independent and identically distributed  . Hence, it results in estimation errors. No assumption is considered in score fusion rules, such as simple sum, max rule, etc. Input pattern Z is finally assigned to the class c∈1, 2,..., m.
Product rule considers the assumption that representations (i.e., x 1 , x 2 ,..., x R) are independent. The input pattern Z is assigned to class c as
Different biometric modalities of a user (e.g., speech, face, signature) are mutually independent. In general, different representations are also independent, such as PCA and LDA for face-based recognition systems. Hence, product rule can be used for MBSs under independence assumption  . Product rule overruled all other fusion rules in aspect of performance in .
- Sum Rule: In uncontrolled and dynamic environment, very much noise is present in the data obtained by the sensors. In this case, sum rule is very much useful, since sum is a smoothing operation. On the other hand, in the presence of noise, if product rule is employed, it will be affected more. The input pattern Z is assigned to class c such that
- Max-rule: Max rule depends on the order of the matching scores generated by the matchers; therefore, can be affected by the noise (i.e., outlier). The max rule assigns the input pattern to class c such that
- Min-rule: Min rule also depends on the order of the matching scores generated by the matchers and therefore can also be affected by the noise (i.e., outlier). In this case, the input pattern is assigned to class c such that
| 6. Performance Evaluation of Biometric System|| |
Some biometrics change with age (e.g., face), sensors get tear and wear (e.g., scratch on fingerprint sensor), accumulation of dirt on the sensors, different ambient conditions (e.g., amount of environmental noise, illumination, cold, hot, humidity, etc.), intra-class variation (e.g., facial image may change with a different hairstyle, the presence or absence of eyeglasses, or due to some cosmetic changes), two successive samples of the same biometric from the same person (e.g., two utterances of a sentence) are not exactly the same due to different environment conditions as well as muscle operations (e.g., background noise, changes in mood), changes in the physiological or behavioral characteristics of user (e.g., cuts on the finger), and interaction of user with the sensor (e.g., finger placement, pose variation, focus, angle) and threshold. These are the reasons due to which performance of the biometric system can be degraded. Genuine distribution is plotted by using the matching scores of genuine matches and imposter distribution is plotted by using the matching scores of imposter matches.
In case of similarity scale, if the person is imposter in real, but the matching score is higher than the threshold, then he is treated as genuine that increase the FAR and hence performance also depends upon the selection of threshold value. Prediction model can be employed to obtain the optimal sensor fusion combination instead of performing the brute-force experiments. A prediction model based on area under ROC curve, likelihood ratio, and discriminability measurement to predict the performance of sensor fusion is described in  . They show that given the characteristics of individual sensor, how the performance of sensor-level fusion can be predicted, and how good the prediction is.
Biometric recognition systems make mainly four kinds of errors  [Figure 7]:
6.1 Decision Errors
Wrong decision by system is termed as decision error. Decision errors are divided into two categories, false accept and false reject.
Decision errors are due to matching errors and/or image acquisition errors (or, with some systems, binning errors)  . How these errors are combined to form decision errors relies on
- False Acceptance Rate: Rate of accepting unauthorized persons as authorized persons is labeled as FAR. In case of similarity scale, if the person is imposter in real, but the matching score is higher than the threshold, then he is treated as genuine and it is termed as false acceptance. Person can be false accepted by both identity (positive ID) and non-identity (negative ID) systems. For negative identity system, each person's biometric data should be stored once in the database. If name or ID of any person is changed in the database, then in explicit claim, no match will be there and person will be false accepted. In some systems, maximum number of wrongful attempts are fixed to increase the security. Some systems may allow maximum three attempts to be accepted by the system. You will not be accepted after three attempts either you are genuine or imposter. FAR is also referred as Type II error. Performance analysis of MBS using FAR and FRR is described in  .
- False Rejection Rate: Rate of rejecting authorized persons is termed as FRR. In case of similarity scale, if the person is genuine in real, but the matching score is lower than the threshold, then he is treated as imposter and it is termed as false rejection. Person can be false rejected by both identity (positive ID system) and non-identity (negative ID) systems. If non-identity system is used with implicit claim, then matching will take place with all the stored reference models. Even if person is new to system, but if his/her biometric matches to any of the other persons, then he/she will be rejected. FRR is also referred as Type I error. Performance of the systems cannot be checked properly by only using the FAR and FRR. FRR and FAR also depend on the size of the database, and hence confidence interval should also be given with FAR or FRR to correctly check the performance of the systems .
6.2 Matching Errors
- whether one-to-one or one-to-many matching is required. If number of persons are more in one-to-many matching, then chances of false acceptance are more;
- whether the claim is positive or negative. In positive claim, person is false accepted if his biometric match to any other in the database. A person can also be false accepted in non-identity (negative-ID) system with explicit claim. For negative identity system, person's biometric data are stored once in database. If any person changes his/her name or ID, then person will be false accepted due to no match; and
- the decision policy, e.g., whether the system permits multiple attempts or single attempt to accept. When single attempt is allowed, then FAR will be low but FRR will be high, hence there is a trade-off.
Matching errors are divided into two classes, false match and false non-match, and these are due to matching algorithms and the selection of threshold.
The Difference between decision errors and matching errors:
- False Match Rate: The False Match Rate (FMR) is the probability that the system incorrectly declares a successful match between the input pattern and a non-matching pattern in the database. These systems are critical in defense since no person should be entered in the secure area. FMR is also called as false positive .
- False Non-Match Rate: The False Non-Match Rate (FNMR) is the probability that the system incorrectly declares an unsuccessful match between the input pattern and a matching pattern in the database. These systems are critical in forensic since no suspected person should be left. FNMR is also called as false negative .
Most of the persons use FMR and FAR similarly, but "FMR" and "FNMR" are not the synonyms of "FAR" and "FRR," respectively. "FAR" and "FRR" are due to "FMR" and "FNMR" as well as image acquisition errors and somewhat binning errors. One transaction may consist of more than one attempt, depends on decision policy. False match/non-match rates are calculated using the number of comparisons, but false accept/reject rates are calculated using the number of transactions.
In a positive identification system, some systems allow maximum three attempts to be matched to an enrolled template. If in these three attempts, the person is not able to match due to matching algorithm, threshold selection, or quality, then false rejection will result. If the person is false matched to any of the stored referenced models over any of the three attempts, then result will be false accepted.
In a negative identification system, claim of an unenrolled person will be falsely rejected if his sample falsely matched to any one of the stored reference models. A claim of enrolled person will be falsely accepted if his current sample falsely non-matched with his stored reference model.
6.3 Acquisition Errors
Some persons do not have proper quality biometric like fingerprint or sometimes sensors are not able to collect the proper data due to wrong interaction with the sensors, then these errors are termed as acquisition errors. Acquisition errors are classified into two groups, false match and false non-match.
6.4 Binning Error
- Failure to Enroll: Failure to enroll happens when the data obtained by the sensor are considered invalid or of poor quality, user is unable to produce the required biometric such as some user do not have hands, users are unable to match against their previous samples to confirm the enrollment, it may be due to drunk state, unhealthy state, etc. In some systems, if user is not able to enroll on a particular date, then enrollment may be reattempted after some days so that enrollment rejection due to some unexpected conditions, like drunken state, can be solved.
- Failure to Acquire/Failure to Capture: Failure to acquire happens when biometric is present, but due to mishandling of the system, it cannot be captured. It is the probability that the system fails to detect a biometric characteristic when presented to the sensor. For example, when persons change their face orientation in front of face recognition system, then it fails to acquire the sufficient quality image.
In identification phase, matching is performed in one-to-many fashion and hence it requires very long time to identify. Enrolled models are divided into the different "bins" to reduce the search area. Input test sample is assigned to a particular bin according to the obtained information and matched only to the enrolled models present in that bin to reduce the computation complexity and the time consumption. For example, enrolled models of terrorists are divided in bins according to country and then matching is performed against a particular bin according to the information gathered from the scene of crime. If the number of partitions is more, then the estimation of bin according to the obtained information may be wrong. In that case, when matching is performed, then input test sample do not match to any enrolled model in the assigned bin and hence it causes binning error. If the number of partitions of the database is more, then the penetration rate will be low, but the probability of a binning error will be more.
If the whole database is divided into the bins, then no need to search in the whole database for a particular input sample. Depth of search can be measured by the help of penetration rate. Lower penetration rates refer fewer searches, and hence less computational complexity and time consumption.
6.5 Commonly Used Other Errors
Some other useful errors and rates are:
Here, Cost of False Accepting (C FA) and Cost of False Rejecting (C FR) both are one, but they vary according to the application, such as cost of false accept is more in military area and hence Total Cost (T C) is
- Crossover Error Rate/Equal Error Rate: The rate at which both false accept and false reject errors are equal is termed as EER. There is a tradeoff between FAR and FRR, i.e., if any one variable is increased, then other will decrease automatically. The performance of the system cannot be measured by using single error (FAR/FRR), and hence EER is commonly used parameter when quick comparison of two systems is required. EER can be obtained from the FRR and FAR plots against the scores. FAR and FRR value at the crossover point is termed as EER. The lower the EER, the more accurate the system is considered to be. EER depends on mean and variance of imposter and genuine distributions  . EER can be reduced by increasing the difference between the means of genuine and imposter distributions and decreasing the variances of both imposter and genuine distributions.
- Genuine Acceptance Rate: There are two possibilities for the genuine user, either to accept or reject. Rejection will cause false rejection and acceptance will cause genuine accept, and hence FRR and Genuine Acceptance Rate (GAR) are complementary.
- Total Error Rate: There are two types of errors, false acceptance and false rejection, and hence
where, E is total error, and F ARa and F RRa are the achieved false acceptance and rejection rates  . FRR and FAR should be zero for accurate system, but if FRR and FAR are accepted up to some desired values, then up to these desired values they will not consider into the errors and hence total error is
where, F ARd and F RRd are the desired false acceptance and rejection rates.
Error in every system should be less. In defense, C FA is very high and hence FAR should be very less to reduce the error. 2-C FA is very less and hence FRR can be high. Performance of a system on large population is checked by using Commercial Of The Self (COTS) approach  .
Performance can also be checked by using confusion matrix C  . C ij denotes the number of patterns that actually belong to class i, but are assigned to class j. If there are M classes and patterns can be rejected also, then C is M×(M+1) matrix. Total patterns that belong to class i are . Correctly assigned patterns of class i are C ii, and hence are incorrectly assigned. In the same way, total patterns assigned to class j are . Correctly assigned patterns to class j are C jj, and hence are incorrectly assigned. Error can be computed by the help of incorrectly assigned patterns. If there are K classifiers, then there would be K confusion matrices C (k) ,1≤k≤K. Error corresponds to each classifier is computed.
Standard database should be used for evaluating the performance of any biometric system. If the performance of biometric system is evaluated on local database, then it cannot be compared with other techniques that perform on standard database. Other techniques can be used on local database to compare, but it is an exhaustive process. It is better to use standard databases, as they contain all the variations  .
| 7. Trends in Biometric Person Authentication|| |
The modeling of person-specific information started in each of the signal processing fields as a scientific curiosity. For instance, speaker recognition area started in the speech processing area much before it is included under the umbrella of biometrics. On the similar lines, signature verification, finger print identification, face recognition, and so on. The proliferation of computers, internet, and other modern communication means accelerated the need for developing schemes for person authentication. Initially, non-biometric means were developed and quickly realized that there are scopes for compromise in the security offered. With this, the biometrics for person's authentication became an attractive alternative.
The belief from the laboratory studies of each of the biometrics was very promising and respective researchers had a feel that their biometric will be deployed as the means for person authentication. However, in the field trials, it was quickly realized that any one biometric feature is insufficient for reliable person authentication under all conditions. This is because the operating conditions may vary significantly and accordingly their performance. In case of speaker recognition, if the background noise is high, then the performance degrades significantly. Also, it may be possible to fool the system. At this juncture, the biometric community switched to old saying "two is better than one." Accordingly, the birth of person authentication using multiple biometrics field.
In case of multimodal biometric person authentication, typically more than one biometric feature is used for person authentication. The hope here is the performance degradation may not happen simultaneously for all the biometric features. Also, it may be difficult to fool all the biometric features simultaneously. Even though the multimodal systems provided better performance compared with unimodal case, it did not became the ultimate solution. This shows that there is ample scope for improvement in the multimodal biometric person authentication field also. Among the different interests, the recent ones include adaptive MBS, complementary vs supplementary information, physiological biometrics, and so on.
The basis for adaptive MBS is the argument that for better performance, the biometric system needs to be adapted for the person over time. For instance, in the high noisy background, the speech data collected may be of poor quality. In such a scenario, the testing algorithm should suitably adjust its threshold with other biometric modalities to provide reliable performance. With continuous usage, the sensor quality may degrade resulting in noisy data, and signal processing algorithms should take care of handling such noisy signals. In one enrollment session, the amount of data may be insufficient to capture all the variations of a person in the particular biometric data. Adapting the model for multi-session data will improve the performance of the system.
One of the major factors that votes for the use of multimodal biometrics is the availability of complementary person-specific information among different biometric features. For instance, the speech and face can complement each other in a multimodal framework in providing robustness. In addition to this, other biometrics that can provide supplementary information can also be used for improving the robustness. For instance, speaker verification through speech mode may be replaced by an audiovisual speaker verification mode. Audiovisual speaker verification can provide robustness to the speaker verification module by extracting some supplementary information from the visual scenes.
Spoofing is another major issue faced by the existing biometric systems. One way to counter this is to employ liveliness detection along with person authentication. For this, physiological signals like ECG, electroencephalogram (EEG), and electromyogram (EMG) come handy. Both during enrollment as well as testing one of these physiological signals may be collected along with the other biometrics. The presence of these signals will provide the evidence that the same person provided the biometric data. Hence, spoofing can be avoided.
| 8. Summary and Future Scope|| |
This paper started with an introduction of biometric, its advantages, disadvantages, and the recognition system based on it. Biometric selection criterions based on different requirements and available resources were also briefly discussed. The classification of biometric system, its strengths, and limitations were also presented in this text. Descriptions of different studies in the field of MBS, different modes of operation, levels of fusion, and integration scenarios were also described. Classifications of sensor-level and feature-level fusions were carried out based on the survey of research papers in these areas. All the studies on MBS demonstrate the performance improvement due to the fusion of complementary information available in multiple biometrics. This paper also gave an overview of some performance parameters for biometric recognition systems. Finally, the current trends in the biometric person authentication field are also mentioned.
Furthermore, the challenges like spoof attack, noisy data, and privacy issues remain in the biometric recognition task. Hence, the research groups should give more attention toward the emerging physiological biometrics like EEG, ECG, and EMG, in biometric-based authentication task ,, . In practice, these are more robust toward spoof attack and too do not concern with the privacy issues, those related to the more sensitive biometrics like fingerprint and speech. In contrast, there will be much requirement in quality signal processing approach on those physiological biometrics to improve their recognition accuracy or to bring together with the existing biometrics in multimodal applications. Again, the newly developed periocular biometric, which requires less subject cooperation in acquisition, by Unsang et al.  may attract the research community in near future.
| References|| |
|1.||S. Nanavati, M. Thieme, and R. Nanavati, "Biometrics: Identity in a Networked World", In: M. Eldridge, Editor. New York: John Wiley; 2002. |
|2.||A. Abate, M. Nappi, D. Riccio, and G. Sabatino, "2d and 3d face recognition: A survey," Pattern Recognition Letters, Vol. 28, no. 14, pp. 1885-906, 2007. |
|3.||U. Rajanna, A. Erol, and G. Bebis, "A comparative study on feature extraction for fingerprint classification and performance improvements using rank-level fusion," in Proc. Pattern Analysis Application, pp. 263-72, 2010. |
|4.||A. Jain, S. Prabhakar, and H. Lin, "A multichannel approach to fingerprint classification," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 21, no. 4, pp. 348-59, 1999. |
|5.||L. Ern, and G. Sulong, "Fingerprint classification approaches: An overview," in Proc. Sixth Int. Symp. on Signal Processing and its Applications, Vol. 1, pp. 347-50, 2001. |
|6.||J.P. Campbell, "Speaker recognition: A tutorial," Proc. IEEE, Vol. 85, no. 9, pp. 1437-62, Sep. 1997. |
|7.||D. Pati, and S. Prasanna, "Speaker recognition from excitation source perspective," IETE Technical Review, Vol. 27, no. 2, pp. 138-57, Mar.-Apr. 2010. |
|8.||P. Kartik, S. Prasanna, and R. Prasad, "Multimodal biometric person authentication system using speech and signature features," in Proc. IEEE Region 10 Conf. (TENCON), pp. 1-6, 2006. |
|9.||R. Duda, P. Hart, and D. Stork, "Pattern Classification", 2 nd ed. New York: John Wiley; 2001. |
|10.||R. Gray, "Vector quantization," IEEE Trans. Acoustics, Speech, and Signal Process., Vol. 1, no. 12, pp. 4-29, Apr. 1984. |
|11.||Y. Linde, A. Buzo, and R.M. Gray, "An algorithm for vector quantizer design," IEEE Trans. Communications, Vol. 28, no. 1, pp. 84-95, Jan. 1980. |
|12.||V. Chatzis, A.G. Bors, and I. Pitas, "Multimodal decision-level fusion for person authentication," IEEE Trans. Systems, Man, and Cybernetics, Part A: Systems and Humans, Vol. 29, no. 6, pp. 674-80, Nov. 1999. |
|13.||D. Reynolds, and R. Rose, "Robust text-independent speaker identification using gaussian mixture models," IEEE Trans. Speech and Audio Process., Vol. 3, no. 1, pp. 72-83, Jan. 1995. |
|14.||F. Cardinaux, C. Sanderson, and S. Bengio, "User authentication via adapted statistical models of face images," IEEE Trans. Signal Processing, Vol. 54, no. 1, pp. 361-73, Jan. 2006. |
|15.||J. Wang, Y. Li, X. Ao, C. Wang, and J. Zhou, "Multi-modal biometric authentication fusing iris and palmprint based on gmm," in Proc. IEEE/SP 15 th Workshop on Statistical Signal Process. (SSP '09), pp. 349-52, Aug. 2009. |
|16.||S. Sahoo, and S. Prasanna, "Bimodal biometric person authentication using speech and face under degraded condition," in Proc. National Conf. on Communication (NCC'11), Jan. 2011, pp. 1-5. |
|17.||B. Yegnanarayana, and S. Kishore, "AANN: An alternative to GMM for pattern recognition," Neural Networks, Vol. 15, no. 3, pp. 459-69, Apr. 2002. |
|18.||H.S. Jayanna, and S.R. Prasanna, "Analysis, feature extraction, modeling and testing techniques for speaker recognition," IETE Technical Review, Vol. 26, no. 3, pp. 181-90, May-Jun. 2009. |
|19.||C.M. Bishop, "Neural networks for pattern recognition", New York: Oxford University Press; 1995. |
|20.||B. Yegnanarayana, K.S. Reddy, and S.P. Kishore, "Source and systsem feature for speaker recognition using aann models," in Proc. IEEE Int. Con. Acoust. Speech and Signal Process., Salt Lake City, UT, USA. Proc. IEEE Int. Con. Acoust. Speech and Signal Process., Salt Lake City, UT, USA, pp. 409-12, May 2001. |
|21.||L. Wang, K. Chen, and H. Chi, "Capture interspeaker information with a neural network for speaker recognition," IEEE Trans. Neural Network, Vol. 13, no. 2, pp. 436-45, Mar. 2002. |
|22.||A. Sao, and B. Yegnanarayana, "Face verification using template matching," IEEE Trans. Information Forensics and Security, Vol. 2, no. 3, pp. 636-41, Sep. 2007. |
|23.||R. Ramachandran, K. Farrell, R. Ramachandran, and R. Mammone, "Speaker recognition-general classifier approaches and data fusion methods," Pattern Recognition, Vol. 35, pp. 2801-21, Dec. 2002. |
|24.||A. Rosenberg, "Automatic speaker verification: A review," Proc. IEEE, Vol. 64, no. 4, pp. 460-87, 1976. |
|25.||F. Itakura, "Minimum prediction residual principle applied to speech recogntion," IEEE Trans. Acoust., Speech and Signal Proc., Vol. 23, no. 1, pp. 67-72, Feb. 1975. |
|26.||K. Fukunage, "Introduction to statistical pattern recognition", 2 nd ed. New York: Academic Press; 1972. |
|27.||A. K. Jain, "An introduction to biometric recognition," IEEE Trans. circuits and systems for video technology, Vol. 14, no. 1, pp. 4-20, Jan. 2004. |
|28.||A. Jain, R. Bolle, and S. Pankanti, "Biometrics: Personal identification in networked society", In: A.K. Jain, Editor. Berlin, Heidelberg: Kluwer Academic Publishers; 1999. |
|29.||J. L. Wayman, A.K. Jain, D. Maltoni, and D. Maio, Editors., "Biometric systems: Technology, design and performance evaluation", 1 st ed. Berlin, Heidelberg: Springer; 2004. |
|30.||J. D. Woodward Jr., N.M. Orlans, and P.T. Higgins, "Biometrics: Identity assurance in the information age," New York City, U.S: McGraw-Hill Osborne Media; 2002. |
|31.||E. Surer, "Multimodal biometric verification: Applications that identify," Saarbrücken: VDM Verlag; 2009. |
|32.||A. Ross, A.K. Jain, and D. Zhang, "Handbook of multibiometrics", Berlin, Heidelberg: Springer US: 2006. |
|33.||A. Jain, K. Nandakumar, and A. Nagar, "Biometric template security," EURASIP Journal on Advances in Signal Process., Vol. 2008, p. 17, 2008. |
|34.||J. Bhatnagar, B. Lall, and R. Patney, "Performance issues in biometric authentication based on information theoretic concepts: A review," IETE Technical Review, Vol. 27, no. 4, pp. 273-85, 2010. |
|35.||V. Mane, and D.V. Jadhav, "Review of multimodal biometrics: Applications, challenges and research areas," Int. Journal of Biometrics and Bioinformatics (IJBB), Vol. 3, no. 5, pp. 90-5, Nov. 2009. |
|36.||P. Grother, and E. Tabassi, "Performance of biometric quality measures," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 29, no. 4, pp. 531-43, Apr. 2007. |
|37.||M. Faundez-Zanuy, J. Fierrez-Aguilar, J. Ortega-Garcia, and J. Gonzalez-Rodriguez, "Multimodal biometric databases: An overview," IEEE Aerospace and Electronic Systems Mag., Vol. 21, no. 8, pp. 29-37, Aug. 2006. |
|38.||S. Dass, Z. Yongfang, and A. Jain, "Validating a biometric authentication system: Sample size requirements," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 28, no. 12, pp. 1902-13, Dec. 2006. |
|39.||K. Boyer, V. Govindaraju, and N. Ratha, "Introduction to the special issue on recent advances in biometric systems [guest editorial]," IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 37, no. 5, pp. 1091-5, Oct. 2007. |
|40.||S. Prabhakar, J. Kittler, D. Maltoni, L. O'Gorman, and T. Tan, "Introduction to the special issue on biometrics: Progress and directions," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 29, no. 4, pp. 513-6, 2007. |
|41.||D. Maltoni, D. Maio, A. Jain, and S. Prabhakar, "Multimodal biometric systems," in Handbook of Fingerprint Recognition, ser. Springer Professional Computing. London: Springer; pp. 233-55, 2003. |
|42.||L. Hong, A. Jain, and S. Pankanti, "Can multibiometrics improve performance?" in Proc. 1999 IEEE Workshop on Automatic Identification Advanced Technologies, 1999. |
|43.||L. I. Kuncheva, J.C. Bezdek, and R. Duin, "Decision templates for multiple classifier fusion: An experimental comparison," Pattern Recognition, Vol. 34, pp. 299-314, 2001. |
|44.||P. Gutkowski, "Algorithm for retrieval and verification of personal identity using bimodal biometrics," Information Fusion, Vol. 5, pp. 65-71, 2004. |
|45.||J. You, W.K. Kong, D. Zhang, and K.H. Cheung, "On hierarchical palmprint coding with multiple features for personal identification in large databases," IEEE Trans. Circuits and Systems for Video Technology, Vol. 14, no. 2, pp. 234-43, Feb. 2004. |
|46.||P. Pudill, J. Novovicova, S. Blaha, and J. kittler, "Multistage pattern recognition with reject option," in Proc. 11 th IAPR Int. Conf. Pattern Recognition, Conf. B: Pattern Recognition Methodology and Systems, Vol. 2, pp. 92-5, 1992. |
|47.||H. El-Shishini, M. Abdel-Mottaleb, M. El-Raey, and A. Shoukry, "A multistage algorithm for fast classification of patterns," Pattern Recognition Letters, Vol. 10, no. 4, pp. 211-5, 1989. |
|48.||C. Sanderson, and K.K. Paliwal, "Identity verification using speech and face information," Digital Signal Processing, Vol. 14, pp. 449-80, 2004. |
|49.||G. L. Marcialis, and F. Roli, "Fingerprint verification by fusion of optical and capacitive sensors," Pattern Recognition Letters, Vol. 25, pp. 1315-22, 2004. |
|50.||X. Chen, P.J. Flynn, and K.W. Bowyer, "Ir and visible light face recognition," Computer Vision and Image Understanding, Vol. 99, pp. 332-58, 2005. |
|51.||A. Kumar, D.C. Wong, H.C. Shen, and A.K. Jain, "Personal verification using palmprint and hand geometry biometric," in Proc. 4 th Int. Conf. on Audio-and Video-Based Biometric Person Authentication (AVBPA), Guildford, U.K., pp. 668-78, Jun. 2003. |
|52.||R. Brunelli, and D. Falavigna, "Person identification using multiple cues," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 12, no. 10, pp. 955-66, 1995. |
|53.||R. Frischholz, and U. Dieckmann, "Bioid: A multimodal biometric identification system," IEEE Computer, Vol. 33, no. 2, pp. 64-8, 2000. |
|54.||U. Dieckmann, P. Plankensteiner, and T. Wagner, "Sesam: A biometric person identification system using sensor fusion," Pattern Recognition Letters, Vol. 18, no. 9, pp. 827-33, 1997. |
|55.||X. Pan, Y. Cao, X. Xu, Y. Lu, and Y. Zhao, "Ear and face based multimodal recognition based on kfda," in Proc. Int. Conf. on Audio, Language and Image Processing, 2008. |
|56.||L. Hong, and A.K. Jain, "Integrating faces and fingerprints for personal identification," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 20, pp. 1295-307, Dec. 1998. |
|57.||G. L. Marcialis, and F. Roli, "Experimental results on fusion of multiple fingerprint matchers," in Proc. 4 th Int. Conf. on Audio and Video-based Biometric Person Authentication (AVBPA), Guildford, UK, pp. 814-20, Jun. 2003. |
|58.||A. Jain, S. Prabhakar, and S. Chen, "Combining multiple matchers for a high security fingerprint verification system," Pattern Recognition Letters, Vol. 20, no. 11-13, pp. 1371-9, 1999. |
|59.||A. Ross, A.K. Jain, and J. Reisman, "A hybrid fingerprint matcher," Pattern Recognition, Vol. 36, pp. 1661-73, Jul. 2003. |
|60.||K. Toh, W. Yau, and X. Jiang, "A reduced multivariate polynomial model for multimodal biometrics and classifiers fusion," IEEE Trans. Circuits and Systems for Video Technology, Vol. 14, no. 2, pp. 224-33, Feb. 2004. |
|61.||F. Yang, H. Abdi, and A. Monopoli, "Development of a fast panoramic face mosaicking and recognition system," Optical Engineering, Vol. 44, no. 8, pp. 087 005-1/10, Aug. 2005. |
|62.||Y. Zhang, J. Yang, and H. Wu, "A hybrid swipe fingerprint mosaicing scheme," in Proc. of Audio and Video-based Biometric Person Authentication (AVBPA), Rye Brook, NY, pp. 131-40, Jul. 2005. |
|63.||S. Sayeed, N.S. Kamel, and R. Besar, "A sensor-based approach for dynamic signature verification using data glove," Signal Processing: An International Journal, Vol. 2, no. 1, pp. 1-10. |
|64.||S. Singha, A. Gyaourovaa, G. Bebisa, and I. Pavlidisb, "Infrared and visible image fusion for face recognition," in Proc. SPIE Defense and Security Symposium (Biometric Technology for Human Identification), pp. 585-96, 2004. |
|65.||R. Singh, M. Vatsa, and A. Noore, "Integrated multilevel image fusion and match score fusion of visible and infrared face images for robust face recognition," Pattern Recognition, Vol. 41, no. 3, pp. 880-93, 2008. |
|66.||J. Wang, W. Yaua, A. Suwandya, and E. Sungb, "Person recognition by fusing palmprint and palm vein images based on laplacianpalm representation," Pattern Recognition, Vol. 41, pp. 1514-27, 2008. |
|67.||Y. Hao, Z. Sun, and T. Tan, "Comparative studies on multispectral palm image fusion for biometrics," in Proc. ACCV 2007, Part II, LNCS, 2007. |
|68.||A. Noore, R. Singh, and M. Vatsa, "Robust memory-efficient data level information fusion of multimodal biometric images," Information Fusion, Vol. 8, pp. 337-46, 2005. |
|69.||A. Ross, S. Shah, and J. Shah, "Image versus feature mosaicing: A case study in fingerprints," in Proc. SPIE Conf. on Biometric Technology for Human Identification, pp. 1-12, 2006. |
|70.||W. Yau, K. Toh, D. Jiang, T. Chen, and J. Lu, "On fingerprint template synthesis," in Proc. Int. Conf. on Control, Automation, Robotics and Vision, 2001. |
|71.||S. Singh, A. Gyaourova, G. Bebis, and I. Pavlidis, "Infrared and visible image fusion for face recognition," in Proc. SPIE Defense and Security Symposium, Vol. 5404, pp. 585-96. |
|72.||R. Singh, M. Vatsa, and A. Noore, "Hierarchical fusion of multi-spectral face images for improved recognition performance," Information Fusion, Vol. 9, pp. 200-10, 2008. |
|73.||G. Trunk, "A problem of dimensionality: A simple example," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 1, no. 3, pp. 306-7, 1979. |
|74.||A. Ross, and R. Govindarajan, "Feature level fusion using hand and face biometrics," in Proc. SPIE Conf. on Biometric Technology for Human Identification ll, orlando, USA, Vol. 5779, pp. 196-204, Mar. 2005. |
|75.||A. Kumar, and D. Zhang, "Personal recognition using hand shape and texture," IEEE Trans. Image Processing, Vol. 15, no. 8, pp. 2454-61, Aug. 2006. |
|76.||X. N. Xu, Z.C. Mu, and L. Yuan, "Feature-level fusion method based on kfda for multimodal recognition fusing ear and profile face," in Proc. 2007 Int. Conf. on Wavelet Analysis and Pattern Recognition, Beijing, China, pp. 1306-10, Nov. 2007. |
|77.||S. Dupont, and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Trans. Multimedia, Vol. 2, no. 3, pp. 141-51, Sep. 2000. |
|78.||S. Bengio, "An asynchronous hidden markov model for audio-visual speech recognition," in Proc. Advances in Neural Information Process. Systems, NIPS 15. Cambridge, Massachusetts: MIT Press; 2003. |
|79.||S. Nakamura, "Statistical multimodal integration for audiovisual speech processing," IEEE Trans. Neural Networks, Vol. 13, no. 4, pp. 854-66, Jul. 2002. |
|80.||A. V. Nefian, L.H. Liang, T. Fu, and X.X. Liu, "A bayesian approach to audio-visual speaker identification," in Proc. 4 th Int. Conf. on Audio-and video-based biometric person authentication, Guildford, UK, pp. 761-9, 2003. |
|81.||A. Nefian, L. Liang, X. Pi, L. Xiaoxiang, C. Mao, and K. Murphy, "A coupled hmm for audio-visual speech recognition," in Proc. Int. Conf. on Acoustics, Speech and Signal Processing, pp. 2013-6, 2002. |
|82.||K. Woods, K. Bowyer, and W. Kegelmeyer, "Combination of multiple classifiers using local accuracy estimates," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 19, no. 4, pp. 405-10, 1997. |
|83.||Y.A. Zuev, and S.K. Ivanon, "The voting as a way to increase the decision reliability," Journal Of The Franklin Institute, Vol. 336, pp. 361-78, 1999. |
|84.||L. Lam, and C.Y. Suen, "Application of majority voting to pattern recognition: An analysis of its behavior and performance," IEEE Trans. Systems, Man, and Cybernetics, Vol. 27, no. 5, pp. 553-68, Sep. 1997. |
|85.||L. I. Kuncheva, "Combining Pattern Classifiers: Methods and Algorithms", In: D.J. Cook, Editor. Hoboken, New Jersey: John Wiley; 2004. |
|86.||A. K. Xu, and C. Suen, "Methods of combining multiple classifiers and their applications to handwriting recognition," IEEE Trans. Systems, Man and Cybernetics, Vol. 22, pp. 418-35, 1992. |
|87.||F. Roli, J. Kittler, G. Fumera, D. Muntoni, and G.G. Xh, "An experimental comparison of classifier fusion rules for multimodal personal identity verification systems," Multiple Classifier systems, pp. 325-36, 2002. |
|88.||K. Tumer, and J. Ghosh, "Linear and order statistics combiners for pattern classification," Combining Artificial Neural Nets, pp. 127-62, 1999. |
|89.||K. Veeramachaneni, L.A. Osadciw, and P.K. Varshney, "An adaptive multimodal biometric management algorithm," IEEE Trans. Systems, Man, and Cybernetics, Part C: Applications and Reviews, Vol. 35, no. 3, pp. 344-56, Aug. 2005. |
|90.||S. Pigeon, and L. Vandendorpe, "Image-based multimodal face authentication," Signal Processing, Vol. 69, pp. 59-79, 1997. |
|91.||T. Ho, J. Hull, and S. Srihari, "Decision combination in multiple classifier systems," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 16, no. 1, pp. 66-75, 1994. |
|92.||K. S. Fu, and Y.T. Chien, "Sequential recognition using a nonparametric ranking procedure," IEEE Trans. Information Theory, Vol. IT-13, no. 3, pp. 484-92, Jul. 1967. |
|93.||A. Jain, K. Nandakumara, and A. Ross, "Score normalization in multimodal biometric systems," Pattern Recognition, Vol. 38, pp. 2270-85, 2005. |
|94.||A. Ross, and A. Jain, "Information fusion in biometrics," Pattern Recognition Letters, Vol. 24, no. 13, pp. 2115-25, 2003, (special issue on multimodal biometrics). |
|95.||Y. Wang, T. Tan, and A. Jain, "Combining face and iris biometrics for identity verification," in Proc. Fourth Int. Conf. on AVBPA, Guildford, UK, pp. 805-13, 2003. |
|96.||K. A. Toh, "Deterministic global optimization for fnn training," IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 33, pp. 977-83, Jun. 2003. |
|97.||Y. Shin, and J. Ghosh, "Ridge polynomial networks," IEEE Trans. Neural Networks, Vol. 6, no. 3, pp. 610-22, 1995. |
|98.||S. Ben-Yacoub, Y. Abdeljaoued, and E. Mayoraz, "Fusion of face and speech data for person identity verification," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 10, pp. 1065-74, 1999. |
|99.||K. A. Toh, and W.Y. Yau, "Combination of hyperbolic functions for multimodal biometrics data fusion," IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 34, no. 2, pp. 1196-209, Apr. 2004. |
|100.||R. Snelick, U. Uludag, A. Mink, M. Indovina, and A. Jain, "Large-scale evaluation of multimodal biometric authentication using state-of-the-art systems," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 27, no. 3, pp. 450-5, Mar. 2005. |
|101.||L. I. Kuncheva, C.J. Whitaker, C.A. Shipp, and R.P. Duin, "Is independence good for combining classifiers?" in Proc. of Int. Conf. on Pattern Recognition (ICPR), Barcelona, Spain, Vol. 2, pp. 168-71, Aug. 2000. |
|102.||R. P. Duin, and D.M. Tax, "Experiments with classifier combining rules," in Proc. 1 st Workshop on Multiple Classifier Systems, Cagliari, Italy: Springer; Vol. LNCS 1857, pp. 16-29, Jun. 2000. |
|103.||S. Ribaric, and I. Fratric, "A biometric identification system based on eigenpalm and eigenfinger features," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 27, no. 11, pp. 1698-709, Nov. 2005. |
|104.||J. Kittler, J. Matas, K. Jonsson, and M.R. Sanchez, "Combining evidence in personal identity verification systems," Pattern Recognition Letters, Vol. 18, pp. 845-52, 1997. |
|105.||J. Kittler, M. Hatef, R. Duin, and J. Matas, "On combining classifiers," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 20, no. 3, pp. 226-39, Mar. 1998. |
|106.||L. I. Kuncheva, "A theoretical study on six classifier fusion strategies," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 24, pp. 281-6, Feb. 2002. |
|107.||G. Shakhnarovich, and T. Darrell, "On probabilistic combination of face and gait cues for identification," in Proc. Int. Conf. on Automatic Face and Gesture Recognition, pp. 169-74, 2002. |
|108.||R. Wang, and B. Bhanu, "Performance prediction for multimodal biometrics," in Proc. 18 th Int. Conf. on Pattern Recognition (ICPR'06), 2006. |
|109.||A. J. Mansfield, and J.L. Wayman, "Best practices in testing and reporting performance of biometric devices," National Physical Laboratory and San Jose State University, Tech. Rep., Aug. 2002. |
|110.||S. K. Dahel, and Q. Xiao, "Accuracy performance analysis of multimodal biometrics," in Proc. 2003 IEEE Workshop on Information Assurance, United States Military Academy, West Point, NY, pp. 170-3, Jun. 2003. |
|111.||R. M. Bolle, S. Pankanti, and N.K. Ratha, "Evaluation techniques for biometrics-based authentication systems (frr)," in Proc. 15 th Int. Conf. on Pattern Recognition. Yorktown Heights, NY 10598: IEEE, 2000. |
|112.||N. Poh, and S. Bengio, "How do correlation and variance of base-experts affect fusion in biometric authentication tasks?" IEEE Trans. Signal Processing, Vol. 53, no. 11, pp. 4384-96, Nov. 2005. |
|113.||V. Kanhangad, A. Kumar, and D. Zhang, "Comments on an adaptive multimodal biometric management algorithm," IEEE Trans. Systems, Man, and Cybernetics, Part C: Applications and Reviews, Vol. 38, no. 6, pp. 841-3, Nov. 2008. |
|114.||M. Indovina, U. Uludag, R. Snelick, A. Mink, and A. Jain, "Multimodal biometric authentication methods: A cots approach," in Proc. MMUA 2003, Workshop on Multimodal User Authentication, Dec. 2003. |
|115.||L. Lam, and C. Suen, "Optimal combination of pattern classifiers," Pattern Recognition Letters, Vol. 16, pp. 945-54, 1995. |
|116.||P. Phillips, P. Rauss, and S.Z. Der, "Feret (face recognition technology) recognition algorithm development and test results," Army Research Laboratory, Tech. Rep. ARL-TR-995, Oct. 1996. |
|117.||K. Revett, F. Deravi, and K. Sirlantzis, "Biosignals for user authentication -towards cognitive biometrics?" in Proc. Int. Conf. on Emerging Security Technologies (EST), pp. 71-6, 2010. |
|118.||N. Venkatesh, and S. Jayaraman, "Human electrocardiogram for biometrics using dtw and flda," in Proc. 20 th Int. Conf. on Pattern Recognition (ICPR), pp. 3838-41, 2010. |
|119.||S. Marcel, and J. Millan, "Person authentication using brainwaves (eeg) and maximum a posteriori model adaptation," IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 29, no. 4, pp. 743-52, 2007. |
|120.||P. Unsang, R. Jillela, A. Ross, and A. Jain, "Periocular biometrics in the visible spectrum," IEEE Trans. Information Forensics and Security, Vol. 6, no. 1, pp. 96-106, Mar. 2011. |
| Authors|| |
Soyuj Kumar Sahoo was born in India in 1984. He received the B.E. degree with first class honors in electronics and telecommunication engineering from Eastern Academy of Science and Technology, Utkal University, Bhubaneswar, India, in 2006 and the M.Tech degree in Digital Signal Processing from Indian Institute of Technology Guwahati, India, in 2011. He is currently pursuing his Ph.D. degree in Multimedia Communication and Signal Processing at University of Erlangen-Nuremberg, Germany. His research interests are in biometric person authentication, speaker recognition, source separation and dereverberation.
Tarun Choubisa was born in India in 1986. He received the B.E. degree with honors in electronics and communication engineering from Engineering College Kota, Rajasthan University, India, in 2008 and the M.Tech degree in Digital Signal Processing from Indian Institute of Technology Guwahati, India, in 2010. He is currently pursuing his Ph.D. degree in Electrical Communication Engineering at Indian Institute of Science, Bangalore, India. His research interests are in biometric person authentication, machine learning, digital signal processing, and optical signal processing.
S. R. Mahadeva Prasanna (M'05) was born in India in 1971. He received the B.E. degree in electronics engineering from Sri Siddartha Institute of Technology, Bangalore University, Bangalore, India, in 1994, the M.Tech. degree in industrial electronics from the National Institute of Technology, Surathkal, India, in 1997, and the Ph.D. degree in computer science and engineering from the Indian Institute of Technology Madras, Chennai, India, in 2004. He is currently an Associate Professor in the Department of Electronics and Electrical Engineering, Indian Institute of Technology, Guwahati. His research interests are in speech and signal processing.
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7]