You are here: Resources > FIDIS Deliverables > HighTechID > D3.10: Biometrics in identity management >

Resources

D3.10: Biometrics in identity management

Title:
QUALITY FACTORS OF BIOMETRIC SYSTEMS

Quality factors of biometric systems

Several physiological and/or behavioural characteristics are apt to serve as a biometric identifier to recognise a human subject. Such identifying characteristics however have to fulfil some mandatory and some desirable qualities.

Mandatory qualities of a biometric characteristic:

Universality - each individual has the biometric characteristic and can become a biometric capture subject
Distinctiveness – any two individuals have sufficiently distinct biometric characteristic to be separable by suitable extracted biometric features
Permanence – the biometric characteristic must be sufficiently invariant over a longer time period
Collect ability – the biometric characteristic can be measured through the evaluation of emitted physical signals

Desired qualities of a biometric characteristic:

Performance – the measurement of the biometric characteristic is robust, fast, accurate and efficient
Acceptance – the delivery of the biometric characteristic is well accepted by the individuals which are biometric capture subjects
Reliability – the biometric characteristic is not easy to forge and the delivery not easy to circumvent by fooling the system

For practical implementations, the following additional considerations have to be taken into account:

The size of the population that has to be enrolled in the biometric system
The biometric application mode, identification or verification
The control scheme of the biometric system components (see below, section )
The environment for the different processing steps (delivery, capture, extraction, template creation and comparison and match)
The purpose of the biometric system relative to the security policy
The proportionality of the recognition and data collecting process relative to the intended transaction
The organisational integration in the superordinated identity management system
The ergonomic integration of the biometric delivery process in the authentication protocol of the authorised users to find long term acceptance of the user
The costs and the requirements on the infrastructure

Biometric system errors

Any biometric processing system utilises a physical measurement step which is intrinsically error prone. The systematic and statistical errors of the measurement and the algorithms of the biometric extraction and comparison processes define the limits of application of the biometric system and the separation capability between different individuals. Each biometric template measurement of an individual represents a point in the phase space of the biometric feature vectors. The separation capability of a biometric system depends on the distribution of these points in the feature phase space. If points representing templates from one individual are clustered in a narrow region that is typically far away from all other clusters of different individuals the separation capability of the system is high and the error rates are low. On the contrary, if the clusters are large and overlapping, the separation capability is low and there are errors of the first (false non recognition) and second kind (false recognition). There are various reasons for such an overlapping of template clusters in the feature vector space. It is possible that the biometric identifier characteristic is not very distinctive or has a natural variation from measurement to measurement, e.g. voice or planar signatures are examples of such biometric identifiers with low separation capability. It is also possible that the measurement and feature extracting processes are poor or oversimplified and reduce a high dimensional feature vector to a few components with a much poorer separation capability. Early commercial fingerprint and many face recognition systems are examples of such oversimplified systems. A third reason for system deterioration comes from the scaling of centralised systems to large populations. If a central database collects so many reference templates that the cluster region of each individual heavily overlaps with the regions of other individuals the system will no more work in the identification mode and becomes susceptible to impostor attacks even in the verification mode.

Figure 7 below illustrates this problem. The representation of 3-D feature vector phase space for a biometric system is filled on the left side with the points of the reference templates of a given population. Each point represents a measured reference template of an individual. On the right side the lines represent the measured distance of the feature vectors between the reference templates and a corresponding query template (one endpoint represents the reference template, the other endpoint represents the query template in the same phase space). The lengths of the straight lines are an indicator of the typical cluster size of the feature vectors distribution from one biometric characteristic of an individual in the population. It is clear that this example would not lead to a good biometric classifier as the clusters are heavily overlapping between individuals. The specific system comes from an oversimplified fingerprint recognition system and it was selected for the illustration of the scaling problem.

Figure : Distribution of reference feature vector points in a 3-D feature phase space (left side) and connection lines between the reference and a set of corresponding query vectors (right side). The length of the lines are indicators for the typical cluster size of feature vectors of one subject in the phase space. The example shows a biometric system with insufficient separation capability for the chosen population.

The scaling problem becomes especially serious in applications with non cooperative individuals where the system should provide the evidence of false identity claims or attempts of individuals for multiple enrolments under different identities. The scaling problem can be solved by an appropriate conceptual architecture of the biometric system. It is evident that the overlap problem can be reduced by the reduction of the number of reference templates that have to be recognised. In the ideal case, the biometric system has only to recognise query templates coming from one single individual and therefore only one reference template has been stored in the template storage. The acceptance region in the feature space around the point defined by the single stored reference template can then be tuned in an optimal way to this reduced problem. This architecture is realised within the concept of the so called encapsulated biometrics where each individual carries a personal identity assistant device which has a full biometric system integrated in it. Such a system will be presented in planned FIDIS D3.14.

The multiple enrolment problem can be reduced by the use of multimodal biometrics. If the system detects an overlap of newly delivered reference templates of different and independent biometric characteristics with already registered reference templates of one single enrolee, it is very likely that the new claimed identities is claimed by the same person that already has enrolled under a different identity.

Biometric system failures

In addition to the system errors with false results of the recognition process, there are system failures where the biometric system is unable to process the biometric data. If such a failure happens in the initial, enrolment, mode, we speak of a failure to enrol (FTE), if such a failure happens in a query process we speak of failure to acquire (FTA) or failure to capture (FTC). For a clear definition of theses failure notions, we have to break down the biometric processing in three hierarchical steps:

Presentation
A presentation is the interaction of an individual with the capture component of the biometric system. One or more presentations may be necessary or permitted to constitute an attempt to deposit a template. In a typical decision policy, failure to acquire the biometric data sufficient to constitute an attempt after a certain number of presentations represents a failed attempt.
Attempt
An attempt is the presentation of a biometric identifier and the capture of the biometric data for the preprocessing, feature extraction and template generation step. One or more attempts may be necessary or permitted to constitute a biometric transaction, depending on whether the system requires or allows multiple sample templates of a biometric identifier characteristic. In a typical decision policy, an inability to enrol or match a template subsequent to a certain number of attempts constitutes a failed transaction.
Transaction
A transaction is the successful completion of a biometric processing step either in the enrolment or in the query mode. A biometric recognition may consist of one or several biometric transactions using a certain biometric identifier characteristic. The inability to complete a biometric transaction leads to the two following failure types:

FTE – Failure to enrol

The FTE is defined as the probability that an individual attempting to enrol in the biometric system is unable to succeed. Inability means that the individual exhausted the maximum number of presentations and/or attempts without succeeding to realise the requested number of transactions for a successful definition of a valid reference template.

FTA (FTC, FTM) – Failure To Acquire (Failure To Capture or Failure to Match)

The FTA (FTC, FTM) is defined as the probability that an individual attempting to pass a recognition step in the biometric system is unable to deliver a query template for the regular running of the comparison step. Inability means that the individual exhausted the maximum number of presentations and/or attempts without succeeding to realise the requested number of transactions for a successful definition of a query template with sufficient distinctive features to run a comparison step (FTC) or that the comparison step fails for some reason without delivering a correct matching score (FTM).

The two failure rates are not necessary equal and for each individual there are specific failure rates:

The corresponding values over a population are the averages over the individual values for FTE and FTA over the population in question. The failure rates are in principle not dependent on the operation mode. An identification process as well as a verification process both need valid reference and query templates. However, the design of a verification and an identification system may lead to different definitions of what a valid template is, which in turn influences the failure rates.

Statistical measurement errors

A typical response of a biometric system in a template comparison step is a so called matching score S, which is a measure of the correspondence between the two templates. The matching score is compared with a recognition threshold T to decide if two templates originate from the same biometric characteristic or not. If S >= T (assuming an ascending matching score with a better correspondence between the two templates) the templates are considered as matching templates, if S < T the two templates are considered as non matching templates. It is clear that the choice of the threshold T is critical for the rate of false non matching of two templates coming from the same biometric characteristic (error of the first kind I) or the rate of false matching of two templates that come from different biometric characteristics (error of the second kind II). Expressing this in a more formal way with the stored reference template R and the acquired query template Q, the null and the alternate hypothesis are:

query template does come from the same biometric characteristic as the reference template

query template does not come from the same biometric characteristic as the reference template

query template does come from the same biometric characteristic as the reference template

query template does not come from the same biometric characteristic as the reference template

And accordingly the associated decisions are

query template does come from the same biometric characteristic (same person) than the reference template

query template does not come from the same biometric characteristic than the reference template

query template does come from the same biometric characteristic (same person) than the reference template

query template does not come from the same biometric characteristic than the reference template

The errors are of type I when the decision D1 is taken when H0 is true and of type II when the decision is D0 when H1 is true.

Figure 8 below shows typical distributions of the score parameters of comparisons between reference and query templates coming from the same biometric characteristic of the same subject (green distribution) and such coming from different subjects (red line) The green field to the left of the threshold value T represents the total False Non Match Rate (FNMR) due to errors of the first kind and the red field to the right of the threshold value represents the total False Match Rate (FMR) due to errors of the second kind. The two points (ZFR, ZFA) on the score parameter axis design the critical score parameters below which the FNMR becomes zero (ZFR) and above which the FMR becomes zero (ZFA).

Figure : Typical distributions of the score parameters of comparison between templates coming from the same biometric characteristic of a subject (green line) and such coming from biometric characteristics of different subjects (red line).

The matching process delivers a matching score S= S(Q,R) which we assume to be normalised to the interval [0,1] with a perfect match for the value S=1. This leads to the following error notions which are slightly different for the verification and the identification mode.

Errors in the verification mode

FMR – False Match Rate

Total probability that the calculated matching score exceeds the threshold value T although the two templates do not come from the same biometrics.

FNMR – False Non Match Rate

Total probability that the calculated matching score is below the threshold value T although the two templates come from the same biometrics.

Zero FMR

FNMR(ZFA) is the lowest FNMR Thus it defines the lowest threshold value T=ZFA so that the FMR is still zero. This value is a measure of the total rate of ‘inconvenience’ when one would achieve a 100 % rejection of impostors.

Zero FNMR

FMR(ZFR) is the lowest FMR that can be achieved without accepting false negative matches. Thus it defines the highest threshold value ZFR so that the FNMR is still zero. This value is a measure of the total rate of ‘insecurity’ when one would achieve a 100 % acceptance of authorised individuals.

EER- Equal Error Rate

The equal error rate is defined as the error rate at the specific threshold value Te where FMR=FNMR. The EER is a measure for the quality of biometric system that operates in a typical commercial or civilian environment. For highest security or for forensic applications where the FMR or the FNMR are the dominant criterion the EER may not be a good quality indicator.

FAR – False acceptance Rate

The false acceptance rate is closely related to the FMR. It is defined as potentially successful impostor acceptance rate when the impostor uses his own biometric characteristics to try to be accepted as another subject. If a system requests several matches N to accept a biometric verification the value of FAR may be substantially different from the FMR.

In most civil application cases N=1 and the rates are only calculated relative to enrolled persons. In addition, the amount of allowed presentations and attempts to create a valid template are sufficiently high to reduce the FTA rate to a very low value. Therefore:

is a good approximation which is widely used in the literature and in the biometric community.

FRR – False Rejection Rate

On the other side, also the false rejection rate is closely related to the FNMR. It is defined as the rate of rejection of (in principle) authorised persons relative to the total number of recognition processes of persons. If a system requests several successful matches N to accept a biometric verification, the value of FRR may increase relative to the FNMR.

In most civil application cases N=1 and the rates are only calculated relative to enrolled persons. In addition, the amount of allowed presentations and attempts to create a valid template is sufficiently high to reduce the FTA rate to a very low value. Therefore:

is a good approximation which is widely used in the literature and in the biometric community.

The approximations for FAR and FRR are further justified by the fact that the values of FMR(T) and FNMR(T) are approximated calculations over the biometric variability of the identifiers within a population. Exact definitions would need to calculate the values of p(S) for each biometrics and for each individual in function of the time and location within the feature phase space of the templates, which is infeasible in practice (see also figure 10 below). However, one has to be aware of this fact when error rates are calculated on a limited population sample and when such rates are applied for predictions on the behaviour of a biometric system within other populations. There are dependencies of such values from relative position of the considered templates in the feature phase space but also from the age, ethnic characteristics, profession and health conditions.

Relations between error parameters in the verification mode

It is often not clear for what purpose a biometric system will be implemented. Depending on the specific application, the biometric system should provide high security with low FAR or high availability with low FRR. To characterise a biometric system, the interdependence of the two parameters can be best represented by the so called Receiver Operating Characteristic curve:

ROC curve – Receiver Operation Characteristic

The ROC curve shows the FNMR (FRR) in function of the FMR (FAR). The plot is most often represented in a double logarithmic diagram. Sometimes the vertical axis shows the transformed value (1-FNMR) instead of the FNMR.

Figure : Typical ROC curve on a double logarithmic diagram. Alternative representations may show (1-FNMR) or the direction of the axis flipped. The ROC curve is the characteristic quality curve of a biometric system.

Errors in the identification mode

The main difference of the identification mode relative to the verification mode arises from the matching step. Instead of a clearly defined reference template R claimed by the individual that requested a verification of his biometrics, the comparison step has to use all possible reference templates Ri to calculate all matching scores Si(Q,Ri) over the full database. The result in general is not a clear match but a ranking list of reference templates with the matching scores over the threshold value {Si,…Si’)>T. It is clear that the size of this list and the probability of an error rises with the size N of the entries in the database of reference templates. The decision rule becomes more complex as it is not clear which one within the list of reference templates with matching scores over the threshold belongs to the individual that has presented the query template. The notions of FMR and FNMR can only be extended to the identification mode in a more or less straightforward way if the FMR<<(1/N). For a given biometric system with defined ROC characteristics this constraint clearly limits the potential size of the enrolled population. The error notions become slightly more complicated as one has to distinguish between the following cases for a given query template Q and N reference templates (R1…..RN):

All comparisons of Q with any Ri give matching scores Si(Q, Ri)<T; this means that the individual which delivered the biometric sample Q is either not enrolled in the database or that the comparison of query template with the right reference template Rq falls under the FNMR(T). Assuming that the individual was enrolled the probability that this happens is the probability that all (N-1) non corresponding templates do correctly not match and that the single corresponding template does falsely not match

All comparisons of Q with any Ri except with Rq give matching scores Si(Q, Ri)<T and Sq(Q, Rq)>T; this means that for this case the identification mode worked correctly. The chance that this happens is:

The comparisons of Q with the Ri give k matching scores Si(Q, Ri)<T and (N-k)>1 matching scores Sj(Q, Rj)>T with Sq(Q, Rq)>T also one of the accepted templates; this means that for this case the identification mode delivers a set of candidates in which the right candidate is still present. The chance that this happens is:

The comparisons of Q with the Ri give k>0 matching scores Si(Q, Ri)<T and (N-k) matching scores Sj(Q, Rj)>T with Sq(Q, Rq)<T not one of the accepted templates; this means that for this case the identification mode delivers a set of candidates in which the right candidate is not present any more. The chance that this happens is:

The first terms in the expressions C and D are the binomial coefficients. It is clear that mathematically case A is a special case of case D with k=N and case B is a special case of case C with k=N-1. Interpreting all events that fall under case B and C as a correct match and all events falling under A or D as a false non match allows definition of the FNMR for the identification mode.

The FNMR does not change relative to the verification mode as only one reference template is implied in the calculation of the matching score that could lead to false rejection.

In the same sense we can interpret all events that fall under case D with exception of case A as a false match event which gives us a clue for the determination of the false matching rate in the identification mode for instance in forensic application with a subsequent human controlled evaluation of the remaining candidates:

In applications with an automated identification however all cases with false matches have to be considered as a false match event and therefore contribute to the FMR (case C and D excluding A and B).

It is evident that even with a very small FMR(T) this value rises rapidly to unacceptable values when the size N of the database grows. This problem is known as the scaling problem. Therefore useful applications of biometric systems in the identification mode with centralised processing against a large database are limited more or less for forensic purposes. The identification mode however may be very useful in distributed architectures with local small numbers of reference templates coming from a few or even just one person. The identification mode is best suited for personal biometric tokens that recognise only templates from the authorised user. In this case, the biometric processing and the threshold for the matching decision may even be adapted to the individual user. In such architectures, the identification mode is equivalent to the verification mode but it omits the additional step that the individual has to claim his identity. This claim is intrinsically realised and the system becomes more convenient for the individual user.

Interpretation of the estimated errors

It is necessary to make a restriction remark concerning all above formulas. They are perfectly valid only in the ideal case where FMR(T) and FNMR(T) are not dependant from the position of the compared templates in the feature vector phase space. For all other practical cases they are only more or less good estimators. An exact calculation of the real values would lead to rather infeasible integrations over the probability density functions of the feature vector distributions in the feature vector phase space. Such a discussion is clearly beyond this report.

But the following example, represented in figure 10 below, illustrates the problem. In figure 10, we see a (dimensionally reduced) distribution of feature vectors of reference templates in a 2D phase space. We consider 4 locations of a possible reference template feature vector with their acceptance range defined by the threshold value T (indicated by the red circles around the template point) and the corresponding location and acceptance range of query templates (denoted by green and purple circles).

Case A: The reference and the query template are too far away from each other to match, but both lie in a region of the phase space with low density of feature vectors. Thus the result of the comparison process is a no match (Type A of the above explained cases). The FMR(T) at this location is very low.
Case B: The reference and the query template are within the acceptance range from each other to match and both lie in a region of the phase space with low density of feature vectors. Thus the result of the comparison process is a single match with the right reference template (Type B of the above cases). The FMR(T) at this location is very low.
Case C: The reference and the query template are within the acceptance range from each other to match, but both lie in a region of the phase space with high density of feature vectors. Thus the result of the comparison process is a multiple match with many reference templates including the right one (Type C of the above cases). The FMR(T) at this location is much higher than the average FMR(T).
Case D: The reference and the query template are too far away from each other to match, and both lie in a region of the phase space with high density of feature vectors. Thus the result of the comparison process is a multiple match with many reference templates but missing the right one (Type D of the above cases). The FMR(T) at this location is much higher than the average FMR(T).

Figure : Distribution of feature reference vectors in a 2D phase space with the same acceptance range but in regions with different population densities.

The examples show that the FMR(T) is not necessarily a constant value for all types of biometric samples in the same system but depends also on the location of the feature vectors in their phase space. The performance of identification systems with large centralised databases may be especially biased by such effects.

It is possible to obtain a homogeneous distribution of the feature vectors with a topology conserving transformation of the phase space inserting a so called self-organising topological map between the feature extracting and the comparison component. Such a map reorganises the feature vectors in a homogeneous way over a non-linear transformed phase space. However this neuronal network technique works only with a large and representative sample of training vectors. It is therefore rarely applied.

Valuation of a biometric system in identity management

Biometric recognition of individuals is a valuable method to establish a strong link between a person and a specific identity within a set of identities. It has the advantage that a manipulation of this link is substantially more difficult for the concerned individual but also for potential impostors. This includes the fact that it becomes more difficult for an individual to hide an identity or to usurp new identities. On the other hand, biometric links between an individual and an identity are difficult to determine even if there are good and legal reasons to do so. Most of the biometric characteristics are stable for a long time in the lifespan of an individual, much longer than typical business relationships. Therefore a widespread use of biometric applications for identity management in civil or business purposes may harm the right of privacy of persons.

Another important point which has been outlined above is the fact that biometric techniques always include a measurement process of a physical parameter. As all physical measurements, such a process is intrinsically error-prone and it can lead to false results in the identity verification mode but especially in the identification mode. It is therefore necessary that backup mechanisms and legal restrictions protect individuals that are wrongly authenticated or abusively profiled, from severe consequences.

9 / 40