You are here: Resources > FIDIS Deliverables > Profiling > D7.2: Descriptive analysis and inventory of profiling practices >

Resources

D7.2: Descriptive analysis and inventory of profiling practices

Foreword
5. FIELDS OF APPLICATION

5. Fields of application

After analysing the technologies and techniques of profiling, we will now move into a set of examples of profiling practices. These will concern marketing, employment, the financial sector, forensics and e-learning.

5.1. Marketing in general

(Ana Canhoto, LSE)

Commercial enterprise has a great interest in knowing what the behaviour of its customers is or is likely to be. Since, considering the scale of commercial enterprise, it is not always possible for organisations to know each customer individually, they may seek indirect methods to learn about their customers, for instance by aggregating ‘records’ of transactions. The point is that retailers, for instance, do not want only to understand the behaviour of individual customers, but above all to be able to generalise from observed behaviour in order to make predictions about the behaviour of specific types of customers – that is, organisations want to develop ‘community knowledge’.

With this knowledge, the organisation can make informed strategic decisions. For instance, it can organise itself in order to respond in a specific way to exhibited behaviour. If a supermarket identifies the top 20 items bought by a key group of its users – e.g., sandwich-snackers – and stocks the shelves adjacent to sandwich sections accordingly, it can push up sales by this group of users. An enterprise can also use the knowledge about its users’ current or expected behaviour to encourage or reward actions that are profitable for the organisation: the objective underlying the introduction of a loyalty card by UK supermarket Tesco. Alternatively, a firm may wish to discourage certain types of behaviour: if it costs more to acquire a new customer than to hold on to an existing one, organisations will take action to increase retention of certain customers. They will try to reduce attrition or churn among certain groups of customers: Chase Manhattan bank decided to reduce the required minimum balance in customers’ checking accounts, when it realised that this was a key factor in the choice of banks, for a key segment of Chase Manhattan’s customers.

5.1.1 Customer Loyalty Programs

(Martin Meints, ICPP)

Customer loyalty programs became very popular in the last five years in Germany. To gain the loyalty of a customer a certain amount of discount is granted. We observe two types of customer loyalty programs on the German market:

Two party relationship programs, where we have a direct contract between customer and vendor
Three party relationship programs, where we have a more complicated relationship between customer, vendor and system operator granting the discount.

Figure 4 describes the three party relationship:

Figure 4: Three party relationship

In most cases in addition to the data needed for discount purposes additional personal data is being collected. In many cases this is e.g.:

Date of birth
Several contact addresses (telephone numbers, e-mail etc.)
Which goods where purchased when and where
Information on personal circumstances of life (e.g. family status, number of children, income etc.)

These data are mainly used for market research and advertising purposes using profiling techniques. In a study ordered by the “Bundesverband der Verbraucherzentralen e.V.” and carried out by ICPP in December 2003, 16 customer loyalty programs were investigated. Against the benchmark of the German Federal Data Protection Act (BDSG) numerous major and minor weaknesses and offences were found. A central weakness of all investigated programs is that because of trade secrecy, the place and time of storage of the data, the way and the purpose of processing was not described sufficiently. On the grounds of insufficient information, a declaration of consent - which has to be based on a free will - is legally not effective.

See the Appendix, section B, for further elaboration of the legal implications.

5.2 Employment

(Martin Meints, ICPP)

German supermarkets use profiling to determine unusual cash flow often caused by embezzlement by cashiers. They analyse cash refund transactions especially. There are some well known techniques for taking money out of a cash till fraudulently. One example is using false certificates for bottle deposits with usually small amounts of money. In the profiles cashiers using this method can be determined by a higher rate of refund transactions than average. Further investigation is necessary, but can be carried out in a targeted fashion. In addition, data mining is used to generate insight on fraudulent techniques as yet unknown.

A retailer chain from Switzerland claims to have caught 50 of their cashiers fraudulently taking money from the cash till. By using profiling techniques, they claim to have saved 200.000 €. For Germany no data are available.

5.3 Financial Sector

5.3.1 Anti-money laundering profiling

(LSE, Ana Canhoto)

Anti Money Laundering (AML) regulation has gradually increased in scope and depth in recent years and all major jurisdictions require businesses, located within their boundaries, to play their part in its prevention and detection. As well as banking and finance, other sectors, such as accountancy and legal services are required to establish procedures, which facilitate the reporting of suspicious activity to the relevant law enforcement authorities. A number of spectacular fines in such countries as the USA, the UK and Spain, within the last year, have emphasised the degree to which financial regulators are concerned about this particular crime. More recently, terrorist outrages have concentrated attention on how the financial system, in particular, might detect and prevent the funding of such criminal activities.

A critical tool in AML is the Suspicious Activity Report (SAR). A regulated institution must prepare a SAR when it suspects that a customer (either an individual or an organisation) is trying to process financial proceeds from criminal activities through that institution. This report is channelled to the appropriate Financial Intelligence Unit (FIU), a specialised governmental agency in every state created with the purpose of identifying and reducing money laundering activity. The FIU analyses the reports received and forwards a number of cases for further investigation and eventually, prosecution by the competent law enforcement agencies. This process is illustrated in figure 5.

Figure 5: Basic model of FIU’s role in AML

The use of automated monitoring systems is often seen as a powerful ally in detecting suspicious activity, justified by the wholesale increase in size of the typical transactional database, and by a desire to keep compliance costs under control. These systems usually consist of powerful algorithms that sweep through the records stored in transaction databases looking for those patterns of financial behaviour that deviate from the norm. Such algorithms tend to be based on rationalist approaches that assume that human behaviour can be modelled through positivist relations of transaction data. This implies that there is a fixed, immutable entity-structure and that behaviour is bivalent (i.e., can only be considered right or wrong, true or false, etc…). Such models take semantics as given and do not question the fundamental notions of individuality and identity. As a result, on top of heavy investments in technology, the organisations intervening in anti-money laundering still need to employ large numbers of people to eliminate the false positives from the large number of patterns of transactions that are deemed to be unusual after automated analysis. Such inefficiency hinders the performance of the system and, ultimately, contributes to the ability of money launderers to operate undiscovered.

An additional problem of the use of automated monitoring tools in AML, is that the profiles typically rely on tried and tested money laundering typologies. They usually lag behind the ever-changing, and increasingly complex, methods of laundering money. Additionally, organisations tend to focus on the “usual suspects” and give more attention to anomalous activity coming from individuals with a given demographic profile – this is well illustrated by the case of a personal assistant who stole nearly £4.5m over two years from her bosses at Goldman Sachs by forging signatures on cheques

5.3.2 Fraud prevention

(Martin Meints, ICPP)

In Germany profiling techniques are used to minimise the risk of granting a credit by a bank. For that purpose German banks and other financial service providers (e.g. insurance companies) have founded the so called “Schutzgemeinschaft für allgemeine Kreditsicherung (SCHUFA)”. With the consent of customers, required by the Federal Data Protection Act (BDSG), the so called “SCHUFA-Klausel”, banks and financial service providers transfer data about all bank accounts and the financial behaviour of German citizen to the “SCHUFA”. German citizen cannot avoid the “SCHUFA-Klausel”; no respectable bank or financial service provider offers services to customers who do not consent to the “SCHUFA-Klausel”.

The general financial behaviour of reference groups is analysed with massive data volumes about customers of various banks. The profiling gives a so-called scoring value developed from those reference group profiles. Its value lies between 0 points and 1000 points. This value is assumed to express the risk based on past personal behaviour. It is therefore used by banks together with other “SCHUFA”-information (e.g. special person related information like account information) to determine the risk of defaulting on credit and conditions under which someone can obtain credit (e.g. interest rate, maximum amount of credit etc.). The scoring value is used as one instrument to meet the Basel II-requirements by the banks.

The method of calculation of those scoring values is not published in its entirety (trade secrecy) and the customer of a bank cannot obtain all the information, including the scoring value, that is stored at the “SCHUFA”. In addition this SCHUFA scoring value is claimed to violate the Federal German Data Protection Act (BDSG), especially § 6a BDSG. There is a risk in this method of excluding people from financial activity (exclusion) for no obvious reason. This can be caused, for example, by mistaken assignment to a profile with a low scoring value. This in turn can lead to severe consequences such as not being able to open any bank account at all.

5.4 Forensics

(NFI, Zeno Geradts)

5.4.1 Current situation

In forensic science, currently there exist many different databases that can be used to link cases and suspects :

Firearm : Cartridge cases, bullets
Fingerprint
DNA
Face
Tool mark (e.g. screwdriver )
Shoe print
Handwriting
Paint and glass
Voice

In practice there is experience with combining those databases for combining evidence; however searching between databases is often not easy, since the data, the data models, and entry of data may be at odds with one another.

If we consider digital evidence on the internet, for example in internet hacking cases, one needs to examine logs and other files. Here too some cases have been submitted. A question that always arises with these cases is who was really behind the keyboard at a given moment. If biometric devices are used more (and spoofing of biometrics is not used), it is also possible to follow persons. The logs of the antenna’s mobile service provider can also be used to examine the position of a person at any given time.

5.4.2 Expectation

We expect that in future databases and the data models will become more standardised, in such a way that they can be combined with other databases such as :

Face and 3D images and other biometrics of everyone (ear, iris, fingerprints, DNA etc)
Banking and insurance transactions : money laundering
Telecommunication traffic and interception (location GSM and internet)
All computer actions and storage
Records of toll ports / public transportation
Board computer in private transportation (cars etc.)
GPS
Customer loyalty programs (air miles etc.)
Surveillance cameras (also satellite images)
Digital traces in domestic applications (e.g. coffee maker, microwave, heater)
Ambient intelligence

Examination and combination of data is currently possible in Dutch law if there is a severe crime involved; a court order is needed (depending on the kind of information).

For the passport for example it will be possible to track someone if the ICAO-standard is implemented without any protection. The passport will have a wireless chip in it, and information concerning face and fingerprint can be extracted remotely. Currently in trials in the Netherlands, protection is being developed in such a way that one needs more information concerning the machine-readable zone of the passport. However if countries do implement these systems without any protection, then the possibility exists that information concerning the passport they carry can be extracted remotely.

5.4.3 Discussion

The question arises whether the kind of evidence with the combination of many different databases, such as surveillance systems with non-structured data, is feasible. Also the amount of data that is collected grows very rapidly, and the question is whether it is feasible to store this data in an appropriate way.

Furthermore, it is expected that there will be more false positives when combining different databases. If a ‘cold’ hit is found in the database, which means that there was no prior information that a certain suspect would be involved in the case, false positives are possible. For example, if DNA were to be collected from all citizens of the world, and the search were against this database, then with current methods around 6 suspects would be found, perhaps more, as family relationships are not accounted for.

The questions also arises whether the databases are inputted correctly. In most databases data entry errors exist. For this reason the standardisation of databases is required before the databases are searched through routinely. In the end, the evidence for some cases might be stronger, since the fact that other data was not found before can also itself be used in certain cases as evidence. How far society wants to go with profiling in (forensic) databases, depends of course on current legislation.

See the Appendix, section A, for further elaboration on forensic use of RFID and biometric profiling.

5.5 E-learning

(Thierry Nabeth, INSEAD)

5.5.1 Personalised profiling and e-Learning

Profiling represents a central element of the traditional world of education: student performance is very systematically accessed via series of exams or other similar processes, which aim at evaluating the level of proficiency of a student in a given domain, but also at validating some capabilities that are to be certified officially by a diploma. The discipline of Education (and learning) has also a long tradition of investigating the less tangible factors (such as motivation, desires, learning style, previous experience, or personality) that intervene in student learning performance, and or the likelihood of completing well a particular task. In the latter case, different theories and tools (personality tests or intelligence tests) have been developed in order to “profile” the student, and in particular to identify the characteristics of this students and to assess his ability to “fit” a particular job (for instance the characteristics of a student may make him unable to fulfil a job in which he would have to use manual skills, long period of sustain his attention, and manage a lot of stress).

In the E-Learning world, the situation appears however to be significantly different in the way that profiling is considered. Profiling in e-learning systems appears mainly principally under two following forms:

Student modelling.

Adaptive learning systems.

5.5.2 Student modelling & profiling in LMS (Learning Management Systems)

We have previously seen how user modelling, student modelling in this case, represents an essential element of adaptive systems. However, student modelling also represent on its own a critical component that intervene also in the design of the more traditional LMS (Learning Management Systems).

LMS can be defined as electronic learning environments providing support to the online management, delivery and tracking of learning. Practically, a LMS will give a group of educators the possibility to deliver to students a series of on line courses, to test and to track the achievement of the students, and more generally to manage the educational process.

The student profile represents one of the important components of the LMS, since it is used to centralise all the information that is associated to a particular student, such as his name, coordinates, but also the information that are related to his educational background (such as his different diploma), and the advancement of his work (what are his different assignments, how well did he performed in his previous assignments, etc.). Interestingly, the representation of the student model has been the object of a standardisation, which aimed at facilitating the development of standard component, but also to facilitate the interoperability between e-learning systems. More concretely, the IMS / LIP (Learner Information Package), defines the student according to the following eleven dimensions: accessibility, activity, affiliation, competency, goal, identification, interest, QCL (certifications), relationship (relationship between the attributes), security key and transcript (performance of the learner).

To our knowledge, profiling in LMS is currently not particularly sophisticated, and relies on two ideas that will still need to be fully operationalised in the future: (1) Universal student management & interoperability: LMS are able to manage more centrally a student model and to interoperate with other systems. The profile of the student will increasingly be more complete and will result from a better capture of the actions and the achievements of the students, and from the connections to other databases (for instance between schools) that will be achieved more easily because of increased interoperability. (2) Better global exploitation of the students’ information as a whole in order to identify and to exploit some trends in the student population.

5.5.3 Intelligent e-learning applications

User adaptive (or personalised) systems, such as intelligent tutoring systems, represent another field of e-learning application in which user profiling plays an important role.

We have already described in this document how the user model (and the profiling of this information) represents a major component for the design of adaptive applications.

If adaptivity in e-learning applications do not differ fundamentally from adaptivity in other categories of application, it is important to mention that research on adaptive e-learning application has attracted a considerable attention from the research community. Indeed, adaptive systems promise to revolutionise education by providing each student with a personal tutor, addressing therefore the problems of the overcrowded classroom, and of the students that do not get enough attention from the teaching staff.

6 / 10