You are here: Resources > FIDIS Deliverables > HighTechID > D3.8: Study on protocols with respect to identity and identification – an insight on network protocols and privacy-aware communication >

Resources

Identity Use Cases & Scenarios.
FIDIS Deliverables.
- Identity of Identity.
- Interoperability.
- Profiling.
- Forensic Implications.
- HighTechID.
  - D3.1: Overview on IMS.
  - D3.2: A study on PKI and biometrics.
  - D3.3: Study on Mobile Identity Management.
  - D3.5: Workshop on ID-Documents.
  - D3.6: Study on ID Documents.
  - D3.7: A Structured Collection on RFID Literature.
  - D3.8: Study on protocols with respect to identity and identification – an insight on network protocols and privacy-aware communication.
  - D3.9: Study on the Impact of Trusted Computing on Identity and Identity Management.
  - D3.10: Biometrics in identity management.
  - D3.11: Report on the Maintenance of the IMS Database.
  - D3.15: Report on the Maintenance of the ISM Database.
  - D3.17: Identity Management Systems – recent developments.
  - D12.1: Integrated Workshop on Emerging AmI Technologies.
  - D12.2: Study on Emerging AmI Technologies.
  - D12.3: A Holistic Privacy Framework for RFID Applications.
  - D12.4: Integrated Workshop on Emerging AmI.
  - D12.5: Use cases and scenarios of emerging technologies.
  - D12.6: A Study on ICT Implants.
  - D12.7: Identity-related Crime in Europe – Big Problem or Big Hype?.
  - D12.10: Normality Mining: Results from a Tracking Study.
- Privacy and legal-social content.
- Mobility and Identity.
- Other.
IDIS Journal.
FIDIS Interactive.
Press & Events.
In-House Journal.
Booklets
Identity in a Networked World.
Identity R/Evolution.

D3.8: Study on protocols with respect to identity and identification – an insight on network protocols and privacy-aware communication

Title:
TRANSPORT LAYER PROTOCOLS

Transport layer protocols

The third layer in the protocol hierarchy contains protocols like TCP and UDP. These two protocols are the foundation for many other protocols from the higher layers. TCP is used for reliable delivery of data streams, whereas UDP is mainly used for the fast, but unreliable transport of single data packets.

TCP

The Transmission Control Protocol (TCP) is one of the core protocols of the Internet protocol suite, supporting many important Internet protocols such as HTTP, FTP and POP3. TCP guarantees reliable (received packets are acknowledged by the receiver, else, they are sent again) and in-order (by the use of sequence numbers) packet delivery, thus making it the protocol of choice for applications, where loss of data is not acceptable. Applications can send a stream of data via so called stream sockets to the TCP layer, which divides the stream of data into appropriately sized segments of data. These “data chunks” are extended by checksums, which allow the receiver of the data to check for corrupted packets. Note though, that the TCP checksums are in no way cryptographically secure. They just detect data corruption caused by the network, not by an active attack.

The TCP protocol allows the dynamic adaptation of the speed with which packets are sent. This is called congestion control. The TCP packets are passed to the next lower layer, the IP layer.

Figure : Three-way handshake in TCP

In order to send data via TCP, a connection between the sending and the receiving computer has to be established. A three-way handshake is used for this, as illustrated in Figure 7 which shows connection establishment, data exchange and two-sided connection termination. The termination of the session is realised by a handshake too (Wikipedia: Transmission Control Protocol 2007).

Table 6 shows the header (bits 0 to 160/192) of a TCP packet.

Table : TCP header

The source and destination port specify the port which is used for the data transfer. The sequence number is used for controlling the order of the sent packets. The acknowledgement number is a kind of receipt to acknowledge a received packet. The checksum is used to check for corrupted packets, the urgent pointer indicates packets with high priority. The options fields can contain non-mandatory information. Their usage is application dependant.

Identifiers and their uniqueness

The TCP protocol itself contains no data which can be used to identify a user (except such information is contained within the (unencrypted) data part of the packet). In addition to the lower level IP protocol, this is not true any more. The source and destination ports in cooperation with the IP address of the sender and receiver can identify both participating parties.

TCP is a good example for identifying information introduced by the “protocol obscurities” as described in Section 2.2 Although the meaning of the Sequence number field is defined – the initial value is not. It is up to the implementer how the initial sequence number is chosen (e.g., randomly). The same is true for instance for the Window field used for congestion control. As the congestion control has a key influence on the overall performance of TCP, many attempts to optimise them are done e.g., by operating system manufactures.

All these implementation depended information could be used for so called TCP/IP stack fingerprinting. The goal is to guess which TCP/IP stack and operating system is running on a remote machine. TCP/IP fingerprinting (as many other fingerprinting methods) can be done actively or passively; observing or modifying. A passive observing attempt would just listen on the networks links and tries to guess the TCP/IP stack from the eavesdropped data packets. An active observing attempt would also send data packets. The attacker would intelligently choose the data packets he sends to concluded as much information as possible from the remote host. But as it is still an observing attack the attacker would behave according to the protocol. Of course he could also violate the protocol rules. This typically gives him more possibilities for guessing the TCP/IP stack – but would make his attack also more obvious.

The software Nmap (“Network Mapper”) is a well-known representative for tools which are able to do TCP/IP fingerprinting. According to the developers, Nmap currently has more than 1500 fingerprints in its database.

Of course the identifying information described above cannot only be used to guess the TCP/IP stack and operating system used but can also be used to link different TCP connections, especially if the guessed TCP/IP stack or operating system is a rather unusual one. Note that the information mentioned above can also be used to decide if TCP packets belong to the same TCP stream or not. Imagine for instance the situation that an attacker can monitor network traffic on different locations of the network whilst he cannot easily “follow” the TCP packets as an anonymisation service (operating at the IP layer) is used by the communication partners. Nevertheless chances are high that the attacker can still conclude who is communicating with whom just by looking at the TCP information.

Personal data

The ports can leak information about the application used without looking at the data content. This is possible since there are standard ports used in the Internet, like port 25 for FTP, or port 80 for HTTP requests. If an application uses the Options field, this can contain personal information too, but this is very unlikely, since most application will use the Data field for such information. The Data field therefore can contain the most sensitive information, i.e., the application data.

Linkability: identifiability and profiling

By monitoring and tracing TCP data (with tools like TCPDump), it is possible to profile a communication or Internet traffic in general. A profile can contain header information and/or the data sent. If the data are not encrypted, an analysis of their content is quite easy.

Even if the connection is protected by low-level methods like IPSec, a traffic analysis can take place, as Bissias et al. have shown (Bissias et al. 2005). This can be used for profiling. To identify a user out of many, either the data fields or the lower-level IP address have to be examined.

Avoidance or circumvention of information disclosure

The data contents can be encrypted to prevent eavesdropping. A secure tunnel can be used in order to reduce the possibility of traffic analysis, but this must be designed and implemented carefully since traffic analysis can still be accomplished despite a secure tunnel.

The usage of TLS can protect the data sent with the TCP protocol. The TLS protocol operates above the TCP protocol, but beneath application protocols like HTTP. Thus, TLS can be easily integrated into available products, since it does not require any changes to the application and transportation protocols used. TLS has been developed from the SSL protocol. The most common usage is the protection of sensitive data which is sent by the HTTP protocol over the Internet. TLS has the following security features:

Peer entity authentication;
Data confidentiality;
Data integrity;
Key generation and distribution;
Security parameter negotiation.

TLS is a two layer protocol - the TLS Handshake Layer and the TLS Record Layer. Figure 8 shows an illustration of the handshake protocol. This handshake protocol is for establishing a secured connection between server and client. The handshake protocol has multiple purposes. First, it is needed for the exchange of certificates, to ensure the identity of the communication partner (note though that the client authentication is optional). Then, the handshake is used for the client and server to exchange their crypto-preferences, i.e., which cryptographic protocols to use for the encryption and the signing of messages. Lastly, the protocol is used for the exchange of the keys which are used for the cryptographic functions (Molva 1999).

Figure : TLS handshake protocol (“*” marks optional fields)

After a successful handshake the TLS Record Layer is instantiated. This layer is responsible for the actual usage of the agreed security methods. This means, within the Record Layer, the data is encrypted and signed, thus protecting it against eavesdroppers and active attacks which try to change the communicated data. Furthermore, the record layer has built in replay protection and functionality for the key generation out of the exchanged master key.

As described above, TLS provides data integrity and confidentiality, as well as peer authentication, although this is optional. TLS does not protect against traffic analysis though. The protection level depends on the chosen cryptographic algorithms as well as on the keys used. It has to be noted that TLS only provides integrity and confidentiality on a hop-to-hop basis, e.g., directly from client to server, or from server to server. It does not provide end-to-end security, e.g., from client to client where data must pass through several servers. Figure 9 shows that the connection from client to server1 is secured by TLS, as is the connection from server1 to server2. But there is no directly secured connection from client to server2.

Figure : Illustration of the TLS hop-by-hop protection

UDP

UDP is the abbreviation of User Datagram Protocol and is defined in the RFC 768. UDP offers fast but unreliable data transmission - packets may get lost without automatic retransmission. No guarantee of order is given by UDP either. Because of these features, UDP is a lightweight protocol, which delivers each packet independently of other packets. UDP contains no state, cf. Figure 10 which shows request and response being stateless and independent of each other. Thus it is often used for services where small packets have to be sent in a fast way to many clients (or servers). Services which use UDP are: DNS, streaming media like IPTV, Voice over IP, online games, etc.

Figure : UDP data transfer

UDP, like TCP, uses ports to allow “parallel” sending and receiving of data in application-to-application scenarios. It has to be noted that UDP and TCP ports exist independently of each other, meaning that a port of a computer can be used at the same time for a TCP socket (stream socket) and a UDP socket (datagram socket). The UDP header has a very simple layout as shown in Table 7.

Table : UDP packet structure

As can be seen, a UDP packet is noticeable smaller than a TCP packet, and so such a packet can be processed faster than a TCP packet. But it is also evident that the lack of most of the TCP packet control fields leads to a loss of functionality.

Identifiers and their uniqueness

If UDP is used in the simplest possible case (sending just one packet) is does not reveal identifying information. But in practise a useful application often requires sending of multiple packets and implies the receiving of answers from the communication partner(s). Through the port numbers (in conjunction with the underlying IP protocol address information) it is possible to link all these UDP packets.

It might also be possible to decide if two UDP packets were sent from the same host, even if the underlying IP information is anonymised, by means of an IP level anonymisation service. This linkage can be done with the help of the source port number. TCP/IP stacks of different operating systems have different algorithms for choosing the source port number. If, for instance, it is known that the TCP/IP stack of the “victim” increases the UDP source port number one-by-one and the attacker can eavesdrop a sequence of UDP packets arriving at a certain recipient with rather random source port numbers (but not consistently increasing ones) then from an attacker’s point of view the probability decreases that these UDP packets came from the “suspect”.

Moreover, the data field may contain information which can be used for identifying participating parties.

Personal data

If the Data part of a UDP packet is not encrypted, personal data can potentially be read out. The only field which could be considered to be personal is the Source Port, because it identifies an open port to which answers should be sent. But this is more a security than privacy problem.

Linkability: identifiability and profiling

The combination of Source Port and Destination Port can identify packets which contain data belonging to the same application-to-application session. If the data parts of the packets are not encrypted, identifiable properties can be read out by an eavesdropper.

Avoidance or circumvention of information disclosure

Sending the content in clear-text can be avoided by using encryption, either application level encryption or network layer encryption like IPSec. Traffic analysis can take place even if network layer encryption is used.

The usage of the Source Port is optional. Therefore, the source port should be omitted (e.g., set to a random value) if it is not needed for a reply.

TLS, described above for TCP, cannot be used without changes for UDP. TLS requires a reliable transport protocol, like TCP or SCTP. To overcome this shortcoming, DTLS has been created. DTLS is an acronym for Datagram Transport Layer Security. DTLS is based on TLS meaning it will provide confidentiality and integrity as well as authentication. It has been developed in RFC 4347, and is now waiting for a widespread deployment.

SCTP

SCTP is a transport layer protocol, thus operating analogously to UDP and TCP. In fact, SCTP provides some similar services as TCP, particularly reliable and in-sequence submission with congestion control. SCTP has initially been developed because call setup for Voice over IP faced some severe problems when using TCP. The main problem is that the in-sequence delivery of TCP packets could disrupt the call-establishment of independent calls. The IETF working group SIGTRAN, then responsible to design a mechanism for reliably transporting call control signalling over the Internet, decided to create a different protocol to overcome these problems. The development of SCTP began.

The main difference of SCTP to TCP is that SCTP uses message-streams instead of byte-streams. SCTP allows multiple streams of messages within one association. The message-streams are independent of each other, but each stream provides reliable and in-sequence data delivery. This is useful for applications in which multiple streams are related, but not dependent from each other. An example is a video conference in which the video and audio data is related, but not dependent. If the video is slowing down, the audio should run on smoothly if possible.

Further noteworthy features of SCTP are (Stewart, Amer 2004):

Message Orientation: a message is delivered as a whole, like in UDP. This means, if the sender sends a 100 byte message, the receiver will get this 100 byte message in one read, no more, no less. Message boundaries are preserved.
Unordered Delivery: next to the in-sequence delivery of messages, SCTP offers unordered delivery for applications, in which the order of messages is not important.
Keep-alive function: a “heartbeat” is sent regularly in order to keep an association/connection alive when idling.
Message time-to-live: a message can be tagged with a time-to-live (ttl) value, indicating how long a message is useful.

The packet structure of SCTP is shown in Table 8.

Table : SCTP packet structure

A SCTP packet can contain several so called “chunks”. A chunk has a chunk header and a data body. The chunk header describes the type (e.g., DATA, INIT, ERROR), the flags for special type dependent properties and the length of the data field. The data field has header fields too which depend on the selected type. It is evident that SCTP is a more complex protocol than TCP or the simple UDP protocol.

Identifiers and their uniqueness

A SCTP packet can contain identifiers in its headers, but this is dependent on the type used. The INIT chunk for example contains all the IP addresses which can be used for the communication. A DATA chunk does not contain any identifying properties, except those included inside the payload.

Personal data

A SCTP packet does not contain any personal information per se. Exceptions are the data carried as payload, which may contain personal data, and the IP addresses which must be provided in order to support multihoming. Multihoming is a SCTP service to increase the reliability of an association. It enables the two participating parties in an association to define multiple IP addresses, which can be switched between if required.

Linkability: identifiability and profiling

The combination of Source Port and Destination Port can identify packets which contain data belonging to the same application-to-application session. If the data part of the packet is not encrypted, identifiable properties can be read out by an eavesdropper. The additional IP addresses which can be specified in the INIT chunk can be used to link them to one entity.

Avoidance or circumvention of information disclosure

The provision of multiple IP addresses is optional, if the additional reliability provided by several IP addresses is not essential, it should be avoided.

TLS, described above for TCP, can be used with SCTP, too. TLS requires a reliable transport protocol, which SCTP is.

schulte

10 / 30