A Layman's Guide to Networking Cryptography
This article explains common network cryptographic concepts in plain-English.
- Primary Features of Secure Communications
- Common Authentication Methods
- Common Encryption Methods
- Common Key Exchange Agreements
Primary Features of Secure Communications
What exactly is a "secure" communication method? A means of transferring data that provides the following characteristics:
Confidentiality pertains to ensuring only the sender and receiver of a message can view the content of the message.
Integrity means ensuring a protected message has not been altered during transit. In other words, data received by a network peer has not been altered or tampered with during transit from the sending peer.
Authentication validates the message originated from the network peer the receiving peer believes was the sender.
Common Authentication Methods
The object of authentication is to confirm the legitimacy of the sender of a data packet. It allows the receiver to verify the sender is who was expected. Authentication is closely tied to integrity, and in some cases authentication methods are able to perform both functions.
The most common methods of authentication in networking are passwords and cryptographic keys. Passwords are not a particularly good method, but they do perform authentication provided the password is unique to a particular transmitting (sending) host. You can think of a password as a shared secret. Passwords are a method of authenticating the identity of a user. They presume only that a user has access to the password (effectively a key in-and-of-itself). Crytpographic keys offer several improvements over passwords. Most importantly, they can be shared securely and/or crytpographic methods may be employed where asymmetric (different) keys are used to code and decode the same content.
Authenticating with PKI
Public Key Infrastructure (PKI) may be used for both encryption and authentication, and is described in greater detail under the Asymmetric Keys section: Public Key Infrastructure (PKI). For the purpose of authentication, PKI verifies a host is legitimate.
Take a public facing Web server for example. How do you know you are communicating with that web server and not a fake host pretending to be the web server? You know the server's private key is needed to decode messages encrypted with its corresponding public key. Therefore, if the web server is the only entity with its private key, then if a network peer sends a message to the web server encrypted with the server's public key, only the target web server will be capable of decrypting the message (with its private key). The web server will then send a return message to your host encrypted with the web server's private key. That message can only be decrypted with the server's public key - which your host device has - and therefore, the decryption will only work properly if the web server is legitimate and used its private key to encrypt the return message to your host. So, how does your device know if that process has occurred? The return message from the web server will match the original message sent by your host. Note this process also works in reverse, where a web server wants to authenticate your host device.
IPsec Authentication Header (AH)
Authentication Header (AH) is an IPsec security protocol. It signs the entire IPsec data packet with a keyed, one-way hash function. AH is a part of the IPsec protocol/framework and cannot be used outside of an IPsec tunnel.
IPsec Encapsulating Security Payload (ESP)
Encapsulating Security Payload (ESP) is a security protocol designed primarily to encrypt IPsec packets. However, it is also capable of performing authentication.
Passwords are the oldest form of authentication, and are generally considered the weakest. While technically, any sort of shared secret is in fact - functionally - a password, the term password is generally understood as a reference to a plain-text token (normally 16-characters or less in length) that a human is capable of memorizing and repeating back to a device in order to authenticate the person is the owner of a username. Therefore, passwords are substantially less complex than the other authentication methods mentioned in this article.
Common Encryption Methods
Cryptographic keys may be symmetric (the same) or asymmetric (not the same). An asymmetric key means each peer in a communication has a different key. A symmetric key is an identical key shared by both peers. Asymmetric keys typically involve a shared key which is known to all parties involved, and a private key which is known only to one party.
Symmetric keys may be negotiated or they may be known to both peers prior to attempting to establish a connection between the peers.
- Asymmetric Keys
- Symmetric Keys
Asymmetric keys are cryptographic keys that are different from each other. They are used to securely transmit messages between two (2) network peers. A set of two (2) keys (a key pair) are created by a single host which are different. Normally referred to as public and private keys, the public key may be shared with anyone. It is used to decrypt content encrypted by the private key. The private key - also known as a secret key - is retained only by the original host that created the key pair.
With asymmetric keys, the opposite key must be used to decrypt a message. The private or secret key decrypts messages from the public or shared key, and the shared key decrpyts messages from the private key.
Now, you might be wondering, what is the point in all this? What is the point in having a secret key on one end of the communication? Two reasons. First, obviously it is useful for encrypting messages, but it also makes it possible to authenticate the web server. This is accomplished by your host sending a message to the web server, where it requests the web server repeat what your host sent it. If the web server returns the identical message payload to your host, it validates the fact you are communicating with the web server and not an imposter. This is a form of authentication.
The most common example of an asymmetric key is Public Key Infrastructure (PKI). Frequently used to secure website traffic. In a client/server relationship, PKI keys validate the server end of the connection. PKI is also sometimes used in both directions, where each host (network peer) creates a one-way connection to the other that is secured with PKI. This concept is one technique that allows both hosts to validate one another and validate each message has not been tampered with.
Asymmetric keys are used to secure messages traveling in one direction. Therefore, in a 2-way communication, there will be two (2) sets of asymmetric key pairs required in order for messages to be encrypted in both directions (between both network peers). For example, a 2-way Peer A to Peer B communication will require two (2) sets of two (2) keys, or four (4) keys in total. The encryption key used by Peer A, and the decryption key used by Peer B. When Peer B wants to send a return message to Peer A securely, the process is reversed. Peer B to Peer A communiction requires a different set of dedicated keys where the encryption key used by Peer B, and the decryption key used by Peer A. By applying two (2) pairs of asymmetric cryptographic keys, messages in both directions are protected.
Public Key Infrastructure (PKI) utilizes asymmetric key pairs (pairs of keys which are different). Better known as public/private key encryption, PKI is behind most consumer-facing IT infrastructure and is the most common form of encryption for SSL and TLS.
X.509 is a standard that defines the creation and use of public-private key certificates (part of PKI).
X.509 vs PKI
The manner in which many articles about X.509 certificates are written can muddy the waters between the two. They are not the same thing. X.509 is a public-key certificate standard, and a type of PKI. However, PKI itself is a method or practice, it is not a protocol.
A properly formed X.509 certificate includes:
- A public key
- An identity (e.g. hostname, person’s name, or organization’s name)
- A signature (self-signed or from a 3rd party certificate authority)
How does X.509 compare to Public Key Infrastructure (PKI)?
Thought it is quite often more strongly associated with encryption (security) certificates, X.509 is actually a set of internet protocols. The X.509 protocols are defined by RFC 5280 , with updates specified in RFC 6818 , RFC 8398 , and RFC 8399 . An X.509 certificate encodes the server's public key and a signature. The signature is used during the certificate extraction process to confirm the certificate is authentic. Additionally, the certificate includes metadata used by the Certificate Authority (CA) to track the certificate and provide guidelines on how the public key can be used.
A Certificate Authority (CA) is a trusted 3rd party server that validates the authenticity of an X.509 certificate. Both client and server must trust the same authority. In order to ensure that’s the case, CA is specified in each X.509 certificate.
How Certificate Authorities (CAs) Validate X.509 Certificates
Certificate Authorities act as trusted third-party authentication brokers.
To preserve authenticity of the certificate, a hash of the X.509 certificate’s public key is generated. That string is then encrypted using the CA’s public key. When the CA is asked to authenticate the other peer’s certificate, this string is sent to the CA server, which uses its private key to decrypt the message and compare the hash value of the X.509 certificate to its master record of what the hash should be. The CA informs the requester whether the test passed or not.
Symmetric keys are the opposite of asymmetric keys.
Diffie-Hellman (often abbreviated as D-H or DH) is an anonymous key agreement cryptographic algorithm. D-H is referenced frequently throughout this website due to its ubiquity. It is mentioned here (under encryption methods) because it is used to create encryption keys between two communicating parties. What is special about the D-H algorithm is it allows two (2) remote network peers to generate a shared key without securely sharing one another's keying material (also known as a "shared secret"); a problem known colloquially in the cryptography world as the inherent "chicken and the egg problem" with symmetric keys.
Pre-shared keys are exactly what their name sounds like. They are cryptographic keys that have been created and shared between network peers who will use them to authenticate and/or encrypt future secure communications. Pre-shared keys offer the best protection from Man-in-the-Middle (MiTM) attacks, but are challenging to manage properly. The integrity and privacy of the initial key exchange must assured. If the sanctity of the exchange process cannot be guaranteed, they should not be used.
Pre-shared keys are most likely to be used in a corporate environment, such as on a private LAN where their use and sharing can be controlled. They may also be utilized in limited peer-to-peer exchanges, such as between friends or other circumstances where the sharing process can be dealt with off-line.
In cryptography, a shared secret simply means the same secret key is used by both network peers attempting to establish a secure communication channel. It means the two (2) peers have somehow ensured they are using the same key, and that fact is due to some process whereby both peers either obtain from one another or generate an identical security key via some sort of shared information. The key is not necessarily exchanged or shared online. Rather, it is the key used for secure communications - the result of a process - that is identical. The term can be confusing at times. The point is both peers in a secure network transmission are using identical security keys for encryption and decryption.
Hybrid systems are capable of both authentication and encryption using a single security protocol. A good example is IPsec's ESP methods that utilize more than one authentication method. For example, a PKI key exchange followed by a username/password challenge that a human must enter in order to gain access to a website. Hybrid systems are very common in systems that require human interaction, such as an e-commerce website or remote access to a corporate LAN.
Most hybrid authentication/encryption protocols follow these steps:
- Establish PKI
- Exchange secret key material (shared secret)
- Use secret key in symmetric-key cryptography system (e.g. Diffie-Hellman)
Key Exchange Agreements
A Key Exchange Agreement is a process whereby two (2) network peers agree on how they will exchange cryptographic keys. This process may take place over an insecure or secure network connection. If it occurs over an insecure connection, one of the following practices should be followed in order to prevent the possibility of a third party intercepting their key agreement information (e.g. via a Man-in-the-Middle attack vector):
- Using a PKI methodology, both peers create a public and private key combination. Each peer provides its public key to the other peer. The peers then establish secure connections with one another. From there, they negotiate the key agreement to be used.
- A secret symmetric key method such as Diffie-Hellman is used to exchange a shared secret, where neither party reveals it during the key negotiation process.
As a last resort, another method is possible though it is generally frowned upon, but as a worst-case scenario it is better than a plain-text key negotiation process. Immediately after a secure connection type is agreed to, a secure connection is established and the peers negotiate another key exchange agreement, inside the secure connection.
Diffie-Hellman is an anonymous key agreement algorithm used to create encryption keys between two (2) communicating parties.
Diffie-Hellman (D-H) is an algorithm named after its inventors, Whitfield Diffie and Martin Hellman. It allows the generation of symmetric keys without explicitly sharing them (though the key is often referred to as a "shared secret"). D-H provides encryption, and thus it is also mentioned in this article under the encryption section.
IKEv2 is a security protocol that facilitates cryptographic symmetric key exchanges between endpoints.
Some Internet forums and articles describe IKE/IKEv2 as a VPN protocol. This is a misnomer.
Built into IPsec, IKEv2 is defined as a secure key exchange protocol (a type of security protocol). IPsec uses IKEv2 to establish a temporary secure tunnel for the purpose of securely exchanging Security Association parameters and Diffie-Hellman cryptographic session keys between two IPsec endpoints. However, it should really be treated as a security framework. IKEv2 is fundamentally an amalgamation of three (3) other security protocols: ISAKMP, OAKLEY, and SKEME.
IKEv2 is not a tunneling protocol for exchanging data. It is a security protocol that establishes an ephemeral secure tunnel for the sole purpose of securely exchanging encryption keys. IKE/IKEv2 was conceived and designed specifically for implentation with IPsec. It defines how endpoints using IPSec will authenticate one another through Security Associations (SAs), and executes the parameters of those SAs by facilitating security key computations, assignments, exchanges, replacements, and revocations.
IKE's methodologies have been copied and implemented in other VPN protocols (e.g. OpenSwan, StrongSwan).
OpenIKE, OpenIKEv2, Racoon, and Racoon2 are examples of alternative, open source IKE/IKEv2 solutions that may be implemented by other processes to take advantage of IKE/IKEv2's robust key exchange architecture.
Selected characteristics of IKE:
- Manages IPsec Authentication, Keys, and Security Associations (SAs)
- 2 phase operation
- Phase 1: Mutual authentication using pre-shared X.509 keys (encryption and integrity) and setup Phase 2 communication
- Phase 2: Negotiates methods used to encrypt information from both IPSec endpoints; Security association (SA) for IPsec tunnel is established
- Supports MOBIKE (Mobility and Multihoming IKE) protocol
- NAT traversal1
- SCTP support (TCP only over ports 1021, 1022)2
- Uses port 500 (UDP)
IKEv2 is an IPsec protocol that derives its behavior from other, non-IPsec specific protocols: ISAKMP, OAKLEY, and SKEME
MOBIKE (MOBile IKE)
MOBIKE is an IPsec protocol designed for client-side mobile device usage.
MOBIKE stands for IKEv2 Mobility and Multihoming Protocol. Even though it's tightly coupled to IKEv2, it has its own RFC (4555). As its name implies, MOBIKE is an IKE-compliant platform built for mobile IPsec connection use. Why is this important and how is it relevant?
MOBIKE only works in IPsec's tunnel mode.
MOBIKE is highly resilient towards network changes, including source IP address changes. This makes it useful for mobile platforms due to their frequent hand-off of connections from site-to-site (which sometimes causes the client device's IP address to change).
As with IKE/IKEv2, MOBIKE is sometimes discussed online as if it were a VPN technology in and of itself. It is not.
VPN service providers are particularly bad about making such innuendos.
KINK (Kerberized Internet Negotiation of Keys)
Kerberized Internet Negotiation of Keys (KINK) is an IPsec key agreement protocol. It is an alternative to IKE.
KINK is another key agreement negotiation protocol, though it applies to IPsec only. Just like IKE/IKEv2, the KINK protocol sets up Security Associations in IPsec. The difference is KINK eschews X.509 certificates in favor of Kerberos, a network authorization protocol.
Kerberos Internet Negotiation of Keys or KINK is a protocol for managing IPsec Security Associations (SAs). It behaves in a similar way as IKE/IKEv2 in that it requires a two-phase process of authenticating the user. Of the two (2) phases, the second contains the real keys to the kingdom. Just as with IKE/IKEv2, the first phase exists simply to setup the second phase.
Why would anyone want to use KINK instead of IKE? Most IPsec implementations favor IKE. That’s especially true for connections traversing the Internet. Kerberos (and therefore KINK) starts from a position of requiring a password, making it conducive to circumstances where a user must enter a password to kick off the data exchange process for some other reason to begin with. This means if a solution requires a client device and server to communicate without user intervention, you virtually must use IKE. However, even if KINK/Kerberos could be used, there’s no reason you can’t still use IKEv2 and simply send the credentials manually entered by the user after the IKE-driven connection is established.
KINK is suitable in corporate environments or other private networks. It's not a good method for a platform connecting devices across the Internet, such as a VPN.
To understand why an IPsec Security Association (SA) management process might use KINK, one needs to understand how Kerberos works. KINK is an alternative to IKE/IKEv2. Both are protocols that manage SA protocols.
At a high level, Kerberos:
- Is an authentication protocol
- Uses symmetric (shared) keys
- Requires passwords, but never transmits them between peers
- Destroys password on local machine after one-time use
- Refers to authentication key exchanges as “tickets”
A thorough explanation and walk-through of Kerberos may be found here.
Kerberos is a network authorization protocol. Its purpose is to allow a client to prove its identity to a server. In a nutshell, Kerberos uses a shared secret model of authentication. A Kerberos client will have a password that has been previously been shared with the server. When the client attempts to connect to the server, it identifies itself to the server. The server then creates a challenge phrase and encrypts it using the client’s password, and returns the encrypted challenge phrase to the client device.
Why/When to Use Kerberos
Kerberos is a great tool when it’s prudent to have some sort of constant authentication process, where every message between a client device and a server needs encryption and authentication, but it’s not running over another service that would provide them (e.g. a VPN protocol such as IPsec). In particular, when a user must enter a password and/or process multi-part authentication, and then communications will go back and forth between the client and server.
Let’s say you have an application on a server that needs to verify the client device and user on every message received from the client device. One way to accomplish this is for the client device to store the user’s password during the communications session and send that password with every message to the server. The problem with this approach is the user’s password is being transmitted frequently and it must be stored locally on the client device until the communication session is terminated. This creates two (2) points of exposure; the client device (e.g. its memory) and any device along the path between client and server.
Just looking at the potential for a MitM (Man-in-the-Middle) attack vector, the more times the password is transmitted with the messages between client and server, the more likely an adversary will be able to mathematically deduce the user’s password as it will be the same byte pattern transmitted in every message. If the user’s password is compromised, then obviously an attacker may then be able to impersonate the user.
How Does Kerberos Solve This Problem?
Kerberos never transmits the user’s password, and it is destroyed on the client device (i.e. not stored in memory or elsewhere) after it is used. The password is only needed for a very short time, during the initial authentication process.
First off, the user’s password is never transmitted. Second, when the connection is initiated, the server returns a session key that is encrypted with the user’s password as the server already knows the user’s password ahead of time. The client device utilizes the user password to decrypt the message and obtain a session key. After that, the password may be discarded by the client device as it is no longer required. Now, the session key is used for subsequent communications during the current session between client and server. This often involves repeating the process such that the session key just created acts as the password to create a new session key. It’s possible multiple session keys may be used during the communication session. Regardless, at some pre-determined point each session expires and the negotiation process starts all over again.
The bottom line is Kerberos is a great tool when a user must login to an application. It is an excellent tool for something like a smart phone app or website where a user needs to login to use a product or service. Kerberos is more secure than traditional client/server identity credential exchanges because it protects the user’s password, never transmitting it.
Additional protection may be added through methods such as multi-factor authentication using short TTL (Time-To-Live) shared keys. For example, texting a code to a user’s cell phone where the server knows the code and the cell phone is not the same device as the one trying to login to the server, and the code has a very short lifespan. The other device (in this case a cell phone) and the server are the only devices that know the code. This strategy is primarily designed to defeat scenarios where the client device has been compromised with malware and a malicious hacker is aware of the user’s password. Without the secondary key (code texted to the cell phone), the client will fail the login credential requirements.
Perfect Forward Secrecy
Perfect Forward Secrecy (PFS) refers to a security practice governing cryptographic key exchanges in a manner that compartmentalizes risk.
Perfect Forward Secrecy is a process. A new and unique private encryption key is generated for each session (ephemeral keys).
- Must be supported by both peers of the VPN
- PFS requires more processing power, and takes slightly longer for IKEv2 Phase 1 and 2 to complete
- Known as a session key
- Managed by SKEME
How PFS Works
As its name implies, a session key is a cryptographic key that created for a particular session. When that session is bought down, the key is destroyed and not used again. The next time a session is initiated, a new and completely different session key is created.
When PFS is turned on, for every negotiation of a new phase 2 SA the two gateways must generate a new set of phase 1 keys. This ensures if the phase 2 SA’s have expired, the keys used for new phase 2 SA’s have not been generated from the current phase 1 keying material.
If PFS is not active, the current keying material already from phase 1 will be used again to generate new phase 2 SA’s.