Internet Privacy

How the NSA Monitors VPNs

Did you know in the United States, the NSA (National Security Agency) routinely monitors all internet traffic coming in or out of the United States?1 Were you aware they are capable of monitoring all domestic internet traffic as well?2

USA and UK Government VPN Monitoring

Multiple sources point to the fact many encryption methods are not secure generally or specifically when it comes to government intelligence agencies. Of course, the vast majority of people don't care. If you do, be aware it is only a matter of time until large government agencies such as the NSA (National Security Agency) in the U.S. will have the means to decrypt all intercepted internet communications. It may take years, but there will come a time when modern computing power will make brute-force decryption possible against current ciphers.

In the meantime, it makes sense for these organizations to store anything they cannot currently decrypt. They do not know what nuggets of useful information could be uncovered in the future when they are capable of decrypting all of it. In the meantime, storage is cheap.

Naturally, we're talking about an extreme amount of data. Even the NSA has its limits. The NSA likely does not save literally everything (although it could). The Agency is able to collect metadata and conduct analyses on some of that traffic; enough to be able to make educated guesses on a lot of the data as to whether or not it may have any possibility of importance at all.

Point being, the NSA will eventually be capable of cracking all current cipher algorithms in a short period of time. This is precisely why the U.S. government has been vacuuming up virtually all message data on the Internet for years. First off, much of it can be decoded right now. Cheap storage, and there are only a few technologies currently beyond the NSA's reach for decrypting, and as time progresses there will be fewer and fewer they cannot readily break that are based on today's methodologies of constructing ciphers.

For a glimpse into how and why Quantum Computing is a game changer in the intelligence community, you may wish to read this blog post. Once the quantum computing barrier-to-entry is crossed and it becomes viable for private companies, the only way to counter near real-time decryption via quantum computers will be to re-invent how encryption algorithms are architected in the first place. They will need to become infinitely more complex to provide sufficient resistance to via brute-force decryption attacks.

Known Compromised VPN Protocols

PPTP should be avoided, period. It is common knowledge it has been compromised for some time by private malicious actors.

The NSA has been attempting to crack VPNs in earnest since at least early 2006.3 And it is known the NSA succeeded in hacking the Tunnel mode of IPsec in 2007, including compromising IKEv1 in the process.4 It is unknown if IKEv2 has been cracked by the NSA or not. While it originated in 2005 (RFC 4306, now defunct), its standard was updated in 2017 and is currently described in three (3) RFCs: 7427, 7670, and 8247.

I recommend avoiding L2TP, L2TP/IPsec, and SSTP if you are concerned about the NSA or its equivalents monitoring your Internet traffic. Use a reliable open source VPN, such as OpenVPN or StrongSwan, implement Perfect Forward Secrecy, and use key exchange ciphers that are 2048 bits or higher and data encryption keys of 384 bits or higher. If you are only concerned with ordinary bad guys, avoid PPTP and L2TP; the others should be fine. In general, you are better off sticking with open source products as they are much, much less likely to have a backdoor or similar clever subterfuge introduced as they would likely be detected.

Diffie-Hellman (DH) and the Logjam Attack Vector

Diffie-Hellman (DH) is an anonymous key exchange algorithm used to create encryption keys between two communicating parties. It is a staple of numerous cryptographic protocols, includinge IKE. DH allows the generation of symmetric keys and is designed for short-term shared key creation. Mathematically, DH allows both parties to derive a symmetrical secret key without explicitly sharing it. Instead, characteristics that describe the key are shared, and through a series of complicated calculations, both parties arrive at the same result independently. The resulting symmetric key is sometimes referred to as a "shared secret."

Diffie-Hellman ciphers are divided into groups, described in several RFCs (RFC 3526, 5114, 5903, and 7296). Each DH group contains one or more algorithms. As of this writing, there are 24 DH groups. The distinguishing characteristics between DH groups are:

  1. Type of algorithm (e.g. modulus MODP or Elliptic Curve)
  2. Number of bits used to calculate the key

Diffie-Hellman key exchanges provide encryption only. They are not authenticated. This makes them vulnerable to sophisticated Man-in-the-Middle (MitM) attacks. Yet prior to 2015, everyone thought if the DH key were robust enough, it was impossible for anyone to crack the key before a later stage of security key generation would make knowledge of the original key obsolete. Unless an attacker could reasonably expect to decipher the first DH key in a key exchange agreement process in realtime, DH was sufficient and basically unbreakable with modern technology and into the foreseeable future.

That viewpoint changed in 2015 when a group of cryptographic scientists made public a white paper that described exactly how to accomplish this task. They labeled their unique attack vector Logjam.

Logjam

Logjam is a vulnerability in DH keys up to 1024-bits with a very specific - yet surprisingly common - target profile. It is primarily directed at TLS-based websites, though it is conceivable TLS-based VPN protocols are also vulnerable,5 such as IPsec.

Though PFS is not required by IPsec, the resulting DH symmetric key may be used to facilitate PFS.

Although Logjam is a legitimate threat, its scope is limited. However, when it is enacted it is potentially devastating as it can defeat Perfect Forward Secrecy (PFS). The 2015 white paper that made this vulnerability public notes that nation state actors in particular are likely to have the capability to implement the attack in near real-time. Notably, this means actors such as the NSA are very likely to be exploiting this attack in the wild. I mention the NSA specifically because it is known to maintain databases of security vulnerabilities, including the ability to analyze Internet connections in real-time and compare IP packet signatures to known vulernability profiles. That is the type of situation where Logjam is particularly useful and effective, since a database of precomputations dramatically increases the scope of key signatures the attack is capable of defeating. Furthermore, the researchers found many commercial websites recycle the same key between client sessions, at times repeating the same key sequentially between client connections. This makes the prospect of compromising these connections even easier and swifter since all the data for calculations is almost undoubtedly still readily available on the MitM attacker's device.

Logjam is only possible when the following conditions are true:

  • Specific Diffie-Hellman groups in use by the target
  • SSL 3 or TLS 1.0
  • 1024-bit key
  • Server allows downgrading key requirement to 512-bits when export flag is set
  • Precomputations are performed ahead of time that accomplish most of the work

Logjam requires substantial preparation ahead of time. It works by first forcing a downgrade of 1024-bit encryption level to 512-bits. Diffie-Hillman uses prime numbers, and because there are a limited number of prime numbers, they can be pre-calculated and a database of possible outcomes can be pre-calculated. When a MitM attacker detects the DH usage, it forces a downgrade to the 512-bit cipher to make the task of hacking the key easier. The database population takes a considerable period of time to calculate, but once that is done the matching and subsequent decrypting is very quick. Naturally, all these details make the usage of attack highly unlikely unless the attacker is recording the transmission for decrypting later. If that is the case, with unlimited time, the attacker could likely decrypt the messages over a period of weeks, perhaps less. A nation-state actor would be the ideal attacker profile, with sufficient resources and patience.

The MitM attack is easily defeated provided a second phase of key negotiation is implemented after the initial key exchange. Even if the initial key exchange is compromised, the MitM attacker would need time to calculate the corresponding private keys used by either network peer of the VPN, and then monitor their connection to decrypt the next phase of key exchange before the peers changed to their 2nd phase keys. To ensure adequate protection, Forward Secrecy should be utilized, including applying a short time frame between key re-negotiations. Even if a MitM attacker is able to record all data transmissions (e.g. a nation-state actor), the shorter the reset period of the key exchange, the more difficult it will be for a 3rd party actor to decrypt the underlying communication due to the time required to decrypt each layer.

The Diffie-Hellman vulnerabilities described here are realistically only within the realm of nation-state actors due to the resources required to make use of them within a suitable period of time. If Perfect Forward Secrecy is also applied, even a nation-state actor - and even using a known compromised protocol such as IPsec - would have to expend substantial resources to decrypt all associated messages.

Logjam's Attack Vector

How does the implementation of Logjam work? It leverages a Man-in-the-Middle (MitM) style attack like so:

  1. Preparation: Perform precomputations for the two most popular 512-bit primes on the Web, to allow ability to quickly compute discrete log for any key-exchange message that uses one of them
  2. Setup or compromise existing server to create MitM vector
  3. Wait for SSL 3 or TLS 1.0 traffic
  4. MitM server uses a TLS protocol flaw to attempt to downgrade the connection to export-strength and recover the session key
  5. If the server complies and downgrades the connection, we know export-grade Diffie-Hellman is now in use, limiting the key size to 512-bits
  6. Attempt to compromise the DH key using precomputations

The Logjam attack takes advantage of Web server support for a 1990's era federal export restriction law. Though no longer required, many web servers continue to implement this feature through legacy code. The process downgrades encryption to 512-bit maximum in order to comply with the old regulation. The researchers who uncovered the Logjam attack vector figured out the most common primes used in DH 512-bit keys for a particular, common DH group. As of 2015 this tactic worked on ~7.8% of HTTPS servers among Alexa Top Million domains - a significant and quite surprising statistic.5

What is Precomputation?

Precomputation is the act of performing repetitive calculations ahead of time pertaining to solving a computational algorithm. You may think of it is "pre-solving" portions of a puzzle. It is a useful practice when portions of an algorithm depend on solving time consuming mathematical operations not related to the input of the algorithm. For example, a mathetmatical operation performed on two (2) prime numbers. Because the prime numbers may be known ahead of time, a calculation using both of them may be precomputed to facilitate faster real-time processing of the actual algorithm. A lookup table is generated with the precomputations. When a known equation is called for in the algorithm, rather than running the computation, the precomputation lookup table is consulted for a match. If known, the result is retrieved from the lookup table and applied to the algorithm. This process can substantially reduce the time required to run the algorithm.

Faults That Help Attackers

A number of common practices among web servers and VPN servers make the job of a hacker easier. In fact, it's shocking how common certain poor practices are in the IT industry. When one begins to realize certain patterns exist across the world, it becomes less surprising how frequently data exposures and successful hacking events occur. No wonder they are commonplace. If a bad actor is open to opportunities based on ease of compromise versus particular targets, their probability of success rises dramatically.

Here is a small sample of common security mistakes in the wild that lead to higher probabilities of attack vector success by malicious actors:

  • Many web servers are programmed to re-use keys between sessions, even between different handshakes (17% of all web servers)5
  • Some servers automatically re-use the same handshake for several hours or longer, exposing other clients to the same attack when vectors are already computed
  • Particuarly with public VPN services, PKI public keys are sometimes openly advertised on websites or blast emails, allowing anyone to discover them and perform related precomputations

How Does the NSA Do It?

The NSA deploys a variety of methods to infiltrate encrypted communications and decrypt them. These include:

  • Leveraging common vulnerabilities due to human error, such as those listed above
  • Redirecting legitimate Web traffic and inserting malware into a browser (e.g. SECONDDATE6)
  • Brute force attacks to discover symmetric keys7
  • Scanning and discovery of public keys, which may then be brute-force attacked to uncover the private key
  • Embedding malicious code (aka "backdoors") into device hardware and/or software through agreements with manufacturers (e.g. Cisco PIX firewall product circa pre-2009)
  • Taking advantage of known cryptographic or security protocol weaknesses (e.g. DH and Logjam)

How the NSA Cracks IPsec

Based on leaked documentation, the NSA has been able to compromise IPsec transmissions since at least 2007, though at that time their capability was almost certainly limited to implementations using IKEv1. It is not known if at this time the NSA has since compromised IPsec + IKEv2. I suspect IF the peers properly implement Perfect Forward Secrecy and choose a key-reset timer that is sufficiently short (e.g. 1 hour or less), that even today breaking IKEv2 is likely still beyond the NSA's scope. However, given the NSA's capability of recordation and long-term storage, any communications with long-term reconnaissance value will eventually be broken.

As of 2007, the NSA could only decrypt IPsec transmissions that met the following conditions:

  • Tunnel mode only
  • IKEv1
  • PKI key found in agency database (cross-referenced based on IP address and/or signature)
  • Relied on collecting metadata from raw packets and using that metadata to programmatically determine the other data points their system needed to make eavesdropping successful
  • Specialized processing by independent servers/processes of various portions of the packets

At that time (and probably still true now), the NSA used a divide-and-conquer approach to breaking the IPsec/IKEv1 key encryption. By chopping up the IKEv1 key negotiations and IPsec framework into compartmentalized chunks and disseminating those chunks off to highly specialized pattern recognition processes focused on very specific portions of raw data or metadata. And as you can see, the NSA also relied upon a form of precomputation by implementing a lookup table of known (cracked) PKI private keys. Note this latter point underscores one of my warnings on VPN service providers in How VPNs Work - Part 3: Encryption and Authentication.

Some Standards Make Hacking Easier

One of the conundrums of the IPsec and IKE security standards is their implementation of Diffie-Hellman groups. Both IPsec and IKE standards call for specific DH groups from which keys may be generated. This is ostensibly by design to ensure sufficiently robust protection and cipher strength. However, this practice has the (unintentional?) side-effect of making a malicious actor's job easier. By virtue of substantially limiting the scope of DH groups available, a shorter analysis process is required to analyze a cipher under attack. Rather than a reperatoire of 24 groups, narrowing the possible group ID to just a few means the precomputations an attacker is likely to prepare are not as numerous. And likewise, filtering all the possible matches when implementing an attack vector means an attack can be carried out more rapidly. Even though this may seem an academic argument to some, the fact is there is a real time savings here. When quantum computing becomes a mainstream reality for cracking encryption, it will underscore this fact.

Imagine if IPsec and IKE could use any of the 24 DH groups. Regardless of one's opinion on which group is better than another, an equal distribution of group numbers implemented across the Internet would confound hackers substantially more than the current situation, which makes their task easier. Is this by design? Or an oversight? One needs look no further than the Logjam attack (explained above) to draw support for my conclusions here. Logjam leverages this exact issue. While concentrated on only one (1) DH group, it proves the concept that the fewer the number of DH groups involved in an attack vector, the more plausible the chance of success in defeating the encryption method.

Footnotes

1 NSA surveillance covers 75 percent of U.S. Internet traffic: WSJ

2 PRISM (surveillance program)

3 Efforts Against Virtual Private Networks Bear Fruit. 23 March 2006. SIDtoday.

4 Inside the NSA's War on Internet Security. 28 December 2014. Der Spiegel.

5 According to the Logjam study.

6 This article: Biddle, Sam. 19 August 2016. The NSA Leak Is Real, Snowden Documents Confirm explains SECONDDATE is a tool designed to intercept web requests and redirect browsers on target computers to an NSA web server. That server in turn infects the client device with malware. SECONDDATE’s existence was first reported by The Intercept in 2014, as part of a look at a global computer exploitation effort code-named TURBINE. The malware server, known as FOXACID, has also been described in various documents released by former spy Edward Snowden. A previously leaked operating manual for SECONDDATE puts its operational use at no earlier than 2010.

7 Logjam is a vulnerability in DH keys up to 1024-bits, reported by a group of computer scientists in 2015.