Search this blog ...

Friday, July 10, 2015

Java SSL HttpUrlConnection Performance Slow using TLS 1.0 with CBC

The fix Oracle implemented in the JVM to combat the BEAST attack can have a significant performance impact when using TLS 1.0 with CBC.  This is particularly noticeable when performing large streaming uploads with HttpURLConnection using the setFixedLength streaming mode (rather than its default mode where it buffers the request payload in full).

When performing writes to HttpURLConnection's OutputStream in setFixedLength streaming mode using a BufferedOutputStream based on the default 8k buffer [OutputStream out = new BufferedOutputStream(uc.getOutputStream())], you can see a pattern like that below when running with the system property -Djavax.net.debug=ssl,handshake set.

Java 6 1.6.0_91
%% Cached client session: [Session-1, TLS_RSA_WITH_AES_128_CBC_SHA]
...
main, WRITE: TLSv1 Application Data, length = 32
main, WRITE: TLSv1 Application Data, length = 16416
main, WRITE: TLSv1 Application Data, length = 32
main, WRITE: TLSv1 Application Data, length = 16416
...


Java 7 1.7.0_15
%% Cached client session: [Session-1, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA]
...
main, WRITE: TLSv1 Application Data, length = 32
main, WRITE: TLSv1 Application Data, length = 16416
main, WRITE: TLSv1 Application Data, length = 32
main, WRITE: TLSv1 Application Data, length = 16416
...

When using Java 8 and TLS 1.2, there are none of the 32 byte packets in the output …

Java 8 1.8.0_40
%% Cached client session: [Session-1, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA]
...
main, WRITE: TLSv1.2 Application Data, length = 16432

main, WRITE: TLSv1.2 Application Data, length = 16432
main, WRITE: TLSv1.2 Application Data, length = 16432
main, WRITE: TLSv1.2 Application Data, length = 16432
...

If I set the system property "-Djsse.enableCBCProtection=false" with Java 6 (disabling the BEAST attack fix), the 32 byte packets disappear ...

%% Cached client session: [Session-1, TLS_RSA_WITH_AES_128_CBC_SHA]
...
main, WRITE: TLSv1 Application Data, length = 16416

main, WRITE: TLSv1 Application Data, length = 16416
main, WRITE: TLSv1 Application Data, length = 16416
main, WRITE: TLSv1 Application Data, length = 16416
...

As disabling the CBC protection is not viable in production, I looked at what could be done to minimize the occurrence of the 32 byte packets when using TLS 1.0 with CBC.  In turns out by increasing the buffer size of the BufferedOutputStream wrapping HttpURLConnection’s OutputStream from the default 8kb to something much larger e.g. to 256kb, the number of 32 byte packets reduced significantly resulting in a significant performance increase. 

Java 7 1.7.0_15 with 32k buffer
%% Cached client session: [Session-1, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA]
...
main, WRITE: TLSv1 Application Data, length = 32
main, WRITE: TLSv1 Application Data, length = 16416
main, WRITE: TLSv1 Application Data, length = 16416
main, WRITE: TLSv1 Application Data, length = 32

main, WRITE: TLSv1 Application Data, length = 16416
main, WRITE: TLSv1 Application Data, length = 16416
...

The larger buffer however as expected had minimal (or no) impact with Java 1.8 based on the TLS 1.2 connection.  Java 1.7 can support TLS 1.2, though will by default negotiate TLS 1.0 unless explicitly instructed otherwise:

http://docs.oracle.com/javase/7/docs/technotes/guides/security/SunProviders.html#tlsprotonote

Footnote 1 - Although SunJSSE in the Java SE 7 release supports TLS 1.1 and TLS 1.2, neither version is enabled by default for client connections. Some servers do not implement forward compatibility correctly and refuse to talk to TLS 1.1 or TLS 1.2 clients.

Oracle’s acknowledgement of the BEAST exploit when using TLS 1.0 with CBC (Cipher Block Chaining) is part of CVE-2011-3389:

CVE-2011-3389 Java Runtime Environment SSL/TLS JSSE Yes 4.3 Network Medium None Partial None None JDK and JRE 7, 6 Update 27 and before, 5.0 Update 31 and before, 1.4.2_33 and before.
JRockit R28.1.4 and before

This is a vulnerability in the SSLv3/TLS 1.0 protocol. Exploitation of this vulnerability requires a man-in-the-middle and the attacker needs to be able to inject chosen plaintext.

The links below describe the attack:
https://blog.torproject.org/blog/tor-and-beast-ssl-attack
http://blogs.cisco.com/security/beat-the-beast-with-tls

To combat the exploit, the fix Oracle did was to split each write() to the underlying OutputStream in to at least two separate TLS records with every record having a different initialization vector.  TLS itself caps the maximum record size at 16384 (this is the size of the raw unencrypted bytes).  http://blog.fourthbit.com/2014/12/23/traffic-analysis-of-an-ssl-slash-tls-session
So a write of 16k of client data to the underlying OutputStream at a time with the fix above would result in one TLS record containing the first byte encrypted, and the second TLS record containing the remaining 16383 bytes encrypted. Whereas a write of 32k of client data to the underlying OutputStream at a time would result in three TLS records, one containing the first byte encrypted, the second containing the next 16384 bytes, and the third containing the remaining 16383 bytes encrypted.  So when using TLS 1.0 with CBC, the bigger the buffer associated with the write, the fewer one byte encrypted TLS records you are going to see.

To give you an idea of effect that buffer size plays with TLS 1.0 and CBC when the JVM has the fix for BEAST applied:
Assuming a file size of 31527359 (~ 30 Megabytes)
with 16k buffer: 16384 = 1 + 16383 ; 31527359 / 16384 = ~1924 ; so 1924 one byte ssl records, 1924 x 16383 byte ssl records
with 32k buffer: 32768 = 1 + 16384 + 16383; 31527359 / 32768 = ~962 ; so 962 one byte ssl records, 962 x 16384 byte records, and 962 x 16383 byte records
with 64k buffer 65536 = 1 + 16384 + 16384 + 16384 + 16383; 31527359 / 65536 = ~481 ; so 481 one byte ssl records, 3*481*16384 byte records, and 481 x 16383 byte records
with 256k buffer 262144 = 1 + 15*16384 + 16383; 31527359 / 262144 = ~120; so 120 one byte ssl records, 15*120*16384 byte records, and 120 x 16383 byte records
..
So to summarize for the 30 megabyte file, buffer size and resulting one-byte ssl records
16k: 1924 one byte ssl records
32k: 962 one byte ssl records
64k: 481 one byte ssl records
256k: 120 one byte ssl records
512k: 60 one byte ssl records
1024k buffer: 30 one byte ssl records

Each SSL record obviously has a reasonable amount of processing time, both client to encrypt/hash, network from a TCP perspective, and server to validate/decrypt the SSL payload.
So ideally going forward Java 1.8 using TLS 1.2 is what you want to strive for.   If stuck with TLS 1.0, then the large buffer will definitely help with performance.