In January, I asked whether TLS channel binding with strong authentication was the solution to defend against MITM or proxy style phishing attacks. The answer was “yes, but also no”. I will look beyond SCRAM soon, but first I want to fulfil a promise to go into more detail about how SCRAM works, especially with channel binding.

Previously, I lamented how both descriptions and demonstrations of SCRAM were somewhat thin on the ground, especially where channel binding is concerned. However, it would have been too heavy a topic to cover in the previous post. Instead, this post is dedicated to that topic.

While working on this, a subject that admittedly I wasn’t particularly familiar with until a couple of months ago, I’ve also unearthed some issues with the process. Some of these are covered online already, but esoteric as the subject is, I think there’s value in collecting those issues here, as well as adding my own thoughts.

It’s entirely possible that some of these observations are faults with me rather than faults with the protocol, so while I’ve made a best effort to understand the RFCs and implementation sources, I will gladly receive any corrections from those better versed in the matter.

## A SCRAM refresh

SCRAM, or Salted Challenge Response Authentication Mechanism, is a way for a client and server to mutually prove that they both possess knowledge of the password belonging to an identity, without having to exchange the password itself. There are some benefits to this:

• Even over an unencrypted channel, the password is never revealed
• Not only must the client provide a valid password, but the server must prove to the client that it knows it, too
• Multi-round hashing is used to increase brute-force and dictionary attack complexity
• The hashing algorithm used is extensible, so supporting stronger future hash types is straightforward

In addition to the above, SCRAM can be combined with TLS channel binding, yielding the following additional benefit:

• The client can prove to the server that the TLS channel it is using is uninterrupted (i.e. not subject to a MITM proxy)

While this doesn’t phix fix all phishing issues, it’s a step in the right direction.

### Demonstration code, a WIP

I’ve put together some demonstration code that shows both the server and client steps needed for an imagined TLS channel bound SCRAM-SHA-256-PLUS authentication exchange. At the time of writing it only reflects TLSv1.2, but I do intend to create a version that depicts the proposed channel binding approach for SCRAM over TLSv1.3 as well.

# SCRAM in one big diagram, step by step

What follows is a diagram that I created to try to depict the steps performed in SCRAM, complete with channel binding. It’s based primarily on the steps documented in RFC 5802, but presented in a more visually traceable manner.

## Legend

Data elements are rectangles, processes or operations are rectangles with double vertical borders, outcomes are rounded rectangles, persistent storage are cylinders and decisions are diamonds.

%% Created by Steve Kerrison CC BY-SA 4.0 graph TD; subgraph "Legend" subgraph "Client" Decision{Decision} Outcome1(Outcome 1) Outcome2(Outcome 2) Data[Data] Process[[Process]] Decision -->|Option 1| Outcome1 Decision -->|Option 2| Outcome2 Process --> Data Data --> Decision end subgraph Server D1[More data] P2[[Another process]] DB[(Persistent data)] P3[[Another process]] end Process -->|Message contents| P2 D1 --> P2 P2 -->|Data treatment| DB DB -->|Data selection| P3 end

Now, let’s do some SCRAMming.

Before an authentication is performed, the client and server need to agree on a password. We assume that the server decides the salt and iteration count and shares that with the client, as this is what will happen later during authentication. With that information, both client and server can generate the keys or hashes that they can store. These keys and the operations performed with them, will later prove prior knowledge of the password, without needing direct access to it. Of course, these keys can be re-generated from the original password, salt and iteration count if needed.

The password, salt and iteration count are used with a PKBF2 hashing process to produce the “salted password”. Hashed Message Authentication Codes (HMAC) are used with the keys “Client Key” and “Server Key”, to derive client and server keys from the salted password. Additionally, the client key is hashed, producing a “stored key”.

We then have the following data objects that can be stored by client and server:

• Server
• Server key
• Stored key
• Client
• Server key
• Client key

Note that the salted password doesn’t need to be stored if the client instead stored both client and server keys; those are everything it needs to complete a SCRAM handshake later. For the rest of these example steps we assume that we keep only the keys that we need, and avoid touching the original password or salted password again.

%% Created by Steve Kerrison CC BY-SA 4.0 graph TD; subgraph Server S_Salt[Salt] S_Iter[Iterations] S_UserDB[("Server-side
credentials")] end C_Password --> C_PKBF2 S_Salt --> C_PKBF2 S_Iter --> C_PKBF2 C_PKBF2 --> C_SaltPass C_SaltPass --> C_HMACClientKey C_SaltPass --> C_HMACServerKey C_HMACClientKey --> C_HashCK C_HashCK --> S_UserDB C_HMACServerKey --> S_UserDB C_User ------> S_UserDB C_User --> C_UserDB S_Salt --> C_UserDB S_Iter --> C_UserDB S_Salt --> S_UserDB S_Iter --> S_UserDB C_HMACClientKey ---> C_UserDB C_HMACServerKey ---> C_UserDB

RFC 5802 doesn’t specify how these keys are agreed. For example, does the client compute and send the server key and stored key to the server? Or do they both take the salted password and use it to derive the keys they need before discarding it? An untrustworthy server could ask the client to do a very low number of hash iterations, allowing the server to receive a salted hash that was easier to brute force. However, regardless of exactly what is done, the client should be mindful of the salt and iteration strength, defending against short salts and low iterations.

In the above example, the key calculations are left to the client, which then sends to the server the keys that it needs to store, which must be over a secure channel.

## 1. First message from client

Now we can do some actual authentication. The first message sent is from the client to server, which in our example will request that channel binding be enabled, using the tls-unique method. The message will also contain the username and a client-generated nonce. The channel binding token exists at this point, too, because communication is happening over a TLS session. However, we don’t use it yet.

%% Created by Steve Kerrison, CC BY-SA 4.0 graph TD; subgraph Server S_RecvFirstMsg[[Receive client first message]] end subgraph Client C_UserDB[("Client-side
credentials")] C_GenNonce[[Generate client nonce]] C_CBOpt[[Channel binding options]] C_SendFirstMsg[[Send client first message]] end C_UserDB -->|Username| C_SendFirstMsg C_GenNonce -->|Nonce| C_SendFirstMsg C_CBOpt -->|tls-unique| C_SendFirstMsg C_SendFirstMsg -->|p=tls-unique,,n=$USER,r=$CLIENT_NONCE| S_RecvFirstMsg

## 2. First message from server

The server now knows the requested username, so can lookup the salt and iteration count that was used for that user. It will send those, and extend the client nonce with additional server-generated nonce-material. If the client is in possession of the original password, it can use the salt and iteration count to re-compute the PKBF2 salted password and other required keys. It would also be possible to check that the server is still expecting the same salt and iteration count as first agreed and use the previously retained keys instead.

%% Created by Steve Kerrison, CC BY-SA 4.0 graph TD; subgraph Server S_GenNonce[[Generate server nonce]] S_UserDB[("Server-side
credentials")] S_SendFirstMsg[[Send server first message]] S_ClientFirstMsg[Client's first message] end subgraph Client C_RecvFirstMsg[[Receive server first message]] end S_ClientFirstMsg --> |Client_Nonce| S_SendFirstMsg S_UserDB -->|Iterations, Salt| S_SendFirstMsg S_GenNonce -->|Nonce| S_SendFirstMsg S_SendFirstMsg -->|r=$CLIENT_NONCE$SERVER_NONCE,s=$SALT,i=$ITERATIONS| C_RecvFirstMsg

## 3. Auth message generation

The client now has enough data to produce the authentication message. This is the object that both client and server will operate on to prove knowledge of the password. The authentication message is the concatenation of the first messages from the client and server, along with the channel binding token and a final copy of the combined client/server nonce. A SCRAM authentication without channel binding would, of course, not include the CBT. The client’s proof will be appended to this message in the next step, but that part is not used within the authentication message.

%% Created by Steve Kerrison, CC BY-SA 4.0 graph TD; subgraph Client C_ClientFirstMsg[Client's first message] C_ServerFirstMsg[Server's first message] C_TLS[TLS session data] C_ClientFinalMsg[Client's final message no-proof] C_AuthMsg[Authentication message] end C_ClientFirstMsg ---> C_AuthMsg C_ServerFirstMsg ---> C_AuthMsg C_TLS -->|CBT| C_ClientFinalMsg C_ServerFirstMsg -->|Full nonce| C_ClientFinalMsg C_ClientFinalMsg --> C_AuthMsg

So far the only data contained within exchanged messages and the authentication message is:

• Nonce (in parts and repeated)
• Iteration count
• Salt
• Channel binding options
• Channel binding token value

The accumulated nonce data will ensure that the proofs provided are freshly computed and the CBT will ensure that the client is using the expected communication channel.

## 4. Proof generation

The client must now prove that it has prior knowledge of the password. To do this, it reproduces the stored key from the client key, then performs an HMAC with the stored key against the authentication message. This produces the client signature. The signature is transformed into a client proof by XOR’ing the signature with the client key. Note that the server is not in possession of the client key, so we’ll see later how the server actually verifies the message.

The client now sends the authentication message, along with its proof, to the server.

%% Created by Steve Kerrison, CC BY-SA 4.0 graph TD; subgraph Client C_ClientFinalMsg[Client's final message no-proof] C_AuthMsg[Authentication message] C_UserDB[("Client-side

## 7. Client verification of server

The client is in possession of the server key as well, so this verification step is relatively simple: use the authentication message and server key to reproduce the server signature, and compare it to the one that the server sent. If they match, authentication is complete.

%% Created by Steve Kerrison, CC BY-SA 4.0 graph TD; subgraph Client C_ServerSig[Server signature] C_CmpServerSig{Compare signature} C_UserDB[("Client-side
credentials")] C_AuthMsg[Authentication message] C_HMACServerVerif[[HMAC]] C_ServerSigOK([Pass]) C_ServerSigFail([Fail]) end C_AuthMsg --> C_HMACServerVerif C_UserDB -->|Server key| C_HMACServerVerif C_ServerSig ---> C_CmpServerSig C_HMACServerVerif --> C_CmpServerSig C_CmpServerSig -->|Match| C_ServerSigOK C_CmpServerSig -->|Different| C_ServerSigFail C_ServerSigOK --> AOK([Authentication complete!])

For an extra treat, see this article’s appendix for an attempt at combining all steps into one.

## Message sequence

Hopefully, you spotted the messages exchanged between client and server? For completeness, here they are combined into a sequence diagram. This is single example with some variables used for easier reading. But other forms of SCRAM authentication will sometimes have other fields and options present, and of course, there are potential error responses as well, which are not shown here.

%% Created by Steve Kerrison, CC BY-SA 4.0 sequenceDiagram; participant C as Client participant S as Server C ->> S: Client first message:
p=tls-unique,,n=$USER,r=$CLIENT_NONCE S ->> C: Server first message:
r=$CLIENT_NONCE$SERVER_NONCE,s=$SALT,i=$ITERATIONS C ->> S: Client final message:
c=$CHANNEL_BINDING,r=$NONCE,p=$CLIENT_PROOF S ->> C: Server final message: v=$SERVER_SIGNATURE

What I find particularly interesting is that the client part of the nonce is exchange three times, and the server part twice, by virtue of how it’s assembled and then ping-ponged between the two participants. The presence of the channel binding token in the sent message is interesting too - more on that in the section on issues, below.

# Issues

Having covered the mechanics of SCRAM with channel binding, I now want to look at the issues that it has. There are three main areas of concern:

• SLOTH and tls-unique, where truncated hashes could lead to collisions
• TLSv1.3 drops tls-unique
• SCRAM protocol breaks two-way bindings, I think.

## SLOTH

SLOTH stands for Security Losses from Obsolete and Truncated Transcript Hashes. The premise is that tls-unique uses truncated material as its channel binding token, and that known or potential weaknesses in the hashing algorithms can be used more easily against systems that have such truncated material in them.

This type of attack can be launched in exactly the kind of scenario where we’d want to use SCRAM with channel binding - where a MITM may be present. As described in the miTLS article on SLOTH, the MITM has knowledge of (but not control over) the master secret of both TLS channels, along with sufficient control of the transcript of TLS messages such that it can generate a colliding channel binding token across the two channels. With tls-unique, the HMAC output that forms the CBT is 96 bits (12 bytes) long, so only 2^48 HMAC attempts are needed to find a collision that breaks the binding.

This attack is feasible even if SHA-256 is used for the HMAC, as would be the case with SCRAM-SHA-256-PLUS, thanks to the truncation it’s subjected to in tls-unique. SLOTH also addresses the use of weaker algorithms, such as MD5 and SHA-1, which can be vulnerable regardless of whether their output is truncated. This has some implications for SCRAM-SHA-1 and of course SCRAM-SHA-1-PLUS, but I suggest that SHA-256 should be the minimum acceptable mode for most authentication systems as the time of writing (although I doubt it is).

## TLSv1.3 and channel binding

I discussed this issue in a previous article, so won’t get into much detail here. For completeness, however, it must be noted that TLSv1.3 does not include tls-unique. This is mostly driven by the SLOTH issue described above. Unfortunately, the replacement for tls-unique, dubbed tls-exporter, is still under review.

With tls-exporter, Exported Key Material (EKM) is derived from the TLS master secret using an HMAC-based Key Derivation Function (HKDF), using the label EXPORTER-Channel-Binding as input to the HKDF alongside the master secret. The output is 32 bytes in length, so maintains the same strength as the master secret, unlike the truncated 12-bytes of the tls-unqiue output.

If published as a proposed standard, this would overcome SLOTH type issues, but would require TLS libraries to support exporting this data for use by applications for channel binding.

## Mutual channel binding verification

One potential deficiency I identified in step 5 is that the client has to trust that the server verified the channel binding. SCRAM avoids a situation where the server simply accepts any provided proof, because it must also produce verification to the client that it shares some knowledge of the password. However, SCRAM-PLUS doesn’t give the same mutual guarantees for the channel binding as it does for the password.

The server may simply accept a SCRAM-PLUS authentication without checking the channel binding token. The server would be at fault for deficient binding checks and the client has no way of knowing this.

### A fix?

I don’t think it’s worth dwelling on TLSv1.2 and the tls-unique binding method. Instead, I’d like to turn my attention to TLSv1.3 and the proposed tls-exporter method. The current proposal is to export a token by using an HMAC Key Derivation Function (HKDF) against the TLS session’s unique master secret with the label EXPORTER-Channel-Binding. We can go further than that.

I would like to see client and server derive and share their own Exported Key Material (EKM) by using unique labels, for example EXPORTER-Channel-Binding-Client and EXPORTER-Channel-Binding-Server. The client and server can share the EKM during SCRAM and include them in the authentication message, similar to the combined client/server nonce. The client can reproduce the server’s EKM and check it against the authentication message contents, with the server reproducing the client’s EKM and then also checking. This proves that both are operating with the same TLS unique master secret, with two-way verification of this fact.

This closes the loop in terms of verification by both parties.

### Alternative

Another way of resolving the issue, which would work for tls-exporter without generating any extra material, would be to remove the binding data from the client’s final message, but retain it in the authentication message. With a bound channel, both sides have access to the channel binding data, so can construct equivalent authentication messages for their proof/signature generation, but the client doesn’t give the server and advantage by revealing the binding data.

# Closing thoughts

In this post I’ve focused on three main issues:

1. The steps involved in SCRAM authentication
2. How channel binding tokens are used in SCRAM-PLUS
3. A proposal for allowing mutual verification of channel binding

This gets us a step closer to understanding and application of techniques for stronger mutual authentication with defenses against MITM. In future articles I’ll be looking at how browsers should be stricter about authentication methods that they allow and how people are nearly always the weakest link in the authentication process, and what to do about it.