In January, I asked whether TLS channel binding with strong authentication was the solution to defend against MITM or proxy style phishing attacks. The answer was “yes, but also no”. I will look beyond SCRAM soon, but first I want to fulfil a promise to go into more detail about how SCRAM works, especially with channel binding.
Previously, I lamented how both descriptions and demonstrations of SCRAM were somewhat thin on the ground, especially where channel binding is concerned. However, it would have been too heavy a topic to cover in the previous post. Instead, this post is dedicated to that topic.
While working on this, a subject that admittedly I wasn’t particularly familiar with until a couple of months ago, I’ve also unearthed some issues with the process. Some of these are covered online already, but esoteric as the subject is, I think there’s value in collecting those issues here, as well as adding my own thoughts.
It’s entirely possible that some of these observations are faults with me rather than faults with the protocol, so while I’ve made a best effort to understand the RFCs and implementation sources, I will gladly receive any corrections from those better versed in the matter.
A SCRAM refresh
SCRAM, or Salted Challenge Response Authentication Mechanism, is a way for a client and server to mutually prove that they both possess knowledge of the password belonging to an identity, without having to exchange the password itself. There are some benefits to this:
- Even over an unencrypted channel, the password is never revealed
- Not only must the client provide a valid password, but the server must prove to the client that it knows it, too
- Multi-round hashing is used to increase brute-force and dictionary attack complexity
- The hashing algorithm used is extensible, so supporting stronger future hash types is straightforward
In addition to the above, SCRAM can be combined with TLS channel binding, yielding the following additional benefit:
- The client can prove to the server that the TLS channel it is using is uninterrupted (i.e. not subject to a MITM proxy)
While this doesn’t phix fix all phishing issues, it’s a step in the right direction.
Demonstration code, a WIP
I’ve put together some demonstration code that shows both the server and client steps needed for an imagined TLS channel bound SCRAM-SHA-256-PLUS
authentication exchange. At the time of writing it only reflects TLSv1.2, but I do intend to create a version that depicts the proposed channel binding approach for SCRAM over TLSv1.3 as well.
SCRAM in one big diagram, step by step
What follows is a diagram that I created to try to depict the steps performed in SCRAM, complete with channel binding. It’s based primarily on the steps documented in RFC 5802, but presented in a more visually traceable manner.
License
The diagrams I’ve made are Mermaid diagrams. View the source of this page to see how they’re defined. They are licensed under CC BY-SA 4.0. The rest of the post is all rights reserved.
Legend
Data elements are rectangles, processes or operations are rectangles with double vertical borders, outcomes are rounded rectangles, persistent storage are cylinders and decisions are diamonds.
Now, let’s do some SCRAMming.
0. Password creation
Before an authentication is performed, the client and server need to agree on a password. We assume that the server decides the salt and iteration count and shares that with the client, as this is what will happen later during authentication. With that information, both client and server can generate the keys or hashes that they can store. These keys and the operations performed with them, will later prove prior knowledge of the password, without needing direct access to it. Of course, these keys can be re-generated from the original password, salt and iteration count if needed.
The password, salt and iteration count are used with a PKBF2 hashing process to produce the “salted password”. Hashed Message Authentication Codes (HMAC) are used with the keys “Client Key” and “Server Key”, to derive client and server keys from the salted password. Additionally, the client key is hashed, producing a “stored key”.
We then have the following data objects that can be stored by client and server:
- Server
- Server key
- Stored key
- Client
- Server key
- Client key
- Salted password
Note that the salted password doesn’t need to be stored if the client instead stored both client and server keys; those are everything it needs to complete a SCRAM handshake later. For the rest of these example steps we assume that we keep only the keys that we need, and avoid touching the original password or salted password again.
credentials")] end subgraph Client C_User[Username] C_Password[Password] C_PKBF2[[PKBF2]] C_SaltPass[Salted password] C_HMACClientKey[["HMAC "Client Key""]] C_HMACServerKey[["HMAC "Server Key""]] C_HashCK[[Hash]] C_UserDB[("Client-side
credentials")] end C_Password --> C_PKBF2 S_Salt --> C_PKBF2 S_Iter --> C_PKBF2 C_PKBF2 --> C_SaltPass C_SaltPass --> C_HMACClientKey C_SaltPass --> C_HMACServerKey C_HMACClientKey --> C_HashCK C_HashCK --> S_UserDB C_HMACServerKey --> S_UserDB C_User ------> S_UserDB C_User --> C_UserDB S_Salt --> C_UserDB S_Iter --> C_UserDB S_Salt --> S_UserDB S_Iter --> S_UserDB C_HMACClientKey ---> C_UserDB C_HMACServerKey ---> C_UserDB
RFC 5802 doesn’t specify how these keys are agreed. For example, does the client compute and send the server key and stored key to the server? Or do they both take the salted password and use it to derive the keys they need before discarding it? An untrustworthy server could ask the client to do a very low number of hash iterations, allowing the server to receive a salted hash that was easier to brute force. However, regardless of exactly what is done, the client should be mindful of the salt and iteration strength, defending against short salts and low iterations.
In the above example, the key calculations are left to the client, which then sends to the server the keys that it needs to store, which must be over a secure channel.
1. First message from client
Now we can do some actual authentication. The first message sent is from the client to server, which in our example will request that channel binding be enabled, using the tls-unique
method. The message will also contain the username and a client-generated nonce. The channel binding token exists at this point, too, because communication is happening over a TLS session. However, we don’t use it yet.
credentials")] C_GenNonce[[Generate client nonce]] C_CBOpt[[Channel binding options]] C_SendFirstMsg[[Send client first message]] end C_UserDB -->|Username| C_SendFirstMsg C_GenNonce -->|Nonce| C_SendFirstMsg C_CBOpt -->|tls-unique| C_SendFirstMsg C_SendFirstMsg -->|p=tls-unique,,n=$USER,r=$CLIENT_NONCE| S_RecvFirstMsg
2. First message from server
The server now knows the requested username, so can lookup the salt and iteration count that was used for that user. It will send those, and extend the client nonce with additional server-generated nonce-material. If the client is in possession of the original password, it can use the salt and iteration count to re-compute the PKBF2 salted password and other required keys. It would also be possible to check that the server is still expecting the same salt and iteration count as first agreed and use the previously retained keys instead.
credentials")] S_SendFirstMsg[[Send server first message]] S_ClientFirstMsg[Client's first message] end subgraph Client C_RecvFirstMsg[[Receive server first message]] end S_ClientFirstMsg --> |Client_Nonce| S_SendFirstMsg S_UserDB -->|Iterations, Salt| S_SendFirstMsg S_GenNonce -->|Nonce| S_SendFirstMsg S_SendFirstMsg -->|r=$CLIENT_NONCE$SERVER_NONCE,s=$SALT,i=$ITERATIONS| C_RecvFirstMsg
3. Auth message generation
The client now has enough data to produce the authentication message. This is the object that both client and server will operate on to prove knowledge of the password. The authentication message is the concatenation of the first messages from the client and server, along with the channel binding token and a final copy of the combined client/server nonce. A SCRAM authentication without channel binding would, of course, not include the CBT. The client’s proof will be appended to this message in the next step, but that part is not used within the authentication message.
So far the only data contained within exchanged messages and the authentication message is:
- Username
- Nonce (in parts and repeated)
- Iteration count
- Salt
- Channel binding options
- Channel binding token value
The accumulated nonce data will ensure that the proofs provided are freshly computed and the CBT will ensure that the client is using the expected communication channel.
4. Proof generation
The client must now prove that it has prior knowledge of the password. To do this, it reproduces the stored key from the client key, then performs an HMAC with the stored key against the authentication message. This produces the client signature. The signature is transformed into a client proof by XOR’ing the signature with the client key. Note that the server is not in possession of the client key, so we’ll see later how the server actually verifies the message.
The client now sends the authentication message, along with its proof, to the server.
credentials")] C_HashCK2[[Hash]] C_HMACAuthMsg[[HMAC]] C_XORProof[[XOR encrypt client key]] C_ClientFinalMsgProof[Client's final message with proof] C_SendProof[[Send proof]] end subgraph Server S_RecvProof[[Receive client proof]] end C_UserDB --> |Client key| C_HashCK2 C_HashCK2 --> |Stored key| C_HMACAuthMsg C_AuthMsg ---> C_HMACAuthMsg C_UserDB --> |Client key| C_XORProof C_HMACAuthMsg -->|Client signature| C_XORProof C_ClientFinalMsg -----> C_ClientFinalMsgProof C_XORProof -->|Client proof| C_ClientFinalMsgProof C_ClientFinalMsgProof --> C_SendProof C_SendProof -->|c=$CHANNEL_BINDING,r=$NONCE,p=$CLIENT_PROOF| S_RecvProof
5. Verification by server
The server now needs to verify the client’s proof. With receipt of the final client message, is also has all data required to create its own copy of the authentication message. It possesses the stored key, so it can perform the same HMAC as the client did in order to reproduce the client signature. The proof that was sent by the client is XOR’d with the server-generated client signature. Remember that the client XOR’d the client signature and client key to generate the client proof, so the server’s XOR should produce the client key, which I call the candidate client key here.
The server isn’t storing the client key, so how does it verify that the candidate client key is correct? The stored key, which the server does keep, is simply a hash of the client key. So the server hashes the candidate client key to produce a candidate stored key and compares it to the stored key that it… stored. If they match, then the client’s proof is valid; they are both in possession of adequate material derived from the same original password.
credentials")] S_CBTCheck{Check CBT} S_CBTOK([Pass]) S_CBTFail([Fail]) S_AuthMsg[Authentication message] S_XORProof[[XOR decrypt client key]] S_HashProofKey[[Hash]] S_StoredKeyOK([Pass]) S_StoredKeyFail([Fail]) S_CmpStoredKey{Compare keys} S_HMACAuthMsg[[HMAC]] end S_TLS -->|CBT| S_CBTCheck S_ClientProof -->|CBT| S_CBTCheck S_ClientProof -->|CBT, nonce, no proof| S_AuthMsg S_ClientFirstMsg --> S_AuthMsg S_ServerFirstMsg --> S_AuthMsg S_CBTCheck -->|Match| S_CBTOK S_CBTCheck -->|Different| S_CBTFail S_AuthMsg --> S_HMACAuthMsg S_UserDB -->|Stored key| S_HMACAuthMsg S_ClientProof -->|Client proof| S_XORProof S_HMACAuthMsg -->|Client signature| S_XORProof S_XORProof -->|Candidate client key| S_HashProofKey S_HashProofKey -->|Candidate stored key| S_CmpStoredKey S_UserDB -->|Stored key| S_CmpStoredKey S_CmpStoredKey -->|Match| S_StoredKeyOK S_CmpStoredKey -->|Different| S_StoredKeyFail
Channel binding check
When a CBT is present, both the key comparison and CBT check must pass in order for authentication to be successful. With the CBT included in the authentication message, the server must access its own TLS data to verify that the token matches it. If the client is not using the same TLS channel directly, it will not be able to produce a matching token. If a MITM tries to tamper with the token in the client’s message, then the server may accept the token, but still fail to authenticate the client, as the client proof will be invalid due to the authentication message being different.
There’s no way for the client to know for sure that the server isn’t affected by a MITM, because the client already revealed the channel binding token to it. We have to trust that the server adequately verified the channel binding.
6. Server verification message
With the server satisfied that the client is authenticated, the server must report this success, along with proof that it, too, has adequate knowledge of the password. Once again, then authentication message is used for this, but this time, the server performs an HMAC of the authentication message with the server key instead. This produces a server signature, which can then be sent as a final message to the client.
credentials")] S_HMACServerVerif[[HMAC]] S_SendServerSig[[Send server signature]] end subgraph Client C_RecvServerSig[[Receive server signature]] end S_UserDB -->|Server key| S_HMACServerVerif S_AuthMsg --> S_HMACServerVerif S_HMACServerVerif -->|Server signature| S_SendServerSig S_SendServerSig --> |v=$SERVER_SIGNATURE| C_RecvServerSig
7. Client verification of server
The client is in possession of the server key as well, so this verification step is relatively simple: use the authentication message and server key to reproduce the server signature, and compare it to the one that the server sent. If they match, authentication is complete.
credentials")] C_AuthMsg[Authentication message] C_HMACServerVerif[[HMAC]] C_ServerSigOK([Pass]) C_ServerSigFail([Fail]) end C_AuthMsg --> C_HMACServerVerif C_UserDB -->|Server key| C_HMACServerVerif C_ServerSig ---> C_CmpServerSig C_HMACServerVerif --> C_CmpServerSig C_CmpServerSig -->|Match| C_ServerSigOK C_CmpServerSig -->|Different| C_ServerSigFail C_ServerSigOK --> AOK([Authentication complete!])
For an extra treat, see this article’s appendix for an attempt at combining all steps into one.
Message sequence
Hopefully, you spotted the messages exchanged between client and server? For completeness, here they are combined into a sequence diagram. This is single example with some variables used for easier reading. But other forms of SCRAM authentication will sometimes have other fields and options present, and of course, there are potential error responses as well, which are not shown here.
`p=tls-unique,,n=$USER,r=$CLIENT_NONCE` S ->> C: Server first message:
`r=$CLIENT_NONCE$SERVER_NONCE,s=$SALT,i=$ITERATIONS` C ->> S: Client final message:
`c=$CHANNEL_BINDING,r=$NONCE,p=$CLIENT_PROOF` S ->> C: Server final message:
`v=$SERVER_SIGNATURE`
What I find particularly interesting is that the client part of the nonce is exchange three times, and the server part twice, by virtue of how it’s assembled and then ping-ponged between the two participants. The presence of the channel binding token in the sent message is interesting too - more on that in the section on issues, below.
Issues
Having covered the mechanics of SCRAM with channel binding, I now want to look at the issues that it has. There are three main areas of concern:
- SLOTH and
tls-unique
, where truncated hashes could lead to collisions - TLSv1.3 drops
tls-unique
- SCRAM protocol breaks two-way bindings, I think.
SLOTH
SLOTH stands for Security Losses from Obsolete and Truncated Transcript Hashes. The premise is that tls-unique
uses truncated material as its channel binding token, and that known or potential weaknesses in the hashing algorithms can be used more easily against systems that have such truncated material in them.
This type of attack can be launched in exactly the kind of scenario where we’d want to use SCRAM with channel binding - where a MITM may be present. As described in the miTLS article on SLOTH, the MITM has knowledge of (but not control over) the master secret of both TLS channels, along with sufficient control of the transcript of TLS messages such that it can generate a colliding channel binding token across the two channels. With tls-unique
, the HMAC
output that forms the CBT is 96 bits (12 bytes) long, so only 2^48 HMAC attempts are needed to find a collision that breaks the binding.
This attack is feasible even if SHA-256 is used for the HMAC, as would be the case with SCRAM-SHA-256-PLUS
, thanks to the truncation it’s subjected to in tls-unique
. SLOTH also addresses the use of weaker algorithms, such as MD5
and SHA-1
, which can be vulnerable regardless of whether their output is truncated. This has some implications for SCRAM-SHA-1
and of course SCRAM-SHA-1-PLUS
, but I suggest that SHA-256
should be the minimum acceptable mode for most authentication systems as the time of writing (although I doubt it is).
TLSv1.3 and channel binding
I discussed this issue in a previous article, so won’t get into much detail here. For completeness, however, it must be noted that TLSv1.3 does not include tls-unique
. This is mostly driven by the SLOTH issue described above. Unfortunately, the replacement for tls-unique
, dubbed tls-exporter
, is still under review.
With tls-exporter
, Exported Key Material (EKM) is derived from the TLS master secret using an HMAC-based Key Derivation Function (HKDF), using the label EXPORTER-Channel-Binding
as input to the HKDF alongside the master secret. The output is 32 bytes in length, so maintains the same strength as the master secret, unlike the truncated 12-bytes of the tls-unqiue
output.
If published as a proposed standard, this would overcome SLOTH type issues, but would require TLS libraries to support exporting this data for use by applications for channel binding.
Mutual channel binding verification
One potential deficiency I identified in step 5 is that the client has to trust that the server verified the channel binding. SCRAM avoids a situation where the server simply accepts any provided proof, because it must also produce verification to the client that it shares some knowledge of the password. However, SCRAM-PLUS doesn’t give the same mutual guarantees for the channel binding as it does for the password.
The server may simply accept a SCRAM-PLUS authentication without checking the channel binding token. The server would be at fault for deficient binding checks and the client has no way of knowing this.
A fix?
I don’t think it’s worth dwelling on TLSv1.2 and the tls-unique
binding method. Instead, I’d like to turn my attention to TLSv1.3 and the proposed tls-exporter
method. The current proposal is to export a token by using an HMAC Key Derivation Function (HKDF) against the TLS session’s unique master secret with the label EXPORTER-Channel-Binding
. We can go further than that.
I would like to see client and server derive and share their own Exported Key Material (EKM) by using unique labels, for example EXPORTER-Channel-Binding-Client
and EXPORTER-Channel-Binding-Server
. The client and server can share the EKM during SCRAM and include them in the authentication message, similar to the combined client/server nonce. The client can reproduce the server’s EKM and check it against the authentication message contents, with the server reproducing the client’s EKM and then also checking. This proves that both are operating with the same TLS unique master secret, with two-way verification of this fact.
This closes the loop in terms of verification by both parties.
Alternative
Another way of resolving the issue, which would work for tls-exporter
without generating any extra material, would be to remove the binding data from the client’s final message, but retain it in the authentication message. With a bound channel, both sides have access to the channel binding data, so can construct equivalent authentication messages for their proof/signature generation, but the client doesn’t give the server and advantage by revealing the binding data.
Closing thoughts
In this post I’ve focused on three main issues:
- The steps involved in SCRAM authentication
- How channel binding tokens are used in SCRAM-PLUS
- A proposal for allowing mutual verification of channel binding
This gets us a step closer to understanding and application of techniques for stronger mutual authentication with defenses against MITM. In future articles I’ll be looking at how browsers should be stricter about authentication methods that they allow and how people are nearly always the weakest link in the authentication process, and what to do about it.
Join the discussion
Visit my LinkedIn post to contribute your comments.
Appendix: SCRAM-PLUS in one giant diagram
Mermaid is nice, but it doesn’t work so well with very large or complex diagrams. For fun, though, let’s see what happens when all of the (already complicated) steps are combined into a single diagram. To spare your screen, renderer, etc, it’s included as a pre-rendered SVG image. See the mermaid live editor for the original.