This past spring, Juliano Rizzo (@julianor) and I came up with a cryptographic attack on Telegram's MTProto "secret" chat communications which can be performed in roughly 264 operations. The attack happens from an active MITM position on Telegram's servers.
By default, messages sent by users which are not part of a secret-chat are logged and stored on Telegram's servers in a way that allows Telegram to view the message contents and hand them out to intended parties. This always holds true when a conversation can move across devices (from a phone to computer for example). Those chats are not private, so users should be very careful to not send incriminating information or pictures without starting a secret chat. Group chats additionaly do not feature any end-to-end encryption. Furthermore, the next time someone logs in they'll have access to previously sent non-secret-chat messages. We'll get back to this point later.
The Secret Chat Feature
The main backing to Telegram's great claims for "Taking Back Our Right To Privacy" is the secret-chat / end-to-end communication feature. This is the part that was trivially breakable during the first Telegram contest with a server-supplied XOR-key.
The end-to-end encryption is essentially Diffie-Hellman for key negotiation followed by a custom AES encryption scheme. The authenticity of the end-to-end-encryption derives from a truncated SHA-1 hash of the DH shared secret key, which gets turned into a visual fingerprint that users must figure out how to transmit, and then compare by eye. If you're a cryptographer, I'm sorry you had to read this, I know that was painful, and goes against many well-established best practices. At least Telegram's engineers make cryptographically strong recommendations and requirements about the strengths of the Diffie-Hellman parameters.
The fingerprint that two users compare is derived from a hash of the Diffie Hellman secret. So the shared secret will be in the space of 22048 and the hash function used is SHA-1, which produces a 160-bit hash. This 160-bit hash is truncated to 128 bits, which are then used to produce a telegram fingerprint to visually verify as shown in the picture.
What's important to realize is that it's a security-usability nightmare. In practice, when I find someone who uses Telegram, and I ask them how they do their secret-chat authentication, I am usually very disappointed. What they show me is that they will send a screenshot of the fingerprint over the secret-chat they are trying to authenticate. It would be trivial for a man-in-the-middle attacker to auto-replace an image of the fingerprint.
A 264 Attack
Okay, so suppose that users are doing the right thing, and comparing the visual fingerprints in person. And they do not make any mistakes while visually comparing the difference, because they are extremely careful.
The first observation is that although the shared secret is in the space of a strong 2048-bit prime, the authentication that prevents the man-in-the-middle is the 128-bit fingerprint, which is what is visually verified. If there was a way to efficiently cycle through DH parameters, an active man-in-the-middle attacker could spoof the fingerprint.
The second observation is that in the standard flow of the end-to-end protocol, an active MITM attacker only has a few parameters to work with. There's not much flexibility in controlling the resulting shared secret for the second peer.
In the typical Diffie Hellman MITM scenario, an attacker simply negotiates their own shared secret with each peer.
However, if the MITM attacker socially engineers the second peer to start a secret chat out of their own initiative, then there's more freedom. Instead, the MITM attacker now sits on two DH public keys A, and B. A MITM attacker can produce two different shared secrets M1A and M2B with different exponents m1 and m2, whereas previously there was only flexibility for picking multiple possible shared secrets with the resulting shared secret for the first peer.
This is the attack. The attacker searches through values of 'm1' and 'm2' which will produce the same 128-bit fingerprint for both M1A and M2B. The flexibility to search for values of both 'm1' and 'm2' for two different resulting shared secrets allows a birthday-attack to be performed to find the collision. Instead of having to search through 2128 values, the attacker searchers through 264 values of 'm1' and 264 values of 'm2' with nearly 50% probability of success.
Complexity wise, this attack is a roughly a 265) step operation. In 2015, this difficulty is unacceptably weak. For some perspective, SHA-1 was deprecated by NIST at the end of 2012, with SHA-256 being recommended instead. A birthday attack on SHA-256 would require 2128 computations. And despite the attack being expensive, it is feasible and within the means of a very well financed adversary.
Implementation wise, one might at first glance think this attack still requires prohibitive resources.
One misconception may be that such large exponentiation in the search for colliding hashes will be too computationally intensive, since very large numbers in a group of a 2048-bit prime are being exponentiated. However, an attacker only needs to perform a square and modulo operation to achieve a new candidate to hash at each step.
Another common misconception may be that such birthday attacks are memory constrained, and 264 memory is needed. Again, this is incorrect, as parallel and memoryless birthday attacks are a relatively well known attack technique and various memory-time trade off strategies exist. We've implemented code which uses a rho-phollard method for cycle finding to detect hash collisions without the huge memory overhead.
Overall, I would estimate that a full attack costs in the tens of millions USD in infrastructure and electricity to pull off and get a full fingerprint collision in reasonable time. Attackers may also be able to steal or borrow existing infrastructure, like a botnet or supercomputer system. In other words, this should be within some Super Villain's budget.
260 attack cost estimation - https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html
Telegram has responded on this attack, with an article where they estimate this attack to cost over one trillion dollars
Why A Super Villain Doesn't Need The Attack
Sure, an attacker could spend a bunch of money and effort to gain access to telegram's servers and even more effort and money to pull off this attack, but realistically there are much simpler and equally profitable attacks to perform.
All those contact lists and default chats will be sitting on the Telegram's servers, with plenty of metadata information about who is talking to who, how often they communicate, with phone-numbers and more than enough communications logged by default to look through. If an attacker did gain access to Telegram's data, those results in themselves would be pretty good without spying on secret chats. Seriously, doesn't that sound like the kind of private information that super villains are after?
Now, if a super villain really wanted to intercept a secret chat, they can bet that their target just sends the fingerprint in-band over the newly established channel. That is what many users do, unfortunately. And because they're a super-villain, they'll intercept the other channel that their target may try to use, SMS for example, since the visual fingerprint can't be verbally communicated unless the client also shows the fingerprint in readable form (most clients do not). The fingerprints are simply replaced with the super villain's, and users would have to wait to meet in person to discover their communication was compromised.
Another thing. Presumably, people use private messengers on their phones to avoid the vulnerable communications telco carriers provide consumers with. However, Telegram's only user authentication scheme is an SMS code.
We know that SMS is weak and we know that adversaries exploit this weakness in the real world. SMS can be sniffed and cracked, targets can be connected to fake base stations, and carriers can be compromised (think belgacom) or coerced into cooperating. So if SMS is the authentication method, you can bet that telegram accounts can and will be hijacked.
Attacking SMS can also be lo-tech. If the Super Villain is targeting a frenemy, they can simply borrow their frenemy's phone (or SIM card if the phone is locked) for a few minutes and hijack the account. And best of all, Telegram logs and displays previous default-chat messages. From the stolen account, a super villain can also social engineer account contacts and start new secret-chats to trick contacts into spilling secrets.
On December 1st, Telegram did introduce a "forward-secrecy" feature which adds key-rotation inside of the established channel. However, it will not prevent the active MITM attack described here.
What is needed to fix Telegram's privacy are the following critical things:
First, chats (including group chats) must be end-to-end encrypted by default. This avoids human error and information leaks. End-to-end by default is the case with Threema and TextSecure already.
Second, the end-to-end encryption must move away from authentication per secret-chat conversation and go over to public-key cryptography for user identities. It is nonsense to authenticate a secret-chat session key. Instead, each user should have one or a set of public keys with which to perform key negotiation with. Users verify each other once, and that's it. OTR, Threema, and TextSecure all solved this problem ages ago as well.
Third, the user authentication is much, much too weak for a private messenger and adversaries who can afford SMS-interception can trivially hijack accounts and perform social engineering as well as view default chats. Another authentication scheme for accounts is needed as soon as possible, even passwords with two-factor authentication. This vector is by far the most likely real-world attack scenario where Telegram's servers are not compromised and it can be performed by any villain, not just super villains.
Finally, to honor privacy, Telegram must enable communications decoupled from the requirement for address books and a phone number so that people can use Telegram anonymously, which is not currently possible
The Telegram Contest
There is currently an active $300,000 contest for cracking Telegram's End-to-End. https://telegram.org/blog/cryptocontest. Does this attack qualify? Actually, the way the contest is set-up, there's no way to have two peers initiate a chat with a MITM attacker. If the contest allowed this, the attack may have a chance. The computation will also likely cost more than the winnings without a massively crowd-sourced effort.
Readers, go and see if you can find more bugs, design flaws or otherwise. There should be at least a few more in the MTProto soup.
Recommendations for Telegram Users Looking for Privacy
As a general messenger, Telegram seems just fine. If secrecy and privacy are at all important however, I recommend to users that they look elsewhere at this time because of the usability problems. Some viable alternatives I can recommend are Threema or TextSecure.
Special thanks to @odiumeh for much collaboration, tips, and encouragement as well as all the other people I talked their ears off of about this. Also thanks to Dhiru Kholia and Marsh Ray for the helpful feedback on this post.
Example Partial Collision
p = 0xc71caeb9c6b1c9048e6c522f70f13f73980d40238e3e21c14934d037563d930f48198a0aa7c14058229493d22530f4dbfa336f6e0ac925139543aed44cce7c3720fd51f69458705ac68cd4fe6b6b13abdc9746512969328454f18faf8c595f642477fe96bb2a941d5bcd1d4ac8cc49880708fa9b378e3c4f3a9060bee67cf9a4a4a695811051907e162753b56b0f6b410dba74d8a84b2a14b3144e0ef1284754fd17ed950d5965b4b9dd46582db1178d169c6bc465b0d6ff9ca3928fef5b9ae4e418fc15e83ebea0f87fa9ff5eed70050ded2849f47bf959d956850ce929851f0d8115f635b105ee2e4e15d04b2454bf6f4fadf034b10403119cd8e3b92fcc5b
A = 31255283409627973717223000370765106922189476428506306750544086446550550889396
B = 112771494640069988074840580093805523312740220076402790935086113796930956685865
m1 = 1524456385
m2 = 2871128407
M1A = A2 * m1 mod p
M2B = B2 * m2 mod p
M1A = 0x894c450b654f0fba7eb0baf97f08cc0aa448a17a2773d5dfcd6827d119d5d910b9f8fa75e25cc23ad30aa819cdcae8e4b4a9f3551afbe37061f3a768b62507c1eb2d017a67d3433f69acec8a5cd2a73008c2f520c71a7cea269325dc8e5bf77778d6349d8db0ac64d7b362218d04bd0a0a0d2a8f8187c4df8f8aa2b177268a505fc8892bda722449a60fff1647601a505e0480af05b14ee5a613414ad3a4ea06d37c03d856e20c5156199bd382048f45e5c150321b8c354fb541946b501aeaf6af039acda9ecb871095f726cc20a920c4b3361cb5f40142fb6319e07667a0e587e49be5f0c7f13960fbc138c9b2c81ae0fff4364d041bfbbf66842d82aca7604
M2B = 0x6722b48e670f4d3c2a24019a9d18211ebbefa407fb94fa3e91bff416cf344d1a2e3bd62c43c92bb4b633586b8e11853faa609c2f474e92033bc0fc97e047e7ed4c2af54a75f7e4bbf08cf9854a478e485b358232aa886b701c5e5dc488335b757cfaf0eda87d5299b5385ca69bdbe758a0b99aa45d371bed4a576cfff4147d384e78ca245cd778d983168a3ebc55950c0203db61c67d903fe77b4879bc35e9529ff51dcbf24add97771a0538ed45d3ee7a269674ded42dc0c85546601485388b045f5196a18355d5a8579df69a06df1519f5bada66270691574b425b22cefa0f6d522394c60f6c5c568f3a16458c64d6dde8cea917cdacbc7fdf552dd7872ef7
SHA1(M1A) = 0xe20bf6722b0a44b7e0fca07bd6c55a74b378ad74
SHA1(M2B) = 0x5813a3521aa452a4a2fffc3ed6c55a74b378ad74
64-bits in common, d6c55a74b378ad74
The O() notation was bothering people since it was inappropriate, also I've added a link to Telegram's response on attack cost.