Some weeks ago, I rented the 2007 adventure film National Treasure: Book of Secrets on DVD, in which treasure hunter Benjamin Franklin Gates (played by Nicholas Cage) looks to discover the truth behind the assassination of Abraham Lincoln. In the movie’s first scene, which takes place in a tavern in Washington, D.C. five days after the end of the Civil War, Ben Gates’ great-great-grandfather Thomas Gates is approached by John Wilkes Booth and another member of the Knights of the Golden Circle, who ask him to decipher a secret message, which has obviously been encrypted using the Playfair cipher and might lead them to a mythological city of gold called Cíbola.
As the Playfair cipher was state-of-the-art at the end of the Civil War in 1865, I wondered how someone (even if portrayed as a well-known puzzle solver) would be able to perform a successful ciphertext-only attack within just one or two hours, not having any frequency tables at hand and given a ciphertext consisting of only 22 digraphs (= pairs of letters). The following article will explain the basic concepts (encryption, decryption and cryptanalysis) of the Playfair cipher using the example from National Treasure: Book of Secrets.
The Playfair cipher was the first digraph substitution cipher in history, that is, letters are sequentially encrypted and decrypted in pairs. This scheme was invented in 1854 by Charles Wheatstone, but bears the name of Lord Playfair, who promoted the use of the cipher. The digraph substitution makes frequency-based cryptanalysis significantly harder as one has to deal with 600 possible digraphs rather than the 26 possible monographs. In effect, larger ciphertexts are necessary to perform a successful cryptanalysis compared to conventional monograph substitution ciphers. Due to this characteristic, the Playfair cipher was superior to many contemporary ciphers and as it was also relatively easy to use, the British forces even employed it as a field cipher during World War I, about 50 years after the American Civil War.
So how did the fictional character Thomas Gates manage to decrypt the following (rather short) ciphertext?
1
2
3
4
| ME IK QO TX CQ
TE ZX CO MW QC
TE HN FB IK ME HA
KR QC UN GI KM AV |
Most likely, he simply guessed the correct keyword “DEATH” by using the given hint “The debt that all men pay.” In that case, he would have constructed the corresponding 5×5 Playfair square by entering “DEATH” in the first row and filling the square up with the remaining letters of the alphabet.
1
2
3
4
5
| D E A T H
B C F G I
K L M N O
P Q R S U
V W X Y Z |
The ciphertext can now be easily decrypted using the above Playfair square by applying the following four rules to all ciphertext digraphs:
- If the two letters appear on the same row of the Playfair square, replace them with the letters to their immediate left respectively (wrapping around to the right side of the row if a letter in the original pair was on the left side of the row).
- If the two letters appear on the same column of the Playfair square, replace them with the letters immediately above respectively (wrapping around to the bottom side of the column if a letter in the original pair was on the top side of the column).
- If the letters are not on the same row or column of the Playfair square, replace them with the letters on the same row respectively but at the other pair of corners of the rectangle defined by the original pair. (The first encrypted letter of the pair is the one that lies on the same row as the first plaintext letter.)
- Drop any extra “X” characters which don’t make sense in the final message.
Hence, the plaintext of the message which Thomas Gates successfully decrypted reads as follows:
1
2
3
4
| LA BO UL AY EL
AD YW IL LX LE
AD TO CI BO LA TE
MP LE SO FG OL DX |
If the superfluous “X” characters are dropped and the whitespaces are modified correctly, the resulting message is “Laboulaye lady will lead to Cibola temples of gold”. In the movie, this hint refers to the French Statue of Liberty, which is actually the sister statue of the American Statue of Liberty, whose intellectual creator was the French politican Édouard René de Laboulaye. Cíbola is one of the fantastic Seven Cities of Gold existing only in a myth that originated around the year 1150 when the Moors conquered Mérida, Spain. The legend of the seven cities of gold survived for many centuries and even drew the Conquistadors northward until they encountered the French colonists, who successfully resisted their further advance. In the movie National Treasure: Book of Secrets Thomas Gates’ descendant Ben Gates finally manages to rediscover the mythological temples of gold in a huge cave under Mount Rushmore.
However, the final question remains whether a ciphertext-only attack on the short ciphertext given in the movie would be feasible. The usual entry point for cryptanalysis relating the Playfair system is a frequency analysis of the ciphertext’s digraphs. Unluckily, the above plaintext doesn’t contain even one of the ten most frequent English digraphs: th, he, in, er, an, re, nd, at, on, nt. Another way of attacking the Playfair cipher is the fact that if a letter pair AB is encrypted to CD then the pair BA is always encrypted to DC. Thus finding such pairs in the ciphertext (e.g., “CQ … QC … QC”) may prove highly fruitful. But again, the corresponding plaintext digraphs “EL” and “LE” have relatively low frequencies in ordinary English texts, whereas the digraphs “TH” and “HT” are the most frequent English digraphs but don’t appear in the plaintext at all. An obvious weakness of the Playfair cipher (especially if the password is relatively short) is the fact that in many cases the Playfair square ends with “XYZ”. In the given example, the situation is even worse as the last line equals to the end of the alphabet “VWXYZ”. Another way of breaking the above ciphertext might be the so called shotgun hill climbing method in combination with the massive computation power of modern computers. This method takes an educated guess as the basis for the initial square (e.g., “VWXYZ” as the last line) and employs suitable metrics (e.g., frequency count) to find promising mutations of the Playfair square, which ultimately leads to an approximate solution. However, the unusual nature of the given plaintext “Laboulaye lady will lead to Cibola temples of gold” makes it very hard to choose good metrics and even promising ciphertext fragments like “ME IK … IK ME …” probably won’t lead anywhere if the cryptanalyst doesn’t have any initial idea about the cleartext’s content (e.g., “LA BO ul ay [e]” and “ci BO LA”).
In conclusion, the short length of the given ciphertext would have made it virtually impossible for the fictional character Thomas Gates to break the encryption by classical means in 1865. Nowadays, the possibility to perform attacks like the shotgun hill climbing method on powerful computers allow for feasible attacks even in case of short and unusual plaintexts. If you want to give it a try yourself, I recommend using the free software CrypTool, which is not only a great learning environment for cryptographic concepts but also provides many useful tools to attack classical ciphers.
Finally, it might be interesting to know that officials found a Vigenère tableau in the room of the historical John Wilkes Booth after he had shot Abraham Lincoln. As the Confederacy used the Vigenère cipher in conjunction with cipher disks during the Civil War, the prosecution in the trial against eight Southern sympathizers sought to show that Booth’s Viginère tableau proved the Confederacy government’s involvement in Lincoln’s assassination.
1 User viewing this page. (1 Guest)
How come there isn’t a letter ‘J’ in the playfair square?
because if there was a j then it wouldn’t fit into a 5×5 square and j is one of the least used letters… it makes sense if you think about it…
Never mind. I get it… you’re supposed to leave the ‘J’ out.
That’s right. As stated in footnote 6, I and J are treated as one letter.
I love the Playfair cypher and when I watched the movie I paused it and wrote down the letters. I tried to solve it.
First I tried a trigraph brute-forcer program (fairplay.exe) It didn’t work. The message is too short and the distribution of letters is all wrong.
Then I tried DEATH as the key word. I solved the cypher in 15 minutes.
Not sure that this is true:), but thanks for a post.
Truden