1. If the two letters of the pair are found in the same column in the key-square, replace each letter with the one directly beneath it; and if one letter stands at the bottom of the column, use the one standing at the top of the same column. With the key of the figure, ha becomes OH; wa becomes UH.
2. If the two letters of the pair are found in the same row in the key-square, replace each letter with the one immediately to its right; and if one letter stands at the extreme right end of the row, use the one standing at the extreme left end of the same row (os becomes QT; st becomes TN).
3. If the two letters of the pair have a diagonal relationship in the key-square (and these are usually in the majority), consider them to be standing at the diagonally opposite corners of an imaginary small rectangle, and substitute for each letter that letter of the other diagonal which stands on the same row with itself (bu becomes AL, not LA). The decipherment rules, as usual, are the same rules in reverse.
Notice that this encipherment is cyclical. So long as the order 1-2-3-4-5
is maintained in both columns and rows, it makes no difference whatever
how many columns are transferred from one side to the other, or how many
rows are transferred from top to bottom. This may be investigated in the
three equivalent squares of the figure. Notice, too, that our three rules
do not make any provision for the case in which the two letters of a pair
are the same. If, in marking off the plaintext into pairs, we encounter
a pair which is a double, it becomes necessary to dispose of this, usually
by inserting a null which will throw the second letter into the next pair.
Occasionally we find a sequence such as LESS SEVEN, in which it is necessary
to do this twice in succession:
The foregoing description and rules are those of the original Playfair cipher. Many encipherers, however, will vary the rules, especially the one concerning doubles; perhaps one letter will be omitted or replaced with a null; sometimes one double is replaced with another; occasionally an encipherer will separate every doubled letter in the message whether or not this is necessary. We meet, too, with variant forms. A 24-letter alphabet will be used in a 4 x 6 rectangle, or a 27-1etter alphabet (with character &) will be used in a 3 x 9 rectangle. One variation, attributed to W. W. Rouse-Ball, uses the standard key-square with the standard rule 3, but varies the two rules for lineal encipherment. Rule 1: If the two letters of the pair stand in a column, use the two letters immediately to their right. Rule 2: If they stand in a row, use the two letters immediately beneath them. In all of these cases, presuming the method to be known, the degree of difficulty would be the same as if the standard system had been used; otherwise, it is only necessary to keep in mind the fact that variations occasionally occur. We will give our attention, then, to the standard encipherment. But before entering into the subject of decryptment, let us look carefully at the system itself.
Primarily, we have a fixed substitution. No plaintext pair ever has
more than one substitute pair; and no substitute pair ever changes its
original. We might say that the Playfair is, in effect, a "simple substitution"
based on an "alphabet" of 600 pairs; and, just as in simple substitution
proper, the Playfair cryptograms will very often contain long repeated
sequences which represent whole words. Again, the reversal of a plaintext
pair means the reversal of its substitute pair, and vice versa, so that
the discovery of any one equation (as th = OM) always means
the discovery of another (
Naturally, then, those letters which, in the key-square, are standing on the same row or in the same column with the normally frequent letters will have high frequencies in the cryptograms; in fact, the two or three which predominate in a given cryptogram will practically always be letters which, in the key-square, were standing in the same row or column with E or T (in English). Moreover, if any letter has been identified once as the substitute for E, there is a most excellent chance that it can be identified again as the substitute for E. Say, for instance, that CF has been identified, or assumed, as the substitute for er. This means that C is individually the substitute for e, and when another pair CT is found to be of some frequency, it can be tried as the substitute for en, es, et, and so on. Single-letter frequencies, then, will play an important part in the decryptment of the Playfair. But the process will rest fundamentally upon the frequencies of digrams, and will follow, in general, three steps repeated over and over in the same rotation:
1. Certain pairs are identified, or assumed, as the substitutes for certain digrams.
2. These pairs and their supposed originals are set together in such a way as to start the reconstruction of the key-square.
3. Substitutions are made on the cryptogram and further pairs are identified.
When probable words exist, the work of solution becomes more or less
mechanical, as we shall see. At worst, we may begin at the beginning of
the cryptogram and work straight through until we find the word. But very
often, a really probable word is repeated, and even repeated more than
once. In the latter case, we are sure to find the long repeated sequence
in the cryptogram; while a word repeated only once may have been divided
into two different sets of pairs, as: ex-ec-ut-io-n and e-xe-cu-ti-on.
But notice, here, what the two encipherments would be, using our key-square
of the figure:
Granting an absence of probable words, the difficulties of solution are almost entirely dependent upon the amount of material available. A pair-count will be made in the usual chart-form (but only on the divided pairs, and not "straddling" from one pair to another), and pairs will be identified by frequency, by the frequency with which they are found reversed, by the possibility of their letter-combinations in a key-square, and so on. We will not attempt, here, to go into a detailed demonstration, since every case is individual in its details, and success, in all of them, is dependent largely upon the decryptor's own persistence. But in order to see sketchily what some of the routine might be, we will make use of the very short example shown in Fig. 167.
In the usual case, there has been a preliminary frequency count on single letters in order to find out what the cipher is. The appearance of this frequency count has more or less negatived the possibility of simple substitution, and the next step has been a Kasiski tabulation in the hope of finding a period. This tabulation, in any pair-system, will bring out a predominant factor 2, and, since many of the supposed digram systems actually do produce periods, the two supposed alphabets would have been examined for that possibility. But pair-systems, as a rule, will leave a wide-open trail: Repeated sequences, in the majority of cases, will include an even number of letters (that is, an exact number of pairs), and will begin largely at the odd serial positions (that is, at the beginnings of pairs). The Playfair shows this a little less distinctly than some of the others, because of the fact that substitutes for single letters are so limited in number.
It is sometimes said of the Playfair that it can be distinguished from other ciphers by (1) the fact that cryptograms contain an even number of letters (2) the fact that only 25 letters are represented in its general frequency count, (3) the fact that when the cryptogram is marked into pairs, no pair will be a doubled letter, and (4) the presence of long repeated sequences at irregular intervals. As conclusive evidence, these are debatable points, but all are good supporting evidence, provided a proper confession can be extracted from the pair-chart: (5) When the cryptogram has been marked off into pairs, and the pairs counted, the result should bear much resemblance to a count made on the same number of normal digrams. Even on an extremely long cryptogram, over half of the cells will be blank, since a normal text never uses more than about 300 of the possible 676 combinations; there will be a certain group of predominant pairs followed by a group of moderate frequencies; and, with any appreciable length, there will be a generous sprinkling of reversals. In preparing the cryptogram, a great deal of convenience may be had by placing frequency figures beside their digrams, by marking long repeated sequences, noticeable reversals, and so on; and many persons like to list the most prominent pairs and the most prominent reversals.
The Playfair has also a rather characteristic frequency count. Notice in the figure, where the general count has been rearranged in the order of decreasing frequencies, that the gradation from high to low is somewhat less even than in a periodic; frequency 8, for instance, is skipped altogether, and we have a sort of modified high-frequency group. Sometimes we find from one to three letters of great prominence before the downward gradation begins.
Concerning the "chart of probable position," most solvers prefer simply to keep this in mind, while others will actually set it down and make it the basis of their solution. With 176 letters of text, the average frequency of letters is about 7 (176 divided by 25). Any letter whose frequency is above that average is very likely to have been standing on the same row or in the same column of the keysquare as E or T, and the two or three which lead the list are practically sure to have been substitutes for one of these two letters.
With cryptograms of the present length, or even with those of 400 to
600 letters, it is very uncertain as to whether or not the leading pair
will represent th, or the leading reversal
Having placed frequency figures beside their digrams, find those points
at which two pairs of high frequency are consecutive (not necessarily a
repeated sequence), and attempt to identify these tetragrams as frequent
tetragrams of the language: ther, ered, ened, tion,
atio, ment, beca, and so on. We have one here, provided
a frequency of 3 can be considered important:
Another good demonstration, provided the student has access to it in
his public library, is found in Colonel Parker Hitt's "Manual for the Solution
of Military Ciphers." This manual is an elementary work intended for the
preliminary instruction of soldiers, and the attack is made on the assumption
of a key-square filled by straight horizontals. With a square of the kind
we are using, most of the vowels and high-frequency letters will be standing
on the upper two rows, and letters on the first two or three rows will
have a much higher frequency than those of the last two or three. In fact,
it can often be detected that the letters
Colonel Hitt's demonstration begins with the usual pair-count, made on a chart. He selects from this chart the (approximately) ten letters having the widest variety of contact, including, if necessary, the vowel or so which would have to be present in a key-word, and these letters are assumed to have stood on the upper rows of the key-square. The remaining (approximately) fifteen letters are then set up in their alphabetical sequence and are assumed to have stood on the lower rows in about that order. They are not, of course, known to be correctly placed, the set-up merely gives a concrete idea as to where letters ought to have stood. Then, following the military case of abundant material, it is assumed that the leading pair will represent th (sure to be followed often by e), or, if th is not the leader, then he (sure to be preceded often by t). With a few obvious identifications made in the usual way, letters begin to arrange themselves on the upper rows, and a gradual adjustment takes place which corrects the few wrong assumptions of the lower rows, so that the key-square is restored far in advance of solution. When a short keyword has been used, it is not impossible, by following Colonel Hitt's suggestions, to pick out all of the key-letters, guess the word, and decipher with the key-square. Other demonstrations, based, respectively, on French and Italian language characteristics, can be found in General Givierge's Cours de cryptographie and in General Sacco's Manuale di crittografia. (In the French work, the cipher is referred to simply as "orthogonal and diagonal substitution.")
It will be seen from the foregoing that the initial difficulty lies in the correct identification of the first few pairs, and this, in a short cryptogram, is no small difficulty. By whatever means it is found possible to make these first tentative identifications, the operation which is to admit or disprove their correctness is step No. 2, in which we set them up as equations and then attempt to replace them into their connected relationships in the key-square. If this cannot be done, they cannot be correct; and, on the other hand, it would be an extremely rare case indeed in which we could combine as many as five or six such equations into one framework and then find them incorrectly matched. To understand "equations," suppose we look at Fig. 168.
Assuming that the beginning pairs of our cryptogram,
To learn whether or not the word "condemnation" does (or could) occur
here, we proceed as in Fig.
We are safe, now, in making substitutions on the cryptogram. This means
not only the five pairs originally identified, together with their reversals
and possible reciprocals, but all others which can be derived from combination
16, such as
Once a beginning is made, the cipher is broken, though just how rapidly we may proceed with the solution depends chiefly upon the manner in which the square has been filled. The presence of alphabetical sequences (either horizontal or vertical) will often enable us to complete the key-square independently of the cryptogram; but the badly mixed square must usually be built up to the very end, and we must sometimes be satisfied with one of the "equivalents" in place of the square originally used. If the student cares to make a fresh beginning of his own, this same cryptogram contains the word RECONSTRUCT.
The Playfair has been, in its day, a very effective cipher, and is still
good for many purposes. It can be rendered much safer if subjected to the
process called seriation. This process may be examined in Fig.
171. Here, the text is "Send diamonds to Amsterdam Monday," and the
agreed seriation index is 5. The text is written in pairs of five-letter
lines, so that each ten-letter segment forms five vertical pairs, SI,
EA, NM, etc., and these are the pairs which undergo the digram
encipherment (notice the treatment of the doubled S in the second
group). If the key-square is that of Fig.
166, the first ten-letter segment is enciphered QK, UF,
TG, SA, RS, and the cryptogram may be taken off in
that order; or by taking the upper and lower lines separately: