A Null cipher, at least in American Cryptogram Association (ACA) usage, is a type of concealment cipher. The idea is to write normal sounding text but use a secret key or “rule” that tells the recipient how to extract the true message hidden inside. The rules used can be imaginative and, ironically, there are no real rules about what the rule can be. As a simple example, if the key is 123, the word ‘jog’ could be enciphered as ‘just too big” taking the first letter of just, second letter of too, etc. The key repeats throughout the length of the plaintext.
Just for fun I decided to try to write a program that would generate a null cipher given a plaintext and a numeric key like the example. My first attempt was to load in a large list of English words and then randomly choose words that fit the rule, followed by several rounds of substituting words that fit more naturally with its neighbors. When I tested it, the first step worked instantly, but it produced a meaningless jumble of words. It tended to choose long words simply because there are more long words than short, but in natural speech or writing we use many more short words. This method might produce ‘justification polemic signatory’ to encipher the above example. The subsequent rounds did tend to make it replace these words with shorter, more common, words, but it took forever because testing pairs of words for how frequent they are is a very time-consuming computing task. I never let it run to the end and the intermediate ciphertext was still not natural-sounding.
So I changed my strategy. Instead of using word lists, I sought a source that already had common words in a natural sounding order: literature. I had a large file of plaintext books, mostly classic novels downloaded from Project Gutenberg. This file had already been processed to have no punctuation, exactly one space between words, and be all lower case to facilitate computer searches and comparisons. This second version of my program reads a few dozen lines at a time and scan them to find N words in a row that met the criteria. I found that it was usually easy to find passages that would satisfy a four-word stretch at a time, and often a five-word stretch, but no more. Keys using smaller digits were more productive than keys with eights or nines in them.
For example, when I enciphered ‘hail to the chief’ with the key 2141 my program produced, “What a delightful lazy stream of that history we could gather if we focused.” This is actually a patch of three 5-word outputs and I had to modify a couple of the words. It’s quite normal-sounding, but doesn’t make much sense. If I were to submit it, I’d be looking for better words, most likely for “lazy stream”. The program produced 13 passages for the first five letters ‘hailt’. I extended one of those to “The average individual likes stories of what he….” The program runs very fast and can be modified to fit other rules, not just numeric keys.
I don’t plan to submit any Null ciphers from this program, but I wanted to share it to show how ciphers in general and especially the ACA are a rich playground for recreational computing. I invite others to write a better null cipher generator and share their results here.