# Tridigital cipher solving algorithm continued

In my last post, I mentioned key pruning. What I mean by that is halting the recursion (and allowing it to go back a level and continue with another word) based the nature of the keyblock. In particular, the block always has three rows. This means any given digit may have at most three letter equivalents. Thus I put a testing routine in the recursion routine to look for conflicts of this sort. If the word being tested passes the first conflict test, e.g. THEIRSOME in the last post, it then goes through a routine to count how many letters are represented by each digit. THEIRSOME is represented by 107735607. The 7 enciphers E and I while the 0 enciphers H and M. Since there are no cases where a digit represents four or more letters, this is a possible solution so far. This is not surprising since we’re only on level 2. If one of them represented four different letters, the recursion ends at that level, the test word SOME would be rejected, and the next word in the four-letter word list would be tried.

This example is so short that there are thousands of valid solutions (which is why Tridigital is a hobby cipher, not a real-life one). One my program found was “had some by their,” which even makes sense as a phrase. But with more words, especially longer ones, it will soon be the case for incorrect combinations that one or more digits represent more than three letters. This typically happens on the level three or four recursion.

The other technique I use is scoring. Since all solutions that pass through the conflict testing consist of valid words, the usual scoring techniques such as tetragram frequency or word list scoring don’t work. So I reorder each potential solution array back to its original order and test pairs of words for frequency.  I use Google N-gram data to determine this. The better the score, the higher in the display the solution is placed. Although this doesn’t speed up processing per se, it makes it much easier for me to spot a correct solution, or at least a likely correct segment in the solution early on. I don’t have to continuously scroll through dozens or even thousands of possible solutions as the program runs. The best ones are right up at the top.

As an example, the demo problem produces these possible solutions immediately: