Cryptography for everybody: Let’s Create Our Own Homophonic Substitution Cipher

In our newest video on “Cryptography for everybody”, we create a homophonic substitution cipher using CrypTool 2.


The “Substitution component” of CrypTool 2 allows to create substitution ciphers. For that, we implemented an easy-to-use syntax based on plaintext and ciphertext alphabets. An “alphabet” is just a string (some text), which consists of our “symbols”. A symbol can be one or more UTF-8 characters.

Example (simple shift cipher):
– Plaintext alphabet=”ABCDEFG…Z”
– Ciphertext alphabet=”BCDEFGH…A”

Providing these two alphabets to the substitution component would create a simple shift cipher, where each letter of the plaintext alphabet is shifted one to the left in our corresponding ciphertext alphabet. In the substitution component, letters are substituted based on their corresponding positions in the given alphabets. The first letter of the plaintext alphabet is substituted by the first letter of the ciphertext alphabet, the second by the second, etc.

But the substitution component is much more powerful. It allows also to create alphabets consisting of “words” and also allows alternative substitutions to create “homophones”.

Example (homophonic substitution cipher):
– Plaintext alphabet=”ABCDEFG…Z”
– Ciphertext alphabet=”[01|02][03|04]…[999|555]”

In this example, the letter A can be substituted by either 01 or 02. The brackets tell the substitution component that it should use everything inside the brackets as a single ciphertext symbol. The pipe symbol tells the component that we want to create alternatives. Using this syntax, we are able to create a homophonic substitution cipher, where one plaintext letter will be replaced by one of the defined homophones.

But we are not only limited to use simple two or three digit combinations. We can also create mappings like [MAXIMILIAN] in the plaintext alphabet and [1001] in the ciphertext alphabet. Doing so, we can create so-called nomenclators. How this can be done in CrypTool 2 is part of the linked YouTube video. So if you are interested in more details, you should have a look at this 🙂

If you are interested in downloading the newest version of CrypTool 2 (I always recommend the nightly build, since it contains the newest components) go to https://www.cryptool.org/en/ct2/downloads

Nils

5 thoughts on “Cryptography for everybody: Let’s Create Our Own Homophonic Substitution Cipher”

  1. What is a good procedure to generate the codes? Using codes of fixed length for every symbol or a using a prefix code? Thanks.

    1. Hiho,
      From the perspective of a cryptanalyst, a prefix code is much harder to break than a code with fixed length ciphertext symbols. Especially with historical ciphers from the Vatican, the main problem was the tokenization of the ciphers :-). So when you want to create a very difficult code, use symbols with different lengths and create a prefix code.

      When you create your codewords (ciphertext symbols), you should take care that the assignments are performed randomly. Non-random assignments of ciphertext symbols help the cryptanalysts 🙂

      My friend and colleague George Lasry created such a cipher for a challenge (with a prefix code), which was very difficult to break. Have a look at it here: https://scienceblogs.de/klausis-krypto-kolumne/the-friedman-ring-challenge-by-george-lasry/

      For Vatican ciphers have a look at this article here: https://www.tandfonline.com/doi/full/10.1080/01611194.2020.1755915

      Nils

      1. Thanks for the links. I got one more question.

        What if we develop a cryptosystem along these lines. We use a key to generate a table for homophonic substitution. And then, we use a external source of randomess to choice between the different codes we can use to encipher the same letter.

        Now, we can generate different versions of the same cyphertext. We think we have a secure system and get cocky. We send the same message to N people, enciphered with the same key and diferent sources of randomess. Let’s imagine a cryptoanalyst who can have access to all this different versions of the ciphertext and have the knowledge they are the same plaintext. How much easy is going to be the cryptoanalyst work?

        1. Hiho,
          I try to understand your cipher construction by 100%, I hope I got it.

          Having two different random number generators (rngs) for assigning the homophones (one for each ciphertext) would not help to make it more secure. Lets assume we have 4 homophones for the letter “E”. We also assume, we have perfect rngs (let them even be true cryptographic rngs). This means, they will select each homophone with a probability of 1/4th for “E”. So in both ciphertexts the distribution of the homophones among all letters will probably be the same ( when we have long ciphertexts). Nevertheless, the ciphertexts will be different, but also many ciphertext letters may be the same by chance. You may test this in CT2 by just encrypting a text using a homoph. subst. cipher twice. When the substitution component is set to “random choice”, the ciphertexts will always differ, but still have many equal ciphertextletters at same positions. You can lower these “collisions” of ciphertext letters by just introducing MANY homophones per letter 🙂

          Btw, having more than one ciphertext encrypted with the same key is referenced as “ciphertexts in-depth”. This always helps us, since our algorithms (and even humans :-)) perform better when having more material to analyze.

          I hope this answers yout question,
          Nils

Leave a Reply

Your email address will not be published. Required fields are marked *