Cryptology - Part 1 - Phrost Byte
- Introduction -
Cryptology came from the need to hide and conceal information from prying
eyes, be it war tactics, army commands, secrets, or even directions to hidden
treasure (see the famous Beale ciphers).
The name cryptology is a combination of the Greek words cruptos (hidden) and
logos (study, science). Crytology comprises both the enciphering (turning
readable text into unreadable text to those who dont know the key) and
deciphering (turning the unreable text to readable text for those with the
key) of data. Crytology can be split into two seperate areas;
cyrtography - dealing with techniques of concealing data based on a key, and
cryptanalysis - the deciphering of data into readable text without knowing
the key.
When party A wants to send a message to party B without party C knowing,
they hide the message by means of encryption (also called encipher /
encipherment). When party B recieves the message they decrypt it (also
called decipher / decipherment) to read it's contents. The message before
encryption is known as the plaintext, and the message after encryption is
known as the ciphertext.
Plaintext Ciphertext Original Plaintext
---------> Encryption ----------> Decryption ------------------>
The method of encryption and decryption is carried out using a crytographic
algorithm, which is also known as a cipher. A cipher is a mathematical
function that both encrypts and decrypts a message with the known (secret)
key.
- Classical Ciphers -
Classical ciphers have been used long throughout history, and were most
popular during the second world war. With the invention of the computer,
their effectiveness and usefullness diminished, and were replaced with far
superiour number based ciphers. Classical ciphers are character based. They
involve the substitution of one character with another, or the transposition
of characters with one another. Even with the advent of computers, classical
ciphers can still be used effectively. They are often incorperated in more
modern crypto-systems, or combined in succession on data.
- Substitution Ciphers -
As mentioned before, a substitution cipher is one in which each character in
the plaintext is replaced with another character in the ciphertext. There are
four basic types of substitution ciphers:
Monoalphabetic Substitution - a character in the plaintext is substituted
with a character from a corresponding ciphertext. The cipher alphabet is
fixed throughout encryption. eg. Caesar Cipher, ROT13.
Homophonic Substitution - a single character in the plaintext can be
represented by one or several characters in the corresponding ciphertext.
eg 'e' in the plaintext can be represented by six characters in the
ciphertext. This is a type of monoalphabetic cipher.
Polyalphabetic Substitution - one in which the cipher alphabet changes
during encryption. The alphabet used can depend on the position of each
character of the plaintext, and the key. eg. Vigenere Cipher.
Polygram Substitution - blocks of characters are encrypted in groups. eg.
in the Playfair Cipher, characters are grouped into twos, and then
encrypted.
- Monoalphabetic Substitution Ciphers -
Monoalphabetic ciphers are the easiest to implement and to cryptoanalise. One
of the simplest and most common is the Caesar cipher, which was named after
Julis Caesar. The whole alphabet is simple shifted a few positions, and in
the case of the Caesar cipher it was by three places:
Plain Alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Cipher Alphabet: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
Sample Plaintext: may the force be with you
Sample Ciphertext: PDB WKH IRUFI EH ZLWK BRX
(The common standard is to have ciphertext in uppercase.)
ROT13 is similar to the Caesar cipher, but the letters are shifted (ROTated)
thirteen places. The order of characters in the Caesar and ROT13 ciphers do
not change, so there is only 25 possible keys. Due to the small number of
possible keys, the Caesar cipher is open to a brute force attack. It would
not take long to cycle through the various 25 keys until intelligable text
is given. A superior method would be to create a random cipher alphabet. This
further increases the possible keys to more than 400 000 000 000 000 000 000
000 000. For example:
Plain Alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Cipher Alphabet: S C E J T I Q L P D V B N W Z O X K J U A F H G M R
Sample Plaintext: may the force be with you
Sample Ciphertext: NSM ULT IZKET CT HPUL MZA
A random cipher alphabet like this is hard to remember, and encryption keys
should never be written down. An easier method is to use a key-sentence. A
simple phrase or even a word is used, and each time a new letter appears in
the key-sentence, it is written down. Once all letters in the phrase or word
have been used, the remaining letters of the alphabet are added onto the
cipher alphabet. For example:
Key-Phrase: I am Queeg. Red Dwarf backup computer.
Removing Duplicates: IAMQUEGRDWFBCKPOT
Cipher Alphabet: I A M Q U E G R D W F B C K P O T H J L N S V X Y Z
With the large number of possible keys that can be used, a monoalphabetic
cipher is relativley easy to break, and even easier if the plain text
language is known. A technique known as frequency analysis is used, and is
the foundation in which most ciphers are broken.
- Frequency Analysis -
Simply, the most frequent occuring letter in the cipher text will represent
one of the most frequent occuring letters in the plaintext's language
alphabet. The second most frequent letter in the cipher text will represent
one of the second most frequent occuring letter in the plaintext language
alphabet and so on.
Table of relative frequencies for English
(compiled by H. Beker and F. Piper, using various passages)
Letter Percentage Letter Percentage
------------------- ----------------------
a 8.2 n 6.7
b 1.5 o 7.5
c 2.8 p 1.9
d 4.3 q 0.1
e 12.7 r 6.0
f 2.2 s 6.3
g 2.0 t 9.1
h 6.1 u 2.8
i 7.0 v 1.0
j 0.2 w 2.4
k 0.8 x 0.2
l 4.0 y 2.0
m 2.4 z 0.1
The letters should not be taken as is (ie, the most frequent letter in the
cipher text IS the most frequent letter in the plain text). The surroundings
of the letters in question should be examined. The can be done by looking at
how letters interact with one another. For example, in English the letter Q
is pretty much garunteed to be followed by a U, and the letter H frequently
follows the letter E (the, then, they), but rarely after. Most commonly
occuring letter combinations is also something to look at such as repeated
letters, diagrams (two letter combinations) and trigrams (three letter
combinations).
Repeats Order: SS, EE, TT, FF, LL, MM, OO
Digram Order: TH, HE, AN, IN, ER, RE, ES, ON, EA, TI, AT, ST, EN, ND, OR
Trigram Order: THE, AND, THA, ENT, ION, TIO, FOR, NDE
If the cipher text contains spaces between words, plaintext words can easily
be obtained using frequency analysis, and knowledge of common words.
Single Letter Words: A and I (these are the only two in English)
Double Letter Words: OF, TO, IN, IT, IS, BE, AS, AT, SO, WE, HE, BY, OR...
Three Letter Words: THE, AND
Once a few letter have been picked out, and partial words start to form,
decipherment proceeds rapidly.
- Conclusion -
That ends the first part to Cryptology. Next issue I will explain Homophonic
ciphers and how to crack them, and possibly Polyalphabetic ciphers. Until
then try your hand at cracking the ciphers at the end of this ezine which
incorperate various tricks to make them progresivly harder.
- References -
Applied Cryptography - Bruce Schneier
Basic Method of Cryptography - Van Der Ludde
The Code Book - Simon Singh
Xenos - Lanaki
TUCoPS is optimized to look best in Firefox® on a widescreen monitor (1440x900 or better).
Site design & layout copyright © 1986-2025 AOH