|
Cryptology - Part 1 - Phrost Byte - Introduction - Cryptology came from the need to hide and conceal information from prying eyes, be it war tactics, army commands, secrets, or even directions to hidden treasure (see the famous Beale ciphers). The name cryptology is a combination of the Greek words cruptos (hidden) and logos (study, science). Crytology comprises both the enciphering (turning readable text into unreadable text to those who dont know the key) and deciphering (turning the unreable text to readable text for those with the key) of data. Crytology can be split into two seperate areas; cyrtography - dealing with techniques of concealing data based on a key, and cryptanalysis - the deciphering of data into readable text without knowing the key. When party A wants to send a message to party B without party C knowing, they hide the message by means of encryption (also called encipher / encipherment). When party B recieves the message they decrypt it (also called decipher / decipherment) to read it's contents. The message before encryption is known as the plaintext, and the message after encryption is known as the ciphertext. Plaintext Ciphertext Original Plaintext ---------> Encryption ----------> Decryption ------------------> The method of encryption and decryption is carried out using a crytographic algorithm, which is also known as a cipher. A cipher is a mathematical function that both encrypts and decrypts a message with the known (secret) key. - Classical Ciphers - Classical ciphers have been used long throughout history, and were most popular during the second world war. With the invention of the computer, their effectiveness and usefullness diminished, and were replaced with far superiour number based ciphers. Classical ciphers are character based. They involve the substitution of one character with another, or the transposition of characters with one another. Even with the advent of computers, classical ciphers can still be used effectively. They are often incorperated in more modern crypto-systems, or combined in succession on data. - Substitution Ciphers - As mentioned before, a substitution cipher is one in which each character in the plaintext is replaced with another character in the ciphertext. There are four basic types of substitution ciphers: Monoalphabetic Substitution - a character in the plaintext is substituted with a character from a corresponding ciphertext. The cipher alphabet is fixed throughout encryption. eg. Caesar Cipher, ROT13. Homophonic Substitution - a single character in the plaintext can be represented by one or several characters in the corresponding ciphertext. eg 'e' in the plaintext can be represented by six characters in the ciphertext. This is a type of monoalphabetic cipher. Polyalphabetic Substitution - one in which the cipher alphabet changes during encryption. The alphabet used can depend on the position of each character of the plaintext, and the key. eg. Vigenere Cipher. Polygram Substitution - blocks of characters are encrypted in groups. eg. in the Playfair Cipher, characters are grouped into twos, and then encrypted. - Monoalphabetic Substitution Ciphers - Monoalphabetic ciphers are the easiest to implement and to cryptoanalise. One of the simplest and most common is the Caesar cipher, which was named after Julis Caesar. The whole alphabet is simple shifted a few positions, and in the case of the Caesar cipher it was by three places: Plain Alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Cipher Alphabet: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C Sample Plaintext: may the force be with you Sample Ciphertext: PDB WKH IRUFI EH ZLWK BRX (The common standard is to have ciphertext in uppercase.) ROT13 is similar to the Caesar cipher, but the letters are shifted (ROTated) thirteen places. The order of characters in the Caesar and ROT13 ciphers do not change, so there is only 25 possible keys. Due to the small number of possible keys, the Caesar cipher is open to a brute force attack. It would not take long to cycle through the various 25 keys until intelligable text is given. A superior method would be to create a random cipher alphabet. This further increases the possible keys to more than 400 000 000 000 000 000 000 000 000. For example: Plain Alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Cipher Alphabet: S C E J T I Q L P D V B N W Z O X K J U A F H G M R Sample Plaintext: may the force be with you Sample Ciphertext: NSM ULT IZKET CT HPUL MZA A random cipher alphabet like this is hard to remember, and encryption keys should never be written down. An easier method is to use a key-sentence. A simple phrase or even a word is used, and each time a new letter appears in the key-sentence, it is written down. Once all letters in the phrase or word have been used, the remaining letters of the alphabet are added onto the cipher alphabet. For example: Key-Phrase: I am Queeg. Red Dwarf backup computer. Removing Duplicates: IAMQUEGRDWFBCKPOT Cipher Alphabet: I A M Q U E G R D W F B C K P O T H J L N S V X Y Z With the large number of possible keys that can be used, a monoalphabetic cipher is relativley easy to break, and even easier if the plain text language is known. A technique known as frequency analysis is used, and is the foundation in which most ciphers are broken. - Frequency Analysis - Simply, the most frequent occuring letter in the cipher text will represent one of the most frequent occuring letters in the plaintext's language alphabet. The second most frequent letter in the cipher text will represent one of the second most frequent occuring letter in the plaintext language alphabet and so on. Table of relative frequencies for English (compiled by H. Beker and F. Piper, using various passages) Letter Percentage Letter Percentage ------------------- ---------------------- a 8.2 n 6.7 b 1.5 o 7.5 c 2.8 p 1.9 d 4.3 q 0.1 e 12.7 r 6.0 f 2.2 s 6.3 g 2.0 t 9.1 h 6.1 u 2.8 i 7.0 v 1.0 j 0.2 w 2.4 k 0.8 x 0.2 l 4.0 y 2.0 m 2.4 z 0.1 The letters should not be taken as is (ie, the most frequent letter in the cipher text IS the most frequent letter in the plain text). The surroundings of the letters in question should be examined. The can be done by looking at how letters interact with one another. For example, in English the letter Q is pretty much garunteed to be followed by a U, and the letter H frequently follows the letter E (the, then, they), but rarely after. Most commonly occuring letter combinations is also something to look at such as repeated letters, diagrams (two letter combinations) and trigrams (three letter combinations). Repeats Order: SS, EE, TT, FF, LL, MM, OO Digram Order: TH, HE, AN, IN, ER, RE, ES, ON, EA, TI, AT, ST, EN, ND, OR Trigram Order: THE, AND, THA, ENT, ION, TIO, FOR, NDE If the cipher text contains spaces between words, plaintext words can easily be obtained using frequency analysis, and knowledge of common words. Single Letter Words: A and I (these are the only two in English) Double Letter Words: OF, TO, IN, IT, IS, BE, AS, AT, SO, WE, HE, BY, OR... Three Letter Words: THE, AND Once a few letter have been picked out, and partial words start to form, decipherment proceeds rapidly. - Conclusion - That ends the first part to Cryptology. Next issue I will explain Homophonic ciphers and how to crack them, and possibly Polyalphabetic ciphers. Until then try your hand at cracking the ciphers at the end of this ezine which incorperate various tricks to make them progresivly harder. - References - Applied Cryptography - Bruce Schneier Basic Method of Cryptography - Van Der Ludde The Code Book - Simon Singh Xenos - Lanaki