Projects - Code Breaking ----------------------- Note: In the text of the second task 4th should be 5th and 5th should be 4th. Frequency tables are given in the files Freq_space - for English with spaces Freq_no_space - for English where the spaces between words have been closed up. Programmes ---------- To run these programmes copy them into your file space. To compile them in Linux (E208, E217, A412, A413) open up a terminal window and use g77 -o Prog_Name Prog_Name.f where Prog_Name is replaced by the appropriate programme name. This compiles the programme. You can then run it by just typing Prog_Name It is also possible to run these on the Windows PCs in other labs, but a bit more work needs to be done. These programmes are run using Fortran77, but the compilers on the Windows service only recognises Fortran95. They are not too different. If you change the suffix of the programme to .f95 and load the programmes into Plato IDE then click on 'File Options ...' and type /FIXED_FORMAT in the box 'Local Compiler Options' then some of the programmes (e.g. Caesar.f) will run. Others need a bit of modification as the continuation line convention changed. See Dr Bowtell's Fortran95 Notes available at http://www.staff.city.ac.uk/~ra933/X2NumMaths/FORTRAN95_Notes.pdf Caesar.f - for the Caesar cypher -------- Copy the text you want to encypher or decipher into the file Caesar.txt and run the programme by typing Caesar You will be prompted for the shift you want to use and whether you want the output in UPPER or lower case letters. Obviously you can use all 25 non-trivial shifts and see which one you can read - but try and use some deductive work to get your answer in fewer goes. MonoAlph.f - for a monoalphabetic cypher ---------- The key for this code is stored in the file MonoAlph.key. This file consists of a single line of 26 letters (upper or lower case). A or a will be replaced by the first letter, B or b by the second and so on. I suggest you start with a key ABCDEF...XYZ and replace each guessed letter in lower case. So if you think C is encrypted by B change your key file to AeCDE...XYZ and run you programme again. The programme assumes it is working on a plain text, so when decoding use the option (1) encoding, and use the option for mixed upper and lower case. You shouldn't need to do a frequency analysis for Code_2. If you need to do a frequency analysis of a text then use the programme Analysis.f which reads from a file Analysis.in. HomoDecode.f - for decoding a monoalphabetic cypher with homophones. ------------ This reads the text to be decoded from the file Homophones.code. It reads a guess for the key from the file HomoGuess.key. The format of this file is a file with 27 lines. The first line contains letters and characters that you think correspond to A, the second to B, and so on up to 26th for Z and 27 for spaces. If you have no guesses for a given letter leave the corresponding line blank. The output on running the programme consists of alternate lines of the enciphered text and the (partial) decipherment. Unknown letters are replaced by X in the decipherment. Frequency analysis of pairs is very useful here. For example, spaces are common, but double spaces are not. So if you see some set of relatively common letters that are never seen together then they may be spaces! VigCount.f and Vienere.f ------------------------ VigCount.f takes some code from the file Vigenere.code and looks for repeated 3, 4 , 5 and six letter repeats. It then writes them out with the difference in positions, and the position of the first and second occurrences. It is the difference in positions that matters most, as the highest common factor of these should give you the length of the key word (plus a few random matches to throw you off). Once you have the key word length, you have to split the text up appropriately and use frequency analysis to find the letters of the keyword. The guess at the keyword can be entered into Vigenere.f which reads its source text from the file Vigenere.txt. This will decode your text based on this guess. PlayfairBreak.f and PlayfairCuess.f ----------------------------------- In PlayfairBreak you are given a frequency count of digraphs. Use this to make some guesses as to suitable substitutions. Don't forget that if th is encoded as, say, JK then ht will be encoded as KJ, so you would expect JK to be common and KJ scarce. The guesses are entered in the form of pairs DE st where DE is the pair in the code and st is what you think it is in the plain text. Don't forget to enter other guesses/deductions such as ED ts and if you think the grid looks like ..E.T ..... ..S.D ..... ..... you can also enter TS ed ST de The last line of your guesses should be ZZ ZZ I suggest you have a list of your guesses in a file and just copy and paste them to the programme after each modification. Once you have established some sense of what part of the key looks like transfer to PlayfairGuess.f. In this programme you put your partial guess into the file Playfair.key in the from ..E.T ..AFG ..SID ..... ..... You must have a 5x5 grid, use capital letters and full stops for unknown entries. Every time you change the file just hit return and the guess should be updated with you attempts at decoding in lower case letters with "." for one of a pair of letters that cannot be deduced. Original code in uppercase letters. RotMono2.f ---------- This programme is for helping decipher a text that has had a rotation type cipher, where the shift of the ith letter is by a+N*i (mod 26), followed my a monoalpbetic substitution cipher. This programme reads the cipher from the file Cipher.7.ABC. They then split the text up into 26 columns and look for the most frequent letters in each column. If you have a very large sample and/or are lucky these will correspond to e being enciphered. From this and other information you should try and guess what the value of N is. For example if you have identified a letter you think corresponds to e in one column, then the frequency of the same letter in other columns puts constraints on possible values of N. Applying this to several columns will give you an idea what N is. You can assume that a=-N (why?), and so the first column is unaffected by the a+N*i shift. This may tell you what e is enciphered as in the monoalphabetics substitution cipher. If you assume the most frequent letter in each of the other columns originally came from e, then your knowledge of N will give you guesses for other elements of the monoalphabetic key. If is even or 13 this may be a bit restrictive and you may need to look for some more information to help. Put your guesses for the letters corresponding to e into the file RotMono2.key (same as for the monoalphabetic substitution cipher) in the order of the columns, so in the original example the content of this files would be YKVBLDFSKQBAHESRVDFGRTMWIO Then use RotMono2.f. If N is even then the two halves of the key should be the same. Where there are differences look more closely at the distribution in both corresponding columns to get a better guess. If every thing was perfect and N was odd then you would have a clear text. However, things are often not. For example by using the most frequent letter you may get (as is the case here) repeated letters. Here K is repeated, but if N is odd they can't both be right. RotMono2 will print out its first attempt. Then you have to try and sort out your errors. Look at the partial decipherment, spot letters that you think are wrong, enter the letter's position (remember - each line is 80 characters long) and what it should be and it will work out the correction to the monoalphabetic key. The latest version of the monoalphabetic key is given numerically just above the latest decipherment. This will not look like your key file. If N is even you may get a lot of blanks. Scan through your text and with luck you should spot something that you can work on. Once you get a few corrections you will find things becoming clearer quite quickly. (A few of you have even N, and I have successfully deciphered them). Other uses: If the key file has all the letters in alphabetic order, then this like a rotation with N=1. Try different values of N in the programme and the one that gives the most uneven distribution of the letters is probably that for the rotation step (with the adjustment of 1 due to the key files having an rotation of 1). You can use this for ciphers 7 and 8.