Lab 1 - Math 173A#
Note. This is not the first lab, be sure you do Lab 0 first. (Lab 0 is also due one week earlier.)
You are encouraged to work in groups of up to 3 total students, but each student should make their own submission on Canvas. (It’s fine for everyone in the group to have the same upload.)
Put the full names of everyone in your group (even if you’re working alone) here. (This makes grading easier.)
Names:
Suggestion#
Follow the instructions from the Lab 0 worksheet to make an Education Workspace for this lab.
Preliminaries#
Our goal in this lab is to implement encryption and (more difficult) automatic decryption of Vigenère ciphertext, as described in Section 5.2 of Hoffstein, Pipher, and Silverman.
In this Preliminaries section, we introduce some concepts and tools that will be useful below.
Import some libraries and functions we will need by executing the following code. Some of these functions are new. Some you might have to review from lecture or Lab 0. (The
Lab1_Helper.pyfile is already attached to this Deepnote project, so there is no need to download it.)
import string
import numpy as np
from Lab1_Helper import (
only_letters,
weave,
shift_string,
kasiski_diffs,
ind_co,
mut_ind_co,
get_freq,
english_freq,
add_spaces
)
When we encrypt using the Vigenère cipher with a key length of 5 (for example), we shift all the characters at integer positions 0 modulo 5 by the same amount, all characters at integer positions 1 modulo 5 by the same amount, and so on. Assume
Xis the string we want to encrypt using the Vigenère cipher with a key length of5. Notice you can get all the characters inXat integer positions 2 modulo 5 (for example) usingX[2::5]. (You can think of this as shorthand forX[2:len(X):5], which means, start at position2, go until reachinglen(X), go up by5.) Test this out by settingX = string.ascii_uppercaseand then printingXandX[2::5].
We need a way of “re-assembling” these strings. We can do that using the imported
weavefunction. Evaluate the following code to get a sense for how it works. (HereXshould be the same as above. If you know list comprehension, try to recreate this list using list comprehension.)
weave([X[0::5], X[1::5], X[2::5], X[3::5], X[4::5]])
As described on page 219 of Hoffstein, Pipher, and Silverman, the Kasiski method first finds the gaps between repeated trigrams in Vigenère ciphertext. (The idea is that these repeated trigrams will usually correspond to the same English plaintext, and so the gaps will usually be divisible by the key length.) The
kasiski_diffsfunction will find these gaps for us. Try it out by evaluating the following code, and notice that the numbers match those in Table 5.3 in Hoffstein, Pipher, Silverman, and that many of them are divisible by 7.
Y = '''zpgdl rjlaj kpylx zpyyg lrjgd lrzhz qyjzq repvm swrzy rigzh
zvreg kwivs saolt nliuw oldie aqewf iiykh bjowr hdogc qhkwa
jyagg emisr zqoqh oavlk bjofr ylvps rtgiu avmsw lzgms evwpc
dmjsv jqbrn klpcf iowhv kxjbj pmfkr qthtk ozrgq ihbmq sbivd
ardym qmpbu nivxm tzwqv gefjh ucbor vwpcd xuwft qmoow jipds
fluqm oeavl jgqea lrkti wvext vkrrg xani'''
kasiski_diffs(Y)
In lecture on Monday of Week 1, we used (an estimate for) the Index of Coincidence to distinguish random characters from shift ciphertext. One strength of the Vigenère cipher is that it looks very much like random text from this perspective. Compute
ind_co(Y)forYas above, and compare it with1/26, which is the value expected for randomly chosen English characters. (Comment. We’ve changedind_cosince Monday. It used to take a dictionary of letter frequencies as input, likemut_ind_co. Now it takes a string as input.)
On the other hand, the above was encrypted using a Vigenère cipher with a key length of
7. Compute the following and compare the values with the expected Index of Coincidence value for true English of approximately0.068. (All of the values will be close to, but slightly less than, this0.068value. I’m not sure if there is a theoretical reason for why they are less than0.068. The important thing is they’re all significantly closer to the true English value than to the random characters value.)
Z = only_letters(Y) # we remove spaces from `Y`
for i in range(7):
print(ind_co(Z[i::7]))
Vigenère encryption#
Writing the vigenere function#
Write a function to perform Vigenère encryption using the following template. (If you want to make the code more elegant, replace
for i in range(key_length):in our template withfor i, k in enumerate(key):.)
Input: a string
Xand a list of shift amountskeyOutput: a stringYExample: If
Xis"The rain in Spain stays mainly in the plain"andkeyis[5, 11, 0, 12, 8, 13, 6, 14], then the output should be the ciphertext written on page 217 of our textbook, beginning"YSEDIVTWSDPM...".
def vigenere(X, key, case="upper"):
X = only_letters(X, case=case)
key_length = len(key)
string_list = []
for i in range(key_length):
s = # Get the characters in `X` at integer positions ??? modulo ???
shifted_s = # Shift the characters in s. Use the imported `shift_string` function.
string_list.append(shifted_s)
output = # Use the imported `weave` function
return output
Test your function by evaluating the following and checking that the result matches what is on page 217 of the textbook (aside from ours being in upper-case letters).
X = "The rain in Spain stays mainly in the plain"
key = [5, 11, 0, 12, 8, 13, 6, 14]
vigenere(X, key)
Posting Vigenère ciphertext on Ed Discussion#
Encrypt a piece of English text using the above
vigenerefunction with a random key of length 7, 8, 9, 10, or 11.You can generate a random key using the following, where
key_lengthis your chosen length. (The resultingkeyis a NumPy array rather than a list, but it should still work.)
rng = np.random.default_rng()
key = rng.integers(0, 26, size=key_length)
key
Your chosen text should be quite long, maybe 300 characters. Just copy and paste from some source, and use triple quotation marks. (Using triple quotation marks means that Python won’t complain about line breaks or apostrophes or double quotation marks in your pasted text.)
plaintext = ''' Your text here '''
Encrypt your chosen plaintext using the above key and the
vigenerefunction. Store the resulting ciphertext using the variable nameY.
Make the ciphertext less overwhelming to look at by evaluating the following.
print(add_spaces(Y))
Go to the “Lab 1 Ciphertexts” thread on Ed Discussion. Post the displayed ciphertext as a new response. Type triple backticks (on my keyboard, backtick is the key above tab) before pasting the ciphertext, or alternatively, change from “Paragraph” to “Code” in the dropdown menu. This will let Ed Discussion know that all characters should be displayed with the same width.
Vigenère decryption#
There are two main steps to decrypting Vigenère ciphertext: Find the key length and find the individual shift amounts. Finding the individual shift amounts is essentially the same as what we did in Lab 0, so the main work is finding the key length.
We will implement two separate methods for finding the key length, both of which are described in Section 5.2 of Hoffstein, Pipher, and Silverman.
Finding the key-length using the Kasiski method#
Write a function
prop_divusing the following template.
Input: a list of integers
Xand a modulusmOutput: the proportion of integers inXwhich are divisible bymFor example,
prop_div([3,5,8,9,10], 3)should be equal to0.4, because 2 out of 5 of these integers are divisible by 3.
def prop_div(X, m):
n = len(X)
t = 0 # Eventually t will be the number of integers divisible by m in X
for x in X:
if x%??? == ???:
t = t+1
return ??? # Return the proportion, not t.
Write a function
key_length_kasiskito predict the key length using the following template. (Recall that we introduced thekasiski_diffsfunction above.) The idea is to find for which key length value, fromkey_start(inclusive) tokey_end(exclusive), is the highest proportion of gaps between repeated trigrams divisible by that key length. The parameterstop_valueandtop_keyrepresent the current highest proportion and the current best key length, respectively.
Feel free to rewrite this if you would like to use a more elegant/Pythonic approach to find the best key length, as long as it is still using the Kasiski method. (If I were doing this in Math 10, I think I would make a pandas Series prop_ser with values equal to the proportions and keys equal to the key lengths, and then I would find the best key length by using prop_ser.idxmax().)
def key_length_kasiski(Y, key_start=7, key_end=12):
diffs = kasiski_diffs(Y)
top_value = 0
top_key = 0
for k in range(key_start, key_end):
prop = prop_div(diffs, k)
if ??? > ???:
top_value = prop
top_key = k
return ??? # Return the key length which yields the highest value
Test your function by evaluating the following. The output should be
7.
Y = '''zpgdl rjlaj kpylx zpyyg lrjgd lrzhz qyjzq repvm swrzy rigzh
zvreg kwivs saolt nliuw oldie aqewf iiykh bjowr hdogc qhkwa
jyagg emisr zqoqh oavlk bjofr ylvps rtgiu avmsw lzgms evwpc
dmjsv jqbrn klpcf iowhv kxjbj pmfkr qthtk ozrgq ihbmq sbivd
ardym qmpbu nivxm tzwqv gefjh ucbor vwpcd xuwft qmoow jipds
fluqm oeavl jgqea lrkti wvext vkrrg xani'''
key_length_kasiski(Y)
Finding the key-length using the Index of Coincidence#
Write a function
coincidence_meanusing the following template. (Note. Base Python does not have a built-in mean or average function, but it does have a built insumfunction.)
Inputs: a string
Yand an integerm. Output: a real number
def coincidence_mean(Y, m):
coinc_list = []
for i in range(m):
s = # Get the characters in `Y` at integer positions ??? modulo ???
c = # Compute the Index of Coincidence for `s` using the `ind_co` function
coinc_list.append(c)
return ??? # Return the mean value of `coinc_list`
Write a function
key_length_coincidenceto predict the key length using the following template.
Again, feel free to rewrite this if you would like to use a more elegant approach, as long as it is still using the coincidence_mean function from above.
def key_length_coincidence(Y, key_start=7, key_end=12):
top_value = 0
top_key = 0
for k in range(key_start, key_end):
mean = coincidence_mean(???, ???)
if ??? > ???:
top_value = ???
top_key = ???
return ??? # Return the key length which yields the highest value
Test your function by evaluating the following. The output should again be
7.
Y = '''zpgdl rjlaj kpylx zpyyg lrjgd lrzhz qyjzq repvm swrzy rigzh
zvreg kwivs saolt nliuw oldie aqewf iiykh bjowr hdogc qhkwa
jyagg emisr zqoqh oavlk bjofr ylvps rtgiu avmsw lzgms evwpc
dmjsv jqbrn klpcf iowhv kxjbj pmfkr qthtk ozrgq ihbmq sbivd
ardym qmpbu nivxm tzwqv gefjh ucbor vwpcd xuwft qmoow jipds
fluqm oeavl jgqea lrkti wvext vkrrg xani'''
key_length_coincidence(Y)
Finding the shift amounts#
This portion should be easier, because you essentially already did this in Lab 0.
Paste in and execute the code for your
shift_decryptfunction from Lab 0.
Write a function
find_shiftsusing the following template.
Inputs: a Vigenère ciphertext string
Yand a key-lengthkOutput: a list of shift amountsshifts
def find_shifts(Y, k):
shifts = []
for i in range(k):
s = # Get the characters in `Y` at integer positions ??? modulo ???
X, m = shift_decrypt(s) # We only care about the shift amount `m`
shifts.append(m)
return shifts
Putting it all together#
Write a function
vigenere_decryptusing the following template. Themethodkeyword argument allows the user to specify whether they wish to use the Kasiski method or the Index of Coincidence method to find the key length. If no method is specified, the Kasiski method is used by default.
def vigenere_decrypt(Y, key_start=7, key_end=12, method="kasiski"):
Y = only_letters(Y, case="upper") # Remove spaces
if method == "kasiski":
key_length_fn = key_length_kasiski
elif method == "coincidence":
key_length_fn = key_length_coincidence
else:
raise ValueError('''key_length should be either "kasiski" or "coincidence"''')
k = key_length_fn(Y, key_start=key_start, key_end=key_end) # `k` is the predicted key length
shifts = ??? # The predicted shift amounts
X = ??? # The predicted plaintext. Use your `vigenere` function from high above
return X.lower()
Try to decrypt the above ciphertext from Section 5.2 of Hoffstein, Pipher, and Silverman.
Y = '''zpgdl rjlaj kpylx zpyyg lrjgd lrzhz qyjzq repvm swrzy rigzh
zvreg kwivs saolt nliuw oldie aqewf iiykh bjowr hdogc qhkwa
jyagg emisr zqoqh oavlk bjofr ylvps rtgiu avmsw lzgms evwpc
dmjsv jqbrn klpcf iowhv kxjbj pmfkr qthtk ozrgq ihbmq sbivd
ardym qmpbu nivxm tzwqv gefjh ucbor vwpcd xuwft qmoow jipds
fluqm oeavl jgqea lrkti wvext vkrrg xani'''
vigenere_decrypt(Y)
Decrypting Vigenère ciphertext from Ed Discussion#
Select one of your classmate’s Vigenère ciphertexts that were posted on Ed Discussion and attempt to decrypt it using your
vigenere_decryptfunction. Assign the ciphertext to the variable nameY(use triple quotation marks to allow line breaks). Try using both key length methods.
vigenere_decrypt(Y, method="kasiski")
and
vigenere_decrypt(Y, method="coincidence")
Do they both wok?
Submission#
Using the
Sharebutton at the top right, enable public sharing, and enable Comment privileges. Then submit the created link on Canvas.
(Don’t just copy the browser URL. Copy the link that is provided after you click the “Share” button. It won’t be available until you enable public sharing.)
Reminder: Everyone in the group needs to submit this link on Canvas.