Worksheet 3

Worksheet 3#

This worksheet is due Monday night of Week 3. You are encouraged to work in groups of up to 3 total students, but each student should submit their own file. (It’s fine for everyone in the group to upload the same file.)

These questions refer to the attached vending machines csv file, vend.csv.

Load the file as a pandas DataFrame using pd.read_csv and store it as the variable df. (You will need to import pandas first.)

How many rows are there in this DataFrame? How many columns? Use the shape attribute. (When we refer to something as an attribute, it usually means we will not be using parentheses with it. Methods are like functions and attributes are like variables. Both methods and attributes are attached to an object and are accessed using a period ..)

Using the dtypes attribute of this DataFrame, check how the data type of the “Location” column is represented. Among all the columns, what different data types are listed?

Access the row at integer location 2420 using iloc and square brackets. Store this in the variable x.

Using the Python built-in function type, what is the data type of x?

What is the value of x.loc["Location"]? Is there any difference if you use x["Location"]? What about x("Location")?

What is the type of x.loc["Location"]? (Notice how this type was not directly reported to us by pandas when we used the dtypes attribute. When something is reported as having “object” as its dtype, I usually assume it is a string, but it could also be something else, like a list.)

Using Boolean indexing, define df_sub to be the sub-DataFrame containing all the transactions from this same location.

How many rows in the original DataFrame correspond to this location? Set the variable a to be equal to this integer. (Check. It should be between 600 and 700.)

What values of b and c are such that df_sub.loc[13, "Transaction"] is equal to df_sub.iloc[b,c]? (Remember that counting in Python starts at 0. I don’t intend you to have a computer code way of finding these values. Just look at df_sub and check.) Store these values.

There was exactly one transaction in df_sub where the “RPrice” was 1.5 and where “RQty” was 2 (meaning two items were sold in the same transaction). What was the name of that product (i.e., the value in the “Product” column? Store that string with the variable d. (Be sure your answer is exact, including spacing and capitalization.)

There is exactly one row in df where the "RPrice" is not equal to the "MPrice". What is the index of that row? Set e to be equal to that index. (The index is the number that’s displayed all the way on the left. You can access the index by using the index attribute. To check whether two elements are not equal, you can use !=. Another option is to check for equality and then to negate it using tilde ~.)

Put these five values (four integers and one string) into a tuple, my_tuple = (a,b,c,d,e).
Save my_tuple in a pickle file named "wkst3-ans.pickle" using the following code. Submit that file on Canvas as your submission for Worksheet 3.

import pickle

with open("wkst3-ans.pickle", 'wb') as f:
    pickle.dump(my_tuple, f)

If you want to double-check that this "wkst3-ans.pickle" pickle file really contains your answer, you can run the following code. If you then evaluate or print x, you should see your original my_tuple values. (If you’re in a new notebook, you also need to import the pickle module again.)

with open("wkst3-ans.pickle", 'rb') as f:
    x = pickle.load(f)

Created in Deepnote