Worksheet 4#
This worksheet is due Monday night of Week 3. You are encouraged to work in groups of up to 3 total students, but each student should submit their own file. (It’s fine for everyone in the group to upload the same file.)
These questions refer to the attached vending machines csv file, vend.csv.
Goal#
The goal of this worksheet is to make a 8-by-3 pandas DataFrame. The columns for this DataFrame will be named “Month”, “transactions” and “total price”. The entries will correspond to the number of transactions in that month, as well as the combined price for all those transactions.
- Load the attached - vend.csvdataset using- pd.read_csv, and store it with the variable name- df.
- Look at the first few rows of - dfusing the- headmethod.
- Which of the columns contains date-like values? Convert this column to actual date values (i.e., values that are recognized as dates by pandas) using the pandas function - to_datetime. (You shouldn’t need to use a for loop or anything like that, just use the entire column as an input to the pandas- to_datetimefunction.) Save the resulting pandas Series as a new column in- df, named- "Date".
- Get the corresponding month name for each date, by using the - dtaccessor and the method- month_nameon the new “Date” column. Save this as a new column in- df, named- "Month".
- Write a function - month_countwhich takes as input a month name like “March” and as output returns how many transactions occurred in this month (use Boolean indexing… there are other ways, but we’ll use a similar approach below to get the total pr).
For example, the output for “March” should be 633.
- Write a function - total_pricewhich takes as input a month name like “March” and as output returns the sum of the “RPrice” value for all transactions occuring in that month.
For example, the output for “March” should be 1117.75.
- Make an empty DataFrame, that will eventually be the submission for this worksheet. Use - pd.DataFrame(), nothing inside the parentheses. Name this empty DataFrame- df_out.
- Put a “Month” column in - df_outcontaining the months from- df, each listed one time. (Use- df["Month"].unique()to get each value one time.)
- Apply the - month_countfunction to each value in- df_out["Month"]by calling- df_out["Month"].map(month_count). Store the result as a new column in- df_outnamed- "transactions".
- Using the same strategy, make a column named - "total price"in- df_out, obtained by applying the- total_pricefunction to each entry in the- "Month"column of- df_out.
- The resulting DataFrame - df_outshould have 8 rows and 3 columns. Save it to a file named- df_out.csvusing the pandas DataFrame method- to_csv. (You need to give the desired file name as the first argument to the- to_csvmethod.) Specify the keyword argument- index=Falseso that the numbers- 0through- 7from the index are not stored.
- Submit this csv file on Canvas. 
