Week 7, Tuesday Discussion

Week 7, Tuesday Discussion

Today:

  • Pass back Quiz #3

  • Review for Quiz #4

  • Take Quiz #4

Reminders and Announcements:

  • Homework #5 due tonight 11:59pm (don’t forget about the opportunity for bonus points!)

  • This Thursday we will spend the first half of discussion working on Homework #6 all together; this will give you a head start on the homework, so be sure to join!

  • Midterm #2 next week during Thursday discussion

    • There will be no quiz Tuesday of Week 8, instead we will review for the midterm

    • I will pass out notecards during Thursday discussion this week

Question 1:

Suppose df is the DataFrame shown below. Describe in words what the result of the following code will be.

df["Monday"] = 0

df.loc[df["Date].dt.day_name() == "Monday", "Monday"] = 1

import pandas as pd 

df = pd.read_csv("../data/Fremont.csv")
df["Date"] = pd.to_datetime(df["Date"]).dt.day_name()
df.head()
Date Fremont Bridge Total Fremont Bridge East Sidewalk Fremont Bridge West Sidewalk
0 Friday 12.0 7.0 5.0
1 Friday 7.0 0.0 7.0
2 Friday 1.0 0.0 1.0
3 Friday 6.0 6.0 0.0
4 Friday 6.0 5.0 1.0

Question 2:

Imagine we have a set of data points that we want to fit to a polynomial. What would be the potential drawbacks of fitting to a polynomial of a “large” degree? Of a “small” degree?

polynomial

Source

Question 3:

Imagine you are given the code below. What would be the result of this afterwards?

df2 = pd.DataFrame({"X":[1,2],"Y":[3,4],"Z":[5,6]})

reg.predict(df2)

df = pd.DataFrame({"W":[1,9,17,12],"X":[4,1,3,3],"Y":[1,6,7,23],"Z":[9,1,1,19]})
from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(df[["X","Y","Z"]],df["W"])
LinearRegression()
coefs = pd.Series(reg.coef_, ["X","Y","Z"])
coefs
X    3.346939
Y    1.306122
Z   -1.438776
dtype: float64
reg.intercept_
-0.7448979591836835