Week 7, Tuesday Discussion
Week 7, Tuesday Discussion¶
Today:
Pass back Quiz #3
Review for Quiz #4
Take Quiz #4
Reminders and Announcements:
Homework #5 due tonight 11:59pm (don’t forget about the opportunity for bonus points!)
This Thursday we will spend the first half of discussion working on Homework #6 all together; this will give you a head start on the homework, so be sure to join!
Midterm #2 next week during Thursday discussion
There will be no quiz Tuesday of Week 8, instead we will review for the midterm
I will pass out notecards during Thursday discussion this week
Question 1:
Suppose df
is the DataFrame shown below. Describe in words what the result of the following code will be.
df["Monday"] = 0
df.loc[df["Date].dt.day_name() == "Monday", "Monday"] = 1
import pandas as pd
df = pd.read_csv("../data/Fremont.csv")
df["Date"] = pd.to_datetime(df["Date"]).dt.day_name()
df.head()
Date | Fremont Bridge Total | Fremont Bridge East Sidewalk | Fremont Bridge West Sidewalk | |
---|---|---|---|---|
0 | Friday | 12.0 | 7.0 | 5.0 |
1 | Friday | 7.0 | 0.0 | 7.0 |
2 | Friday | 1.0 | 0.0 | 1.0 |
3 | Friday | 6.0 | 6.0 | 0.0 |
4 | Friday | 6.0 | 5.0 | 1.0 |
Question 2:
Imagine we have a set of data points that we want to fit to a polynomial. What would be the potential drawbacks of fitting to a polynomial of a “large” degree? Of a “small” degree?
Question 3:
Imagine you are given the code below. What would be the result of this afterwards?
df2 = pd.DataFrame({"X":[1,2],"Y":[3,4],"Z":[5,6]})
reg.predict(df2)
df = pd.DataFrame({"W":[1,9,17,12],"X":[4,1,3,3],"Y":[1,6,7,23],"Z":[9,1,1,19]})
from sklearn.linear_model import LinearRegression
reg = LinearRegression()
reg.fit(df[["X","Y","Z"]],df["W"])
LinearRegression()
coefs = pd.Series(reg.coef_, ["X","Y","Z"])
coefs
X 3.346939
Y 1.306122
Z -1.438776
dtype: float64
reg.intercept_
-0.7448979591836835