NFL offense performance and analysis
Contents
NFL offense performance and analysis#
Author: Shengkai Yang
Course Project, UC Irvine, Math 10, F22
Introduction#
In this project, I want to use machine learning to analyze how some specific data which affect NFL football games. Also, I want to know how these data affect teams performance and predict them.
Main part of project#
You can either have all one section or divide into multiple sections. To make new sections, use ##
in a markdown cell. Double-click this cell for an example of using ##
import pandas as pd
import numpy as np
import altair as alt
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
import seaborn as sns
w4 = pd.read_csv("nfloffenseweek4.csv")
w4
rank | team | games | points_scored | total_yards | offensive_plays | yards_per_play | turnovers_lost | fumbles_lost | 1st_downs | ... | rushing_yards | rushing_touchdowns | rushing_yards_per_attempt | rushing_1st_downs | penalties | penalty_yards | 1st_down_penalties | percentage_scoring_drives | percentage_turnover_drives | expected_points | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Detroit Lions | 4 | 140 | 1747 | 269 | 6.5 | 4 | 1 | 90 | ... | 656 | 7 | 5.9 | 26 | 23 | 188 | 7 | 45.8 | 8.3 | 53.17 |
1 | 2 | Kansas City Chiefs | 4 | 129 | 1539 | 257 | 6.0 | 4 | 2 | 95 | ... | 468 | 4 | 4.5 | 23 | 19 | 156 | 10 | 50.0 | 7.1 | 57.85 |
2 | 3 | Baltimore Ravens | 4 | 119 | 1437 | 230 | 6.2 | 5 | 1 | 77 | ... | 568 | 3 | 5.4 | 31 | 17 | 114 | 7 | 42.2 | 11.1 | 32.97 |
3 | 4 | Philadelphia Eagles | 4 | 115 | 1742 | 285 | 6.1 | 2 | 0 | 98 | ... | 661 | 10 | 4.3 | 40 | 26 | 193 | 9 | 40.0 | 4.4 | 44.88 |
4 | 5 | Buffalo Bills | 4 | 114 | 1650 | 275 | 6.0 | 7 | 4 | 99 | ... | 462 | 2 | 4.8 | 29 | 24 | 167 | 7 | 47.5 | 17.5 | 49.56 |
5 | 6 | Cleveland Browns | 4 | 105 | 1539 | 281 | 5.5 | 3 | 1 | 96 | ... | 749 | 7 | 5.0 | 46 | 26 | 184 | 7 | 46.3 | 7.3 | 40.78 |
6 | 7 | Jacksonville Jaguars | 4 | 105 | 1346 | 250 | 5.4 | 6 | 4 | 83 | ... | 441 | 3 | 4.0 | 23 | 22 | 157 | 8 | 40.9 | 13.6 | 17.69 |
7 | 8 | Atlanta Falcons | 4 | 103 | 1396 | 236 | 5.9 | 8 | 4 | 86 | ... | 672 | 6 | 5.1 | 36 | 16 | 114 | 12 | 44.2 | 18.6 | 30.77 |
8 | 9 | Miami Dolphins | 4 | 98 | 1444 | 227 | 6.4 | 4 | 0 | 80 | ... | 277 | 2 | 3.5 | 15 | 22 | 132 | 9 | 42.1 | 10.5 | 46.01 |
9 | 10 | Las Vegas Raiders | 4 | 96 | 1425 | 256 | 5.6 | 5 | 1 | 87 | ... | 452 | 2 | 5.0 | 26 | 23 | 148 | 5 | 50.0 | 12.5 | 23.26 |
10 | 11 | Seattle Seahawks | 4 | 95 | 1444 | 228 | 6.3 | 6 | 3 | 83 | ... | 459 | 3 | 5.2 | 26 | 32 | 306 | 9 | 44.1 | 11.8 | 42.94 |
11 | 12 | Los Angeles Chargers | 4 | 92 | 1487 | 264 | 5.6 | 4 | 2 | 77 | ... | 258 | 2 | 2.7 | 17 | 21 | 161 | 5 | 36.4 | 6.8 | 17.02 |
12 | 13 | Cincinnati Bengals | 4 | 91 | 1387 | 290 | 4.8 | 6 | 2 | 87 | ... | 358 | 1 | 3.1 | 18 | 22 | 166 | 11 | 37.5 | 12.5 | 6.37 |
13 | 14 | Arizona Cardinals | 4 | 88 | 1398 | 292 | 4.8 | 2 | 0 | 89 | ... | 448 | 4 | 4.1 | 33 | 30 | 280 | 6 | 36.6 | 4.9 | 8.95 |
14 | 15 | Minnesota Vikings | 4 | 86 | 1376 | 254 | 5.4 | 5 | 1 | 92 | ... | 392 | 3 | 4.4 | 22 | 16 | 95 | 11 | 38.6 | 11.4 | 20.12 |
15 | 16 | Tampa Bay Buccaneers | 4 | 82 | 1268 | 245 | 5.2 | 6 | 5 | 77 | ... | 261 | 1 | 3.1 | 13 | 24 | 206 | 9 | 34.8 | 10.9 | -3.64 |
16 | 17 | Carolina Panthers | 4 | 78 | 1049 | 214 | 4.9 | 6 | 3 | 56 | ... | 385 | 2 | 4.5 | 22 | 23 | 250 | 7 | 31.1 | 11.1 | -25.26 |
17 | 18 | New Orleans Saints | 4 | 76 | 1457 | 244 | 6.0 | 11 | 6 | 74 | ... | 446 | 4 | 5.1 | 23 | 34 | 319 | 5 | 27.1 | 20.8 | -11.00 |
18 | 19 | New York Giants | 4 | 76 | 1328 | 256 | 5.2 | 5 | 2 | 80 | ... | 770 | 4 | 5.7 | 43 | 31 | 219 | 9 | 35.6 | 8.9 | 6.34 |
19 | 20 | New York Jets | 4 | 76 | 1458 | 289 | 5.0 | 9 | 4 | 82 | ... | 350 | 1 | 4.1 | 15 | 25 | 257 | 8 | 32.6 | 19.6 | -10.79 |
20 | 21 | Green Bay Packers | 4 | 75 | 1510 | 259 | 5.8 | 7 | 4 | 85 | ... | 580 | 3 | 5.0 | 34 | 17 | 140 | 5 | 28.3 | 15.2 | 16.07 |
21 | 22 | Tennessee Titans | 4 | 75 | 1150 | 220 | 5.2 | 6 | 3 | 69 | ... | 409 | 4 | 3.8 | 19 | 29 | 262 | 9 | 31.0 | 9.5 | 2.39 |
22 | 23 | Pittsburgh Steelers | 4 | 74 | 1115 | 234 | 4.8 | 7 | 2 | 69 | ... | 389 | 4 | 4.0 | 25 | 24 | 159 | 4 | 30.4 | 13.0 | -15.59 |
23 | 24 | New England Patriots | 4 | 74 | 1365 | 241 | 5.7 | 9 | 4 | 74 | ... | 514 | 5 | 4.5 | 33 | 20 | 161 | 3 | 28.6 | 21.4 | -0.64 |
24 | 25 | Houston Texans | 4 | 73 | 1208 | 237 | 5.1 | 5 | 1 | 68 | ... | 380 | 2 | 4.5 | 19 | 26 | 226 | 11 | 31.3 | 10.4 | -12.95 |
25 | 26 | Washington Commanders | 4 | 73 | 1323 | 287 | 4.6 | 7 | 2 | 84 | ... | 402 | 2 | 4.1 | 24 | 23 | 264 | 8 | 21.6 | 13.7 | -16.38 |
26 | 27 | Dallas Cowboys | 4 | 71 | 1251 | 241 | 5.2 | 2 | 1 | 69 | ... | 416 | 2 | 4.0 | 22 | 27 | 208 | 7 | 35.6 | 4.4 | 1.65 |
27 | 28 | San Francisco 49ers | 4 | 71 | 1298 | 238 | 5.5 | 5 | 3 | 68 | ... | 541 | 4 | 4.4 | 29 | 24 | 190 | 8 | 27.3 | 11.4 | -1.21 |
28 | 29 | Los Angeles Rams | 4 | 70 | 1176 | 248 | 4.7 | 9 | 3 | 79 | ... | 274 | 3 | 3.3 | 17 | 14 | 131 | 6 | 35.9 | 23.1 | -8.08 |
29 | 30 | Denver Broncos | 4 | 66 | 1343 | 247 | 5.4 | 4 | 3 | 66 | ... | 438 | 2 | 4.2 | 20 | 37 | 286 | 7 | 31.8 | 9.1 | -6.92 |
30 | 31 | Chicago Bears | 4 | 64 | 1099 | 219 | 5.0 | 7 | 3 | 59 | ... | 709 | 4 | 5.2 | 34 | 21 | 171 | 8 | 31.8 | 13.6 | -5.87 |
31 | 32 | Indianapolis Colts | 4 | 57 | 1359 | 270 | 5.0 | 9 | 4 | 82 | ... | 351 | 1 | 3.5 | 22 | 23 | 203 | 5 | 25.0 | 20.5 | -18.74 |
32 rows × 28 columns
First of all, I need to clean the data, based on the original data, I have the data of rushing touchdowns and passing touchdowns, I want to collect all of them together to make sure how many touchdowns they made.
w4["touchdowns"] = w4["passing_touchdowns"]+w4["rushing_touchdowns"]
w4
rank | team | games | points_scored | total_yards | offensive_plays | yards_per_play | turnovers_lost | fumbles_lost | 1st_downs | ... | rushing_touchdowns | rushing_yards_per_attempt | rushing_1st_downs | penalties | penalty_yards | 1st_down_penalties | percentage_scoring_drives | percentage_turnover_drives | expected_points | touchdowns | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Detroit Lions | 4 | 140 | 1747 | 269 | 6.5 | 4 | 1 | 90 | ... | 7 | 5.9 | 26 | 23 | 188 | 7 | 45.8 | 8.3 | 53.17 | 18 |
1 | 2 | Kansas City Chiefs | 4 | 129 | 1539 | 257 | 6.0 | 4 | 2 | 95 | ... | 4 | 4.5 | 23 | 19 | 156 | 10 | 50.0 | 7.1 | 57.85 | 15 |
2 | 3 | Baltimore Ravens | 4 | 119 | 1437 | 230 | 6.2 | 5 | 1 | 77 | ... | 3 | 5.4 | 31 | 17 | 114 | 7 | 42.2 | 11.1 | 32.97 | 14 |
3 | 4 | Philadelphia Eagles | 4 | 115 | 1742 | 285 | 6.1 | 2 | 0 | 98 | ... | 10 | 4.3 | 40 | 26 | 193 | 9 | 40.0 | 4.4 | 44.88 | 14 |
4 | 5 | Buffalo Bills | 4 | 114 | 1650 | 275 | 6.0 | 7 | 4 | 99 | ... | 2 | 4.8 | 29 | 24 | 167 | 7 | 47.5 | 17.5 | 49.56 | 12 |
5 | 6 | Cleveland Browns | 4 | 105 | 1539 | 281 | 5.5 | 3 | 1 | 96 | ... | 7 | 5.0 | 46 | 26 | 184 | 7 | 46.3 | 7.3 | 40.78 | 11 |
6 | 7 | Jacksonville Jaguars | 4 | 105 | 1346 | 250 | 5.4 | 6 | 4 | 83 | ... | 3 | 4.0 | 23 | 22 | 157 | 8 | 40.9 | 13.6 | 17.69 | 11 |
7 | 8 | Atlanta Falcons | 4 | 103 | 1396 | 236 | 5.9 | 8 | 4 | 86 | ... | 6 | 5.1 | 36 | 16 | 114 | 12 | 44.2 | 18.6 | 30.77 | 9 |
8 | 9 | Miami Dolphins | 4 | 98 | 1444 | 227 | 6.4 | 4 | 0 | 80 | ... | 2 | 3.5 | 15 | 22 | 132 | 9 | 42.1 | 10.5 | 46.01 | 11 |
9 | 10 | Las Vegas Raiders | 4 | 96 | 1425 | 256 | 5.6 | 5 | 1 | 87 | ... | 2 | 5.0 | 26 | 23 | 148 | 5 | 50.0 | 12.5 | 23.26 | 8 |
10 | 11 | Seattle Seahawks | 4 | 95 | 1444 | 228 | 6.3 | 6 | 3 | 83 | ... | 3 | 5.2 | 26 | 32 | 306 | 9 | 44.1 | 11.8 | 42.94 | 9 |
11 | 12 | Los Angeles Chargers | 4 | 92 | 1487 | 264 | 5.6 | 4 | 2 | 77 | ... | 2 | 2.7 | 17 | 21 | 161 | 5 | 36.4 | 6.8 | 17.02 | 11 |
12 | 13 | Cincinnati Bengals | 4 | 91 | 1387 | 290 | 4.8 | 6 | 2 | 87 | ... | 1 | 3.1 | 18 | 22 | 166 | 11 | 37.5 | 12.5 | 6.37 | 9 |
13 | 14 | Arizona Cardinals | 4 | 88 | 1398 | 292 | 4.8 | 2 | 0 | 89 | ... | 4 | 4.1 | 33 | 30 | 280 | 6 | 36.6 | 4.9 | 8.95 | 9 |
14 | 15 | Minnesota Vikings | 4 | 86 | 1376 | 254 | 5.4 | 5 | 1 | 92 | ... | 3 | 4.4 | 22 | 16 | 95 | 11 | 38.6 | 11.4 | 20.12 | 9 |
15 | 16 | Tampa Bay Buccaneers | 4 | 82 | 1268 | 245 | 5.2 | 6 | 5 | 77 | ... | 1 | 3.1 | 13 | 24 | 206 | 9 | 34.8 | 10.9 | -3.64 | 7 |
16 | 17 | Carolina Panthers | 4 | 78 | 1049 | 214 | 4.9 | 6 | 3 | 56 | ... | 2 | 4.5 | 22 | 23 | 250 | 7 | 31.1 | 11.1 | -25.26 | 6 |
17 | 18 | New Orleans Saints | 4 | 76 | 1457 | 244 | 6.0 | 11 | 6 | 74 | ... | 4 | 5.1 | 23 | 34 | 319 | 5 | 27.1 | 20.8 | -11.00 | 9 |
18 | 19 | New York Giants | 4 | 76 | 1328 | 256 | 5.2 | 5 | 2 | 80 | ... | 4 | 5.7 | 43 | 31 | 219 | 9 | 35.6 | 8.9 | 6.34 | 7 |
19 | 20 | New York Jets | 4 | 76 | 1458 | 289 | 5.0 | 9 | 4 | 82 | ... | 1 | 4.1 | 15 | 25 | 257 | 8 | 32.6 | 19.6 | -10.79 | 8 |
20 | 21 | Green Bay Packers | 4 | 75 | 1510 | 259 | 5.8 | 7 | 4 | 85 | ... | 3 | 5.0 | 34 | 17 | 140 | 5 | 28.3 | 15.2 | 16.07 | 9 |
21 | 22 | Tennessee Titans | 4 | 75 | 1150 | 220 | 5.2 | 6 | 3 | 69 | ... | 4 | 3.8 | 19 | 29 | 262 | 9 | 31.0 | 9.5 | 2.39 | 9 |
22 | 23 | Pittsburgh Steelers | 4 | 74 | 1115 | 234 | 4.8 | 7 | 2 | 69 | ... | 4 | 4.0 | 25 | 24 | 159 | 4 | 30.4 | 13.0 | -15.59 | 6 |
23 | 24 | New England Patriots | 4 | 74 | 1365 | 241 | 5.7 | 9 | 4 | 74 | ... | 5 | 4.5 | 33 | 20 | 161 | 3 | 28.6 | 21.4 | -0.64 | 8 |
24 | 25 | Houston Texans | 4 | 73 | 1208 | 237 | 5.1 | 5 | 1 | 68 | ... | 2 | 4.5 | 19 | 26 | 226 | 11 | 31.3 | 10.4 | -12.95 | 7 |
25 | 26 | Washington Commanders | 4 | 73 | 1323 | 287 | 4.6 | 7 | 2 | 84 | ... | 2 | 4.1 | 24 | 23 | 264 | 8 | 21.6 | 13.7 | -16.38 | 10 |
26 | 27 | Dallas Cowboys | 4 | 71 | 1251 | 241 | 5.2 | 2 | 1 | 69 | ... | 2 | 4.0 | 22 | 27 | 208 | 7 | 35.6 | 4.4 | 1.65 | 6 |
27 | 28 | San Francisco 49ers | 4 | 71 | 1298 | 238 | 5.5 | 5 | 3 | 68 | ... | 4 | 4.4 | 29 | 24 | 190 | 8 | 27.3 | 11.4 | -1.21 | 7 |
28 | 29 | Los Angeles Rams | 4 | 70 | 1176 | 248 | 4.7 | 9 | 3 | 79 | ... | 3 | 3.3 | 17 | 14 | 131 | 6 | 35.9 | 23.1 | -8.08 | 7 |
29 | 30 | Denver Broncos | 4 | 66 | 1343 | 247 | 5.4 | 4 | 3 | 66 | ... | 2 | 4.2 | 20 | 37 | 286 | 7 | 31.8 | 9.1 | -6.92 | 6 |
30 | 31 | Chicago Bears | 4 | 64 | 1099 | 219 | 5.0 | 7 | 3 | 59 | ... | 4 | 5.2 | 34 | 21 | 171 | 8 | 31.8 | 13.6 | -5.87 | 6 |
31 | 32 | Indianapolis Colts | 4 | 57 | 1359 | 270 | 5.0 | 9 | 4 | 82 | ... | 1 | 3.5 | 22 | 23 | 203 | 5 | 25.0 | 20.5 | -18.74 | 6 |
32 rows × 29 columns
w4["lost"] = w4["turnovers_lost"]+w4["fumbles_lost"]
w4
rank | team | games | points_scored | total_yards | offensive_plays | yards_per_play | turnovers_lost | fumbles_lost | 1st_downs | ... | rushing_yards_per_attempt | rushing_1st_downs | penalties | penalty_yards | 1st_down_penalties | percentage_scoring_drives | percentage_turnover_drives | expected_points | touchdowns | lost | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | Detroit Lions | 4 | 140 | 1747 | 269 | 6.5 | 4 | 1 | 90 | ... | 5.9 | 26 | 23 | 188 | 7 | 45.8 | 8.3 | 53.17 | 18 | 5 |
1 | 2 | Kansas City Chiefs | 4 | 129 | 1539 | 257 | 6.0 | 4 | 2 | 95 | ... | 4.5 | 23 | 19 | 156 | 10 | 50.0 | 7.1 | 57.85 | 15 | 6 |
2 | 3 | Baltimore Ravens | 4 | 119 | 1437 | 230 | 6.2 | 5 | 1 | 77 | ... | 5.4 | 31 | 17 | 114 | 7 | 42.2 | 11.1 | 32.97 | 14 | 6 |
3 | 4 | Philadelphia Eagles | 4 | 115 | 1742 | 285 | 6.1 | 2 | 0 | 98 | ... | 4.3 | 40 | 26 | 193 | 9 | 40.0 | 4.4 | 44.88 | 14 | 2 |
4 | 5 | Buffalo Bills | 4 | 114 | 1650 | 275 | 6.0 | 7 | 4 | 99 | ... | 4.8 | 29 | 24 | 167 | 7 | 47.5 | 17.5 | 49.56 | 12 | 11 |
5 | 6 | Cleveland Browns | 4 | 105 | 1539 | 281 | 5.5 | 3 | 1 | 96 | ... | 5.0 | 46 | 26 | 184 | 7 | 46.3 | 7.3 | 40.78 | 11 | 4 |
6 | 7 | Jacksonville Jaguars | 4 | 105 | 1346 | 250 | 5.4 | 6 | 4 | 83 | ... | 4.0 | 23 | 22 | 157 | 8 | 40.9 | 13.6 | 17.69 | 11 | 10 |
7 | 8 | Atlanta Falcons | 4 | 103 | 1396 | 236 | 5.9 | 8 | 4 | 86 | ... | 5.1 | 36 | 16 | 114 | 12 | 44.2 | 18.6 | 30.77 | 9 | 12 |
8 | 9 | Miami Dolphins | 4 | 98 | 1444 | 227 | 6.4 | 4 | 0 | 80 | ... | 3.5 | 15 | 22 | 132 | 9 | 42.1 | 10.5 | 46.01 | 11 | 4 |
9 | 10 | Las Vegas Raiders | 4 | 96 | 1425 | 256 | 5.6 | 5 | 1 | 87 | ... | 5.0 | 26 | 23 | 148 | 5 | 50.0 | 12.5 | 23.26 | 8 | 6 |
10 | 11 | Seattle Seahawks | 4 | 95 | 1444 | 228 | 6.3 | 6 | 3 | 83 | ... | 5.2 | 26 | 32 | 306 | 9 | 44.1 | 11.8 | 42.94 | 9 | 9 |
11 | 12 | Los Angeles Chargers | 4 | 92 | 1487 | 264 | 5.6 | 4 | 2 | 77 | ... | 2.7 | 17 | 21 | 161 | 5 | 36.4 | 6.8 | 17.02 | 11 | 6 |
12 | 13 | Cincinnati Bengals | 4 | 91 | 1387 | 290 | 4.8 | 6 | 2 | 87 | ... | 3.1 | 18 | 22 | 166 | 11 | 37.5 | 12.5 | 6.37 | 9 | 8 |
13 | 14 | Arizona Cardinals | 4 | 88 | 1398 | 292 | 4.8 | 2 | 0 | 89 | ... | 4.1 | 33 | 30 | 280 | 6 | 36.6 | 4.9 | 8.95 | 9 | 2 |
14 | 15 | Minnesota Vikings | 4 | 86 | 1376 | 254 | 5.4 | 5 | 1 | 92 | ... | 4.4 | 22 | 16 | 95 | 11 | 38.6 | 11.4 | 20.12 | 9 | 6 |
15 | 16 | Tampa Bay Buccaneers | 4 | 82 | 1268 | 245 | 5.2 | 6 | 5 | 77 | ... | 3.1 | 13 | 24 | 206 | 9 | 34.8 | 10.9 | -3.64 | 7 | 11 |
16 | 17 | Carolina Panthers | 4 | 78 | 1049 | 214 | 4.9 | 6 | 3 | 56 | ... | 4.5 | 22 | 23 | 250 | 7 | 31.1 | 11.1 | -25.26 | 6 | 9 |
17 | 18 | New Orleans Saints | 4 | 76 | 1457 | 244 | 6.0 | 11 | 6 | 74 | ... | 5.1 | 23 | 34 | 319 | 5 | 27.1 | 20.8 | -11.00 | 9 | 17 |
18 | 19 | New York Giants | 4 | 76 | 1328 | 256 | 5.2 | 5 | 2 | 80 | ... | 5.7 | 43 | 31 | 219 | 9 | 35.6 | 8.9 | 6.34 | 7 | 7 |
19 | 20 | New York Jets | 4 | 76 | 1458 | 289 | 5.0 | 9 | 4 | 82 | ... | 4.1 | 15 | 25 | 257 | 8 | 32.6 | 19.6 | -10.79 | 8 | 13 |
20 | 21 | Green Bay Packers | 4 | 75 | 1510 | 259 | 5.8 | 7 | 4 | 85 | ... | 5.0 | 34 | 17 | 140 | 5 | 28.3 | 15.2 | 16.07 | 9 | 11 |
21 | 22 | Tennessee Titans | 4 | 75 | 1150 | 220 | 5.2 | 6 | 3 | 69 | ... | 3.8 | 19 | 29 | 262 | 9 | 31.0 | 9.5 | 2.39 | 9 | 9 |
22 | 23 | Pittsburgh Steelers | 4 | 74 | 1115 | 234 | 4.8 | 7 | 2 | 69 | ... | 4.0 | 25 | 24 | 159 | 4 | 30.4 | 13.0 | -15.59 | 6 | 9 |
23 | 24 | New England Patriots | 4 | 74 | 1365 | 241 | 5.7 | 9 | 4 | 74 | ... | 4.5 | 33 | 20 | 161 | 3 | 28.6 | 21.4 | -0.64 | 8 | 13 |
24 | 25 | Houston Texans | 4 | 73 | 1208 | 237 | 5.1 | 5 | 1 | 68 | ... | 4.5 | 19 | 26 | 226 | 11 | 31.3 | 10.4 | -12.95 | 7 | 6 |
25 | 26 | Washington Commanders | 4 | 73 | 1323 | 287 | 4.6 | 7 | 2 | 84 | ... | 4.1 | 24 | 23 | 264 | 8 | 21.6 | 13.7 | -16.38 | 10 | 9 |
26 | 27 | Dallas Cowboys | 4 | 71 | 1251 | 241 | 5.2 | 2 | 1 | 69 | ... | 4.0 | 22 | 27 | 208 | 7 | 35.6 | 4.4 | 1.65 | 6 | 3 |
27 | 28 | San Francisco 49ers | 4 | 71 | 1298 | 238 | 5.5 | 5 | 3 | 68 | ... | 4.4 | 29 | 24 | 190 | 8 | 27.3 | 11.4 | -1.21 | 7 | 8 |
28 | 29 | Los Angeles Rams | 4 | 70 | 1176 | 248 | 4.7 | 9 | 3 | 79 | ... | 3.3 | 17 | 14 | 131 | 6 | 35.9 | 23.1 | -8.08 | 7 | 12 |
29 | 30 | Denver Broncos | 4 | 66 | 1343 | 247 | 5.4 | 4 | 3 | 66 | ... | 4.2 | 20 | 37 | 286 | 7 | 31.8 | 9.1 | -6.92 | 6 | 7 |
30 | 31 | Chicago Bears | 4 | 64 | 1099 | 219 | 5.0 | 7 | 3 | 59 | ... | 5.2 | 34 | 21 | 171 | 8 | 31.8 | 13.6 | -5.87 | 6 | 10 |
31 | 32 | Indianapolis Colts | 4 | 57 | 1359 | 270 | 5.0 | 9 | 4 | 82 | ... | 3.5 | 22 | 23 | 203 | 5 | 25.0 | 20.5 | -18.74 | 6 | 13 |
32 rows × 30 columns
Using Altair chart to show the relationship between their offense performance and the rank#
c1 = alt.Chart(w4).mark_circle().encode(
x="yards_per_play",
y="points_scored",
color =alt.Color("rank", scale=alt.Scale(scheme="goldgreen")),
tooltip =["team", "yards_per_play", "points_scored"]
)
c1
w4.columns
Index(['rank', 'team', 'games', 'points_scored', 'total_yards',
'offensive_plays', 'yards_per_play', 'turnovers_lost', 'fumbles_lost',
'1st_downs', 'passes_completed', 'passes_attempted', 'passing_yards',
'passing_touchdowns', 'passing_interceptions',
'net_yards_per_pass_attempt', 'passing_1st_downs', 'rushing_attempts',
'rushing_yards', 'rushing_touchdowns', 'rushing_yards_per_attempt',
'rushing_1st_downs', 'penalties', 'penalty_yards', '1st_down_penalties',
'percentage_scoring_drives', 'percentage_turnover_drives',
'expected_points', 'touchdowns', 'lost'],
dtype='object')
from above altair chart, we can tell lighter yellow means higher rank, darker green means lower rank, in the middle part. Since Green and Yello is totally different, so we can directly see the relationship between rank and offense performance. Miami Dolphins gets top yards_per_play in the league, but the rank is low.
Using DecisionTree to classify to predict#
I want to find the relationship between yards per play and teams’ touchdowns, so I use decision tree to predict them.
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from pandas.api.types import is_numeric_dtype
w4_1 = w4[["points_scored","yards_per_play","offensive_plays","passes_completed","passes_attempted","touchdowns"]]
w4_1
points_scored | yards_per_play | offensive_plays | passes_completed | passes_attempted | touchdowns | |
---|---|---|---|---|---|---|
0 | 140 | 6.5 | 269 | 93 | 152 | 18 |
1 | 129 | 6.0 | 257 | 97 | 147 | 15 |
2 | 119 | 6.2 | 230 | 76 | 117 | 14 |
3 | 115 | 6.1 | 285 | 82 | 123 | 14 |
4 | 114 | 6.0 | 275 | 113 | 170 | 12 |
5 | 105 | 5.5 | 281 | 82 | 127 | 11 |
6 | 105 | 5.4 | 250 | 88 | 134 | 11 |
7 | 103 | 5.9 | 236 | 57 | 98 | 9 |
8 | 98 | 6.4 | 227 | 94 | 140 | 11 |
9 | 96 | 5.6 | 256 | 95 | 155 | 8 |
10 | 95 | 6.3 | 228 | 102 | 133 | 9 |
11 | 92 | 5.6 | 264 | 111 | 166 | 11 |
12 | 91 | 4.8 | 290 | 101 | 157 | 9 |
13 | 88 | 4.8 | 292 | 115 | 177 | 9 |
14 | 86 | 5.4 | 254 | 100 | 158 | 9 |
15 | 82 | 5.2 | 245 | 106 | 155 | 7 |
16 | 78 | 4.9 | 214 | 64 | 117 | 6 |
17 | 76 | 6.0 | 244 | 93 | 143 | 9 |
18 | 76 | 5.2 | 256 | 68 | 108 | 7 |
19 | 76 | 5.0 | 289 | 111 | 193 | 8 |
20 | 75 | 5.8 | 259 | 93 | 134 | 9 |
21 | 75 | 5.2 | 220 | 68 | 105 | 9 |
22 | 74 | 4.8 | 234 | 79 | 129 | 6 |
23 | 74 | 5.7 | 241 | 79 | 118 | 8 |
24 | 73 | 5.1 | 237 | 88 | 142 | 7 |
25 | 73 | 4.6 | 287 | 107 | 172 | 10 |
26 | 71 | 5.2 | 241 | 76 | 131 | 6 |
27 | 71 | 5.5 | 238 | 62 | 108 | 7 |
28 | 70 | 4.7 | 248 | 106 | 150 | 7 |
29 | 66 | 5.4 | 247 | 80 | 131 | 6 |
30 | 64 | 5.0 | 219 | 34 | 67 | 6 |
31 | 57 | 5.0 | 270 | 102 | 154 | 6 |
num_cols = [c for c in w4_1.columns if is_numeric_dtype(w4[c])]
num_cols
['points_scored',
'yards_per_play',
'offensive_plays',
'passes_completed',
'passes_attempted',
'touchdowns']
features = [c for c in w4_1 if c != "touchdowns"]
features
['points_scored',
'yards_per_play',
'offensive_plays',
'passes_completed',
'passes_attempted']
x = w4_1[features]
y = w4_1["touchdowns"]
X_train, X_test, y_train, y_test = train_test_split(w4_1, w4_1["touchdowns"], test_size=0.2, random_state=0)
clf = DecisionTreeClassifier(max_depth=6)
X_test
y_test
11 11
22 6
10 9
2 14
16 6
14 9
28 7
Name: touchdowns, dtype: int64
X_train
y_train
26 6
20 9
13 9
24 7
5 11
17 9
8 11
30 6
25 10
23 8
1 15
31 6
6 11
4 12
18 7
29 6
19 8
9 8
7 9
27 7
3 14
0 18
21 9
15 7
12 9
Name: touchdowns, dtype: int64
clf.fit(X_train, y_train)
DecisionTreeClassifier(max_depth=6)
clf.score(X_train, y_train)
0.96
clf.score(X_test, y_test)
0.8571428571428571
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(9,15))
plot_tree(
clf,
feature_names=clf.feature_names_in_,
filled=True
);
Based on above information, I choose max_depth is 6, and the train set value is 0.96 and the test value is round to 0.86, so this is not overfitting in this model. Also in the above Decission chart, we can see the different situation’s result. Like touchdowns less or equal than 9.5,etc.
Using K-Neighbors to predict#
This is the extra part of Math10, because this classifier is supervised and make classificartions, predictions about individual data point in a group, so I think it is good for analyze NFL datas. Even though peopla call it “lazy”.
from sklearn.neighbors import KNeighborsClassifier
scaler = StandardScaler()
scaler.fit(X)
X_scaled = scaler.transform(X)
X = w4[["rushing_attempts","passes_attempted"]]
y = w4["points_scored"]
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, train_size=0.6, random_state=4)
clf2 = KNeighborsClassifier()
clf2.fit(X_train, y_train)
KNeighborsClassifier()
w4_1["pred"] = clf2.predict(X_scaled)
clf2.fit(X_train, y_train)
/shared-libs/python3.7/py-core/lib/python3.7/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
"""Entry point for launching an IPython kernel.
KNeighborsClassifier()
w4_1
points_scored | yards_per_play | offensive_plays | passes_completed | passes_attempted | touchdowns | pred_1 | predict | pred | |
---|---|---|---|---|---|---|---|---|---|
0 | 140 | 6.5 | 269 | 93 | 152 | 18 | 57 | 57 | 57 |
1 | 129 | 6.0 | 257 | 97 | 147 | 15 | 57 | 57 | 57 |
2 | 119 | 6.2 | 230 | 76 | 117 | 14 | 66 | 66 | 66 |
3 | 115 | 6.1 | 285 | 82 | 123 | 14 | 105 | 105 | 105 |
4 | 114 | 6.0 | 275 | 113 | 170 | 12 | 57 | 57 | 57 |
5 | 105 | 5.5 | 281 | 82 | 127 | 11 | 74 | 74 | 74 |
6 | 105 | 5.4 | 250 | 88 | 134 | 11 | 66 | 66 | 66 |
7 | 103 | 5.9 | 236 | 57 | 98 | 9 | 64 | 64 | 64 |
8 | 98 | 6.4 | 227 | 94 | 140 | 11 | 73 | 73 | 73 |
9 | 96 | 5.6 | 256 | 95 | 155 | 8 | 57 | 57 | 57 |
10 | 95 | 6.3 | 228 | 102 | 133 | 9 | 66 | 66 | 66 |
11 | 92 | 5.6 | 264 | 111 | 166 | 11 | 57 | 57 | 57 |
12 | 91 | 4.8 | 290 | 101 | 157 | 9 | 57 | 57 | 57 |
13 | 88 | 4.8 | 292 | 115 | 177 | 9 | 57 | 57 | 57 |
14 | 86 | 5.4 | 254 | 100 | 158 | 9 | 57 | 57 | 57 |
15 | 82 | 5.2 | 245 | 106 | 155 | 7 | 73 | 73 | 73 |
16 | 78 | 4.9 | 214 | 64 | 117 | 6 | 66 | 66 | 66 |
17 | 76 | 6.0 | 244 | 93 | 143 | 9 | 73 | 73 | 73 |
18 | 76 | 5.2 | 256 | 68 | 108 | 7 | 64 | 64 | 64 |
19 | 76 | 5.0 | 289 | 111 | 193 | 8 | 73 | 73 | 73 |
20 | 75 | 5.8 | 259 | 93 | 134 | 9 | 66 | 66 | 66 |
21 | 75 | 5.2 | 220 | 68 | 105 | 9 | 66 | 66 | 66 |
22 | 74 | 4.8 | 234 | 79 | 129 | 6 | 66 | 66 | 66 |
23 | 74 | 5.7 | 241 | 79 | 118 | 8 | 66 | 66 | 66 |
24 | 73 | 5.1 | 237 | 88 | 142 | 7 | 73 | 73 | 73 |
25 | 73 | 4.6 | 287 | 107 | 172 | 10 | 57 | 57 | 57 |
26 | 71 | 5.2 | 241 | 76 | 131 | 6 | 66 | 66 | 66 |
27 | 71 | 5.5 | 238 | 62 | 108 | 7 | 66 | 66 | 66 |
28 | 70 | 4.7 | 248 | 106 | 150 | 7 | 73 | 73 | 73 |
29 | 66 | 5.4 | 247 | 80 | 131 | 6 | 66 | 66 | 66 |
30 | 64 | 5.0 | 219 | 34 | 67 | 6 | 64 | 64 | 64 |
31 | 57 | 5.0 | 270 | 102 | 154 | 6 | 57 | 57 | 57 |
c3 = alt.Chart(w4_1).mark_circle().encode(
x="passes_completed",
y="passes_attempted",
color=alt.Color("pred", title="rank"),
tooltip = ('passes_attempted','touchdowns','points_scored','yards_per_play','passes_completed')
).properties(
title="Passing",
width=400,
height=400,
)
c3
Using KNeighborsCalssifier and altair chart to show the performance of passing in each team, except two teams, which is almost 50% passing complete on the left side, and another one 193 passes attempts and 111 completed, high passes, high rate of success. other team has the similar rate of passing success. We can see there is a line in the graph.
Summary#
In the final project, I use altair chart, Decission Tree and KNeighbor classifier to analyze the NFL teams offense performance. Also, using Decission Tree to make sure whtether it is overfitting or not which is very important in machine learning.
References#
Your code above should include references. Here is some additional space for references.
What is the source of your dataset(s)? https://www.kaggle.com/datasets/kendallgillies/nflstatistics
List any other references that you found helpful. https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
Submission#
Using the Share button at the top right, enable Comment privileges for anyone with a link to the project. Then submit that link on Canvas.
Created in Deepnote