Week 2 Videos
Contents
Week 2 Videos#
List comprehension, Part 1#
Make a length 8 list of all 6s using list comprehension.
mylist = []
for i in range(8):
mylist.append(6)
mylist
[6, 6, 6, 6, 6, 6, 6, 6]
[6 for i in range(8)]
[6, 6, 6, 6, 6, 6, 6, 6]
Let
mylist = [3,1,-2,10,-5,3,6,2,8]
. Square each element inmylist
.
mylist = [3,1,-2,10,-5,3,6,2,8]
[x**2 for x in mylist]
[9, 1, 4, 100, 25, 9, 36, 4, 64]
Get the sublist of
mylist
containing only those numbers \(x\) satisfying \(-2 < x \leq 3\).
[x for x in mylist if (x > -2) and (x <= 3)]
[3, 1, 3, 2]
Replace each negative number in
mylist
with 0.
[0 if x < 0 else x for x in mylist]
[3, 1, 0, 10, 0, 3, 6, 2, 8]
List comprehension, Part 2#
Make the length-8 list of lists
[[0,1,2], [0,1,2], ..., [0,1,2]]
, then convert it to a pandas DataFrame.
mylist = [[0,1,2] for _ in range(8)]
mylist
[[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2],
[0, 1, 2]]
import pandas as pd
pd.DataFrame(mylist)
0 | 1 | 2 | |
---|---|---|---|
0 | 0 | 1 | 2 |
1 | 0 | 1 | 2 |
2 | 0 | 1 | 2 |
3 | 0 | 1 | 2 |
4 | 0 | 1 | 2 |
5 | 0 | 1 | 2 |
6 | 0 | 1 | 2 |
7 | 0 | 1 | 2 |
Make the length-24 list
[0,1,2,0,1,2,...,0,1,2]
.
mylist = []
for i in range(8):
for j in range(3):
mylist.append(j)
mylist
[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]
[j for i in range(8) for j in range(3)]
[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]
Capitalize each word in the Math 10 course catalogue description:
Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting; the libraries NumPy, pandas, scikit-learn. Foundations of machine learning.
desc = "Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting;
the libraries NumPy, pandas, scikit-learn. Foundations of machine learning."
Cell In [23], line 1
desc = "Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting;
^
SyntaxError: EOL while scanning string literal
desc = '''Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting;
the libraries NumPy, pandas, scikit-learn. Foundations of machine learning.'''
[c for c in desc]
['I',
'n',
't',
'r',
'o',
'd',
'u',
'c',
't',
'i',
'o',
'n',
' ',
't',
'o',
' ',
'P',
'y',
't',
'h',
'o',
'n',
' ',
'f',
'o',
'r',
' ',
'd',
'a',
't',
'a',
' ',
's',
'c',
'i',
'e',
'n',
'c',
'e',
'.',
' ',
'S',
'e',
'l',
'e',
'c',
't',
'i',
'n',
'g',
' ',
'a',
'p',
'p',
'r',
'o',
'p',
'r',
'i',
'a',
't',
'e',
' ',
'd',
'a',
't',
'a',
' ',
't',
'y',
'p',
'e',
's',
';',
' ',
'f',
'u',
'n',
'c',
't',
'i',
'o',
'n',
's',
' ',
'a',
'n',
'd',
' ',
'm',
'e',
't',
'h',
'o',
'd',
's',
';',
' ',
'p',
'l',
'o',
't',
't',
'i',
'n',
'g',
';',
'\n',
't',
'h',
'e',
' ',
'l',
'i',
'b',
'r',
'a',
'r',
'i',
'e',
's',
' ',
'N',
'u',
'm',
'P',
'y',
',',
' ',
'p',
'a',
'n',
'd',
'a',
's',
',',
' ',
's',
'c',
'i',
'k',
'i',
't',
'-',
'l',
'e',
'a',
'r',
'n',
'.',
' ',
'F',
'o',
'u',
'n',
'd',
'a',
't',
'i',
'o',
'n',
's',
' ',
'o',
'f',
' ',
'm',
'a',
'c',
'h',
'i',
'n',
'e',
' ',
'l',
'e',
'a',
'r',
'n',
'i',
'n',
'g',
'.']
desc.split()
['Introduction',
'to',
'Python',
'for',
'data',
'science.',
'Selecting',
'appropriate',
'data',
'types;',
'functions',
'and',
'methods;',
'plotting;',
'the',
'libraries',
'NumPy,',
'pandas,',
'scikit-learn.',
'Foundations',
'of',
'machine',
'learning.']
[word for word in desc.split()]
['Introduction',
'to',
'Python',
'for',
'data',
'science.',
'Selecting',
'appropriate',
'data',
'types;',
'functions',
'and',
'methods;',
'plotting;',
'the',
'libraries',
'NumPy,',
'pandas,',
'scikit-learn.',
'Foundations',
'of',
'machine',
'learning.']
[word.capitalize() for word in desc.split()]
['Introduction',
'To',
'Python',
'For',
'Data',
'Science.',
'Selecting',
'Appropriate',
'Data',
'Types;',
'Functions',
'And',
'Methods;',
'Plotting;',
'The',
'Libraries',
'Numpy,',
'Pandas,',
'Scikit-learn.',
'Foundations',
'Of',
'Machine',
'Learning.']
'Christopher'.join([word.capitalize() for word in desc.split()])
'IntroductionChristopherToChristopherPythonChristopherForChristopherDataChristopherScience.ChristopherSelectingChristopherAppropriateChristopherDataChristopherTypes;ChristopherFunctionsChristopherAndChristopherMethods;ChristopherPlotting;ChristopherTheChristopherLibrariesChristopherNumpy,ChristopherPandas,ChristopherScikit-learn.ChristopherFoundationsChristopherOfChristopherMachineChristopherLearning.'
' '.join([word.capitalize() for word in desc.split()])
'Introduction To Python For Data Science. Selecting Appropriate Data Types; Functions And Methods; Plotting; The Libraries Numpy, Pandas, Scikit-learn. Foundations Of Machine Learning.'
List comprehension to get row labels#
import pandas as pd
df = pd.read_csv("spotify_dataset.csv")
df.head()
Index | Highest Charting Position | Number of Times Charted | Week of Highest Charting | Song Name | Streams | Artist | Artist Followers | Song ID | Genre | ... | Danceability | Energy | Loudness | Speechiness | Acousticness | Liveness | Tempo | Duration (ms) | Valence | Chord | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 1 | 8 | 2021-07-23--2021-07-30 | Beggin' | 48,633,449 | Måneskin | 3377762 | 3Wrjm47oTz2sjIgck11l5e | ['indie rock italiano', 'italian pop'] | ... | 0.714 | 0.8 | -4.808 | 0.0504 | 0.127 | 0.359 | 134.002 | 211560 | 0.589 | B |
1 | 2 | 2 | 3 | 2021-07-23--2021-07-30 | STAY (with Justin Bieber) | 47,248,719 | The Kid LAROI | 2230022 | 5HCyWlXZPP0y6Gqq8TgA20 | ['australian hip hop'] | ... | 0.591 | 0.764 | -5.484 | 0.0483 | 0.0383 | 0.103 | 169.928 | 141806 | 0.478 | C#/Db |
2 | 3 | 1 | 11 | 2021-06-25--2021-07-02 | good 4 u | 40,162,559 | Olivia Rodrigo | 6266514 | 4ZtFanR9U6ndgddUvNcjcG | ['pop'] | ... | 0.563 | 0.664 | -5.044 | 0.154 | 0.335 | 0.0849 | 166.928 | 178147 | 0.688 | A |
3 | 4 | 3 | 5 | 2021-07-02--2021-07-09 | Bad Habits | 37,799,456 | Ed Sheeran | 83293380 | 6PQ88X9TkUIAUIZJHW2upE | ['pop', 'uk pop'] | ... | 0.808 | 0.897 | -3.712 | 0.0348 | 0.0469 | 0.364 | 126.026 | 231041 | 0.591 | B |
4 | 5 | 5 | 1 | 2021-07-23--2021-07-30 | INDUSTRY BABY (feat. Jack Harlow) | 33,948,454 | Lil Nas X | 5473565 | 27NovPIUIRrOZoCHxABJwK | ['lgbtq+ hip hop', 'pop rap'] | ... | 0.736 | 0.704 | -7.409 | 0.0615 | 0.0203 | 0.0501 | 149.995 | 212000 | 0.894 | D#/Eb |
5 rows × 23 columns
Get the names of the 10 most frequent artists in the attached Spotify dataset.
df["Artist"].value_counts()
Taylor Swift 52
Justin Bieber 32
Lil Uzi Vert 32
Juice WRLD 30
Pop Smoke 29
..
Bing Crosby 1
Lele Pons 1
Hippie Sabotage 1
Anne-Marie, KSI, Digital Farm Animals 1
Damso 1
Name: Artist, Length: 716, dtype: int64
top_artists = df["Artist"].value_counts().index[:10]
type(top_artists)
pandas.core.indexes.base.Index
Using list comprehension, make a list of the row labels for those artists.
row_labels = [i for i in df.index if df.loc[i, "Artist"] in top_artists]
Get the sub-DataFrame for those artists.
df.loc[row_labels]
Index | Highest Charting Position | Number of Times Charted | Week of Highest Charting | Song Name | Streams | Artist | Artist Followers | Song ID | Genre | ... | Danceability | Energy | Loudness | Speechiness | Acousticness | Liveness | Tempo | Duration (ms) | Valence | Chord | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 | 9 | 3 | 8 | 2021-06-18--2021-06-25 | Yonaguni | 25,030,128 | Bad Bunny | 36142273 | 2JPLbjOn0wPCngEot2STUS | ['latin', 'reggaeton', 'trap latino'] | ... | 0.644 | 0.648 | -4.601 | 0.118 | 0.276 | 0.135 | 179.951 | 206710 | 0.44 | C#/Db |
12 | 13 | 5 | 3 | 2021-07-09--2021-07-16 | Permission to Dance | 22,062,812 | BTS | 37106176 | 0LThjFY2iTtNdd4wviwVV2 | ['k-pop', 'k-pop boy group'] | ... | 0.702 | 0.741 | -5.33 | 0.0427 | 0.00544 | 0.337 | 124.925 | 187585 | 0.646 | A |
13 | 14 | 1 | 19 | 2021-04-02--2021-04-09 | Peaches (feat. Daniel Caesar & Giveon) | 20,294,457 | Justin Bieber | 48504126 | 4iJyoBOLtHqaGxP12qzhQI | ['canadian pop', 'pop', 'post-teen pop'] | ... | 0.677 | 0.696 | -6.181 | 0.119 | 0.321 | 0.42 | 90.03 | 198082 | 0.464 | C |
14 | 15 | 2 | 10 | 2021-05-21--2021-05-28 | Butter | 19,985,713 | BTS | 37106176 | 2bgTY4UwhfBYhGT4HUYStN | ['k-pop', 'k-pop boy group'] | ... | 0.759 | 0.459 | -5.187 | 0.0948 | 0.00323 | 0.0906 | 109.997 | 164442 | 0.695 | G#/Ab |
17 | 18 | 5 | 14 | 2021-04-23--2021-04-30 | Save Your Tears (with Ariana Grande) (Remix) | 18,053,141 | The Weeknd | 35305637 | 37BZB0z9T8Xu7U3e65qxFy | ['canadian contemporary r&b', 'canadian pop', ... | ... | 0.65 | 0.825 | -4.645 | 0.0325 | 0.0215 | 0.0936 | 118.091 | 191014 | 0.593 | C |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1499 | 1500 | 100 | 1 | 2020-01-17--2020-01-24 | Alfred - Interlude | 8,030,151 | Eminem | 46814751 | 4EmunTy7kNBYQivOa8F6b8 | ['detroit hip hop', 'hip hop', 'rap'] | ... | 0.429 | 0.231 | -20.43 | 0.402 | 0.878 | 0.279 | 74.545 | 30133 | 0.914 | F |
1500 | 1501 | 102 | 1 | 2020-01-17--2020-01-24 | Little Engine | 7,913,461 | Eminem | 46814751 | 4qNWEOMyexn7b8Icyk29t9 | ['detroit hip hop', 'hip hop', 'rap'] | ... | 0.769 | 0.811 | -4.162 | 0.228 | 0.0234 | 0.0451 | 155.081 | 177293 | 0.76 | A#/Bb |
1501 | 1502 | 113 | 1 | 2020-01-17--2020-01-24 | I Will (feat. KXNG Crooked, Royce Da 5'9" & Jo... | 7,115,414 | Eminem | 46814751 | 3CJbxqRQ0JNCqboWDNUUeX | ['detroit hip hop', 'hip hop', 'rap'] | ... | 0.635 | 0.543 | -5.941 | 0.067 | 0.0454 | 0.272 | 98.743 | 303000 | 0.036 | G#/Ab |
1549 | 1550 | 187 | 1 | 2019-12-27--2020-01-03 | Let Me Know (I Wonder Why Freestyle) | 4,701,532 | Juice WRLD | 19102888 | 3wwo0bJvDSorOpNfzEkfXx | ['chicago rap', 'melodic rap'] | ... | 0.635 | 0.537 | -7.895 | 0.0832 | 0.172 | 0.418 | 125.028 | 215381 | 0.383 | G |
1555 | 1556 | 199 | 1 | 2019-12-27--2020-01-03 | Lover (Remix) [feat. Shawn Mendes] | 4,595,450 | Taylor Swift | 42227614 | 3i9UVldZOE0aD0JnyfAZZ0 | ['pop', 'post-teen pop'] | ... | 0.448 | 0.603 | -7.176 | 0.064 | 0.433 | 0.0862 | 205.272 | 221307 | 0.422 | G |
295 rows × 23 columns
The sub-DataFrame of top artists using isin
and Boolean indexing#
top_artists
Index(['Taylor Swift', 'Justin Bieber', 'Lil Uzi Vert', 'Juice WRLD',
'Pop Smoke', 'BTS', 'Bad Bunny', 'Eminem', 'The Weeknd',
'Ariana Grande'],
dtype='object')
df["Artist"]
0 Måneskin
1 The Kid LAROI
2 Olivia Rodrigo
3 Ed Sheeran
4 Lil Nas X
...
1551 Dua Lipa
1552 Jorge & Mateus
1553 Camila Cabello
1554 Dadá Boladão, Tati Zaqui, OIK
1555 Taylor Swift
Name: Artist, Length: 1556, dtype: object
df["Artist"].isin(top_artists)
0 False
1 False
2 False
3 False
4 False
...
1551 False
1552 False
1553 False
1554 False
1555 True
Name: Artist, Length: 1556, dtype: bool
df[df["Artist"].isin(top_artists)]
Index | Highest Charting Position | Number of Times Charted | Week of Highest Charting | Song Name | Streams | Artist | Artist Followers | Song ID | Genre | ... | Danceability | Energy | Loudness | Speechiness | Acousticness | Liveness | Tempo | Duration (ms) | Valence | Chord | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
8 | 9 | 3 | 8 | 2021-06-18--2021-06-25 | Yonaguni | 25,030,128 | Bad Bunny | 36142273 | 2JPLbjOn0wPCngEot2STUS | ['latin', 'reggaeton', 'trap latino'] | ... | 0.644 | 0.648 | -4.601 | 0.118 | 0.276 | 0.135 | 179.951 | 206710 | 0.44 | C#/Db |
12 | 13 | 5 | 3 | 2021-07-09--2021-07-16 | Permission to Dance | 22,062,812 | BTS | 37106176 | 0LThjFY2iTtNdd4wviwVV2 | ['k-pop', 'k-pop boy group'] | ... | 0.702 | 0.741 | -5.33 | 0.0427 | 0.00544 | 0.337 | 124.925 | 187585 | 0.646 | A |
13 | 14 | 1 | 19 | 2021-04-02--2021-04-09 | Peaches (feat. Daniel Caesar & Giveon) | 20,294,457 | Justin Bieber | 48504126 | 4iJyoBOLtHqaGxP12qzhQI | ['canadian pop', 'pop', 'post-teen pop'] | ... | 0.677 | 0.696 | -6.181 | 0.119 | 0.321 | 0.42 | 90.03 | 198082 | 0.464 | C |
14 | 15 | 2 | 10 | 2021-05-21--2021-05-28 | Butter | 19,985,713 | BTS | 37106176 | 2bgTY4UwhfBYhGT4HUYStN | ['k-pop', 'k-pop boy group'] | ... | 0.759 | 0.459 | -5.187 | 0.0948 | 0.00323 | 0.0906 | 109.997 | 164442 | 0.695 | G#/Ab |
17 | 18 | 5 | 14 | 2021-04-23--2021-04-30 | Save Your Tears (with Ariana Grande) (Remix) | 18,053,141 | The Weeknd | 35305637 | 37BZB0z9T8Xu7U3e65qxFy | ['canadian contemporary r&b', 'canadian pop', ... | ... | 0.65 | 0.825 | -4.645 | 0.0325 | 0.0215 | 0.0936 | 118.091 | 191014 | 0.593 | C |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1499 | 1500 | 100 | 1 | 2020-01-17--2020-01-24 | Alfred - Interlude | 8,030,151 | Eminem | 46814751 | 4EmunTy7kNBYQivOa8F6b8 | ['detroit hip hop', 'hip hop', 'rap'] | ... | 0.429 | 0.231 | -20.43 | 0.402 | 0.878 | 0.279 | 74.545 | 30133 | 0.914 | F |
1500 | 1501 | 102 | 1 | 2020-01-17--2020-01-24 | Little Engine | 7,913,461 | Eminem | 46814751 | 4qNWEOMyexn7b8Icyk29t9 | ['detroit hip hop', 'hip hop', 'rap'] | ... | 0.769 | 0.811 | -4.162 | 0.228 | 0.0234 | 0.0451 | 155.081 | 177293 | 0.76 | A#/Bb |
1501 | 1502 | 113 | 1 | 2020-01-17--2020-01-24 | I Will (feat. KXNG Crooked, Royce Da 5'9" & Jo... | 7,115,414 | Eminem | 46814751 | 3CJbxqRQ0JNCqboWDNUUeX | ['detroit hip hop', 'hip hop', 'rap'] | ... | 0.635 | 0.543 | -5.941 | 0.067 | 0.0454 | 0.272 | 98.743 | 303000 | 0.036 | G#/Ab |
1549 | 1550 | 187 | 1 | 2019-12-27--2020-01-03 | Let Me Know (I Wonder Why Freestyle) | 4,701,532 | Juice WRLD | 19102888 | 3wwo0bJvDSorOpNfzEkfXx | ['chicago rap', 'melodic rap'] | ... | 0.635 | 0.537 | -7.895 | 0.0832 | 0.172 | 0.418 | 125.028 | 215381 | 0.383 | G |
1555 | 1556 | 199 | 1 | 2019-12-27--2020-01-03 | Lover (Remix) [feat. Shawn Mendes] | 4,595,450 | Taylor Swift | 42227614 | 3i9UVldZOE0aD0JnyfAZZ0 | ['pop', 'post-teen pop'] | ... | 0.448 | 0.603 | -7.176 | 0.064 | 0.433 | 0.0862 | 205.272 | 221307 | 0.422 | G |
295 rows × 23 columns
df1 = df.loc[row_labels]
df2 = df[df["Artist"].isin(top_artists)]
df1.index == df2.index
array([ True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True])
(df1.index == df2.index).all()
True