Week 2 Videos#

List comprehension, Part 1#

  • Make a length 8 list of all 6s using list comprehension.

mylist = []

for i in range(8):
    mylist.append(6)

mylist
[6, 6, 6, 6, 6, 6, 6, 6]
[6 for i in range(8)]
[6, 6, 6, 6, 6, 6, 6, 6]
  • Let mylist = [3,1,-2,10,-5,3,6,2,8]. Square each element in mylist.

mylist = [3,1,-2,10,-5,3,6,2,8]
[x**2 for x in mylist]
[9, 1, 4, 100, 25, 9, 36, 4, 64]
  • Get the sublist of mylist containing only those numbers \(x\) satisfying \(-2 < x \leq 3\).

[x for x in mylist if (x > -2) and (x <= 3)]
[3, 1, 3, 2]
  • Replace each negative number in mylist with 0.

[0 if x < 0 else x for x in mylist]
[3, 1, 0, 10, 0, 3, 6, 2, 8]

List comprehension, Part 2#

  • Make the length-8 list of lists [[0,1,2], [0,1,2], ..., [0,1,2]], then convert it to a pandas DataFrame.

mylist = [[0,1,2] for _ in range(8)]
mylist
[[0, 1, 2],
 [0, 1, 2],
 [0, 1, 2],
 [0, 1, 2],
 [0, 1, 2],
 [0, 1, 2],
 [0, 1, 2],
 [0, 1, 2]]
import pandas as pd
pd.DataFrame(mylist)
0 1 2
0 0 1 2
1 0 1 2
2 0 1 2
3 0 1 2
4 0 1 2
5 0 1 2
6 0 1 2
7 0 1 2
  • Make the length-24 list [0,1,2,0,1,2,...,0,1,2].

mylist = []

for i in range(8):
    for j in range(3):
        mylist.append(j)

mylist
[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]
[j for i in range(8) for j in range(3)]
[0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]
  • Capitalize each word in the Math 10 course catalogue description:

Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting; the libraries NumPy, pandas, scikit-learn. Foundations of machine learning.

desc = "Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting;
the libraries NumPy, pandas, scikit-learn. Foundations of machine learning."
  Cell In [23], line 1
    desc = "Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting;
                                                                                                                       ^
SyntaxError: EOL while scanning string literal
desc = '''Introduction to Python for data science. Selecting appropriate data types; functions and methods; plotting;
the libraries NumPy, pandas, scikit-learn. Foundations of machine learning.'''
[c for c in desc]
['I',
 'n',
 't',
 'r',
 'o',
 'd',
 'u',
 'c',
 't',
 'i',
 'o',
 'n',
 ' ',
 't',
 'o',
 ' ',
 'P',
 'y',
 't',
 'h',
 'o',
 'n',
 ' ',
 'f',
 'o',
 'r',
 ' ',
 'd',
 'a',
 't',
 'a',
 ' ',
 's',
 'c',
 'i',
 'e',
 'n',
 'c',
 'e',
 '.',
 ' ',
 'S',
 'e',
 'l',
 'e',
 'c',
 't',
 'i',
 'n',
 'g',
 ' ',
 'a',
 'p',
 'p',
 'r',
 'o',
 'p',
 'r',
 'i',
 'a',
 't',
 'e',
 ' ',
 'd',
 'a',
 't',
 'a',
 ' ',
 't',
 'y',
 'p',
 'e',
 's',
 ';',
 ' ',
 'f',
 'u',
 'n',
 'c',
 't',
 'i',
 'o',
 'n',
 's',
 ' ',
 'a',
 'n',
 'd',
 ' ',
 'm',
 'e',
 't',
 'h',
 'o',
 'd',
 's',
 ';',
 ' ',
 'p',
 'l',
 'o',
 't',
 't',
 'i',
 'n',
 'g',
 ';',
 '\n',
 't',
 'h',
 'e',
 ' ',
 'l',
 'i',
 'b',
 'r',
 'a',
 'r',
 'i',
 'e',
 's',
 ' ',
 'N',
 'u',
 'm',
 'P',
 'y',
 ',',
 ' ',
 'p',
 'a',
 'n',
 'd',
 'a',
 's',
 ',',
 ' ',
 's',
 'c',
 'i',
 'k',
 'i',
 't',
 '-',
 'l',
 'e',
 'a',
 'r',
 'n',
 '.',
 ' ',
 'F',
 'o',
 'u',
 'n',
 'd',
 'a',
 't',
 'i',
 'o',
 'n',
 's',
 ' ',
 'o',
 'f',
 ' ',
 'm',
 'a',
 'c',
 'h',
 'i',
 'n',
 'e',
 ' ',
 'l',
 'e',
 'a',
 'r',
 'n',
 'i',
 'n',
 'g',
 '.']
desc.split()
['Introduction',
 'to',
 'Python',
 'for',
 'data',
 'science.',
 'Selecting',
 'appropriate',
 'data',
 'types;',
 'functions',
 'and',
 'methods;',
 'plotting;',
 'the',
 'libraries',
 'NumPy,',
 'pandas,',
 'scikit-learn.',
 'Foundations',
 'of',
 'machine',
 'learning.']
[word for word in desc.split()]
['Introduction',
 'to',
 'Python',
 'for',
 'data',
 'science.',
 'Selecting',
 'appropriate',
 'data',
 'types;',
 'functions',
 'and',
 'methods;',
 'plotting;',
 'the',
 'libraries',
 'NumPy,',
 'pandas,',
 'scikit-learn.',
 'Foundations',
 'of',
 'machine',
 'learning.']
[word.capitalize() for word in desc.split()]
['Introduction',
 'To',
 'Python',
 'For',
 'Data',
 'Science.',
 'Selecting',
 'Appropriate',
 'Data',
 'Types;',
 'Functions',
 'And',
 'Methods;',
 'Plotting;',
 'The',
 'Libraries',
 'Numpy,',
 'Pandas,',
 'Scikit-learn.',
 'Foundations',
 'Of',
 'Machine',
 'Learning.']
'Christopher'.join([word.capitalize() for word in desc.split()])
'IntroductionChristopherToChristopherPythonChristopherForChristopherDataChristopherScience.ChristopherSelectingChristopherAppropriateChristopherDataChristopherTypes;ChristopherFunctionsChristopherAndChristopherMethods;ChristopherPlotting;ChristopherTheChristopherLibrariesChristopherNumpy,ChristopherPandas,ChristopherScikit-learn.ChristopherFoundationsChristopherOfChristopherMachineChristopherLearning.'
' '.join([word.capitalize() for word in desc.split()])
'Introduction To Python For Data Science. Selecting Appropriate Data Types; Functions And Methods; Plotting; The Libraries Numpy, Pandas, Scikit-learn. Foundations Of Machine Learning.'

List comprehension to get row labels#

import pandas as pd
df = pd.read_csv("spotify_dataset.csv")
df.head()
Index Highest Charting Position Number of Times Charted Week of Highest Charting Song Name Streams Artist Artist Followers Song ID Genre ... Danceability Energy Loudness Speechiness Acousticness Liveness Tempo Duration (ms) Valence Chord
0 1 1 8 2021-07-23--2021-07-30 Beggin' 48,633,449 Måneskin 3377762 3Wrjm47oTz2sjIgck11l5e ['indie rock italiano', 'italian pop'] ... 0.714 0.8 -4.808 0.0504 0.127 0.359 134.002 211560 0.589 B
1 2 2 3 2021-07-23--2021-07-30 STAY (with Justin Bieber) 47,248,719 The Kid LAROI 2230022 5HCyWlXZPP0y6Gqq8TgA20 ['australian hip hop'] ... 0.591 0.764 -5.484 0.0483 0.0383 0.103 169.928 141806 0.478 C#/Db
2 3 1 11 2021-06-25--2021-07-02 good 4 u 40,162,559 Olivia Rodrigo 6266514 4ZtFanR9U6ndgddUvNcjcG ['pop'] ... 0.563 0.664 -5.044 0.154 0.335 0.0849 166.928 178147 0.688 A
3 4 3 5 2021-07-02--2021-07-09 Bad Habits 37,799,456 Ed Sheeran 83293380 6PQ88X9TkUIAUIZJHW2upE ['pop', 'uk pop'] ... 0.808 0.897 -3.712 0.0348 0.0469 0.364 126.026 231041 0.591 B
4 5 5 1 2021-07-23--2021-07-30 INDUSTRY BABY (feat. Jack Harlow) 33,948,454 Lil Nas X 5473565 27NovPIUIRrOZoCHxABJwK ['lgbtq+ hip hop', 'pop rap'] ... 0.736 0.704 -7.409 0.0615 0.0203 0.0501 149.995 212000 0.894 D#/Eb

5 rows × 23 columns

  • Get the names of the 10 most frequent artists in the attached Spotify dataset.

df["Artist"].value_counts()
Taylor Swift                             52
Justin Bieber                            32
Lil Uzi Vert                             32
Juice WRLD                               30
Pop Smoke                                29
                                         ..
Bing Crosby                               1
Lele Pons                                 1
Hippie Sabotage                           1
Anne-Marie, KSI, Digital Farm Animals     1
Damso                                     1
Name: Artist, Length: 716, dtype: int64
top_artists = df["Artist"].value_counts().index[:10]
type(top_artists)
pandas.core.indexes.base.Index
  • Using list comprehension, make a list of the row labels for those artists.

row_labels = [i for i in df.index if df.loc[i, "Artist"] in top_artists]
  • Get the sub-DataFrame for those artists.

df.loc[row_labels]
Index Highest Charting Position Number of Times Charted Week of Highest Charting Song Name Streams Artist Artist Followers Song ID Genre ... Danceability Energy Loudness Speechiness Acousticness Liveness Tempo Duration (ms) Valence Chord
8 9 3 8 2021-06-18--2021-06-25 Yonaguni 25,030,128 Bad Bunny 36142273 2JPLbjOn0wPCngEot2STUS ['latin', 'reggaeton', 'trap latino'] ... 0.644 0.648 -4.601 0.118 0.276 0.135 179.951 206710 0.44 C#/Db
12 13 5 3 2021-07-09--2021-07-16 Permission to Dance 22,062,812 BTS 37106176 0LThjFY2iTtNdd4wviwVV2 ['k-pop', 'k-pop boy group'] ... 0.702 0.741 -5.33 0.0427 0.00544 0.337 124.925 187585 0.646 A
13 14 1 19 2021-04-02--2021-04-09 Peaches (feat. Daniel Caesar & Giveon) 20,294,457 Justin Bieber 48504126 4iJyoBOLtHqaGxP12qzhQI ['canadian pop', 'pop', 'post-teen pop'] ... 0.677 0.696 -6.181 0.119 0.321 0.42 90.03 198082 0.464 C
14 15 2 10 2021-05-21--2021-05-28 Butter 19,985,713 BTS 37106176 2bgTY4UwhfBYhGT4HUYStN ['k-pop', 'k-pop boy group'] ... 0.759 0.459 -5.187 0.0948 0.00323 0.0906 109.997 164442 0.695 G#/Ab
17 18 5 14 2021-04-23--2021-04-30 Save Your Tears (with Ariana Grande) (Remix) 18,053,141 The Weeknd 35305637 37BZB0z9T8Xu7U3e65qxFy ['canadian contemporary r&b', 'canadian pop', ... ... 0.65 0.825 -4.645 0.0325 0.0215 0.0936 118.091 191014 0.593 C
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1499 1500 100 1 2020-01-17--2020-01-24 Alfred - Interlude 8,030,151 Eminem 46814751 4EmunTy7kNBYQivOa8F6b8 ['detroit hip hop', 'hip hop', 'rap'] ... 0.429 0.231 -20.43 0.402 0.878 0.279 74.545 30133 0.914 F
1500 1501 102 1 2020-01-17--2020-01-24 Little Engine 7,913,461 Eminem 46814751 4qNWEOMyexn7b8Icyk29t9 ['detroit hip hop', 'hip hop', 'rap'] ... 0.769 0.811 -4.162 0.228 0.0234 0.0451 155.081 177293 0.76 A#/Bb
1501 1502 113 1 2020-01-17--2020-01-24 I Will (feat. KXNG Crooked, Royce Da 5'9" & Jo... 7,115,414 Eminem 46814751 3CJbxqRQ0JNCqboWDNUUeX ['detroit hip hop', 'hip hop', 'rap'] ... 0.635 0.543 -5.941 0.067 0.0454 0.272 98.743 303000 0.036 G#/Ab
1549 1550 187 1 2019-12-27--2020-01-03 Let Me Know (I Wonder Why Freestyle) 4,701,532 Juice WRLD 19102888 3wwo0bJvDSorOpNfzEkfXx ['chicago rap', 'melodic rap'] ... 0.635 0.537 -7.895 0.0832 0.172 0.418 125.028 215381 0.383 G
1555 1556 199 1 2019-12-27--2020-01-03 Lover (Remix) [feat. Shawn Mendes] 4,595,450 Taylor Swift 42227614 3i9UVldZOE0aD0JnyfAZZ0 ['pop', 'post-teen pop'] ... 0.448 0.603 -7.176 0.064 0.433 0.0862 205.272 221307 0.422 G

295 rows × 23 columns

The sub-DataFrame of top artists using isin and Boolean indexing#

top_artists
Index(['Taylor Swift', 'Justin Bieber', 'Lil Uzi Vert', 'Juice WRLD',
       'Pop Smoke', 'BTS', 'Bad Bunny', 'Eminem', 'The Weeknd',
       'Ariana Grande'],
      dtype='object')
df["Artist"]
0                            Måneskin
1                       The Kid LAROI
2                      Olivia Rodrigo
3                          Ed Sheeran
4                           Lil Nas X
                    ...              
1551                         Dua Lipa
1552                   Jorge & Mateus
1553                   Camila Cabello
1554    Dadá Boladão, Tati Zaqui, OIK
1555                     Taylor Swift
Name: Artist, Length: 1556, dtype: object
df["Artist"].isin(top_artists)
0       False
1       False
2       False
3       False
4       False
        ...  
1551    False
1552    False
1553    False
1554    False
1555     True
Name: Artist, Length: 1556, dtype: bool
df[df["Artist"].isin(top_artists)]
Index Highest Charting Position Number of Times Charted Week of Highest Charting Song Name Streams Artist Artist Followers Song ID Genre ... Danceability Energy Loudness Speechiness Acousticness Liveness Tempo Duration (ms) Valence Chord
8 9 3 8 2021-06-18--2021-06-25 Yonaguni 25,030,128 Bad Bunny 36142273 2JPLbjOn0wPCngEot2STUS ['latin', 'reggaeton', 'trap latino'] ... 0.644 0.648 -4.601 0.118 0.276 0.135 179.951 206710 0.44 C#/Db
12 13 5 3 2021-07-09--2021-07-16 Permission to Dance 22,062,812 BTS 37106176 0LThjFY2iTtNdd4wviwVV2 ['k-pop', 'k-pop boy group'] ... 0.702 0.741 -5.33 0.0427 0.00544 0.337 124.925 187585 0.646 A
13 14 1 19 2021-04-02--2021-04-09 Peaches (feat. Daniel Caesar & Giveon) 20,294,457 Justin Bieber 48504126 4iJyoBOLtHqaGxP12qzhQI ['canadian pop', 'pop', 'post-teen pop'] ... 0.677 0.696 -6.181 0.119 0.321 0.42 90.03 198082 0.464 C
14 15 2 10 2021-05-21--2021-05-28 Butter 19,985,713 BTS 37106176 2bgTY4UwhfBYhGT4HUYStN ['k-pop', 'k-pop boy group'] ... 0.759 0.459 -5.187 0.0948 0.00323 0.0906 109.997 164442 0.695 G#/Ab
17 18 5 14 2021-04-23--2021-04-30 Save Your Tears (with Ariana Grande) (Remix) 18,053,141 The Weeknd 35305637 37BZB0z9T8Xu7U3e65qxFy ['canadian contemporary r&b', 'canadian pop', ... ... 0.65 0.825 -4.645 0.0325 0.0215 0.0936 118.091 191014 0.593 C
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1499 1500 100 1 2020-01-17--2020-01-24 Alfred - Interlude 8,030,151 Eminem 46814751 4EmunTy7kNBYQivOa8F6b8 ['detroit hip hop', 'hip hop', 'rap'] ... 0.429 0.231 -20.43 0.402 0.878 0.279 74.545 30133 0.914 F
1500 1501 102 1 2020-01-17--2020-01-24 Little Engine 7,913,461 Eminem 46814751 4qNWEOMyexn7b8Icyk29t9 ['detroit hip hop', 'hip hop', 'rap'] ... 0.769 0.811 -4.162 0.228 0.0234 0.0451 155.081 177293 0.76 A#/Bb
1501 1502 113 1 2020-01-17--2020-01-24 I Will (feat. KXNG Crooked, Royce Da 5'9" & Jo... 7,115,414 Eminem 46814751 3CJbxqRQ0JNCqboWDNUUeX ['detroit hip hop', 'hip hop', 'rap'] ... 0.635 0.543 -5.941 0.067 0.0454 0.272 98.743 303000 0.036 G#/Ab
1549 1550 187 1 2019-12-27--2020-01-03 Let Me Know (I Wonder Why Freestyle) 4,701,532 Juice WRLD 19102888 3wwo0bJvDSorOpNfzEkfXx ['chicago rap', 'melodic rap'] ... 0.635 0.537 -7.895 0.0832 0.172 0.418 125.028 215381 0.383 G
1555 1556 199 1 2019-12-27--2020-01-03 Lover (Remix) [feat. Shawn Mendes] 4,595,450 Taylor Swift 42227614 3i9UVldZOE0aD0JnyfAZZ0 ['pop', 'post-teen pop'] ... 0.448 0.603 -7.176 0.064 0.433 0.0862 205.272 221307 0.422 G

295 rows × 23 columns

df1 = df.loc[row_labels]
df2 = df[df["Artist"].isin(top_artists)]
df1.index == df2.index
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True])
(df1.index == df2.index).all()
True
Created in deepnote.com Created in Deepnote