Week 1 Tuesday Discussion¶

Meeting Times TuTh: 14:00-14:50 in ALP 3600¶

Office Hours: TBA but will be over Zoom¶

TA: Yasmeen Baki mailto:ybaki@uci.edu ¶

Plan for Today:

Introductions and time to find people to work with ~10 minutes
Overview of discussion policies ~5 minutes
Getting comfortable with Deepnote (e.g. markdown versus code cells)
Practice with uploading data, pandas, and getting started on Homework 1

Overview of Our Discussion Sections¶

Purpose: Discussion sections are a time for you to reinforce the material that you have been learning in lecture throughout the week. My general plan is for us to try some exercises and go through some homework all together as a class, but also to leave time for individual work where Chupeng, Yufei, and I can go around answering specific questions you may have.

Quizzes: Quizzes will typically be during the last 20 minutes of our Tuesday discussions. I will give a review of the quiz material during the first 30 minutes of our discussion on these days.

Office Hours: Office hours (and Ed Discussion!) are some of the best places to get fast help. Please do not ask detailed questions about your code right after or before our discussion times – this can create serious delays for our class, and those right aftewards.

Email Policy: Email should be reserved for personal/private concerns (e.g. illness, family emergency, etc.), and not for homework or lecture related questions (this is what Ed Discussion, office hours, and discussion is for). Further, please be patient and allow me about 24 hours to get back to your email; in particular, do not send me the same email multiple times.

General Advice and Style Guidelines:

Be as organized as possible when saving files on your computer; it helps to have a folder dedicated to this class. Don’t save everything in your Downloads folder!
Use descriptive names for variables and files.
Use comments to make your code more readable to yourself and others.
Start early, start often
Ask for help!

Getting Comfortable with Deepnote¶

All of your work in Deepnote will be done in cells. This is an example of a markdown cell. Markdown cells are used for displaying text, and in our class are an important part of answering homework questions each week.

To create a markdown cell below this cell, we can first use the shortcut ⌘ + j on Mac, or ctrl + j on PC to create a new code cell. Then, we can convert this new cell to a markdown cell by using the command ⌘+shift+m, or ctrl+shift+m.

Exercise 1: Using only keyboard shorcuts, create a new markdown cell below this one. Write a short self-introduction. Using the code from this exercise, change the font color of your self-introduction to blue.

Remember: Markdown is subtly different on different sites, so what might work in Jupyter or GitHub, for example, might not work in Deepnote.

It is worth taking a look at this list of keyboard shortcuts for working in Deepnote. Spending the time to learn at least a few of these shortcuts now will make your life much easier going forward.

Exercise 2: Use the link above to learn the keyboard shortcut for deleting a cell. Using only keyboard shortcuts, create a new cell and then delete it.

#This is an example of a comment inside of a code cell
#Comments can be used to help people reading your code understand it better...
#they can also be used to remove portions of code from being evaluated (think debugging!)

2**3

Exercise 3: Create a new code cell and evaluate 2^3. Is this different than what you would expect?

Uploading files, pandas, and getting started on Homework 1¶

Exercise 4: Import pandas. Practice uploading a dataset by downloading the csv file found at this link. This is a good time to practice giving your csv file a description name. Load it into this notebook using df = pd.read_csv(...). Explore what df.head(), df.columns, and df.shape return.

import pandas as pd

df = pd.read_csv("../data/spotify_dataset.csv",na_values = " ")
df.head()

	Index	Highest Charting Position	Number of Times Charted	Week of Highest Charting	Song Name	Streams	Artist	Artist Followers	Song ID	Genre	...	Danceability	Energy	Loudness	Speechiness	Acousticness	Liveness	Tempo	Duration (ms)	Valence	Chord
0	1	1	8	2021-07-23--2021-07-30	Beggin'	48,633,449	Måneskin	3377762.0	3Wrjm47oTz2sjIgck11l5e	['indie rock italiano', 'italian pop']	...	0.714	0.800	-4.808	0.0504	0.1270	0.3590	134.002	211560.0	0.589	B
1	2	2	3	2021-07-23--2021-07-30	STAY (with Justin Bieber)	47,248,719	The Kid LAROI	2230022.0	5HCyWlXZPP0y6Gqq8TgA20	['australian hip hop']	...	0.591	0.764	-5.484	0.0483	0.0383	0.1030	169.928	141806.0	0.478	C#/Db
2	3	1	11	2021-06-25--2021-07-02	good 4 u	40,162,559	Olivia Rodrigo	6266514.0	4ZtFanR9U6ndgddUvNcjcG	['pop']	...	0.563	0.664	-5.044	0.1540	0.3350	0.0849	166.928	178147.0	0.688	A
3	4	3	5	2021-07-02--2021-07-09	Bad Habits	37,799,456	Ed Sheeran	83293380.0	6PQ88X9TkUIAUIZJHW2upE	['pop', 'uk pop']	...	0.808	0.897	-3.712	0.0348	0.0469	0.3640	126.026	231041.0	0.591	B
4	5	5	1	2021-07-23--2021-07-30	INDUSTRY BABY (feat. Jack Harlow)	33,948,454	Lil Nas X	5473565.0	27NovPIUIRrOZoCHxABJwK	['lgbtq+ hip hop', 'pop rap']	...	0.736	0.704	-7.409	0.0615	0.0203	0.0501	149.995	212000.0	0.894	D#/Eb

5 rows × 23 columns

df.columns

Index(['Index', 'Highest Charting Position', 'Number of Times Charted',
       'Week of Highest Charting', 'Song Name', 'Streams', 'Artist',
       'Artist Followers', 'Song ID', 'Genre', 'Release Date', 'Weeks Charted',
       'Popularity', 'Danceability', 'Energy', 'Loudness', 'Speechiness',
       'Acousticness', 'Liveness', 'Tempo', 'Duration (ms)', 'Valence',
       'Chord'],
      dtype='object')

df.shape

(1556, 23)

Exercise 5: Use info() to see what data is stored as numerically; then use describe() to find out the average number of a times a song in the dataset has charted. Write your answers to these questions in a markdown cell.

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1556 entries, 0 to 1555
Data columns (total 23 columns):
 #   Column                     Non-Null Count  Dtype  
---  ------                     --------------  -----  
 Index                      1556 non-null   int64  
 Highest Charting Position  1556 non-null   int64  
 Number of Times Charted    1556 non-null   int64  
 Week of Highest Charting   1556 non-null   object 
 Song Name                  1556 non-null   object 
 Streams                    1556 non-null   object 
 Artist                     1556 non-null   object 
 Artist Followers           1545 non-null   float64
 Song ID                    1545 non-null   object 
 Genre                      1545 non-null   object 
Release Date               1545 non-null   object 
Weeks Charted              1556 non-null   object 
Popularity                 1545 non-null   float64
Danceability               1545 non-null   float64
Energy                     1545 non-null   float64
Loudness                   1545 non-null   float64
Speechiness                1545 non-null   float64
Acousticness               1545 non-null   float64
Liveness                   1545 non-null   float64
Tempo                      1545 non-null   float64
Duration (ms)              1545 non-null   float64
Valence                    1545 non-null   float64
Chord                      1545 non-null   object 
dtypes: float64(11), int64(3), object(9)
memory usage: 279.7+ KB

df.describe()

	Index	Highest Charting Position	Number of Times Charted	Artist Followers	Popularity	Danceability	Energy	Loudness	Speechiness	Acousticness	Liveness	Tempo	Duration (ms)	Valence
count	1556.000000	1556.000000	1556.000000	1.545000e+03	1545.000000	1545.000000	1545.000000	1545.000000	1545.000000	1545.000000	1545.000000	1545.000000	1545.000000	1545.000000
mean	778.500000	87.744216	10.668380	1.471690e+07	70.089320	0.689997	0.633495	-6.348474	0.123656	0.248695	0.181202	122.811023	197940.816828	0.514704
std	449.322824	58.147225	16.360546	1.667579e+07	15.824034	0.142444	0.161577	2.509281	0.110383	0.250326	0.144071	29.591088	47148.930420	0.227326
min	1.000000	1.000000	1.000000	4.883000e+03	0.000000	0.150000	0.054000	-25.166000	0.023200	0.000025	0.019700	46.718000	30133.000000	0.032000
25%	389.750000	37.000000	1.000000	2.123734e+06	65.000000	0.599000	0.532000	-7.491000	0.045600	0.048500	0.096600	97.960000	169266.000000	0.343000
50%	778.500000	80.000000	4.000000	6.852509e+06	73.000000	0.707000	0.642000	-5.990000	0.076500	0.161000	0.124000	122.012000	193591.000000	0.512000
75%	1167.250000	137.000000	12.000000	2.269875e+07	80.000000	0.796000	0.752000	-4.711000	0.165000	0.388000	0.217000	143.860000	218902.000000	0.691000
max	1556.000000	200.000000	142.000000	8.333778e+07	100.000000	0.980000	0.970000	1.509000	0.884000	0.994000	0.962000	205.272000	588139.000000	0.979000

Exercise 6: Using slicing techniques from Monday’s lecture, create a new dataframe which has just the “Song Name” column from the original dataframe.

df2 = df.loc[:,"Song Name"]

Exercise 7: Using value_counts(), determine how many times each artist appears in the dataset. Then pick an artist and use boolean indexing to find all songs by that artist in the original dataframe.

df["Artist"].value_counts()

Taylor Swift                     52
Lil Uzi Vert                     32
Justin Bieber                    32
Juice WRLD                       30
Pop Smoke                        29
                                 ..
Chris Brown, Young Thug           1
Rauw Alejandro, J Balvin          1
347aidan                          1
Migrantes, Alico                  1
Dadá Boladão, Tati Zaqui, OIK     1
Name: Artist, Length: 716, dtype: int64

df3 = df[df["Artist"] == "Taylor Swift"]["Song Name"]
df3

   Mr. Perfectly Fine (Taylor’s Version) (From Th...
                       Love Story (Taylor’s Version)
                                              willow
               You Belong With Me (Taylor’s Version)
                         Fearless (Taylor’s Version)
                          Fifteen (Taylor’s Version)
              The Way I Loved You (Taylor’s Version)
   You All Over Me (feat. Maren Morris) (Taylor’s...
                      Hey Stephen (Taylor’s Version)
                      White Horse (Taylor’s Version)
                 Forever & Always (Taylor’s Version)
   Breathe (feat. Colbie Caillat) (Taylor’s Version)
   That’s When (feat. Keith Urban) (Taylor’s Vers...
                      Tell Me Why (Taylor’s Version)
                 You’re Not Sorry (Taylor’s Version)
       Don’t You (Taylor’s Version) (From The Vault)
   We Were Happy (Taylor’s Version) (From The Vault)
                                  champagne problems
                      no body, no crime (feat. HAIM)
                                ‘tis the damn season
                                           gold rush
                                 Christmas Tree Farm
                                         tolerate it
                                           happiness
                                                 ivy
                                            dorothea
                   coney island (feat. The National)
                           evermore (feat. Bon Iver)
                                    long story short
                                      cowboy like me
                                            marjorie
                                             closure
                                            cardigan
                              exile (feat. Bon Iver)
                                               the 1
                                              august
                     the last great american dynasty
                                   my tears ricochet
                                    invisible string
                                          mirrorball
                                               seven
                                   this is me trying
                                               betty
                                     illicit affairs
                                           mad woman
                                            epiphany
                                               peace
                                                hoax
                              You Need To Calm Down
        Only The Young - Featured in Miss Americana
    ME! (feat. Brendon Urie of Panic! At The Disco)
                 Lover (Remix) [feat. Shawn Mendes]
Name: Song Name, dtype: object

Getting Started on Homework 1¶

Remember that you can work in groups of 2-3 students on the homework, and you all can submit the same work. Just remember to include the names of your collaborators. Let’s quickly see how to add collaborators to a project.
Thursday we will work on Homework 1 together. It helps if you come prepared to discussion having already found a dataset you would like to use from Kaggle (you will need to create an account). When picking a dataset, here are a few things to keep in mind:
- Find a dataset that interests you, but spend the majority of your time working on the homework questions. It can be easy to waste time trying to find the perfect dataset.
- The data you use for this homework should be relatively “clean” already (I will show you an example of a dataset that would be a bad choice to use for this homework). We will have opportunities later in the quarter to work on data cleaning.

UC Irvine Math 10 S22

Week 1 Tuesday Discussion

Contents

Week 1 Tuesday Discussion¶

Meeting Times TuTh: 14:00-14:50 in ALP 3600¶

Office Hours: TBA but will be over Zoom¶

TA: Yasmeen Baki mailto:ybaki@uci.edu ¶

Overview of Our Discussion Sections¶

Getting Comfortable with Deepnote¶

Uploading files, pandas, and getting started on Homework 1¶

Getting Started on Homework 1¶

UC Irvine Math 10 S22

Week 1 Tuesday Discussion

Contents

Week 1 Tuesday Discussion¶

Meeting Times TuTh: 14:00-14:50 in ALP 3600¶

Office Hours: TBA but will be over Zoom¶

TA: Yasmeen Baki mailto:ybaki@uci.edu¶

Overview of Our Discussion Sections¶

Getting Comfortable with Deepnote¶

Uploading files, pandas, and getting started on Homework 1¶

Getting Started on Homework 1¶

TA: Yasmeen Baki mailto:ybaki@uci.edu ¶