Week 1 Thursday Discussion
Week 1 Thursday Discussion¶
import pandas as pd
Make the list
[0,-1,-4,-9,...,-81,-100]
using list comprehension.
[-x**2 for x in range(11)]
[0, -1, -4, -9, -16, -25, -36, -49, -64, -81, -100]
How can you print out the following lines using f-strings?
the square of 1 is 1
the square of 2 is 4
...
the square of 10 is 100
for x in range(11):
print(f"the square of {x} is {x**2}")
the square of 0 is 0
the square of 1 is 1
the square of 2 is 4
the square of 3 is 9
the square of 4 is 16
the square of 5 is 25
the square of 6 is 36
the square of 7 is 49
the square of 8 is 64
the square of 9 is 81
the square of 10 is 100
indexList = ['NYA', 'IXIC', 'HSI', '000001.SS', 'N225', 'N100', '399001.SZ',
'GSPTSE', 'NSEI', 'KS11', 'SSMI', 'TWII', 'J203.JO']
Using list comprehension, find the sublist of indexList containing all the indexes which have the number “0” in their abbreviation. (Hint. You can use
in
, just make sure “0” is astr
and not anint
.)
[x for x in indexList if "0" in x]
['000001.SS', 'N100', '399001.SZ', 'J203.JO']
Define
df
to be the pandas DataFrame from the attachedindexData.csv
file. For how many values ofi
indf.index
isdf.loc[i,"Name"]
equal to “HSI”? Compute your answer using list comprehension of the formtemp_list = [??? for i in ??? if ???]
and thenlen(temp_list)
. (Warning. This is not nearly as efficient as the next method.)Compute the same value instead using Boolean indexing from Monday’s lecture and then taking the length of the resulting DataFrame using
len
.Compute the same value again using Boolean indexing, but this time using
???.shape[0]
.Compute the same value by evaluating
df.Name.value_counts()
.
df = pd.read_csv("../data/indexData.csv")
temp_list = [i for i in df.index if df.loc[i,"Name"] == "HSI"]
len(temp_list)
8750
df2 = df[df["Name"] == "HSI"]
len(df2)
8750
df2.shape[0]
8750
df.Name.value_counts()
N225 14500
NYA 13948
IXIC 12690
GSPTSE 10776
HSI 8750
GDAXI 8606
SSMI 7830
KS11 6181
TWII 6010
000001.SS 5963
399001.SZ 5928
N100 5507
NSEI 3381
J203.JO 2387
Name: Name, dtype: int64
For each index listed in the “Name” column from the
indexData.csv
file, print out the index’s abbreviation together with how many times it occurs in the file. Use f-strings, Boolean indexing, and theunique
method. For example, one line might be “The index HSI occurs 8750 times”. Check your answer usingdf["Name"].value_counts()
.
df = pd.read_csv("../data/indexData.csv")
for name in df.Name.unique():
print(f"The index {name} occurs {len(df[df.Name==name])} times")
The index NYA occurs 13948 times
The index IXIC occurs 12690 times
The index HSI occurs 8750 times
The index 000001.SS occurs 5963 times
The index GSPTSE occurs 10776 times
The index 399001.SZ occurs 5928 times
The index NSEI occurs 3381 times
The index GDAXI occurs 8606 times
The index KS11 occurs 6181 times
The index SSMI occurs 7830 times
The index TWII occurs 6010 times
The index J203.JO occurs 2387 times
The index N225 occurs 14500 times
The index N100 occurs 5507 times
listA = range(30)
listB = range(0,100,3)
Using a for loop and
append
, make a list containing all the elements inlistA
which are not inlistB
. (Hint. You can usenot in
.)Make the same list using list comprehension.
list_diff = []
for x in listA:
if x not in listB:
list_diff.append(x)
list_diff
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29]
[x for x in listA if x not in listB]
[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29]
The above list
indexList
contains the abbreviation for every stock exchange in theindexData.csv
file except one. What is the missing stock exchange? (Use the technique from the previous example.)
[x for x in df.Name.unique() if x not in indexList]
['GDAXI']