Alternating list#

The two most important parts of this section are that we will see how to define a function in Python, and we will get our first introduction to NumPy (which is the most important Python library for Math 9). We will also see how to use an if statement in Python.

Our overall goal in this section is the following.

Goal: Write a function which takes input n and as output returns the length n list [3,8,3,8,...].

Writing a function that returns a constant list#

Before getting to our goal of defining a function that produces an alternating list, we will see how to define a function that returns a constant list of all 3s. (Later in this course we will see a more elegant way to do this, using what is called “list comprehension”, but here we will use a for loop.)

Preliminary Goal: Write a function which takes input n and as output returns the length n list [3,3,3,...].

Here is a first attempt to define a function f. The overall syntax is correct (especially the indentations), but it does not return anything, as we can see by trying to call f(5).

def f(n):
    mylist = []
    for i in range(n):
        mylist.append(3)
f(5)

In the next definition we do correctly return something, but the indentation is wrong, the return statement should not be indented twice. The second indentation is forcing the return statement to occur within the for loop, so we actually exit the function during the first step in the for loop.

def f(n):
    mylist = []
    for i in range(n):
        mylist.append(3)
        return mylist
f(5)
[3]

Here is a correct solution to our “preliminary goal” from above.

def f(n):
    mylist = []
    for i in range(n):
        mylist.append(3)
    return mylist
f(5)
[3, 3, 3, 3, 3]
f(6)
[3, 3, 3, 3, 3, 3]

In the above, you should think of n as getting assigned to 5 or to 6, but that assignment is only happening locally within the function itself. The value of n is not accessible to us outside of the function.

n
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/var/folders/8j/gshrlmtn7dg4qtztj4d4t_w40000gn/T/ipykernel_54375/1249512285.py in <module>
----> 1 n

NameError: name 'n' is not defined

Similarly, if we define n outside of the function (for example, n = 100), that does not have any influence on the n in the definition of the function. Nor does the n inside of the function have any influence on the n that we have defined to be 100 outside of the function.

n = 100
f(4)
[3, 3, 3, 3]
n
100

First version of the alternating function#

Now we will actually create the list that alternates between 3s and 8s. Let’s start by giving our function defined above a more descriptive name. We will use const instead of f.

def const(n):
    mylist = []
    for i in range(n):
        mylist.append(3)
    return mylist
const(5)
[3, 3, 3, 3, 3]

We’re definitely not done yet, but let’s just get started by using our const function above. It is definitely better to use the name const in the code, rather than copying the code from above.

def alt1(n):
    mylist = const(n)
    return mylist
alt1(4)
[3, 3, 3, 3]

Think of the next function as first making a constant list of all 3s, and then replacing all the odd-indexed elements by 8. Using range(1,n,2) is a nice way to get access to those odd indices.

def alt1(n):
    mylist = const(n)
    for i in range(1,n,2):
        mylist[i] = 8
    return mylist

Let’s check that the code works for both an even input as well as for an odd input.

alt1(10)
[3, 8, 3, 8, 3, 8, 3, 8, 3, 8]
alt1(11)
[3, 8, 3, 8, 3, 8, 3, 8, 3, 8, 3]

Checking if the index is even or odd#

Here we’ll write a different version of the function, which will use an if statement to check if the index is even or odd. Notice that the above alt1 code is longer than it looks, since it is also using the const function that we wrote earlier.

Let’s first see how to check if an integer is even or odd. We use the % operator. If you’ve seen modular arithmetic (for example in Math 13), i%2 is like reducing \(i\) modulo 2. Another way to think about it is that the result is the remainder obtained when dividing \(i\) by 2.

i = 7
i%2
1
6%2
0

We can check if an integer i is even by checking i%2 == 0. We use an if statement, which works very similarly to in Matlab.

def alt2(n):
    mylist = []
    for i in range(n):
        if i%2 == 0:
            mylist.append(3)
        else:
            mylist.append(8)
    return mylist
alt2(5)
[3, 8, 3, 8, 3]
alt2(6)
[3, 8, 3, 8, 3, 8]

Here is a very similar version, where instead of else we use elif. When using elif, you need to specify a condition. These two versions do the same thing, because if the integer i is not even, it is always going to be odd. You can have as many elif statements as you want, although in this case, i%2 == 0 and i%2 == 1 exhaust all the possibilities, so in this case, no later elif statements (nor else statements) would ever be reached.

def alt2(n):
    mylist = []
    for i in range(n):
        if i%2 == 0:
            mylist.append(3)
        elif i%2 == 1:
            mylist.append(8)
    return mylist
alt2(5)
[3, 8, 3, 8, 3]
alt2(6)
[3, 8, 3, 8, 3, 8]

Using NumPy#

The most important Python library for Math 9 is NumPy. Because this is an external library, it needs to be installed on your computer before it can be used. Once it is installed, it can be imported as in the following cell. The convention (which we will always follow) is to abbreviate the library by np.

import numpy as np

Many parts of NumPy are very similar to Matlab. For example, NumPy has a zeros function, just like Matlab. We write np.zeros rather than just zeros, so that Python knows this function is defined in the NumPy library.

A = np.zeros((3,5))

This A is yet another type of object in Python. It is a type defined in the NumPy library, called ndarray. The “nd” portion stands for “n-dimensional”. In this case, A is two dimensional (it has a dimension of rows and a dimension of columns).

type(A)
numpy.ndarray
A
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])
A.shape
(3, 5)

We are going to make a new version of our alternating function, this time using NumPy. As a first step, here is a function that returns a length n NumPy array of all zeros.

def alt3(n):
    A = np.zeros(n)
    return A
alt3(7)
array([0., 0., 0., 0., 0., 0., 0.])
alt3(7)[0]
0.0

The decimal points are a hint that these zeros are floats rather than integers. Just for practice with data types, let’s specify that A should hold integers, not floats.

type(alt3(7)[0])
numpy.float64
A = alt3(7)
A.dtype
dtype('float64')

To change that data type from floats to integers, we will use the dtype argument.

help(np.zeros)
Help on built-in function zeros in module numpy:

zeros(...)
    zeros(shape, dtype=float, order='C', *, like=None)
    
    Return a new array of given shape and type, filled with zeros.
    
    Parameters
    ----------
    shape : int or tuple of ints
        Shape of the new array, e.g., ``(2, 3)`` or ``2``.
    dtype : data-type, optional
        The desired data-type for the array, e.g., `numpy.int8`.  Default is
        `numpy.float64`.
    order : {'C', 'F'}, optional, default: 'C'
        Whether to store multi-dimensional data in row-major
        (C-style) or column-major (Fortran-style) order in
        memory.
    like : array_like
        Reference object to allow the creation of arrays which are not
        NumPy arrays. If an array-like passed in as ``like`` supports
        the ``__array_function__`` protocol, the result will be defined
        by it. In this case, it ensures the creation of an array object
        compatible with that passed in via this argument.
    
        .. versionadded:: 1.20.0
    
    Returns
    -------
    out : ndarray
        Array of zeros with the given shape, dtype, and order.
    
    See Also
    --------
    zeros_like : Return an array of zeros with shape and type of input.
    empty : Return a new uninitialized array.
    ones : Return a new array setting values to one.
    full : Return a new array of given shape filled with value.
    
    Examples
    --------
    >>> np.zeros(5)
    array([ 0.,  0.,  0.,  0.,  0.])
    
    >>> np.zeros((5,), dtype=int)
    array([0, 0, 0, 0, 0])
    
    >>> np.zeros((2, 1))
    array([[ 0.],
           [ 0.]])
    
    >>> s = (2,2)
    >>> np.zeros(s)
    array([[ 0.,  0.],
           [ 0.,  0.]])
    
    >>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
    array([(0, 0), (0, 0)],
          dtype=[('x', '<i4'), ('y', '<i4')])

Here we specify that the dtype should be 64-bit integers.

def alt3(n):
    A = np.zeros(n, dtype=np.int64)
    return A
alt3(8)
array([0, 0, 0, 0, 0, 0, 0, 0])
A = alt3(8)
A
array([0, 0, 0, 0, 0, 0, 0, 0])

NumPy supports many types of elementwise operations, just like Matlab.

A+3
array([3, 3, 3, 3, 3, 3, 3, 3])

Most standard types defined in Python do not support this sort of elementwise operation. For example, list does not.

mylist = [0,0,0]
mylist+3
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/folders/8j/gshrlmtn7dg4qtztj4d4t_w40000gn/T/ipykernel_54885/4015182290.py in <module>
----> 1 mylist+3

TypeError: can only concatenate list (not "int") to list
def alt3(n):
    A = np.zeros(n, dtype=np.int64)+3
    return A
A = alt3(5)

We now have a version of the alternating function that returns a NumPy array of all 3s.

A
array([3, 3, 3, 3, 3])
A[1::2]
array([3, 3])

Here is a convenient syntax (again, it does not work for lists) for setting the odd-indexed elements of A to be 8.

A[1::2] = 8
A
array([3, 8, 3, 8, 3])
def alt3(n):
    A = np.zeros(n, dtype=np.int64)+3
    A[1::2] = 8
    return A

It’s always a good idea, when writing a function like this, to test it on both an even input and on an odd input.

alt3(5)
array([3, 8, 3, 8, 3])
alt3(6)
array([3, 8, 3, 8, 3, 8])
mylist = [3,1,4,1,5,9]
mylist[::2]
[3, 4, 5]
mylist[::2] = -17
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/folders/8j/gshrlmtn7dg4qtztj4d4t_w40000gn/T/ipykernel_54885/4194879934.py in <module>
----> 1 mylist[::2] = -17

TypeError: must assign iterable to extended slice

Here we convert mylist to a NumPy array.

myarray = np.array(mylist)

Now, because myarray is a NumPy array, we can use the slice syntax to set the even-indexed entries to be equal to -17.

myarray[::2] = -17
myarray
array([-17,   1, -17,   1, -17,   9])

If you really want to make this assignment using mylist, you need the right-hand side to be the same length as the number of slots you are trying to fill in. (So the NumPy version is much easier to use.)

mylist[::2] = [-17, -17, -17]
mylist
[-17, 1, -17, 1, -17, 9]

If you use the wrong number of terms on the right side, you will get an error.

mylist[::2] = [-17, -17]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/8j/gshrlmtn7dg4qtztj4d4t_w40000gn/T/ipykernel_54885/777026901.py in <module>
----> 1 mylist[::2] = [-17, -17]

ValueError: attempt to assign sequence of size 2 to extended slice of size 3
mylist[::2] = [-17, -17, -17, -17]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/8j/gshrlmtn7dg4qtztj4d4t_w40000gn/T/ipykernel_54885/502164326.py in <module>
----> 1 mylist[::2] = [-17, -17, -17, -17]

ValueError: attempt to assign sequence of size 4 to extended slice of size 3

Timing different strategies#

Here I have restarted this Jupyter notebook. Let’s compare some of the functions we have defined.

def alt3(n):
    A = np.zeros(n, dtype=np.int64)+3
    A[1::2] = 8
    return A

We need to remember to re-import NumPy each time we restart the notebook, otherwise we’ll get an error like in the following.

alt3(10)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/var/folders/8j/gshrlmtn7dg4qtztj4d4t_w40000gn/T/ipykernel_57482/3247321496.py in <module>
----> 1 alt3(10)

/var/folders/8j/gshrlmtn7dg4qtztj4d4t_w40000gn/T/ipykernel_57482/3223429259.py in alt3(n)
      1 def alt3(n):
----> 2     A = np.zeros(n, dtype=np.int64)+3
      3     A[1::2] = 8
      4     return A

NameError: name 'np' is not defined
import numpy as np
def alt3(n):
    A = np.zeros(n, dtype=np.int64)+3
    A[1::2] = 8
    return A
alt3(10)
array([3, 8, 3, 8, 3, 8, 3, 8, 3, 8])

Here is another version, which is very similar to our original function from the top of this notebook (the only real difference is that I’m not using a separate const function in this case).

def alt4(n):
    mylist = []
    for i in range(n):
        mylist.append(3)
    for i in range(1,n,2):
        mylist[i] = 8
    return mylist
alt4(10)
[3, 8, 3, 8, 3, 8, 3, 8, 3, 8]

The functions alt3 and alt4 basically produce the same output, but alt3 returns a NumPy array whereas alt4 returns a list.

Let’s evaluate how long alt4 takes in the case of input 10. Here is an example of the unit of time being µs, which stands for microsecond, which is \(10^{-6}\) seconds.

%%timeit
alt4(10)
1.18 µs ± 21.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

We have already used for loops and if statements. A third type of fundamental programming construction is a while loop. Just for practice with while loops (this will be our first time seeing them), let’s try to find a value of n such that alt4(n) takes over 5 seconds to complete. I don’t know how to do this using the %%timeit Jupyter magic function, so instead, let’s use another Python library, the time library. This is a standard library in Python, so you don’t need to install it.

import time

This number is not very meaningful by itself. It becomes meaningful if we subtract another time from it, in which case we will get the number of seconds elapsed between the two calls.

time.time()
1657444015.338846
time.time()
1657444029.606127
start = time.time()
alt4(10**3)
end = time.time()
t = end-start
t
0.0003650188446044922

Here we keep multiplying n by 2 and checking the new time. We will exit the while loop once we have a value of n which takes over 5 seconds to complete. Try going through the following code, step-by-step, to convince yourself that this really does produce the first value of n we’ve found for which alt4(n) takes over 5 seconds to complete.

t = 0
n = 100
while t < 5:
    n = n*2 # n *= 2
    start = time.time()
    alt4(n)
    end = time.time()
    t = end-start

Here we check that the time really is bigger than 5.

t
5.250777959823608

The following value of n is the length we were looking for. So it took about 5.25 seconds to produce a length 52,428,800 list, alternating between 3s and 8s.

n
52428800
def alt3(n):
    A = np.zeros(n, dtype=np.int64)+3
    A[1::2] = 8
    return A

Notice how much faster it is using the NumPy version of our function. For the same input n, the alt3 function takes about a quarter of a second instead of over 5 seconds.

print(n)
start = time.time()
alt3(n)
end = time.time()
t = end-start
print(t)
52428800
0.26982712745666504

If you want the output to be a list instead of a NumPy array, you can convert to a list.

def alt3b(n):
    A = np.zeros(n, dtype=np.int64)+3
    A[1::2] = 8
    return list(A)
alt3b(10)
[3, 8, 3, 8, 3, 8, 3, 8, 3, 8]

Surprisingly (to me), making this conversion to a list really slows down the function.

print(n)
start = time.time()
alt3b(n)
end = time.time()
t = end-start
print(t)
52428800
4.124554872512817

Here is a mix between the two methods. It uses a for loop to assign the 8 values, instead of using the slicing syntax. The main lesson is that, when possible, we should always try to avoid for loops. The assignment using A[1::2] = 8 was much faster, over 10 times faster, than in the following where we make the same assignment using a for loop.

def alt3c(n):
    A = np.zeros(n, dtype=np.int64)+3
    for i in range(1,n,2):
        A[i] = 8
    return A
print(n)
start = time.time()
alt3c(n)
end = time.time()
t = end-start
print(t)
52428800
2.886206865310669