Boolean arrays in NumPy¶

A Boolean array by itself is not very interesting; it’s just a NumPy array whose entries are either True or False.

import numpy as np

bool_arr = np.array([True,True,False,True])
bool_arr

array([ True,  True, False,  True])

The reason Boolean arrays are important is that they are often produced by other operations.

arr = np.array([3,1,4,1])
arr < 3.5

array([ True,  True, False,  True])

The number of Trues in a Boolean array can be counted very efficiently using np.count_nonzero. Reminders:

From a small example, it might seem like the NumPy method is slower:

my_list = [3,1,4,3,5]
my_array = np.array(my_list)

my_list.count(3)

%%timeit
my_list.count(3)

75.9 ns ± 0.305 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

np.count_nonzero(my_array==3)

%%timeit
np.count_nonzero(my_array==3)

1.55 µs ± 7.77 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

But for a longer example, it will be clear that the NumPy method is faster. In this example, our array and list have length ten million.

rng = np.random.default_rng()
my_array = rng.integers(1,6,size=10**7)
my_list = list(my_array)

my_list.count(3)

np.count_nonzero(my_array==3)

%%timeit
my_list.count(3)

985 ms ± 5.81 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%%timeit
np.count_nonzero(my_array==3)

3.04 ms ± 9.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

UC Irvine Math 10 W22