Pillow image processing#

In this notebook, we will see examples of image processing using the Pillow fork of the PIL library. Image processing is interesting in its own right, but probably more importantly for Math 9, it provides a way to think about NumPy arrays in a visual way, including three-dimensional NumPy arrays. (So far, we have only worked with one-dimensional and two-dimensional NumPy arrays, which we think of like vectors and matrices, respectively.)

Pillow installation#

Before you can import NumPy, you need to install NumPy. Sounds obvious enough, but for the image processing library we will use, the procedure is slightly different. See the following video, but briefly, we install the library using the name Pillow, and import the library using the name PIL.

Opening an image#

In addition to importing Pillow, we will also import the os module (which is part of the standard Python library, so it does not need to be installed).

import os
from PIL import Image

We won’t do much with the Pillow library in this section, but here is a quick example of using Pillow to open an image file. In order for this command to work as written, the penguins.jpg file needs to be in the same folder as your Jupyter notebook file.

Image.open("penguins.jpg")
../_images/Pillow_6_0.png

If there are many files you need to access, it is inconvenient to need to have them all in the same folder. For the rest of this section, we will see some ways to use the os module to locate files. (Later we will briefly see a more modern, more object-oriented, approach to locating files, using the pathlib module.)

As a first example involving os, we can use its getcwd function to learn the location of the current folder (i.e., the location of the folder that your Jupyter Notebook file is in).

os.getcwd()
'/Users/christopherdavis/Documents/GitHub/UCI-Math-9-F22/Week4'

Another useful os function is listdir, takes as input a string indicating a directory, and which returns a list of every file and directory in the input directory. (I use the words “folder” and “directory” interchangeably.) In the following, the string "images" doesn’t have any special meaning related to images; instead "images" is just the name of a subfolder of my current folder. In this case, we can see that the images folder contains 8 files.

os.listdir("images")
['penguins.jpg',
 'Seurat-var.png',
 'Seurat-orig.jpg',
 'altair.gif',
 'Davis-Square2017.png',
 'alphabet.png',
 'flowchart.jpeg',
 'test_grid.png']

If we don’t pass an argument to listdir, then the output is a list containing all the files and folders in the current folder. For example, we see the name of this notebook, Pillow.ipynb. We also see the images folder mentioned above. We also see some “hidden” files (the ones that start with a period).

os.listdir()
['Draft-Week4.ipynb',
 'Draft-Pillow.ipynb',
 'penguins.jpg',
 '.DS_Store',
 'images',
 'Pillow.ipynb',
 '.ipynb_checkpoints',
 'ObjectOriented.ipynb']

One of the files in the images folder was named “alphabet.png”. Let’s try to open it.

Image.open("alphabet.png")
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 Image.open("alphabet.png")

File ~/miniconda3/envs/math9f22/lib/python3.10/site-packages/PIL/Image.py:3092, in open(fp, mode, formats)
   3089     filename = fp
   3091 if filename:
-> 3092     fp = builtins.open(filename, "rb")
   3093     exclusive_fp = True
   3095 try:

FileNotFoundError: [Errno 2] No such file or directory: 'alphabet.png'

An error is raised because Pillow does not know where to find the alphabet.png file. On my Mac, I could solve this by inputting "images/alphabet.png", but if I instead were on a PC, I would have to input "images\alphabet.png". That is especially annoying if you are sharing code with someone on a different system. Luckily, there is a function in os.path which will automatically choose the correct separator.

os.path.join("images", "alphabet.png")
'images/alphabet.png'
s = os.path.join("images", "alphabet.png")

This variable s now specifies the path to the “alphabet.png” file, relative to our current location. This path is what is needed to open the file using Pillow.

Image.open(s)
../_images/Pillow_19_0.png

As a last example, let’s see two ways to find all the png files in the images folder. Here is a reminder of all the files in that folder.

os.listdir("images")
['penguins.jpg',
 'Seurat-var.png',
 'Seurat-orig.jpg',
 'altair.gif',
 'Davis-Square2017.png',
 'alphabet.png',
 'flowchart.jpeg',
 'test_grid.png']

Both our approaches will involve list comprehension. The first approach uses slicing, to indicate that we want all the strings whose last 3 characters are "png".

[x for x in os.listdir("images") if x[-3:] == "png"]
['Seurat-var.png', 'Davis-Square2017.png', 'alphabet.png', 'test_grid.png']

The second approach uses the string method endswith. I think both of these approaches have their advantages. This endswith approach is nice because you don’t need to indicate the length of "png". The slicing approach is nice because you are more likely to know about slicing than to know about this very specific method endswith.

[x for x in os.listdir("images") if x.endswith("png")]
['Seurat-var.png', 'Davis-Square2017.png', 'alphabet.png', 'test_grid.png']

We haven’t done any image processing yet. So far, we only used the Pillow library as an excuse to talk about working with files in Python. In the rest of this notebook, we’ll work with the Pillow library much more closely.

Images and NumPy arrays#

The Pillow library provides us with a visual way to work with NumPy arrays, which may help make some of the earlier topics more concrete. For example, in this section we will see slicing in the context of Pillow images.

from PIL import Image
import numpy as np

Above, we used Image.open to display an image in this Jupyter notebook. That isn’t usually how we will use Image.open. Usually we will save the resulting object, in this case, a JpegImageFile, so that we can perform image processing on it.

img = Image.open("penguins.jpg")
type(img)
PIL.JpegImagePlugin.JpegImageFile

The object img itself is not a NumPy array, but it can be converted to a NumPy array. There are a few different ways to make that conversion. We might talk about the differences later, but for now we will use the recommended way, which is to use np.asarray.

arr = np.asarray(img)

The resulting object, arr, is a NumPy array.

type(arr)
numpy.ndarray

The actual img variable itself is unchaned; it’s still a Pillow Image object.

img
../_images/Pillow_37_0.png

Because arr is a NumPy array, arr has all the usual attributes and methods of NumPy arrays. For example, here is its shape attribute. Notice that three different numbers are listed. I believe this is our first example in Math 9 of a three-dimensional NumPy array. The number 225 refers to the number of rows, the number 399 refers to the number of columns (notice that the image is wider than it is tall), and the number 3 refers to the RGB (Red-Green-Blue) values which specify the color of a given pixel.

arr.shape
(225, 399, 3)

Think of arr like a matrix with 225 rows and 399 columns. Each entry in this “matrix” is not a single number, but is instead a triple of numbers representing the RGB values. For example, the upper-left-most pixel has Red value of 178, Green value of 190, and Blue value of 206. The bigger the number, the more of that color is present in the given pixel. (For future reference, also notice that the dtype is specified as uint8, which stands for unsigned 8-bit integers. Later we will have to convert our own arrays to dtype=np.uint8 so that Pillow can handle the data type.)

arr[0,0]
array([178, 190, 206], dtype=uint8)

What does a number like 178 mean in this case? Is that a big value? Different normalizations are possible (using real numbers between 0 and 1 is another common choice of normalization), but the normalization Pillow uses is integer values from 0 (inclusive) to 256 (exclusive). That means there are exactly 256 options for each value, which corresponds to 8 bits or to 1 byte.

When we evaluate arr.min(), NumPy goes through all \(225 \cdot 399 \cdot 3\) numbers in the array and checks what is the minimum value which occurs.

arr.min()
0
arr.max()
255

Here is an example of slicing in terms of the image. Let’s get every row, and the columns from 300-th to the right. (And because we don’t specify a slice in the third dimension, in the RGB dimension, all of the RGB values will be included.)

(arr[:, 300:]).shape
(225, 99, 3)

Here is a reminder of how our original image looked.

img
../_images/Pillow_48_0.png

We can convert from a NumPy array (as long as it has a suitable shape and dtype) to a Pillow Image using Image.fromarray. Here we make a Pillow Image out of our sliced NumPy array. Notice how it is the right-most portion of the image, but with every row, and with the colors unchanged.

Image.fromarray(arr[:, 300:])
../_images/Pillow_50_0.png

Setting colors#

We’ve seen above how to get a portion of a Pillow Image. Here we’ll see how to change part of that Image.

from PIL import Image
import numpy as np
img = Image.open("penguins.jpg")
arr = np.asarray(img)
img
../_images/Pillow_54_0.png

Let’s say we want to change the horizontal band from row 100 to row 150 to the color black, which is represented in RGB values as [0,0,0].

Image.fromarray(arr[100:150])
../_images/Pillow_56_0.png

The following almost works, but we get an error which says that arr is “read-only”.

arr[100:150] = 0
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [5], in <cell line: 1>()
----> 1 arr[100:150] = 0

ValueError: assignment destination is read-only

To fix that, we can make a copy of arr, using the array’s copy method.

B = arr.copy()

To understand a fundamental difference between arr and its copy B, let’s look at the size of these two objects. Not the size in terms of pixels, but the size in terms of how much space they take up in memory. We will use the sys module, which is part of the Python standard library.

import sys

The following represents the size of img in bytes. Notice how small it is, even though the picture itself looks relatively high-quality. The size of img is significantly less than even a kilobyte.

sys.getsizeof(img)
48

The variable arr also is very small in relation to how much data it seems to hold; it is only 144 bytes.

sys.getsizeof(arr)
144

For comparison, the copy we made, B, is over two-hundred thousand bytes. The fundamental difference between img and arr on one hand, and B on the other hand, is that img and arr are only keeping track of where to get data from on the computer (like the location of the image file with the data), whereas B is actually holding a copy of all of that data.

sys.getsizeof(B)
269469

So B takes up thousands of times more space than arr, but the benefit of B is that B is not read-only. Because B is not read-only, the following assignment works without raising an error.

B[100:150] = 0

We have now changed many of the pixels in B to correspond to black. We can view the image corresponding to B using Image.fromarray.

Image.fromarray(B)
../_images/Pillow_72_0.png

Setting the color to white is almost identical. White corresponds to the maximum value in each of the red, the green, and the blue slots. Under Pillow’s normalization, this maximum value is 255.

B[100:150] = 255
Image.fromarray(B)
../_images/Pillow_75_0.png

Let’s next try to set the color of this horizontal band to green, which corresponds in RGB values to [0,255,0]. In order to make this assignment, let’s recall the rules of broadcasting.

B[100:150].shape
(50, 399, 3)
color = np.array([0, 255, 0])
color.shape
(3,)

The shapes of B[100:150] and color are compatible with respect to broadcasting if, starting from the right-most dimension, the dimension sizes are either equal, or one of those dimensions is equal to 1. In this case, the right-most dimensions of B[100:150] and color are both equal to 3, and because color has no further dimensions, these shapes are compatible.

The following command is setting each of the 50 times 399 pixels in this horizontal band to be green. The only reason this assignment works is because the shapes of B[100:150] and color are compatible with respect to broadcasting.

B[100:150] = color
Image.fromarray(B)
../_images/Pillow_82_0.png

Making an image from scratch#

So far we’ve gone from an image to a NumPy array, and then we’ve made changes to the NumPy array and displayed the resulting image. Here we are going to start with a NumPy array and try to turn that into a Pillow Image object.

from PIL import Image
import numpy as np

View the following as corresponding to a 2-pixel by 2-pixel image. The top-left pixel will be red, the top-right pixel will be green, and the lower-left pixel will be blue. It’s not as clear what color the lower-right pixel should be, but that pixel will be maximum red and maximum green combined.

arr = np.array([[[255,0,0], [0,255,0]],
                [[0,0,255], [255,255,0]]])

When we try to convert arr to a Pillow Image, we get the following error. I get this error all the time when I try to make a Pillow image from scratch (as opposed to starting with a Pillow image and editing it). The error is telling us that the dtype is incorrect.

Image.fromarray(arr)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/miniconda3/envs/math9f22/lib/python3.10/site-packages/PIL/Image.py:2953, in fromarray(obj, mode)
   2952 try:
-> 2953     mode, rawmode = _fromarray_typemap[typekey]
   2954 except KeyError as e:

KeyError: ((1, 1, 3), '<i8')

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 Image.fromarray(arr)

File ~/miniconda3/envs/math9f22/lib/python3.10/site-packages/PIL/Image.py:2955, in fromarray(obj, mode)
   2953         mode, rawmode = _fromarray_typemap[typekey]
   2954     except KeyError as e:
-> 2955         raise TypeError("Cannot handle this data type: %s, %s" % typekey) from e
   2956 else:
   2957     rawmode = mode

TypeError: Cannot handle this data type: (1, 1, 3), <i8

We saw above that when we load an image from a file and convert it to a NumPy array, the resulting dtype is unsigned 8-bit integers. We’ll explicitly make our NumPy array have that dtype.

arr = np.array([[[255,0,0], [0,255,0]],
                [[0,0,255], [255,255,0]]], dtype=np.uint8)

We can now successfully convert this NumPy array to a Pillow Image, but because it is only four total pixels, the resulting image is tiny. If you zoom in on the following output, maybe you can see the four pixels.

Image.fromarray(arr)
../_images/Pillow_92_0.png

Instead of just displaying the Image, let’s assign it to the variable name img.

img = Image.fromarray(arr)

Pillow Image objects have a resize method that can be used in this case. We call resize with the argument (100,100), which will be the size of the resulting Image.

img.resize((100, 100))
../_images/Pillow_96_0.png

The only thing I don’t like about the above is the blurry transitions between the colors. I would rather it just use 50-by-50 pixels of red, 50-by-50 pixels of green, and so on. If we check the documentation for img.resize, we see that there is a resample keyword argument which can be used to influence these transitions.

help(img.resize)
Help on method resize in module PIL.Image:

resize(size, resample=None, box=None, reducing_gap=None) method of PIL.Image.Image instance
    Returns a resized copy of this image.
    
    :param size: The requested size in pixels, as a 2-tuple:
       (width, height).
    :param resample: An optional resampling filter.  This can be
       one of :py:data:`PIL.Image.Resampling.NEAREST`,
       :py:data:`PIL.Image.Resampling.BOX`,
       :py:data:`PIL.Image.Resampling.BILINEAR`,
       :py:data:`PIL.Image.Resampling.HAMMING`,
       :py:data:`PIL.Image.Resampling.BICUBIC` or
       :py:data:`PIL.Image.Resampling.LANCZOS`.
       If the image has mode "1" or "P", it is always set to
       :py:data:`PIL.Image.Resampling.NEAREST`.
       If the image mode specifies a number of bits, such as "I;16", then the
       default filter is :py:data:`PIL.Image.Resampling.NEAREST`.
       Otherwise, the default filter is
       :py:data:`PIL.Image.Resampling.BICUBIC`. See: :ref:`concept-filters`.
    :param box: An optional 4-tuple of floats providing
       the source image region to be scaled.
       The values must be within (0, 0, width, height) rectangle.
       If omitted or None, the entire source is used.
    :param reducing_gap: Apply optimization by resizing the image
       in two steps. First, reducing the image by integer times
       using :py:meth:`~PIL.Image.Image.reduce`.
       Second, resizing using regular resampling. The last step
       changes size no less than by ``reducing_gap`` times.
       ``reducing_gap`` may be None (no first step is performed)
       or should be greater than 1.0. The bigger ``reducing_gap``,
       the closer the result to the fair resampling.
       The smaller ``reducing_gap``, the faster resizing.
       With ``reducing_gap`` greater or equal to 3.0, the result is
       indistinguishable from fair resampling in most cases.
       The default value is None (no optimization).
    :returns: An :py:class:`~PIL.Image.Image` object.

In the above documentation, notice that the resampling options all have names like PIL.Image.Resampling.LANCZOS. This means that for example LANCZOS is an attribute of PIL.Image.Resampling. (We have already imported Image from PIL, so we would just call this Image.Resampling instead of PIL.Image.Resampling.) We can view all of the attributes using Python’s dir function.

It’s hard to predict how many attributes there will be, but in this case, there are just the values we saw in the documentation, together with some so-called “dunder” methods (short for “double underscore”, because the names have two underscores at the beginning and at the end).

dir(Image.Resampling)
['BICUBIC',
 'BILINEAR',
 'BOX',
 'HAMMING',
 'LANCZOS',
 'NEAREST',
 '__class__',
 '__doc__',
 '__members__',
 '__module__']

The output of dir is a list, and here we iterate through the elements in the list.

for s in dir(Image.Resampling):
    print(s)
BICUBIC
BILINEAR
BOX
HAMMING
LANCZOS
NEAREST
__class__
__doc__
__members__
__module__

Here we are only interested in the attributes that do not start with an underscore.

for s in dir(Image.Resampling):
    if s[0] != "_":
        print(s)
BICUBIC
BILINEAR
BOX
HAMMING
LANCZOS
NEAREST

It would be tempting to use notation like resample=Image.Resampling.s, but how would Python know that this s is a variable and not a string? We want to use something like f-strings, but we don’t have a string in this case, so f-strings will not work with this approach.

for s in dir(Image.Resampling):
    if s[0] != "_":
        print(s)
        display(img.resize((100, 100), resample=Image.Resampling.s))
BICUBIC
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [12], in <cell line: 1>()
      2 if s[0] != "_":
      3     print(s)
----> 4     display(img.resize((100, 100), resample=Image.Resampling.s))

File ~/miniconda3/envs/math9f22/lib/python3.10/enum.py:437, in EnumMeta.__getattr__(cls, name)
    435     return cls._member_map_[name]
    436 except KeyError:
--> 437     raise AttributeError(name) from None

AttributeError: s

One option, that I think will always work, is to use Python’s built-in getattr function (short for “get attribute”). Here we can see the effect of each of these resampling options.

for s in dir(Image.Resampling):
    if s[0] != "_":
        print(s)
        display(img.resize((100, 100), resample=getattr(Image.Resampling, s)))
BICUBIC
../_images/Pillow_108_1.png
BILINEAR
../_images/Pillow_108_3.png
BOX
../_images/Pillow_108_5.png
HAMMING
../_images/Pillow_108_7.png
LANCZOS
../_images/Pillow_108_9.png
NEAREST
../_images/Pillow_108_11.png

Another approach, that I think is more natural in this case, but which I think will not always work, is to use indexing notation, as in resample=Image.Resampling[s].

for s in dir(Image.Resampling):
    if s[0] != "_":
        print(s)
        display(img.resize((100, 100), resample=Image.Resampling[s]))
BICUBIC
../_images/Pillow_110_1.png
BILINEAR
../_images/Pillow_110_3.png
BOX
../_images/Pillow_110_5.png
HAMMING
../_images/Pillow_110_7.png
LANCZOS
../_images/Pillow_110_9.png
NEAREST
../_images/Pillow_110_11.png

Swapping two color channels#

Here is our goal in this section:

  • Swap the green and blue color channels of the penguins image.

from PIL import Image
import numpy as np
img = Image.open("penguins.jpg")
arr = np.asarray(img)

Here is a reminder of what arr contains.

arr
array([[[178, 190, 206],
        [178, 190, 206],
        [178, 190, 206],
        ...,
        [121, 137, 171],
        [121, 137, 171],
        [121, 137, 171]],

       [[179, 191, 207],
        [179, 191, 207],
        [179, 191, 207],
        ...,
        [120, 136, 170],
        [120, 136, 170],
        [119, 135, 169]],

       [[179, 191, 207],
        [179, 191, 207],
        [179, 191, 207],
        ...,
        [119, 135, 169],
        [118, 134, 168],
        [117, 133, 167]],

       ...,

       [[225, 226, 231],
        [224, 225, 230],
        [222, 223, 228],
        ...,
        [ 46,  46,  58],
        [ 64,  67,  82],
        [ 80,  85, 104]],

       [[222, 223, 228],
        [222, 223, 228],
        [218, 219, 224],
        ...,
        [ 54,  56,  69],
        [ 55,  58,  75],
        [108, 115, 134]],

       [[215, 218, 225],
        [209, 212, 219],
        [196, 199, 206],
        ...,
        [ 75,  77,  89],
        [100, 108, 121],
        [127, 139, 155]]], dtype=uint8)

Let’s just think about the upper-left pixel, and trying to swap its green and blue channels. (For simplicity we will consider the list version of this length-3 NumPy array.) Here is a common beginning programming mistake.

mylist = [178, 190, 206]
mylist[1] = mylist[2]
mylist[2] = mylist[1]

The problem of course is that 190, corresponding to mylist[1], gets lost when we execute the line mylist[1] = mylist[2], and there is no way to recover it. So we wind up with the number 206 repeated.

mylist
[178, 206, 206]

Here is what I think of as the standard solution to that error: we store mylist[1] in a temporary variable, so the value doesn’t get lost when we execute the line mylist[1] = mylist[2].

mylist = [178, 190, 206]
temp = mylist[1]
mylist[1] = mylist[2]
mylist[2] = temp

Here we can see that we did successfully swap the green and blue channels for this pixel.

mylist
[178, 206, 190]

Let’s try to emulate that strategy for the entire NumPy array. If we want the green channel from every row and every column, we can use arr[:,:,1]. We store those green values in a temporary variable, and use the same procedure as above.

The first mistake is trying to make assignments to arr. The line temp = arr[:,:,1] is fine, but the next line arr[:,:,1] = arr[:,:,2] raises an error. Because arr is read-only, we need to use its copy B.

B = arr.copy()
temp = arr[:,:,1]
arr[:,:,1] = arr[:,:,2]
arr[:,:,2] = temp
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [9], in <cell line: 3>()
      1 B = arr.copy()
      2 temp = arr[:,:,1]
----> 3 arr[:,:,1] = arr[:,:,2]
      4 arr[:,:,2] = temp

ValueError: assignment destination is read-only

We now replace arr with B, but there is a very subtle error. It would be very difficult to predict this error in advance (it is related to how Python and/or NumPy handle variable assignments).

B = arr.copy()
temp = B[:,:,1]
B[:,:,1] = B[:,:,2]
B[:,:,2] = temp

If we look at the upper-left-most pixel, we see the exact same issue as we saw above in the list version, even though it seems like we already applied the fix.

B[0,0]
array([178, 206, 206], dtype=uint8)

As a hint of what’s wrong, let’s look at the size in memory of temp.

from sys import getsizeof

We see that temp is 128 bytes.

getsizeof(temp)
128

Here is the contents of temp.

temp
array([[206, 206, 206, ..., 171, 171, 171],
       [207, 207, 207, ..., 170, 170, 169],
       [207, 207, 207, ..., 169, 168, 167],
       ...,
       [231, 230, 228, ...,  58,  82, 104],
       [228, 228, 224, ...,  69,  75, 134],
       [225, 219, 206, ...,  89, 121, 155]], dtype=uint8)

And here is the shape of temp. Does it really seem like a 225-by-399 array of integers can be stored in 128 bytes?

temp.shape
(225, 399)

The issue is that, when we evaluate temp = B[:,:,1], the contents of B[:,:,1] do not get copied over to temp. Instead, temp only remembers the location of where it should get its contents. This can be convenient when working with very large NumPy arrays, but in this case, it breaks our code, because the assignment B[:,:,1] = B[:,:,2] not only changes B[:,:,1], but also changes temp.

One flexible work-around is to call the copy method on B[:,:,1], so that its contents get copied over to temp.

B = arr.copy()
temp = B[:,:,1].copy()
B[:,:,1] = B[:,:,2]
B[:,:,2] = temp

Notice how, in this case, B[0,0], representing the upper-left-most pixel, really does have three different channel values.

B[0,0]
array([178, 206, 190], dtype=uint8)

I think the above solution is the most general and most enlightening. In our very particular situation, there is probably a simpler solution, due to the fact that we already have a copy of all the values saved, in the NumPy array arr.

B = arr.copy()
B[:,:,1] = arr[:,:,2]
B[:,:,2] = arr[:,:,1]

Here we get the same upper-left-most pixel as in our copy solution.

B[0,0]
array([178, 206, 190], dtype=uint8)

To end this section, we will see how the image changes when its green and blue color channels are swapped.

Here is the original image.

img
../_images/Pillow_147_0.png

Here is the new image. Notice how the blue colors have been replaced by green colors. What other changes do you notice?

Image.fromarray(B)
../_images/Pillow_149_0.png

Unique colors#

In this section, we will count the number of unique colors in an image. As usual, the point is not the importance of the image processing technique. Instead, the point is to practice with NumPy (in this case, the NumPy function unique together with an axis keyword argument).

from PIL import Image
import numpy as np

The penguins image we were working with above has thousands of colors. Here we will work with a simpler image, that has just three colors.

img = Image.open("images/test_grid.png")
arr = np.asarray(img)

It looks like there are only three colors in this image, but sometimes there are many more colors than you would expect, especially at the boundary between colors. In this case, we’ll verify that there indeed exactly three colors.

img
../_images/Pillow_155_0.png

Even though this is a simpler image than the penguins photograph above, this grid is a significantly bigger image, in terms of pixels.

arr.shape
(2000, 2000, 3)

If we look at the corresponding NumPy array, we see that the very top row of pixels begins with three pixels with RGB values [180, 153, 12] and ends with three pixels with RGB values [12, 153, 152].

arr
array([[[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       [[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       [[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       ...,

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]],

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]],

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]]], dtype=uint8)

The function we want to use is np.unique, but if we call that function directly on arr, we will learn how many unique numbers appear in arr, whereas we want to know how many unique RGB triples of numbers occur.

np.unique(arr)
array([ 12, 152, 153, 180], dtype=uint8)

The first step to applying np.unique successfully in this context is to reshape the array so it is two-dimensional, and with each row corresponding to a single RGB value that occurs in the image.

B = arr.reshape(-1,3)

For example, at the top of B, we see the same RGB values that we noticed in the original three-dimensional NumPy array.

B
array([[180, 153,  12],
       [180, 153,  12],
       [180, 153,  12],
       ...,
       [180, 153,  12],
       [180, 153,  12],
       [180, 153,  12]], dtype=uint8)

Recall that arr was size \(2000 \times 2000 \times 3\). After reshaping, B has size \(4000000 \times 3\). (One way to think about it is, we don’t care about what rows and columns the pixels are in; all we care about are what are the different colors of pixels that occur.)

B.shape
(4000000, 3)

If we call np.unique(B), we get the exact same result as np.unique(arr). If we don’t specify an axis keyword argument to np.unique, then the shape of the input array does not matter.

np.unique(B)
array([ 12, 152, 153, 180], dtype=uint8)

The key is to call the unique function with axis=0. This is saying, compare all of the rows to each other, and keep one copy of each unique row. (How I think about it, is that we are changing the 0-axis, the rows axis.) In this case, we are going from 4 million rows to 3 rows (corresponding to 3 colors). We can see that the 0-dimension is changing, because B.shape[0] is 4000000 while C.shape[0] is 3, whereas B.shape[1] and C.shape[1] are both 3, so the 1-dimension or the columns dimension is not changing.

C = np.unique(B, axis=0)
C.shape
(3, 3)

From C we can confirm our suspicion that there were exactly three colors in our grid image. We had already seen two of these RGB triples, and the last one reported, [12, 12, 12], is very close to black, [0, 0, 0], so that is the dark color that is used for writing the letters and numbers.

C
array([[ 12,  12,  12],
       [ 12, 153, 152],
       [180, 153,  12]], dtype=uint8)

We can tell that the colors change between [12, 153, 152] and [180, 153, 12] at very regular intervals. For the rest of this section, our goal is to determine where these changes occur.

img
../_images/Pillow_176_0.png

It suffices to just look at the very top-most row. If we know where the colors change in the top row, we will also know where they change in every row.

arr[0]
array([[180, 153,  12],
       [180, 153,  12],
       [180, 153,  12],
       ...,
       [ 12, 153, 152],
       [ 12, 153, 152],
       [ 12, 153, 152]], dtype=uint8)

The shape of the top row is \(2000 \times 3\), which represents 2000 pixels.

arr[0].shape
(2000, 3)

A first guess for how to check the colors is to evaluate the following, but the == is getting broadcast. For example, here we intended that [180, 153, 12] == [12, 153, 152] would evaluate to False, since the triples are not equal, but because of broadcasting, this is evaluating to [False, True, False].

arr[0] == [ 12, 153, 152]
array([[False,  True, False],
       [False,  True, False],
       [False,  True, False],
       ...,
       [ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

The trick is to then apply all(axis=1). For example, this will convert [False, True, False] to False and this will convert [True, True, True] to True. (You might have expected us to use axis=2, because the RGB channels usually live in the third-dimension, but remember that we are only looking at the top row of pixels, so we have cut down the dimension by one.)

(arr[0] == [ 12, 153, 152]).all(axis=1)
array([False, False, False, ...,  True,  True,  True])

As a sign that this is working correctly, if we check the shape of this Boolean array, we get that it is a one-dimensional NumPy array of length 2000. This is as we expected, because it represents the 2000 pixels in the top row of the image.

(arr[0] == [ 12, 153, 152]).all(axis=1).shape
(2000,)

Let’s save this length-2000 Boolean array as BM, for “Boolean Mask”.

BM = (arr[0] == [ 12, 153, 152]).all(axis=1)

Where does BM switch from False to True? As a first step, let’s just check where it is True. We can find those indices by using np.nonzero. Recall that np.nonzero returns a tuple.

np.nonzero(BM)
(array([ 200,  201,  202,  203,  204,  205,  206,  207,  208,  209,  210,
         211,  212,  213,  214,  215,  216,  217,  218,  219,  220,  221,
         222,  223,  224,  225,  226,  227,  228,  229,  230,  231,  232,
         233,  234,  235,  236,  237,  238,  239,  240,  241,  242,  243,
         244,  245,  246,  247,  248,  249,  250,  251,  252,  253,  254,
         255,  256,  257,  258,  259,  260,  261,  262,  263,  264,  265,
         266,  267,  268,  269,  270,  271,  272,  273,  274,  275,  276,
         277,  278,  279,  280,  281,  282,  283,  284,  285,  286,  287,
         288,  289,  290,  291,  292,  293,  294,  295,  296,  297,  298,
         299,  300,  301,  302,  303,  304,  305,  306,  307,  308,  309,
         310,  311,  312,  313,  314,  315,  316,  317,  318,  319,  320,
         321,  322,  323,  324,  325,  326,  327,  328,  329,  330,  331,
         332,  333,  334,  335,  336,  337,  338,  339,  340,  341,  342,
         343,  344,  345,  346,  347,  348,  349,  350,  351,  352,  353,
         354,  355,  356,  357,  358,  359,  360,  361,  362,  363,  364,
         365,  366,  367,  368,  369,  370,  371,  372,  373,  374,  375,
         376,  377,  378,  379,  380,  381,  382,  383,  384,  385,  386,
         387,  388,  389,  390,  391,  392,  393,  394,  395,  396,  397,
         398,  399,  600,  601,  602,  603,  604,  605,  606,  607,  608,
         609,  610,  611,  612,  613,  614,  615,  616,  617,  618,  619,
         620,  621,  622,  623,  624,  625,  626,  627,  628,  629,  630,
         631,  632,  633,  634,  635,  636,  637,  638,  639,  640,  641,
         642,  643,  644,  645,  646,  647,  648,  649,  650,  651,  652,
         653,  654,  655,  656,  657,  658,  659,  660,  661,  662,  663,
         664,  665,  666,  667,  668,  669,  670,  671,  672,  673,  674,
         675,  676,  677,  678,  679,  680,  681,  682,  683,  684,  685,
         686,  687,  688,  689,  690,  691,  692,  693,  694,  695,  696,
         697,  698,  699,  700,  701,  702,  703,  704,  705,  706,  707,
         708,  709,  710,  711,  712,  713,  714,  715,  716,  717,  718,
         719,  720,  721,  722,  723,  724,  725,  726,  727,  728,  729,
         730,  731,  732,  733,  734,  735,  736,  737,  738,  739,  740,
         741,  742,  743,  744,  745,  746,  747,  748,  749,  750,  751,
         752,  753,  754,  755,  756,  757,  758,  759,  760,  761,  762,
         763,  764,  765,  766,  767,  768,  769,  770,  771,  772,  773,
         774,  775,  776,  777,  778,  779,  780,  781,  782,  783,  784,
         785,  786,  787,  788,  789,  790,  791,  792,  793,  794,  795,
         796,  797,  798,  799, 1000, 1001, 1002, 1003, 1004, 1005, 1006,
        1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017,
        1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028,
        1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039,
        1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050,
        1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061,
        1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072,
        1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083,
        1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094,
        1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105,
        1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116,
        1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127,
        1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138,
        1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149,
        1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160,
        1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171,
        1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182,
        1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193,
        1194, 1195, 1196, 1197, 1198, 1199, 1400, 1401, 1402, 1403, 1404,
        1405, 1406, 1407, 1408, 1409, 1410, 1411, 1412, 1413, 1414, 1415,
        1416, 1417, 1418, 1419, 1420, 1421, 1422, 1423, 1424, 1425, 1426,
        1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434, 1435, 1436, 1437,
        1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447, 1448,
        1449, 1450, 1451, 1452, 1453, 1454, 1455, 1456, 1457, 1458, 1459,
        1460, 1461, 1462, 1463, 1464, 1465, 1466, 1467, 1468, 1469, 1470,
        1471, 1472, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480, 1481,
        1482, 1483, 1484, 1485, 1486, 1487, 1488, 1489, 1490, 1491, 1492,
        1493, 1494, 1495, 1496, 1497, 1498, 1499, 1500, 1501, 1502, 1503,
        1504, 1505, 1506, 1507, 1508, 1509, 1510, 1511, 1512, 1513, 1514,
        1515, 1516, 1517, 1518, 1519, 1520, 1521, 1522, 1523, 1524, 1525,
        1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535, 1536,
        1537, 1538, 1539, 1540, 1541, 1542, 1543, 1544, 1545, 1546, 1547,
        1548, 1549, 1550, 1551, 1552, 1553, 1554, 1555, 1556, 1557, 1558,
        1559, 1560, 1561, 1562, 1563, 1564, 1565, 1566, 1567, 1568, 1569,
        1570, 1571, 1572, 1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580,
        1581, 1582, 1583, 1584, 1585, 1586, 1587, 1588, 1589, 1590, 1591,
        1592, 1593, 1594, 1595, 1596, 1597, 1598, 1599, 1800, 1801, 1802,
        1803, 1804, 1805, 1806, 1807, 1808, 1809, 1810, 1811, 1812, 1813,
        1814, 1815, 1816, 1817, 1818, 1819, 1820, 1821, 1822, 1823, 1824,
        1825, 1826, 1827, 1828, 1829, 1830, 1831, 1832, 1833, 1834, 1835,
        1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1846,
        1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855, 1856, 1857,
        1858, 1859, 1860, 1861, 1862, 1863, 1864, 1865, 1866, 1867, 1868,
        1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879,
        1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890,
        1891, 1892, 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901,
        1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912,
        1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 1922, 1923,
        1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934,
        1935, 1936, 1937, 1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945,
        1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956,
        1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967,
        1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978,
        1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989,
        1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999]),)

We are not so interested in this tuple of length one; instead we are interested in the NumPy array it holds. That NumPy array contains the indices where BM is True.

np.nonzero(BM)[0]
array([ 200,  201,  202,  203,  204,  205,  206,  207,  208,  209,  210,
        211,  212,  213,  214,  215,  216,  217,  218,  219,  220,  221,
        222,  223,  224,  225,  226,  227,  228,  229,  230,  231,  232,
        233,  234,  235,  236,  237,  238,  239,  240,  241,  242,  243,
        244,  245,  246,  247,  248,  249,  250,  251,  252,  253,  254,
        255,  256,  257,  258,  259,  260,  261,  262,  263,  264,  265,
        266,  267,  268,  269,  270,  271,  272,  273,  274,  275,  276,
        277,  278,  279,  280,  281,  282,  283,  284,  285,  286,  287,
        288,  289,  290,  291,  292,  293,  294,  295,  296,  297,  298,
        299,  300,  301,  302,  303,  304,  305,  306,  307,  308,  309,
        310,  311,  312,  313,  314,  315,  316,  317,  318,  319,  320,
        321,  322,  323,  324,  325,  326,  327,  328,  329,  330,  331,
        332,  333,  334,  335,  336,  337,  338,  339,  340,  341,  342,
        343,  344,  345,  346,  347,  348,  349,  350,  351,  352,  353,
        354,  355,  356,  357,  358,  359,  360,  361,  362,  363,  364,
        365,  366,  367,  368,  369,  370,  371,  372,  373,  374,  375,
        376,  377,  378,  379,  380,  381,  382,  383,  384,  385,  386,
        387,  388,  389,  390,  391,  392,  393,  394,  395,  396,  397,
        398,  399,  600,  601,  602,  603,  604,  605,  606,  607,  608,
        609,  610,  611,  612,  613,  614,  615,  616,  617,  618,  619,
        620,  621,  622,  623,  624,  625,  626,  627,  628,  629,  630,
        631,  632,  633,  634,  635,  636,  637,  638,  639,  640,  641,
        642,  643,  644,  645,  646,  647,  648,  649,  650,  651,  652,
        653,  654,  655,  656,  657,  658,  659,  660,  661,  662,  663,
        664,  665,  666,  667,  668,  669,  670,  671,  672,  673,  674,
        675,  676,  677,  678,  679,  680,  681,  682,  683,  684,  685,
        686,  687,  688,  689,  690,  691,  692,  693,  694,  695,  696,
        697,  698,  699,  700,  701,  702,  703,  704,  705,  706,  707,
        708,  709,  710,  711,  712,  713,  714,  715,  716,  717,  718,
        719,  720,  721,  722,  723,  724,  725,  726,  727,  728,  729,
        730,  731,  732,  733,  734,  735,  736,  737,  738,  739,  740,
        741,  742,  743,  744,  745,  746,  747,  748,  749,  750,  751,
        752,  753,  754,  755,  756,  757,  758,  759,  760,  761,  762,
        763,  764,  765,  766,  767,  768,  769,  770,  771,  772,  773,
        774,  775,  776,  777,  778,  779,  780,  781,  782,  783,  784,
        785,  786,  787,  788,  789,  790,  791,  792,  793,  794,  795,
        796,  797,  798,  799, 1000, 1001, 1002, 1003, 1004, 1005, 1006,
       1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017,
       1018, 1019, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028,
       1029, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039,
       1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 1050,
       1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 1060, 1061,
       1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 1070, 1071, 1072,
       1073, 1074, 1075, 1076, 1077, 1078, 1079, 1080, 1081, 1082, 1083,
       1084, 1085, 1086, 1087, 1088, 1089, 1090, 1091, 1092, 1093, 1094,
       1095, 1096, 1097, 1098, 1099, 1100, 1101, 1102, 1103, 1104, 1105,
       1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116,
       1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125, 1126, 1127,
       1128, 1129, 1130, 1131, 1132, 1133, 1134, 1135, 1136, 1137, 1138,
       1139, 1140, 1141, 1142, 1143, 1144, 1145, 1146, 1147, 1148, 1149,
       1150, 1151, 1152, 1153, 1154, 1155, 1156, 1157, 1158, 1159, 1160,
       1161, 1162, 1163, 1164, 1165, 1166, 1167, 1168, 1169, 1170, 1171,
       1172, 1173, 1174, 1175, 1176, 1177, 1178, 1179, 1180, 1181, 1182,
       1183, 1184, 1185, 1186, 1187, 1188, 1189, 1190, 1191, 1192, 1193,
       1194, 1195, 1196, 1197, 1198, 1199, 1400, 1401, 1402, 1403, 1404,
       1405, 1406, 1407, 1408, 1409, 1410, 1411, 1412, 1413, 1414, 1415,
       1416, 1417, 1418, 1419, 1420, 1421, 1422, 1423, 1424, 1425, 1426,
       1427, 1428, 1429, 1430, 1431, 1432, 1433, 1434, 1435, 1436, 1437,
       1438, 1439, 1440, 1441, 1442, 1443, 1444, 1445, 1446, 1447, 1448,
       1449, 1450, 1451, 1452, 1453, 1454, 1455, 1456, 1457, 1458, 1459,
       1460, 1461, 1462, 1463, 1464, 1465, 1466, 1467, 1468, 1469, 1470,
       1471, 1472, 1473, 1474, 1475, 1476, 1477, 1478, 1479, 1480, 1481,
       1482, 1483, 1484, 1485, 1486, 1487, 1488, 1489, 1490, 1491, 1492,
       1493, 1494, 1495, 1496, 1497, 1498, 1499, 1500, 1501, 1502, 1503,
       1504, 1505, 1506, 1507, 1508, 1509, 1510, 1511, 1512, 1513, 1514,
       1515, 1516, 1517, 1518, 1519, 1520, 1521, 1522, 1523, 1524, 1525,
       1526, 1527, 1528, 1529, 1530, 1531, 1532, 1533, 1534, 1535, 1536,
       1537, 1538, 1539, 1540, 1541, 1542, 1543, 1544, 1545, 1546, 1547,
       1548, 1549, 1550, 1551, 1552, 1553, 1554, 1555, 1556, 1557, 1558,
       1559, 1560, 1561, 1562, 1563, 1564, 1565, 1566, 1567, 1568, 1569,
       1570, 1571, 1572, 1573, 1574, 1575, 1576, 1577, 1578, 1579, 1580,
       1581, 1582, 1583, 1584, 1585, 1586, 1587, 1588, 1589, 1590, 1591,
       1592, 1593, 1594, 1595, 1596, 1597, 1598, 1599, 1800, 1801, 1802,
       1803, 1804, 1805, 1806, 1807, 1808, 1809, 1810, 1811, 1812, 1813,
       1814, 1815, 1816, 1817, 1818, 1819, 1820, 1821, 1822, 1823, 1824,
       1825, 1826, 1827, 1828, 1829, 1830, 1831, 1832, 1833, 1834, 1835,
       1836, 1837, 1838, 1839, 1840, 1841, 1842, 1843, 1844, 1845, 1846,
       1847, 1848, 1849, 1850, 1851, 1852, 1853, 1854, 1855, 1856, 1857,
       1858, 1859, 1860, 1861, 1862, 1863, 1864, 1865, 1866, 1867, 1868,
       1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879,
       1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890,
       1891, 1892, 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901,
       1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912,
       1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 1922, 1923,
       1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934,
       1935, 1936, 1937, 1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945,
       1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956,
       1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967,
       1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978,
       1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989,
       1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999])

Let’s save that NumPy array with the variable name inds, for “indices”.

inds = np.nonzero(BM)[0]

It’s worth thinking slowly about why the following works. We want to know all the indices x in inds such that x-1 is not in inds, because that means that at index x, we switched between colors. Here is how we can get those indices using list comprehension. If you look through the previous array of indices, you’ll notice that 600 is in the array, but 599 is not in the array, so the pixels at index 599 and 600 have different colors.

[x for x in inds if x-1 not in inds]
[200, 600, 1000, 1400, 1800]

If you look at img with the knowledge that it is total width 2000 pixels, so each square might be width 200, it makes sense that the colors switch from the yellow-ish color to the green-ish color at positions 200, 600, 1000, 1400, 1800. The above list of numbers confirms this guess.

img
../_images/Pillow_198_0.png

Changing colors using a Boolean mask#

Changing triples of values in a NumPy array is significantly harder than changing individual values. Here is our goal in this section.

  • Replace all [12, 12, 12] RGB triples with [0, 255, 255] using a Boolean mask.

from PIL import Image
import numpy as np

We will use the same grid as in the previous section.

img = Image.open("images/test_grid.png")
arr = np.asarray(img)

To motivate our approach, we are going to start with a smaller example.

A = np.array([9,2,4,5,1,1,1,0,2,10,5,8])

Recall how Boolean masking (which I also call Boolean indexing) works. For example, if we want to get exactly the values of A which are strictly greater than 5, we can first form the following Boolean array.

A > 5
array([ True, False, False, False, False, False, False, False, False,
        True, False,  True])

And then we can “apply this Boolean mask” by evaluating A[A > 5]. This will keep those values of A which are in slots containing True in our Boolean mask.

A[A > 5]
array([ 9, 10,  8])

If we instead want to keep pairs satisfying a certain condition, the situation is a little more complicated.

I don’t like the way B is displayed by NumPy. It would be better to think of it as a 2-row by 3-column matrix, each entry of which is a pair of numbers. (In our actual example, the “pair of numbers” portion will be replaced by an RGB triple.)

Here is how I would visualize B:

\[\begin{split} \begin{pmatrix} [9,2] & [4,5] & [1,1] \\ [1,0] & [2,10] & [5,8] \end{pmatrix} \end{split}\]
B = A.reshape((2,3,2))
B
array([[[ 9,  2],
        [ 4,  5],
        [ 1,  1]],

       [[ 1,  0],
        [ 2, 10],
        [ 5,  8]]])

Say we want to find all the pairs of numbers whose sum is strictly greater than 5. If we just call B.sum(), that adds together all the numbers in B, which isn’t what we want.

B.sum()
48

As usual, we can include an axis keyword argument. In this case, we want to add up the last axis, axis=2.

B.sum(axis=2)
array([[11,  9,  2],
       [ 1, 12, 13]])

From here we can create a Boolean array just like before.

B.sum(axis=2) > 5
array([[ True,  True, False],
       [False,  True,  True]])

But what should be the result of indexing using this Boolean mask? It turns out that the 2-by-3 shape of B is completely lost, and the pairs satisfying the condition are listed as rows, one on top of the other. (This surprised me at first, but if you think about it, what better convention can you come up with? Keeping the shape the same and leaving the False slots blank would not be an allowable NumPy array.)

Anyway, here is the result of applying this Boolean mask. Be sure you understand how it relates to B.

B[B.sum(axis=2) > 5]
array([[ 9,  2],
       [ 4,  5],
       [ 2, 10],
       [ 5,  8]])

If we want to change the elements satisfying this condition, as opposed to displaying the elements, then the shape can be retained. (The old values are left unchanged in the False slots.) For example, here we replace all the pairs which have sum strictly greater than 5 with the pair [-4,-5].

B[B.sum(axis=2) > 5] = [-4,-5]

The shape is a little concealed by the way NumPy displays it. Here is a clearer visualization of the shape:

\[\begin{split} \begin{pmatrix} [-4,-5] & [-4,-5] & [1,1] \\ [1,0] & [-4,-5] & [-4,-5] \end{pmatrix} \end{split}\]
B
array([[[-4, -5],
        [-4, -5],
        [ 1,  1]],

       [[ 1,  0],
        [-4, -5],
        [-4, -5]]])

Once the above example makes perfect sense, our image processing example will be much easier to follow. Recall our goal:

  • Replace all [12, 12, 12] RGB triples with [0, 255, 255] using a Boolean mask.

In the following print-out, we can’t see any [12, 12, 12] triples, so none of the following displayed pixels should be changed.

arr
array([[[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       [[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       [[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       ...,

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]],

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]],

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]]], dtype=uint8)

We create a two-dimensional Boolean mask using all(axis=2), similar to the previous section. Again, all of the displayed values are False, so none of the displayed pixels should be changed.

(arr == [12, 12, 12]).all(axis=2)
array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

If we use Boolean indexing, we get the triples which satisfy the condition listed one on top of each other, just like above when we used Boolean indexing with B.

arr[(arr == [12, 12, 12]).all(axis=2)]
array([[12, 12, 12],
       [12, 12, 12],
       [12, 12, 12],
       ...,
       [12, 12, 12],
       [12, 12, 12],
       [12, 12, 12]], dtype=uint8)

How many pixels have RGB values [12, 12, 12]? A lot, nearly 500,000.

arr[(arr == [12, 12, 12]).all(axis=2)].shape
(467679, 3)

Our usual “read-only” error shows up in this context, but our NumPy strategy itself is correct.

arr[(arr == [12, 12, 12]).all(axis=2)] = [0, 255, 255]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 arr[(arr == [12, 12, 12]).all(axis=2)] = [0, 255, 255]

ValueError: assignment destination is read-only

As usual, we create a NumPy array which is not read-only by using the copy method.

C = arr.copy()
C[(C == [12, 12, 12]).all(axis=2)] = [0, 255, 255]

The fun thing about using image processing as our example is that we can display the results at the end in a visual manner. Here is the result of replacing all of the [12, 12, 12] pixels, which are close to black, with [0, 255, 255] pixels.

Image.fromarray(C)
../_images/Pillow_237_0.png

Changing colors using np.where#

In the previous section, we replaced the black text with a single blue-ish color. Here we are going to replace the square backgrounds with different random colors. In the previous section, we used Boolean indexing (which I also call a Boolean mask). I don’t know how to accomplish the goal of this section using Boolean indexing; in this section we will use np.where.

Here is the overall goal for this section.

  • Replace the square backgrounds with random colors.

from PIL import Image
import numpy as np
img = Image.open("images/test_grid.png")
arr = np.asarray(img)

We are going to use the function np.where to replace the square background colors. This function is a little complicated, but it is also very powerful. Let’s start out with seeing a basic application of this function, applied on a one-dimensional NumPy array.

A = np.array([9,2,4,5,1,1,1,0,2,10,5,8])

Consider the following example. The first argument to np.where is a Boolean array, and the next two arguments are numbers, -20 and 3. What this does is replace each True in the Boolean array with -20 and replace each False in the Boolean array with 3.

np.where(A > 5, -20, 3)
array([-20,   3,   3,   3,   3,   3,   3,   3,   3, -20,   3, -20])

That seems easy enough; here is a more sophisticated example. Instead of passing -20 as our True replacement, we pass the entire array A. Notice that A > 5 and A have the same shape. In this case, the values of True get replaced by the corresponding value in A.

np.where(A > 5, A, 3)
array([ 9,  3,  3,  3,  3,  3,  3,  3,  3, 10,  3,  8])

There was nothing special about using A itself in the previous example. Any array of that shape would have worked exactly the same. For example, here we use 100*A, and so the initial True gets replaced by \(900 = 100 \cdot 9\).

np.where(A > 5, 100*A, 3)
array([ 900,    3,    3,    3,    3,    3,    3,    3,    3, 1000,    3,
        800])

We can also replace the False replacement with an array. Here we use an np.arange. Notice how we ensure it has the same length as A.

np.where(A > 5, 100*A, np.arange(len(A)))
array([ 900,    1,    2,    3,    4,    5,    6,    7,    8, 1000,   10,
        800])

That is most (but not all) of the information we need about np.where to replace the square background images. Let’s remind ourselves what the original image looks like.

img
../_images/Pillow_252_0.png

To use np.where, the first argument should be some Boolean array. It seems easier to find the black colors than to find the background colors (because there is only one black color but there are two background colors). Since np.where replaces both the True values and the False values, it doesn’t really matter if we are using mask or ~mask in the following; we would just need to swap the following inputs.

mask = (arr == [12, 12, 12]).all(axis=2)

In NumPy’s preview display of mask, all of the values are False. That is because there are no black pixels in the top 3 rows nor in the bottom 3 rows of img.

mask
array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       ...,
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])

Let’s try something simple (no replacing of colors). Let’s just pass arr for both the True replacement as well as the False replacement. Try reading the error message we get, and see if you can tell what the problem is.

np.where(mask, arr, arr)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [11], in <cell line: 1>()
----> 1 np.where(mask, arr, arr)

File <__array_function__ internals>:180, in where(*args, **kwargs)

ValueError: operands could not be broadcast together with shapes (2000,2000) (2000,2000,3) (2000,2000,3) 

In our earlier example, A > 5 had the exact same shape as 100*A and as np.arange(len(A)). In our example here, mask does not have the same shape as arr. There is no hope of using reshape on mask to get it into shape (2000, 2000, 3), but we don’t actually need them to have the same shape. Instead, we just need them to be broadcastable to the same shape.

As it stands, the final (right-most) dimensions (2000 and 3) are not compatible in terms of broadcasting, but we can fix that easily by adding a new dimension of size 1 at the end of mask. We add this third dimension using reshape. (Aside: In the following, typing 2000 twice does not feel very robust. I would rather type something involving arr. Can you find a better approach?)

mask = (arr == [12, 12, 12]).all(axis=2).reshape(2000, 2000, 1)

The output of the following is not so important. (The output should be identical to arr; can you check that using all?) More what I wanted to see is that the following does not raise an error, so the shapes are indeed compatible.

np.where(mask, arr, arr)
array([[[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       [[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       [[180, 153,  12],
        [180, 153,  12],
        [180, 153,  12],
        ...,
        [ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152]],

       ...,

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]],

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]],

       [[ 12, 153, 152],
        [ 12, 153, 152],
        [ 12, 153, 152],
        ...,
        [180, 153,  12],
        [180, 153,  12],
        [180, 153,  12]]], dtype=uint8)

Now that we have our template, let’s work on the False replacements (i.e., the background color replacements). We want random background colors, so let’s start by instantiating a NumPy random number generator.

rng = np.random.default_rng()

Our colors correspond to RGB triples, with integers between 0 and 255 (inclusive). We want a 10-by-10 grid of random colors, so we will use shape (10,10,3); think of this as a 10-by-10 matrix of random RGB values.

colors = rng.integers(256, size=(10,10,3))

We saw a few sections ago that every square has side-length 200, so we are going to repeat each color 200 times. We will use the repeat method, which accepts an axis argument to say which axis to repeat along; we will repeat along both axis=0 and axis=1.

Another way to think about this, is that we eventually want something broadcastable to shape (2000, 2000, 3), and our colors variable has shape (10, 10, 3), so it is natural to repeat 200 times in the rows and the columns dimensions.

Y = colors.repeat(200, axis=0).repeat(200, axis=1)

Notice how we can see that colors are now getting repeated.

Y
array([[[182, 207, 208],
        [182, 207, 208],
        [182, 207, 208],
        ...,
        [ 69, 118, 212],
        [ 69, 118, 212],
        [ 69, 118, 212]],

       [[182, 207, 208],
        [182, 207, 208],
        [182, 207, 208],
        ...,
        [ 69, 118, 212],
        [ 69, 118, 212],
        [ 69, 118, 212]],

       [[182, 207, 208],
        [182, 207, 208],
        [182, 207, 208],
        ...,
        [ 69, 118, 212],
        [ 69, 118, 212],
        [ 69, 118, 212]],

       ...,

       [[231, 209, 153],
        [231, 209, 153],
        [231, 209, 153],
        ...,
        [111, 105,  86],
        [111, 105,  86],
        [111, 105,  86]],

       [[231, 209, 153],
        [231, 209, 153],
        [231, 209, 153],
        ...,
        [111, 105,  86],
        [111, 105,  86],
        [111, 105,  86]],

       [[231, 209, 153],
        [231, 209, 153],
        [231, 209, 153],
        ...,
        [111, 105,  86],
        [111, 105,  86],
        [111, 105,  86]]])

Recall that colors had shape (10, 10, 3); we passed that tuple explicitly to rng.integers.

colors.shape
(10, 10, 3)

After our repetitions, Y now has shape (2000, 2000, 3).

Y.shape
(2000, 2000, 3)

The NumPy array Y contains what we are going to use to replace the False values with (True corresponded to the black colors, and False corresponded to the other two cololrs). So Y is the final argument to np.where. The only other thing we need to do, to get an image out of this, is to make sure it has the correct dtype of unsigned 8-bit integers. We do that here using the astype argument. (Another option, which we used before, would be to do this in two steps, where the second step is setting the dtype.)

Here is the resulting image.

Image.fromarray(np.where(mask, arr, Y).astype(np.uint8))
../_images/Pillow_276_0.png