In this blog, we’ll explore how to solve problems with NumPy for image processing by leveraging concepts such as broadcasting, vectorization, advanced indexing and more. We’ll also show that by using the right NumPy functions for the right tasks, the result is a performance improvement (speed-wise) that is much better than basic inefficient solutions.

NumPy is a great library that provides optimized functionality for a large number of array-based operations. Though it’s fairly straightforward to leverage the full power of NumPy for simple tasks, it is not always readily evident how to implement certain functionality for some applications optimally.

Though we will not be providing an introduction to the concepts listed above, we *will* be demonstrating why they are important and how they can be used practically. For that purpose, we have designed a few **toy problems** which we will solve below. The optimal techniques discussed for these toy problems can be easily extended to issues that you might face while processing data for practical computer vision applications such as semantic segmentation and object detection.

And don’t worry if you are unsure about the concepts mentioned above; we’ve included an abundance of visual and textual explanations to help. If you’re after a refresher, there are plenty of resources available.

Alright, let’s get started by defining our toy problems.

## The Toy Problems for Numpy Image Processing

We will be looking at two toy problems in our exploration of Numpy for image processing. They are:

### Problem #1: Full Replacement

Given the input image in Figure 1, convert all the pixels with color

to the color **[255, 0, 0]**** [255, 255, 255]**.

**Note:** All colors in this blog are represented as [R, G, B] values.

### Problem #2: Conditional Replacement

Given the input image in Figure 2, convert all the pixels **within** the specified boxes **and** with color ** [255, 0, 0]** to the color

**. The boxes are visualized in Figure 2 (middle) with stars representing the top-left and bottom-right coordinates of the boxes.**

`[255, 255, 255]`

As you can see, problem #2 is an extension of problem #1.

At first glance the problems may seem simple. But while the core problems are simple, the implementation details will show us why it is typically recommended to leverage the power of vectorization using NumPy rather than iterating over pixels using python loops.

In the following sections we’ll explore a few solutions to these problems. One solution uses explicit python loops to iterate over pixels to solve the problem. The other solution(s) use NumPy functions. Following these solutions, we will compare **how long it takes** for **each method** to solve our problems.

We will also measure the time taken by each method for **different** image sizes (or **scales**). This will show us how the speed-up obtained from an optimal method varies across multiple scales. The input image for both problems is available in **five scales** (or **sizes**), and the height of each image is half its width. For this reason, we’ll refer to the image in each scale solely by its **width**. The width of the images in the five scales are: 2000px, 1000px, 500px, 250px and 100px.

**Note:** The image was originally created with width 2000px and height 1000px and was resized to the other scales while maintaining the aspect ratio.

Please note that all the time comparisons and plots presented here are based on measurements taken on this colab notebook. You can replicate the experiment on your own device by cloning and following the instructions in the above notebook link or my GitHub repository.

**Disclaimer**: The measurements were taken on 2nd August 2020 on the colab notebook mentioned in the GitHub repository. The exact timings and speed-ups **may vary** based on several factors such as software/hardware versions and so on. It might also be affected by some extraneous factors such cpu/memory usage and so on. Hence, refer to these measurements only to get an **approximate** sense of the differences in performance. Refer to the colab notebook for more information about the versions of some of the software used.

Now that the problems are set, let’s begin.

## Solution — Problem #1: Full Replacement

Let’s analyze the input image from Figure 1 in more detail. Figure 3 below includes some helpful annotations to help us understand the problem better.

The input image is represented in the form of a 3D NumPy array named `img`

which has the shape `[H, W, 3]`

. Let’s focus on just the first two dimensions (height and width).

Let’s denote row-indices by `r`

and column-indices by `c`

. From Figure 3, we can see that `r`

can take any integer value in the range [0, H-1] (inclusive at both ends) and `c`

can take any integer value in the range [0, W-1] (inclusive at both ends). We can represent one pixel in the image by its **coordinate **`(r, c)`

. Moreover, `img[r, c, :]`

will give the RGB value of the image at `(r, c)`

.

This should give you some ideas as to what variables and tricks you can leverage to solve this problem. Now let’s look at a couple of solutions.

### Method I: Simple Python For Loops

Naturally, a simple for-loop based solution can be applied to this problem. Assuming that we loaded the image to the NumPy array `img`

, the below code-block provides a solution to the problem:

`output = img.copy()`

`H, W = img.shape[:2]`

`for r in range(H):`

` for c in range(W):`

` if np.all(img[r, c, :] == [255, 0, 0]):`

` output[r, c, :] = [255, 255, 255]`

In the above code we just looped over every single pixel and checked if its RGB value was equal to `[255, 0, 0]`

. If so, we overwrite the value of that pixel location in the `output`

NumPy array to be `[255, 255, 255]`

.

Pretty simple and concise, right? However, as we will soon observe this method is inefficient.

### Method II: Using NumPy Image Processing Functions

So the question is, how do we make this more efficient? Let’s try to build a solution that leverages NumPy tricks and methods to harness the power of vectorization.

As before, we’ll assume that we loaded the image to the NumPy array `img`

. The following code-block provides an optimized solution:

`output = img.copy()`

`valid = np.all(img == [255, 0, 0], axis = -1)`

`rs, cs = valid.nonzero()`

`output[rs, cs, :] = [255, 255, 255]`

The code still looks very concise, but is also a bit more complex. If you are familiar with how broadcasting and advanced indexing works, you’ll be able to see what’s going on here. If you aren’t, here’s an explanation:

- Instead of manually looping over every pixel and checking for equality, we create a “mask” for the entire image using NumPy functions.
- The “mask” (represented by
`valid`

) is a NumPy array with the same height and width as of the original image. It has a**boolean**value at**each coordinate**. If the value is**True**, then that coordinate in the image has the color`[255, 0, 0]`

. If the value is**False**, then that coordinate does**not**have the color`[255, 0, 0]`

. - We can now use
`valid.nonzero()`

to get the indices of the elements of`valid`

that are non-zero (here,**not False**). Figure 4 provides a neat visual example of how it works. You can also refer to its documentation for more information and examples.

- Now, we can directly use
`rs`

and`cs`

to access the pixels with value`[255, 0, 0]`

of the`output`

array and assign them the value`[255, 255, 255]`

.

Figure 5 below shows an approximate depiction of the full workflow.

By leveraging the concepts of advanced indexing and broadcasting, we can remove the need for explicit python loops. This enables us to harness the power of vectorization.

But was this worth the additional effort? Let’s find out.

### Performance Comparison

Lets apply Methods I and II to the image with width 2000px (and height 1000px) to solve Problem #1. We’ll use the timeit module from Python to measure the time taken to solve the problem.

From the image above we can see how much more efficient Method II is compared to Method I.

The time taken to solve the problem using both methods in all the five scales is presented in the bar plots below. The plot on the right is a zoomed in version of the plot on the left. The zoomed in version was included to highlight the performance difference.

The zoomed in version also includes the **speedup** obtained using Method II (For Method II, speedup is the time taken by Method I divided by time taken by Method II).

## Solution — Problem #2: Conditional Replacement

Let’s analyze the input image from Figure 2 in more detail. Figure 8 below has some helpful annotations added to the input image to help us better understand the problem.

The axis, indices, height and width parameters of the image are the same as Problem #1. The only difference is that we have to perform replacement only inside the dotted boxes. The dotted boxes are represented by the top-left and bottom-right coordinates (marked by stars in the figure below). Note that the boxes and stars are **imaginary** (i.e. they don’t exist on the input image) and are only displayed for visualization purposes.

We are given the top-right and bottom-left coordinates of all the boxes in a NumPy array called `boxes`

. The coordinates have been cast to integers to simplify our problem. In the code-block, `w`

refers to 1/10th of the width of the image and `h`

refers to 1/10th of the height of the image.

**height** = **width** // 2

**h, w** = height / 10, width / 10

**boxes** = np.array([

` [[h, w], [9*h, 2*w]],`

` [[6.5*h, 2.5*w], [9*h, 8*w]],`

` [[h, 2.5*w], [5.5*h, 6*w]],`

` [[h, 6.5*w], [5.5*h, 9*w]],`

` [[6.5*h, 8.5*w], [9*h, 9*w]]`

`]).astype(np.int32)`

Since `w`

and `h`

both depend on the image dimensions, the size of our boxes are proportional to the size of the image. This makes it easier for us to analyze the performance at different scales. Now let’s look at a couple of solutions.

### Method I: Simple Python For Loops

The simple Python loop based solution is very straightforward. Essentially, it has the same structure as Method I in Problem #1, but some minor changes were made so we only traverse the pixels within the imaginary boxes.

The code-block below displays the solution. Since it is similar to Method I of Problem #1, we will not analyze this further.

`output = img.copy()`

`H, W = img.shape[:2]`

**## Looping over all boxes**

`for box in boxes:`

` `

**## Getting the box coordinates**

` top_h, top_w = box[0]`

` bot_h, bot_w = box[1]`

` `

**## Similar to Method I of Problem #1**

` for r in range(top_h, bot_h + 1):`

` for c in range(top_w, bot_w + 1):`

` if np.all(img[r, c, :] == [255, 0, 0]):`

` output[r, c, :] = [255, 255, 255]`

### Method II: NumPy Image Processing Functions (Type 1)

The NumPy based solution can be approached in a variety of ways. One simple solution is to extract the portion of the image within each imaginary box into sub-images and repeat the same solution that we used for Method II of Problem #1. You can see that in the code-block below:

`output = img.copy()`

**## Looping over all boxes**

`for box in boxes:`

` `

**## Getting the box coordinates**

` top_h, top_w = box[0]`

` bot_h, bot_w = box[1]`

` `

**## Extracting sub-images**

` sub_img_inp = img[top_h: bot_h + 1, top_w: bot_w + 1]`

` sub_img_out = output[top_h: bot_h + 1, top_w: bot_w + 1]`

` `

**## Similar to Method II of Problem #1**

` valid = np.all(sub_img_inp == [255, 0, 0], axis = -1)`

` rs, cs = valid.nonzero()`

` sub_img_out[rs, cs, :] = [255, 255, 255]`

In the above code-block, we are looping over the `boxes`

array. The variable `box`

represents one box. We can directly use the coordinate information available in `box`

to slice the NumPy arrays `img`

and `output`

to get `sub_img_inp`

and `sub_img_out`

respectively.

Both the newly created NumPy arrays (`sub_img_inp`

and `sub_img_out`

) are views of the original NumPy arrays (`img`

and `output`

respectively). This means that if we modify values in the new arrays, the change will be reflected in the respective old arrays.

This trick makes our problem much simpler. Now we can just use Method II of Problem #1 on `sub_img_inp`

and `sub_img_out`

for each `box`

. The diagram below provides an approximate visual representation of the workflow.

Please note that we did use a python loop to iterate over boxes. However, it will not have a profound effect because there are only five boxes and the code executed at each iteration is rather fast.

Similar to Method II of Problem #1, we can expect this method to be much faster than using Python loops to traverse every pixel.

### Method III: NumPy Image Processing Functions (Type 2)

Let’s look at one more approach that can be used to solve this problem. It’s a little more complex than the previous method and also a little less efficient, but I want to include this method to demonstrate a few tricks that could come handy for other problems, which I’ll give examples of further down.

The following code-block demonstrates the method. I’ll explain it in greater detail below.

`output = img.copy()`

**## Getting valid coordinates of color [255,0,0]**

`valid = np.all(img == [255, 0, 0], axis = -1)`

`rs, cs = valid.nonzero()`

**## Initializing rcs mask.**

`all_valid_rcs = np.full(rs.shape, False)`

**## Looping over all boxes**

`for box in boxes:`

` `

**## Getting the box coordinates**

` top_h, top_w = box[0]`

` bot_h, bot_w = box[1]`

` `

**## Finding valid coordinates that are also**

` ## inside the current box.`

` cur_valid_rs = ((rs >= top_h) & (rs <= bot_h))`

` cur_valid_cs = ((cs >= top_w) & (cs <= bot_w))`

` cur_valid_rcs = cur_valid_rs & cur_valid_cs `

` `

**## Updating the rcs mask with valid coordinates**

` ## that are inside the current box.`

` all_valid_rcs |= cur_valid_rcs`

**## Selecting indices that belong to any box**

`rs = rs[all_valid_rcs]`

`cs = cs[all_valid_rcs]`

`output[rs, cs, :] = [255, 255, 255]`

Here’s an explanation as to how it works:

- Similar to Method II of Problem #1, we create the mask array
`valid`

and the indices`rs`

and`cs`

for the**whole**image. - Now, we want to
**select**those indices from`rs`

and`cs`

that lie within**any**of the**five**boxes. - To do this, we create an array named
`all_valid_rcs`

which has the same shape of`rs`

(Note:`rs`

and`cs`

have the same shape) but is filled with the value`False`

. - Now, if any coordinate
`[r, c]`

lies within a box, then its corresponding index in`all_valid_rcs`

is set to`True`

. - For instance, if
`rs[i]`

and`cs[i]`

are**both**inside a box, then`all_valid_rcs[i]`

will be set to`True`

. The code within the for-loop of the above code-block performs the vectorized version of this logic. - Once this process has been completed for all elements in
`rs`

and`cs`

(and for all the boxes), we can use`all_valid_rcs`

to select those indices from`rs`

and`cs`

which lie inside**any**box. The below diagram gives an approximate representation of this process.

- Once we are done selecting the
`rs`

and`cs`

values that lie within any box, we can use them to change the color to`[255, 255, 255]`

. The below diagram gives an approximate representation of the full method.

Through this method you can see that with the use of a few NumPy image processing tricks we can select the correct indices and perform replacement just once rather than performing replacement for each box separately (i.e. Method II).

While for our particular problem this workflow may not offer any benefits, this may be useful for other practical problems. For instance, instead of “replacement” you might deal with some costly operation which takes a lot of time to execute. In such a case this method might be more useful.

In the case of overlapping boxes, techniques from this method can also be used to ensure that coordinates belonging to multiple boxes are not operated on multiple times. This may not be a concern for all problems, but it is good to know that you can handle such a problem if it arises.

### Performance Comparison

As before, let’s apply Methods I, II and III to the image with width 2000px (and height 1000px) to solve Problem #2 and measure the time.

From the image above we see that Method I is not as efficient as Methods II and III. We can also see that Method III is slightly slower than Method II.

The time taken to solve the problem using all three methods in **all** the five scales is presented in the bar plots below. The plot on the right is once again a zoomed-in version of the plot on the left. The zoomed-in version is there to highlight the performance difference.

The zoomed-in version also includes the speedup obtained by using Method II and Method III. For Method II, the speedup is the time taken by Method I divided by time taken by Method II. Similarly, for Method III, the speedup is the time taken by Method I divided by time taken by Method III.

## Conclusion

Through the help of a few toy problems, we explored some NumPy image processing tricks and tips which can greatly improve the efficiency of your code. By comparing solutions by their time to completion, we could also verify that efficiency.

As mentioned earlier, though we only focused only on toy problems, the tricks demonstrated here can be easily used in practical applications as well. For instance, dealing with bounding boxes and pixels of a certain color are scenarios you typically face in object detection and semantic segmentation problems.

The solutions presented here were written so the code would be relatively easy to read and follow. However, there may be more elegant or optimal solutions outside of those presented above; if you’re interested in trying out your own solution to these problems, please feel free to check out my GitHub repository and colab notebook.

Be sure to check the related resources below for more of Bharath’s technical articles, and sign up to the Lionbridge AI newsletter for interviews and articles delivered directly to your inbox.