๐Ÿš€ A Crash Course on NumPy for Images#

Welcome to the magical world of NumPy arrays in scikit-image! Here, every pixel ๐Ÿ–Œ๏ธ is a number, and every image ๐Ÿ–ผ๏ธ is a multidimensional treasure trove. Letโ€™s dive into the basics while exploring some fascinating engineering contexts! ๐Ÿ”งโœจ

NumPy is the Backbone ๐Ÿฆด of scikit-image#

In scikit-image, all images are represented as NumPy ndarray objects. This means you can leverage all the awesome features of NumPy to manipulate your images like a pro. Letโ€™s see an example:

import skimage as ski
import matplotlib.pyplot as plt

nala = ski.io.imread("./assets/figures/nala.jpg")
print(type(nala))

plt.imshow(nala)
plt.axis("off")
<class 'numpy.ndarray'>
(np.float64(-0.5), np.float64(1593.5), np.float64(2047.5), np.float64(-0.5))
../../_images/1ef3698c7e6350bd516b6bbe4289065962f89ed8e5577fdf430aaeb4533439f0.png

A painting of my Dog ๐Ÿถ Nala โ€“ in the likeness of the Barack Obama Presidential Portrait.

๐Ÿ’ก Fun Engineering Fact: Engineers use this structure to analyze X-ray images of airplane wings โœˆ๏ธ for micro-cracks!

Image Geometry and Intensity ๐Ÿ“๐ŸŽจ#

Letโ€™s uncover some basic stats about the image:

nala.shape  # Dimensions: rows x columns
(2048, 1594, 3)
nala.size  # Total pixels
9793536
nala.min(), nala.max()  # Intensity range
(np.uint8(0), np.uint8(255))

Note

color images are generally represented in the RGB color space, where each pixel has three values: red, green, and blue. This can be efficiently represented by an 8-bit unsigned integer (uint8) array with shape (height, width, 3). This allows for 256 levels of intensity for each color channel.

nala.mean()  # Average brightness
np.float64(141.98962999676522)

๐ŸŽฏ Use Case: Ever wonder how self-driving cars ๐Ÿš— see the road? They use such stats to differentiate between the road, pedestrians, and traffic signs!

NumPy Indexing: Your Magic Wand ๐Ÿช„#

Want to access and modify individual pixels or regions? NumPy indexing is your best friend! Letโ€™s see some examples:

Individual Pixel Access:#

# Get pixel value at row 10, column 20
nala[10, 20]
array([237, 227, 217], dtype=uint8)
# Set pixel at row 3, column 10 to black
nala[3, 10] = 0
plt.imshow(nala)
<matplotlib.image.AxesImage at 0x7f6bed31ee10>
../../_images/ebaf8635fa807436123bbd92ebd85442a88472607e1f45d045fec0d33cb6c01b.png

you cannot really see that pixel, but itโ€™s there! ๐Ÿง

# Set pixels in a region to black
nala[3:103, 10:110] = 0
plt.imshow(nala)
<matplotlib.image.AxesImage at 0x7f6bed308c90>
../../_images/25aba862956f772c36bc6016b8addca744099ca90056cc1e5f23b6253ba0aa4a.png

๐Ÿ’ก Heads-Up: Remember, in NumPy, the first dimension is rows and the second is columns. The origin (0, 0) is at the top-left corner ๐Ÿ“โ€”not bottom-left, as in Cartesian coordinates.

Masking: Select Pixels Like a Pro ๐ŸŽญ#

Masks are boolean arrays that let you pick pixels based on conditions:

mask = nala < 87
# Set all pixels where mask is True to white
nala[mask] = 255

plt.imshow(nala)
plt.axis("off")
(np.float64(-0.5), np.float64(1593.5), np.float64(2047.5), np.float64(-0.5))
../../_images/37ba5344779581a53a2345f4b457efd0e07dc32d74dd3204372e7ff54009e4a8.png

Use Case: Imagine youโ€™re a botanist ๐ŸŒฑ using satellite images to track plant health. A mask can isolate unhealthy vegetation by analyzing infrared intensity!

Fancy Indexing: Advanced Tricks ๐ŸŽฉ#

Letโ€™s create a funky pattern by modifying pixels using fancy indexing:

import numpy as np

nala = ski.io.imread("./assets/figures/nala.jpg")


inds_r = np.arange(len(nala))
inds_c = 2 * inds_r % len(nala)

inds_r = inds_r[inds_r < nala.shape[0]]
inds_c = inds_c[inds_r < nala.shape[0]]

inds_r = inds_r[inds_c < nala.shape[1]]
inds_c = inds_c[inds_c < nala.shape[1]]

nala[inds_r, inds_c] = 0

plt.imshow(nala)
plt.axis("off")
(np.float64(-0.5), np.float64(1593.5), np.float64(2047.5), np.float64(-0.5))
../../_images/0b5a1f4974691361cce446a12328f7b0d0322a8a49a219ad2c872c2254f0981c.png

Do you see it? ๐Ÿง There is a black line on the diagonal

๐ŸŽจ Fun Fact: This technique could be used to simulate โ€œscratchesโ€ on materials during durability testing in the lab. ๐Ÿ› ๏ธ

Beyond Grayscale: Multichannel (Color) Images ๐ŸŒˆ#

Color images in scikit-image are simply NumPy arrays with one extra dimension for color channels (R, G, B). Hereโ€™s an example:

nala = ski.io.imread("./assets/figures/nala.jpg")

nala.shape  # Height x Width x Channels

# Access RGB values of a pixel
nala[10, 20]
array([237, 227, 217], dtype=uint8)
# Turn a pixel green
nala[50:80, 61:80] = [0, 255, 0]  # [Red, Green, Blue]
# We did a bit more than a pixel, but you get the idea

plt.imshow(nala)
<matplotlib.image.AxesImage at 0x7f6bec2203d0>
../../_images/091bd648ddf4c2f5f09fd5b489a3935b154cbd6a4fcfb3ea332ff80d2b2b21d0.png

๐ŸŒˆ Creative Twist: Use such pixel manipulations to generate psychedelic art from your catโ€™s photo! ๐Ÿฑ๐ŸŽจ

Coordinate Conventions ๐Ÿ“#

Coordinate conventions in scikit-image match NumPyโ€™s matrix-style indexing, not Cartesian coordinates. Hereโ€™s a quick guide:

Image Type

Coordinates

2D grayscale

(row, col)

2D multichannel (RGB)

(row, col, ch)

3D grayscale

(plane, row, col)

3D multichannel

(plane, row, col, ch)

2D color video

(t, row, col, ch)

3D color video

(t, plane, row, col, ch)

Speed Matters ๐ŸŽ๏ธ#

Efficient computation is crucial in image processing. For example:

def in_order_multiply(arr, scalar):
    for plane in range(arr.shape[0]):
        arr[plane, :, :] *= scalar


def out_of_order_multiply(arr, scalar):
    for plane in range(arr.shape[2]):
        arr[:, :, plane] *= scalar

Time it:

import time

rng = np.random.default_rng()
im3d = rng.random((100, 1024, 1024))

start_time = time.time()
in_order_multiply(im3d, 5)  # Faster
print(f"in_order_multiply took {time.time() - start_time:.4f} seconds")

start_time = time.time()
out_of_order_multiply(im3d, 5)  # Slower
print(f"out_of_order_multiply took {time.time() - start_time:.4f} seconds")
in_order_multiply took 0.0546 seconds
out_of_order_multiply took 1.2885 seconds

Tip

Computers today are very fast, if you are doing simple operations on small amounts of data you might not notice the difference. But when you are working with large images or at scale the difference can be significant. Improving efficiency can be the difference between a computation being economically viable or not. Similarly, imagine if your google search took 2 seconds instead of 0.2 seconds, you would probably use it less.

๐Ÿ’ก Engineering Insight: This concept of memory locality is vital for medical imaging. Faster algorithms mean quicker diagnoses for doctors. ๐Ÿฉบ๐Ÿ’ป

Time and Space ๐ŸŒŒ#

Processing videos ๐Ÿ“น? You can represent time-series data as 5D arrays (t, pln, row, col, ch). For example:

import warnings

warnings.filterwarnings("ignore")

from moviepy import VideoFileClip

clip = VideoFileClip("./assets/figures/dragon-typing.mp4")

frames = np.ones((clip.n_frames, clip.size[1], clip.size[0], 3), dtype=np.uint8)
for i in range(clip.n_frames):
    frames[i] = clip.get_frame(i)

print("Shape of MoviePy frames:", frames.shape)

# This is one of the many ways to make subplots
f, axes = plt.subplots(nrows=3, ncols=frames.shape[0] // 3 + 1, figsize=(20, 5))

# subplots returns the figure and an array of axes
# we use `axes.ravel()` to turn these into a list
axes = axes.ravel()

# turns all of the axis off
for ax in axes:
    ax.axis("off")

# plots all of the images in the collection
for i in range(frames.shape[0]):
    axes[i].imshow(frames[i], cmap="gray")
    axes[i].set_title(f"Frame {i}")

# This cleans the layout of the image
plt.tight_layout()
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[17], line 5
      1 import warnings
      3 warnings.filterwarnings("ignore")
----> 5 from moviepy import VideoFileClip
      7 clip = VideoFileClip("./assets/figures/dragon-typing.mp4")
      9 frames = np.ones((clip.n_frames, clip.size[1], clip.size[0], 3), dtype=np.uint8)

ModuleNotFoundError: No module named 'moviepy'

๐ŸŽฏ Example: Analyzing live video feeds for robotic surgery ๐Ÿค–๐Ÿ”ช or wildfire detection ๐Ÿ”ฅ๐ŸŒฒ.

With these tips and tricks, youโ€™re now equipped to conquer image processing like a pro! ๐Ÿฆธโ€โ™€๏ธโœจ What project will you tackle next? ๐Ÿš€