๐ A Crash Course on NumPy for Images#
Welcome to the magical world of NumPy arrays in scikit-image
! Here, every pixel ๐๏ธ is a number, and every image ๐ผ๏ธ is a multidimensional treasure trove. Letโs dive into the basics while exploring some fascinating engineering contexts! ๐งโจ
NumPy is the Backbone ๐ฆด of scikit-image
#
In scikit-image
, all images are represented as NumPy ndarray
objects. This means you can leverage all the awesome features of NumPy to manipulate your images like a pro. Letโs see an example:
import skimage as ski
import matplotlib.pyplot as plt
nala = ski.io.imread("./assets/figures/nala.jpg")
print(type(nala))
plt.imshow(nala)
plt.axis("off")
<class 'numpy.ndarray'>
(np.float64(-0.5), np.float64(1593.5), np.float64(2047.5), np.float64(-0.5))

A painting of my Dog ๐ถ Nala โ in the likeness of the Barack Obama Presidential Portrait.
๐ก Fun Engineering Fact: Engineers use this structure to analyze X-ray images of airplane wings โ๏ธ for micro-cracks!
Image Geometry and Intensity ๐๐จ#
Letโs uncover some basic stats about the image:
nala.shape # Dimensions: rows x columns
(2048, 1594, 3)
nala.size # Total pixels
9793536
nala.min(), nala.max() # Intensity range
(np.uint8(0), np.uint8(255))
Note
color images are generally represented in the RGB color space, where each pixel has three values: red, green, and blue. This can be efficiently represented by an 8-bit unsigned integer (uint8) array with shape (height, width, 3)
. This allows for 256 levels of intensity for each color channel.
nala.mean() # Average brightness
np.float64(141.98962999676522)
๐ฏ Use Case: Ever wonder how self-driving cars ๐ see the road? They use such stats to differentiate between the road, pedestrians, and traffic signs!
NumPy Indexing: Your Magic Wand ๐ช#
Want to access and modify individual pixels or regions? NumPy indexing is your best friend! Letโs see some examples:
Individual Pixel Access:#
# Get pixel value at row 10, column 20
nala[10, 20]
array([237, 227, 217], dtype=uint8)
# Set pixel at row 3, column 10 to black
nala[3, 10] = 0
plt.imshow(nala)
<matplotlib.image.AxesImage at 0x7f6bed31ee10>

you cannot really see that pixel, but itโs there! ๐ง
# Set pixels in a region to black
nala[3:103, 10:110] = 0
plt.imshow(nala)
<matplotlib.image.AxesImage at 0x7f6bed308c90>

๐ก Heads-Up: Remember, in NumPy, the first dimension is rows and the second is columns. The origin (0, 0)
is at the top-left corner ๐โnot bottom-left, as in Cartesian coordinates.
Masking: Select Pixels Like a Pro ๐ญ#
Masks are boolean arrays that let you pick pixels based on conditions:
mask = nala < 87
# Set all pixels where mask is True to white
nala[mask] = 255
plt.imshow(nala)
plt.axis("off")
(np.float64(-0.5), np.float64(1593.5), np.float64(2047.5), np.float64(-0.5))

Use Case: Imagine youโre a botanist ๐ฑ using satellite images to track plant health. A mask can isolate unhealthy vegetation by analyzing infrared intensity!
Fancy Indexing: Advanced Tricks ๐ฉ#
Letโs create a funky pattern by modifying pixels using fancy indexing:
import numpy as np
nala = ski.io.imread("./assets/figures/nala.jpg")
inds_r = np.arange(len(nala))
inds_c = 2 * inds_r % len(nala)
inds_r = inds_r[inds_r < nala.shape[0]]
inds_c = inds_c[inds_r < nala.shape[0]]
inds_r = inds_r[inds_c < nala.shape[1]]
inds_c = inds_c[inds_c < nala.shape[1]]
nala[inds_r, inds_c] = 0
plt.imshow(nala)
plt.axis("off")
(np.float64(-0.5), np.float64(1593.5), np.float64(2047.5), np.float64(-0.5))

Do you see it? ๐ง There is a black line on the diagonal
๐จ Fun Fact: This technique could be used to simulate โscratchesโ on materials during durability testing in the lab. ๐ ๏ธ
Beyond Grayscale: Multichannel (Color) Images ๐#
Color images in scikit-image
are simply NumPy arrays with one extra dimension for color channels (R, G, B). Hereโs an example:
nala = ski.io.imread("./assets/figures/nala.jpg")
nala.shape # Height x Width x Channels
# Access RGB values of a pixel
nala[10, 20]
array([237, 227, 217], dtype=uint8)
# Turn a pixel green
nala[50:80, 61:80] = [0, 255, 0] # [Red, Green, Blue]
# We did a bit more than a pixel, but you get the idea
plt.imshow(nala)
<matplotlib.image.AxesImage at 0x7f6bec2203d0>

๐ Creative Twist: Use such pixel manipulations to generate psychedelic art from your catโs photo! ๐ฑ๐จ
Coordinate Conventions ๐#
Coordinate conventions in scikit-image
match NumPyโs matrix-style indexing, not Cartesian coordinates. Hereโs a quick guide:
Image Type |
Coordinates |
---|---|
2D grayscale |
(row, col) |
2D multichannel (RGB) |
(row, col, ch) |
3D grayscale |
(plane, row, col) |
3D multichannel |
(plane, row, col, ch) |
2D color video |
(t, row, col, ch) |
3D color video |
(t, plane, row, col, ch) |
Speed Matters ๐๏ธ#
Efficient computation is crucial in image processing. For example:
def in_order_multiply(arr, scalar):
for plane in range(arr.shape[0]):
arr[plane, :, :] *= scalar
def out_of_order_multiply(arr, scalar):
for plane in range(arr.shape[2]):
arr[:, :, plane] *= scalar
Time it:
import time
rng = np.random.default_rng()
im3d = rng.random((100, 1024, 1024))
start_time = time.time()
in_order_multiply(im3d, 5) # Faster
print(f"in_order_multiply took {time.time() - start_time:.4f} seconds")
start_time = time.time()
out_of_order_multiply(im3d, 5) # Slower
print(f"out_of_order_multiply took {time.time() - start_time:.4f} seconds")
in_order_multiply took 0.0546 seconds
out_of_order_multiply took 1.2885 seconds
Tip
Computers today are very fast, if you are doing simple operations on small amounts of data you might not notice the difference. But when you are working with large images or at scale the difference can be significant. Improving efficiency can be the difference between a computation being economically viable or not. Similarly, imagine if your google search took 2 seconds instead of 0.2 seconds, you would probably use it less.
๐ก Engineering Insight: This concept of memory locality is vital for medical imaging. Faster algorithms mean quicker diagnoses for doctors. ๐ฉบ๐ป
Time and Space ๐#
Processing videos ๐น? You can represent time-series data as 5D arrays (t, pln, row, col, ch)
. For example:
import warnings
warnings.filterwarnings("ignore")
from moviepy import VideoFileClip
clip = VideoFileClip("./assets/figures/dragon-typing.mp4")
frames = np.ones((clip.n_frames, clip.size[1], clip.size[0], 3), dtype=np.uint8)
for i in range(clip.n_frames):
frames[i] = clip.get_frame(i)
print("Shape of MoviePy frames:", frames.shape)
# This is one of the many ways to make subplots
f, axes = plt.subplots(nrows=3, ncols=frames.shape[0] // 3 + 1, figsize=(20, 5))
# subplots returns the figure and an array of axes
# we use `axes.ravel()` to turn these into a list
axes = axes.ravel()
# turns all of the axis off
for ax in axes:
ax.axis("off")
# plots all of the images in the collection
for i in range(frames.shape[0]):
axes[i].imshow(frames[i], cmap="gray")
axes[i].set_title(f"Frame {i}")
# This cleans the layout of the image
plt.tight_layout()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[17], line 5
1 import warnings
3 warnings.filterwarnings("ignore")
----> 5 from moviepy import VideoFileClip
7 clip = VideoFileClip("./assets/figures/dragon-typing.mp4")
9 frames = np.ones((clip.n_frames, clip.size[1], clip.size[0], 3), dtype=np.uint8)
ModuleNotFoundError: No module named 'moviepy'
๐ฏ Example: Analyzing live video feeds for robotic surgery ๐ค๐ช or wildfire detection ๐ฅ๐ฒ.
With these tips and tricks, youโre now equipped to conquer image processing like a pro! ๐ฆธโโ๏ธโจ What project will you tackle next? ๐