# üìñ üåü NumPy Guide üåü

![](./assets/figures/numpy-box.webp)

NumPy is a Python library that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. It is the fundamental package for scientific computing with Python. Even if you don't explicitly use NumPy, it is likely that you are using it indirectly through other libraries like Pandas, Matplotlib, and SciPy. Nearly everything in the python data science ecosystem relies on NumPy.



## üõ†Ô∏è How to Import NumPy

To start using NumPy, first, install it by following [these instructions](https://numpy.org/install/). After installation, import it into your Python script:


In [None]:
import numpy as np


Why `np`? This is the widely accepted alias for NumPy. Using `np` ensures that the library is easily accessible in your code and avoids conflicts with other modules.



## ü§î Why Use NumPy?


### Why not just use Python lists?
While Python lists are versatile and great for general-purpose programming, NumPy arrays offer significant performance benefits for numerical computations:

- Memory Efficiency: NumPy arrays use less memory compared to lists.
- Speed: Operations on NumPy arrays are faster than on lists because they are implemented in C.
- Functionality: Provides a wide range of mathematical operations, such as linear algebra, Fourier transforms, and random number generation.



## üî¢ What is an "Array"?

An array is a grid-like structure used to store data. It can have one or more dimensions:

- 1D Array (Vector):
$$
\begin{array}{|c||c|c|c|}
    \hline
    1 & 5 & 2 & 0 \\
    \hline
\end{array}
$$

- 2D Array (Matrix):
$$
\begin{array}{|c||c|c|c|}
    \hline
    1 & 5 & 2 & 0 \\
    \hline
    8 & 3 & 6 & 1 \\
    \hline
    1 & 7 & 2 & 9 \\
    \hline
    \end{array}
$$

- 3D Array (Tensor):
Think of this as a stack of 2D arrays.



### üß© Characteristics of NumPy Arrays:

A NumPy array is a custom data structure, it is similar to a Python list but has some restrictions that allow it to be much more efficient for numerical computations, and have several built in methods that make it easier to work with.


1. Homogeneous Data: All elements must have the same data type.
2. Fixed Size: Once created, the array size cannot change.
3. Rectangular Shape: All rows must have the same number of columns in 2D arrays.

These restrictions make arrays more memory-efficient and faster for mathematical operations.

## üèóÔ∏è Array Fundamentals


### üöÄ Creating an Array

You can create a NumPy array using Python lists:


In [None]:
a = np.array([1, 2, 3, 4, 5, 6])
a


- Accessing elements:

Elements in an array can be accessed using their index (starting from 0):


In [None]:
a[0]

```{note}
You can see that NumPy used dynamic typing to infer the data type of the array. In this case, it inferred that the array should be of type `int64`. What do you think would happen if we made one of the values 1.0? Try it out!
```


here is an example of some of the most common ways to index NumPy arrays:

![](./assets/figures/np_indexing.png)

- Mutability: NumPy arrays are mutable, meaning you can modify their elements:

In [None]:
a[0] = 10
a

Here we change the value of the first element, the 0th index, in the array to 10.

## üîÑ Reshaping Arrays

Arrays can be reshaped without changing their data:


In [None]:
a = np.arange(6)
b = a.reshape(3, 2)
b


![](./assets/figures/np_reshape.png)

```{note}
The total number of elements must remain constant during reshaping. For example, a 2x3 array has 6 elements, so it can be reshaped into a 3x2 array or a 6x1 array, but not a 2x2 array.
```



### üîç Array Attributes

- Number of Dimensions: `.ndim`
- Shape: `.shape`
- Total Elements: `.size`
- Data Type: `.dtype`

Example:


In [None]:
a = np.array([[1, 2, 3], [4, 5, 6]])
a.ndim  # Number of dimensions

In [None]:
a.shape  # Shape of the array

In [None]:
a.size  # Total number of elements

In [None]:
a.dtype  # Data type of elements

## üßÆ Mathematical Operations

NumPy allows you to perform operations like addition, subtraction, and multiplication on arrays directly:


In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

# Element-wise addition
a + b

In [None]:
# Element-wise multiplication
a * b

In [None]:
# Sum of all elements
a.sum()

## üìä Broadcasting

Broadcasting allows you to perform operations between arrays of different shapes:


In [None]:
a = np.array([1, 2, 3])
a + 5  # Add 5 to every element


When you apply an operation between an array and a scalar, numpy uses broadcasting to apply the operation to each element of the array.

![](./assets/figures/np_multiply_broadcasting.png)

```{note}
The shapes of the arrays must be compatible for broadcasting.
```

## üèÜ Finding Maximum and Minimum Values
NumPy provides efficient built-in methods for finding the maximum and minimum values in an array.

- `np.max()`: Returns the maximum value of an array or along a specific axis.
- `np.min()`: Returns the minimum value of an array or along a specific axis.
- `np.argmax()` and `np.argmin()`: Return the indices of the maximum and minimum values.


In [None]:
data = np.array([[3, 7, 1], [4, 5, 9]])

In [None]:
# Find global max and min
np.max(data)

In [None]:
np.min(data)

In [None]:
# Find max and min along rows (axis=1)
np.max(data, axis=1)

In [None]:
np.min(data, axis=1)

In [None]:
# Find the indices of max and min
np.argmax(data)  # Global index of max value

In [None]:
np.argmin(data)  # Global index of min value

```{tip}
Combine these methods for advanced analysis. For example, use `np.unravel_index()` with `argmax`/`argmin` to find the row and column of the max/min in a multi-dimensional array.
```


## üß∞ Preallocation of Memory
Preallocating memory for large arrays is a good practice when working with performance-critical applications. Instead of dynamically appending to lists (which is slow), create an empty array or an array filled with default values like zeros or ones.


### Preallocation Methods:
- `np.zeros(shape)`: Creates an array filled with zeros.
- `np.ones(shape)`: Creates an array filled with ones.
- `np.empty(shape)`: Creates an uninitialized array (faster, but contains arbitrary values).
- `np.full(shape, fill_value)`: Creates an array filled with a specific value.


In [None]:
# Preallocate an array of zeros
zeros = np.zeros((3, 3))
zeros

In [None]:
# Preallocate an array of ones
ones = np.ones((2, 4))
ones

In [None]:
# Preallocate an uninitialized array
uninit = np.empty((2, 2))
uninit

In [None]:
# Preallocate an array filled with 42
filled = np.full((3, 3), 42)
filled

```{tip} When to Use Preallocation:
- `np.zeros()` and `np.ones()` are great for initializing arrays for numerical computations.
- `np.empty()` is ideal when you‚Äôll overwrite all values in the array soon after creation.
- Preallocation prevents the overhead of dynamically resizing arrays during iterative operations.
```

## üî¢ Sorting Arrays
NumPy makes sorting arrays simple and efficient with the `np.sort()` method and related functions.


### Key Sorting Functions:
- `np.sort()`: Returns a sorted copy of the array.
- `np.argsort()`: Returns the indices that would sort the array.
- `np.lexsort()`: Sorts based on multiple keys.
- `np.partition()`: Partially sorts the array by selecting elements up to a specific index.


In [None]:
data = np.array([3, 1, 4, 1, 5, 9])

In [None]:
# Sort the array in ascending order
np.sort(data)

In [None]:
# Get the indices that would sort the array
np.argsort(data)

In [None]:
# Sort a 2D array along rows (default axis=1)
matrix = np.array([[5, 2, 9], [3, 7, 1]])
np.sort(matrix)

In [None]:
# Sort along columns (axis=0)
np.sort(matrix, axis=0)

```{tip} Advanced Sorting:
Use `np.lexsort()` to sort by multiple keys.
```

In [None]:
names = np.array(["Alice", "Bob", "Charlie"])
scores = np.array([85, 95, 85])

In [None]:
# Sort by scores, then names
idx = np.lexsort((names, scores))
idx

In [None]:
names[idx]

## üîó Finding Unique Elements
NumPy provides `np.unique()` for identifying unique elements in an array. You can also retrieve additional information, such as indices or counts.


### Key Features of `np.unique()`:
- Find unique elements: Returns the sorted unique elements in the array.
- Return indices: Identify the positions of unique elements in the original array.
- Return counts: Count occurrences of unique elements.


In [None]:
data = np.array([1, 2, 2, 3, 3, 3, 4, 4, 4, 4])

In [None]:
# Find unique elements
np.unique(data)

In [None]:
# Return unique elements with their counts
unique_elements, counts = np.unique(data, return_counts=True)
unique_elements

In [None]:
counts

In [None]:
# Return indices of the first occurrences
unique_elements, indices = np.unique(data, return_index=True)
indices

### Unique in 2D Arrays
By default, `np.unique()` flattens the input array. Use the `axis` parameter to find unique rows or columns.


In [None]:
matrix = np.array([[1, 2], [3, 4], [1, 2]])

In [None]:
# Unique rows
np.unique(matrix, axis=0)

```{tip} 
Use `return_counts=True` to analyze frequency distributions in datasets, useful for exploratory data analysis.
```


## Summary of Methods

| Function             | Purpose                                     |
|----------------------|---------------------------------------------|
| `np.max()` / `np.min()` | Maximum/minimum value of an array          |
| `np.argmax()` / `np.argmin()` | Indices of maximum/minimum values          |
| `np.zeros()` / `np.ones()` | Preallocate arrays with zeros or ones         |
| `np.empty()`         | Preallocate an uninitialized array         |
| `np.sort()`          | Return a sorted array                      |
| `np.argsort()`       | Return indices to sort the array           |
| `np.lexsort()`       | Sort based on multiple keys                |
| `np.unique()`        | Find unique elements, indices, and counts  |




## üé≤ Random Number Generation

Generate random numbers for simulations or initializing models:


In [None]:
rng = np.random.default_rng()
rng.random(5)  # 5 random numbers between 0 and 1


For random integers:

In [None]:
rng.integers(10, size=(2, 3))  # Random integers from 0 to 9

## üìÇ Saving and Loading Data


### Binary Format:

Binary files are faster to read and write compared to text files.

- Save:

In [None]:
np.save("array.npy", a)

- Load:

In [None]:
b = np.load("array.npy")
b

### Text Format:

Text files are human-readable and can be opened in any text editor.

- Save as CSV:

In [None]:
np.savetxt("array.csv", a, delimiter=",")

- Load from CSV:

In [None]:
c = np.loadtxt("array.csv", delimiter=",")
c

## üìñ Getting Help

Need help with NumPy functions? Use Python‚Äôs built-in `help()` function or IPython's `?`:


In [None]:
help(np.array)  # Built-in documentation

In [None]:
np.array?


For even more details, use `??`:

In [None]:
np.array??

## üßë‚Äçüè´ Working with Mathematical Formulas

NumPy simplifies mathematical operations on arrays. For example, the Mean Squared Error (MSE) formula:

$$
\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
$$

Implementation in NumPy:


In [None]:
labels = np.array([1, 2, 3])
predictions = np.array([1.1, 1.9, 3.2])
mse = np.mean((labels - predictions) ** 2)
mse

## üßÆ Advanced Mathematical Operations in NumPy

NumPy is packed with mathematical tools for handling arrays, from basic arithmetic to more advanced mathematical operations. Here's an overview of the most useful ones:



### 1Ô∏è‚É£ Basic Element-Wise Operations
NumPy performs operations element-wise by default. You can add, subtract, multiply, or divide arrays easily:


In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

In [None]:
# Addition
a + b

In [None]:
# Subtraction
a - b

In [None]:
# Multiplication
a * b

In [None]:
# Division
a / b

```{note}
If you combine arrays of different shapes, NumPy will attempt broadcasting.
```


### 2Ô∏è‚É£ Power and Exponentials
- Exponentiation: Use `np.exp()` to compute the exponential of all elements in the array.
- Powers: Raise elements to a power using `np.power()` or the `**` operator.
- Logarithms: Use `np.log()` for natural log, `np.log10()` for base-10 log.


In [None]:
x = np.array([1, 2, 3, 4])

In [None]:
# Exponential
np.exp(x)

In [None]:
# Powers
np.power(x, 3)  # Cube every element

In [None]:
# Natural log
np.log(x)

In [None]:
# Base-10 log
np.log10(x)

### 3Ô∏è‚É£ Trigonometric Functions
NumPy supports trigonometric functions like sine, cosine, and tangent. All angles are in radians by default.


In [None]:
angles = np.array([0, np.pi / 2, np.pi])

In [None]:
# Sine
np.sin(angles)

In [None]:
# Cosine
np.cos(angles)

In [None]:
# Tangent
np.tan(angles)


Other trigonometric methods include:
- `np.arcsin()`, `np.arccos()`, `np.arctan()` for inverse trigonometric functions.
- `np.deg2rad()` and `np.rad2deg()` for converting between degrees and radians.



### 4Ô∏è‚É£ Statistics
NumPy provides many statistical methods for arrays:
- `np.mean()`: Mean (average).
- `np.median()`: Median.
- `np.std()`: Standard deviation.
- `np.var()`: Variance.
- `np.min()` and `np.max()`: Minimum and maximum values.
- `np.percentile()`: Compute the nth percentile.


In [None]:
data = np.array([1, 2, 3, 4, 5])

In [None]:
# Mean
np.mean(data)

In [None]:
# Median
np.median(data)

In [None]:
# Standard deviation
np.std(data)

In [None]:
# Variance
np.var(data)

In [None]:
# Percentile
np.percentile(data, 50)  # Median

### 5Ô∏è‚É£ Linear Algebra
Linear algebra operations are critical for many engineering and scientific applications. NumPy provides `np.linalg` for this purpose:

```{tip}
If you are currently enrolled in a linear algebra course, you can use NumPy to check your answers.
```


In [None]:
# Define a 2D matrix
matrix = np.array([[1, 2], [3, 4]])

In [None]:
# Transpose
matrix.T

In [None]:
# Matrix Multiplication
np.dot(matrix, matrix)

In [None]:
# Determinant
np.linalg.det(matrix)

In [None]:
# Eigenvalues and Eigenvectors
eigvals, eigvecs = np.linalg.eig(matrix)

print("eignvalues: ", eigvals)
print("eigenvectors: ", eigvecs)


Additional methods:
- `np.linalg.inv()`: Matrix inverse.
- `np.linalg.norm()`: Vector or matrix norm.
- `np.linalg.qr()`: QR decomposition.
- `np.linalg.svd()`: Singular Value Decomposition (SVD).



### 6Ô∏è‚É£ Sorting and Searching
- Sorting: Use `np.sort()` to sort elements in ascending order.
- Search for Elements: Use `np.where()` to find indices of elements that match a condition.


In [None]:
data = np.array([3, 1, 4, 1, 5])

# Sort the array
np.sort(data)

In [None]:
# Find indices of elements greater than 3
np.where(data > 3)

### 7Ô∏è‚É£ Aggregations
Aggregate methods operate along entire arrays or specified axes:
- `np.sum()`: Sum of elements.
- `np.prod()`: Product of elements.
- `np.cumsum()`: Cumulative sum.
- `np.cumprod()`: Cumulative product.


In [None]:
data = np.array([1, 2, 3, 4])

In [None]:
# Sum of all elements
np.sum(data)

In [None]:
# Product of all elements
np.prod(data)

In [None]:
# Cumulative sum
np.cumsum(data)

In [None]:
# Cumulative product
np.cumprod(data)

### 8Ô∏è‚É£ Clipping and Rounding
- Clipping: Restrict array values within a range using `np.clip()`.
- Rounding: Round values using `np.round()`, `np.floor()`, `np.ceil()`, etc.


In [None]:
data = np.array([1.2, 2.5, 3.7, 4.4])

In [None]:
# Clip values between 2 and 4
np.clip(data, 2, 4)

In [None]:
# Round values
np.round(data)

In [None]:
# Floor and Ceil
np.floor(data)  # Round down

In [None]:
np.ceil(data)  # Round up

### 9Ô∏è‚É£ Random Sampling
Use `np.random` to generate random values for simulations:
- `np.random.random()`: Uniform random values.
- `np.random.normal()`: Random values from a normal distribution.
- `np.random.randint()`: Random integers within a range.


In [None]:
# Random values between 0 and 1
np.random.random(5)

In [None]:
# Random integers between 10 and 20
np.random.randint(10, 20, size=5)



These mathematical tools make NumPy the backbone of scientific computing in Python. With these methods, you can efficiently handle numerical data for any engineering, research, or data science task! üöÄ