# You must make sure to run all cells in sequence using shift + enter or you might encounter errors
from pykubegrader.initialize import initialize_assignment

responses = initialize_assignment("6_Curve_Fitting_q", "week_8", "lecturenotgraded", assignment_points = 45.0, assignment_tag = 'week8-lecturenotgraded')

# Initialize Otter
import otter
grader = otter.Notebook("6_Curve_Fitting_q.ipynb")

๐Ÿ’ปActivity: Curve Fitting ๐Ÿ“Š#

Engineers often have to deal with large amounts of raw data.

To assist in data analysis it is common to fit a model to data to make useful insights easier.

The fitting process happens by performing computations to minimize an objective relationship.

There are many optimization methods โ€“ this is a rich field of science and engineering.

SciPy is a package for scientific computing in python that has many built in tools for optimization and fitting.

Curve Fitting Example#

Suppose we have some data on a sine function with some noise included, build a function to generate this data.

# import numpy as np
...

# Do not change the random seed
np.random.seed(123)

# generate a x linear spaced vector from -5 to 5 with 50 points, use the np.linspace function
...

# generate y data which is 2.9 * np.sin(1.5 * x_data) , add to this some random noise using the np.random.normal function
...
grader.check("generate-sine-data")

When dealing with data it is usually helpful to visualize the data as a graph

# plot the raw data using matplotlib
...

# plot the x_data and y_data using matplotlib, use the plt.plot function, make the points blue squares using the "-s" marker, save to the variable plot_
...
grader.check("generate-initial-plot")

We know the data lies approximately on a sine wave. We do not know the amplitudes or the period.

We can estimate those by least squares curve fitting. First we have to define the test function to fit. In this case, a sine with unknown amplitude and period:

# write a function that takes x and a and b and returns a * np.sin(b * x), the sine function with unknown amplitude and period
...
grader.check("make-sine-function")

We then use scipy.optimize.curve_fit() to find a and b:

# import the optimize module from scipy
...

# Do not change the random seed
np.random.seed(123)

# use the curve_fit function in optimize to fit the sine_func to the data, use the p0 parameter to set the initial guess for the parameters for the initial guess we can use the values of [2,2 ]
...

# print the parameters
...
grader.check("scipy-optimize-curve-fit")

Question 4 (Points: 10.0): Visualizing our Results#

# Create a new figure and axis, make it using plt.subplots, and set the figsize to (10, 6)
# save the figure to the variable fig, and the axis to the variable ax
...

# Plot the raw data
# save the plot to the variable raw_data_plot, make the points blue circles using the "bo" marker, add a label "Raw data" using the optional label parameter
...

# Plot the fitted function 
# save the plot to the variable fitted_function_plot, make the line red using the "red" tag, add a label "Fitted function" using the optional label parameter
...

# Label the axes "X" and "Y", using the plt.xlabel and plt.ylabel functions
...

# Set y-axis limits, set the y-axis limits to be 1.2 times the minimum and maximum of the y_data using the plt.ylim function
...

# Add a legend, set the loc to "upper right" using the plt.legend function
...
grader.check("sine-viz-plot")

Question 5 (Points: 9.0): Example on a Exponential Function#

# import numpy as np
# import the curve_fit function from scipy.optimize
# import matplotlib.pyplot as plt
...

# Reset the random seed and recompute to check reproducibility
np.random.seed(123)

# Define the exponential function to fit, the expression is a * np.exp(-b * x) + c
...

# Generate noisy data, from 0 to 4 with 50 points, use the np.linspace function
...

# generate ydata which is exp_func(xdata, 2.5, 1.3, 0.5) + 0.2 * np.random.normal(size=len(xdata))
...

# Fit the data, save the fit parameters in popt, and the covariance matrix in pcov
...

# Plot the data and fitted function, make the points blue circles using the "bo" marker, make the line red using the "red" tag, add a label "Raw data" using the optional label parameter.
# save the plot to the variable raw_data_plot
...

# plot the fitted function using matplotlib, make the line red using the "red" tag, add a label "Fitted function" using the optional label parameter.
# you can pass the parameters to the exp_func function using the * operator to unpack the parameters
# save the plot to the variable fitted_function_plot
...

# add a label for the x-axis "X" using the plt.xlabel function
...

# add a label for the y-axis "Y" using the plt.ylabel function
...

# add a legend to the plot using the plt.legend function, the legend should be in the upper right corner using the "upper right" location
...

plt.show()


# calculate the residuals, the difference between the ydata and the fitted function, the residuals are the difference between the ydata and the fitted function
...

# calculate the sum of the squared residuals, this is a common metric for the quality of the fit, save to the variable ss_res
...

# calculate the total sum of squares, this is a common metric for the quality of the fit, save to the variable ss_tot
...

# calculate the r-squared value, this is a common metric for the quality of the fit, the r-squared value is 1 - (ss_res / ss_tot)
...

# print the r-squared value
...
grader.check("exponential-function-fit")

Submitting Assignment#

Please run the following block of code using shift + enter to submit your assignment, you should see your score.

from pykubegrader.submit.submit_assignment import submit_assignment

submit_assignment("week8-lecturenotgraded", "6_Curve_Fitting_q")