# You must make sure to run all cells in sequence using shift + enter or you might encounter errors
from pykubegrader.initialize import initialize_assignment
responses = initialize_assignment("7_zillow_q", "week_8", "readings", assignment_points = 20.0, assignment_tag = 'week8-readings')
# Initialize Otter
import otter
grader = otter.Notebook("7_zillow_q.ipynb")
โ Welcome to the wild world of Zillow Gone Wild! ๐กโจ#
In this activity, weโre diving into the California housing market with the zest of a real estate mogul and the curiosity of a data scientist. Armed with Python and a sprinkle of magic from sklearn, weโre about to fetch a treasure trove of housing data thatโs just begging to be explored.
Our mission, should we choose to accept it (and we totally do), is to unravel the mysteries hidden within this dataset. Weโll start by conjuring up a dazzling data profile using ydata_profiling. Think of it as a crystal ball that reveals the secrets of our dataโdistributions, correlations, and maybe even a few skeletons in the closet. ๐งโโ๏ธ๐ฎ
But wait, thereโs more! Weโre not just stopping at data profiling. Oh no, weโre going full Picasso with Bokeh, crafting an interactive scatter plot masterpiece. Picture this: housing prices dancing with population numbers, all under the spotlight of your mouse cursor. Tooltips will be our backstage pass, giving us the inside scoop on each data point. ๐จ๐ฑ๏ธ
And for those of you with a penchant for precision, weโve got a filtering function that lets you sift through listings like a pro. Want to see only the homes with at least four bedrooms? No problem! Our ZillowAnalyzer class is here to make your real estate dreams come true. So buckle up, grab your virtual hard hat, and letโs build some data-driven insights that would make even the most seasoned realtor green with envy. Letโs get analyzing! ๐ ๐ก๐
# import the pandas library as pd
# import the fetch_california_housing dataset from sklearn.datasets
...
# Load the dataset
# The dataset is stored in a class, if you instantiate the class, you can access the data
# instantiate the class into a variable called housing
...
# Convert to a DataFrame
# We will use the pd.DataFrame() function to convert the data to a DataFrame
# The data is stored in the housing.data variable
# The feature names are stored in the housing.feature_names variable, make sure to set the column names to the feature names
# store the DataFrame in a variable called df
...
# Add the price (targets) to the DataFrame
# The price is stored in the housing.target variable
# store the price in a variable called price
# you can do this by creating a new column in the DataFrame and assigning the price variable to it, this is done just like you would a dictionary.
...
grader.check("loading-the-zillow-data")
Building the ZillowAnalyzer class#
Now we will build the ZillowAnalyzer class. This class will have the following methods:
init: This method will initialize the class with a list of home listings.
generate_profile: This method will generate a ydata profiling report.
create_visualization: This method will create an interactive scatter plot using Bokeh.
filter_by_bedrooms: This method will filter the listings by the number of bedrooms.
df["AveBedrms"].head()
# import the pandas library as pd
# import the ydata_profiling library
# from the bokeh.plotting module, import the figure and show functions
# from the bokeh.models module, import the ColumnDataSource and HoverTool classes
# from the bokeh.io module, import the output_notebook function
...
# Define the ZillowAnalyzer class
...
# The constructor method for the class
# This initializes the class with with a dataframe of home listings
# Assign the dataframe to the instance variable df
...
# Make the methods generate_profile, that takes itself as an input argument
# call the ydata_profiling library to generate a report, and save the report to a local variable called profile
# return the profile by calling the to_notebook_iframe method with no input arguments, this will return the report as an iframe
...
# Make the method create_visualization, that takes itself as an input argument
...
"""Generate an interactive scatter plot using Bokeh."""
# call the output_notebook function to output the notebook
# create a ColumnDataSource object from the dataframe, using the ColumnDataSource function
...
# create a figure object `p` from the figure function, make sure to use the bokeh figure function
# set the title of the figure to "Zillow Gone Wild: Price vs. Population"
# set the x_axis_label to "Population"
# set the y_axis_label to "Price"
# add the following tooltips to the figure:
# tooltips=[
# ("Population", "@Population"),
# ("Price", "@price"),
# ("AveRooms", "@AveRooms"),
# ("AveBedrms", "@AveBedrms"),
# ("AveOccup", "@AveOccup"),
# ("Latitude", "@Latitude"),
# ("Longitude", "@Longitude"),
# ("MedInc", "@MedInc"),
# ]
...
# create a scatter plot from the figure object - this is done by calling the scatter function on the figure object
# set the x to "Population"
# set the y to "price"
# set the size to 10
# set the color to "navy"
# set the alpha to 0.6
...
# add a hover tool to the figure
# instantiate the HoverTool class
# add the tool to the figure, using the add tool method in the figure object
...
# show the figure
...
# Make the method filter_by_bedrooms, that takes itself as an input argument
# and a minimum number of bedrooms
# return the listings with at least the minimum number of bedrooms
# you will need to look up how to filter a pandas dataframe
...
# Instantiate and Test
# Instantiate the ZillowAnalyzer class with the dataframe df, store the instance in a variable called analyzer
# generate the profile of the dataframe, store the profile in a variable called profile
# create the visualization, store the figure in a variable called p
# filter the dataframe by 4 bedrooms, store the filtered dataframe in a variable called filtered_df
...
grader.check("zillow-gone-wild-bokeh-ydata-class")
Submitting Assignment#
Please run the following block of code using shift + enter
to submit your assignment, you should see your score.
from pykubegrader.submit.submit_assignment import submit_assignment
submit_assignment("week8-readings", "7_zillow_q")