Open book, open internet, open questions to me + Rick (head TA) only
No communication with others or posting questions on the internet allowed
What can you do to start preparing?
Review readings, assignments, feedback returned
Organize your notes
Come to office hours with questions
Computational setup
# load packageslibrary(tidyverse) # for data wrangling and visualizationlibrary(tidymodels) # for modelinglibrary(openintro) # for the duke_forest datasetlibrary(scales) # for pretty axis labelslibrary(knitr) # for pretty tableslibrary(patchwork) # for laying out plotslibrary(GGally) # for pairwise plots# set default theme and larger font size for ggplot2ggplot2::theme_set(ggplot2::theme_minimal(base_size =20))
Considering multiple variables
House prices in Levittown
The data set contains the sales price and characteristics of 85 homes in Levittown, NY that sold between June 2010 and May 2011.
Levittown was built right after WWII and was the first planned suburban community built using mass production techniques.
Similar to simple linear regression, this model assumes that at each combination of the predictor variables, the values sale_price follow a Normal distribution.
Regression Model
Recall: The simple linear regression model assumes
The estimated coefficient \(\hat{\beta}_j\) is the expected change in the mean of \(y\) when \(x_j\) increases by one unit, holding the values of all other predictor variables constant.
Example: The estimated coefficient for living_area is 65.90. This means for each additional square foot of living area, we expect the sale price of a house in Levittown, NY to increase by $65.90, on average, holding all other predictor variables constant.
Prediction
What is the predicted sale price for a house in Levittown, NY with 3 bedrooms, 1 bathroom, 1,050 square feet of living area, 6,000 square foot lot size, built in 1948 with $6,306 in property taxes?
The predicted sale price for a house in Levittown, NY with 3 bedrooms, 1 bathroom, 1050 square feet of living area, 6000 square foot lot size, built in 1948 with $6306 in property taxes is $265,360.
Prediction, revisit
Just like with simple linear regression, we can use the predict() function in R to calculate the appropriate intervals for our predicted values:
Calculate a 95% confidence interval for the estimated mean price of houses in Levittown, NY with 3 bedrooms, 1 bathroom, 1050 square feet of living area, 6000 square foot lot size, built in 1948 with $6306 in property taxes.
predict(price_fit, new_house, type ="conf_int", level =0.95)
Calculate a 95% prediction interval for an individual house in Levittown, NY with 3 bedrooms, 1 bathroom, 1050 square feet of living area, 6000 square foot lot size, built in 1948 with $6306 in property taxes.
predict(price_fit, new_house, type ="pred_int", level =0.95)