NOTE: For your homework download and use the template (https://math.dartmouth.edu/~m50f17/HW4.Rmd)

Read the green comments in the rmd file to see where your answers should go. e







Question-1 (Sample)

Read Example 3.1 Delivery Time Data.

  1. Graphics can be very useful in analyzing the data. Plot two useful visualization of the data. First plot three dimensional scatterplot of delivery time data. Then plot scatterplot matrix (which is an array of 2D plots where each plot is a scatter diagram between two variables).

  2. Fit a regression model for the reduced model relating delivery time to number of cases. Plot the joint confidence region of the coefficients (slope and intercept). Also add a point to the plot to show the estimated slope and intercept.

  3. Calculate the extra sum of squares due to the regressor variable Distance.

Answer:

# Computation part of the answer : 

# Loading the data
delivery <- read.table("https://math.dartmouth.edu/~m50f17/delivery.csv", header = TRUE)
x1Cases <- delivery$Cases
x2Distance <- delivery$Distance
yTime <- delivery$Time

cat ("Part (a) \n")
## Part (a)
# 3D scatter diagram  
library("plot3D")
library("scatterplot3d")
sc1 <- scatterplot3d(x1Cases, x2Distance, yTime, pch=17 , type = 'p', angle = 15 , highlight.3d = T ) # Plot scatterplot matrix

plot(delivery[,-1])

cat("Part (b) \n")
## Part (b)
library(ellipse)
reducedFit <- lm(Time ~ x1Cases, data = delivery)
plot(ellipse(reducedFit), type = "l", xlab = "Intercept", ylab = "Slope", main = "Joint Confidence Region")
points (reducedFit$coeff[[1]] , reducedFit$coeff[[2]] )

cat("Part (c) \n")
## Part (c)
fullFit <- lm(Time ~ Cases + Distance, data = delivery)
reducedSSR <- sum((predict(reducedFit) - mean(yTime))^2)
fullSSR <- sum((predict(fullFit) - mean(yTime))^2)

cat ( "Extra sum of square due to distance is : 
      " , fullSSR - reducedSSR , "\n")
## Extra sum of square due to distance is : 
##        168.4021



Question-2

Load the kinematic viscosity data (explained in Problem 3.14 and table B-10 in the book) at https://math.dartmouth.edu/~m50f17/kinematic.csv
Solve the parts (a) to (e) of the Problem 3.14 and use \(\alpha=0.05\). In addition, do the following.

  1. Calculate the extra sum of squares due to the regressor variable x1.

  2. Plot scatterplot matrix and scatter diagram in order to visualize the data. Can you make any connection between the visualization of data and the results you found in previous parts? Discuss.

Answer:

# Computation part of the answer : 



Question-3

Load the Mortality data (explained in Problem 3.15 and table B-15 in the book) at

https://math.dartmouth.edu/~m50f17/mortality.csv

Solve the parts (a) to (e) of the Problem 3.15 (use \(\alpha=0.05\) if you need). In addition do the following.

  1. You want to quantify the contribution of regressors Educ,NOX,SO2 together to the model. Choose \(\alpha=0.01\). Using F test (the partial F test given in equation 3.35) comment on this contribution to the model. (Note the different \(\alpha\) value).

  2. Consider the individual contribution test you calculated in part (c). Now choose the two regressor variables with the lowest t-statistic values (in absolute value). Using partial F test comment on their contribution to the model. Use \(\alpha=0.01\).

Answer:

# Computation part of the answer :