## Question-1 (Sample)

Given a fixed confidence interval percentage (say 95%) at what value of x does CI on the mean response achieve its minimum width?

Write an R-chunk using the propellant data which computes the following.

1. Fit a simple linear regression model relating shear strength to age.
2. Plot scatter diagram.
3. Plot two curves (in blue color) that traces upper and lower limits of 95% confidence interval on $$E(y|x_0)$$
4. Plot two curves (in red color) that traces upper and lower limits of 95% prediction interval for $$y$$
5. Print the 95% quantile of the corresponding t distribution

The width of the interval is $2 t_{\alpha/2,n-2} \sqrt{MS_{Res} ((1/n) + (x_0 - \bar{x})^2 / S_{xx} }$ and all terms inside the square root are positive. Therefore it is minimized when $$x_0=\bar{x}$$.

# Computation part of the answer :

shearS<-prop$ShearS age<-prop$Age

plot(age, shearS, xlab = "Propellant Age (weeks)", ylab = "Shear S. (psi)", main = "Rocket Propellant")
fitted <- lm(shearS ~ age)

ageList <- seq(0,25,0.5)
cList <- predict(fitted, list(age = ageList), int = "c", level = 0.95)
pList <- predict(fitted, list(age = ageList), int = "p", level = 0.95)

matlines(ageList, pList, lty='solid' ,  col = "red")
matlines(ageList, cList, lty = 'solid', col = "blue") 

# since n=20 we look at the t_18 distribution
wantedQuantile <- qt( 0.95, 18) ;
cat("95% quantile is of t_18 is : ", wantedQuantile ) ; 
## 95% quantile is of t_18 is :  1.734064

## Question-2

Plot the same graph as in Question-1 without using R function predict, but instead directly calculating the interval limits we discussed in class. In particular, what are the limits of 95% confidence interval on $$E(y|x_0)$$?

# Computation part of the answer :
Sxx<-sum((age-mean(age))^2)
Sxy<-sum((age-mean(age))*shearS)
beta1Hat<-Sxy/Sxx
beta0Hat<-mean(shearS)-Sxy/Sxx*mean(age)
plot(age,shearS)
abline(c(beta0Hat,beta1Hat), lwd=2, col='blue')

N <- length(age) ;
dof <- N - 2 ;
muHat <- beta0Hat + beta1Hat *  ageList ;
yHat <- fitted$fitted.values ; SSres <- sum((shearS - yHat)^2); MSres <- SSres / dof ; xBar <- mean (age) tQuantile_975 <- qt( 1 - 0.05/2 , 18) ; upperCI <- muHat + tQuantile_975 * sqrt ( MSres * ( (1/N) + (ageList-xBar)^2 / Sxx ) ) lowerCI <- muHat - tQuantile_975 * sqrt ( MSres * ( (1/N) + (ageList-xBar)^2 / Sxx ) ) matlines(ageList, upperCI, lty = 'solid', col = "blue") matlines(ageList, lowerCI, lty = 'solid', col = "blue") yHatList <- beta0Hat + beta1Hat * ageList ; upperPI <- yHatList + tQuantile_975 * sqrt ( MSres * ( 1 + (1/N) + (ageList-xBar)^2 / Sxx ) ) lowerPI <- yHatList - tQuantile_975 * sqrt ( MSres * ( 1 + (1/N) + (ageList-xBar)^2 / Sxx ) ) matlines(ageList, upperPI, lty = 'solid', col = "red") matlines(ageList, lowerPI, lty = 'solid', col = "red")  ## Question-3 Load the propellant data and fit a simple linear regression model relating shear strength to age. 1. Test the hypothesis $$\beta_1 = -30$$ using confidence level 97.5%. 2. Calculate the limits of 97.5% confidence interval for $$\beta_0$$ and $$\beta_1$$ 3. Is there any relation between the answers you find in part (a) and (b) ? 4. Calculate $$R^2$$ ### Answer: # Computation part of the answer : seBeta1Hat = sqrt ( MSres / Sxx ) ; guess = -30 ; testStat = ( beta1Hat - guess ) / seBeta1Hat ; tQuantile_9875 <- qt( 1 - 0.025/2 , 18) ; cat ("Part (a) Absolute value of test statistic is " , abs(testStat) , " Since it is greater than t_quantile=" , tQuantile_9875 , " we REJECT the hypothesis \n\n") ## Part (a) ## Absolute value of test statistic is 2.476056 Since it is greater than t_quantile= 2.445006 ## we REJECT the hypothesis seBeta0Hat = sqrt ( MSres * ( (1/N) + (xBar)^2 / Sxx ) ) ; cat ("Part (b) Lower and upper bounds of 97.5% CI for beta0 are : " , beta0Hat - tQuantile_9875 * seBeta0Hat ," and ", beta0Hat + tQuantile_9875 * seBeta0Hat , " Lower and upper bounds of 97.5% CI for beta1 are : " , beta1Hat - tQuantile_9875 * seBeta1Hat ," and ", beta1Hat + tQuantile_9875 * seBeta1Hat, "\n\n") ## Part (b) ## Lower and upper bounds of 97.5% CI for beta0 are : 2519.792 and 2735.852 ## Lower and upper bounds of 97.5% CI for beta1 are : -44.21747 and -30.08971 cat ("Part (c) Yes, t-test is equivalent to testing whether -30 is in the CI for beta1. We observe that -30 is out of the above confidence interval for beta1, thus hypothesis is rejected in part b \n\n") ## Part (c) ## Yes, t-test is equivalent to testing whether -30 is in the CI for beta1. We observe that -30 is out of the above confidence interval for beta1, thus hypothesis is rejected in part b yBar <- mean (shearS) SSt <- sum((shearS - yBar)^2); SSr <- sum((yHat - yBar)^2); cat ("Part (d) R-square is : " , SSr / SSt ) ## Part (d) ## R-square is : 0.9018414 ## Question-4 Load the propellant data. This time let us consider a relation between square of shear strength and propellant age. 1. Fit a simple linear regression model relating square of shear strength to age. Plot scatter diagram and fitted line. 2. Using analysis-of-variance test for significance of regression (using the formulas we discussed in class) 3. Use t-test and check significance of regression (using the formulas we discussed in class) 4. Does the regression analysis predict a linear relationship between square of shear strength and propellant age ? ### Answer: # Computation part of the answer : prop<-read.table("https://math.dartmouth.edu/~m50f17/propellant.csv", header=T, sep=",") shearS<-prop$ShearS
age<-prop$Age shear_SQ<- (prop$ShearS)^2
plot(age, shear_SQ, xlab = "Propellant Age (weeks)", ylab = "Shear S. Squared", main = "Rocket Propellant")
fitted_SQ <- lm(shear_SQ ~ age)
abline(fitted_SQ\$coef, lwd=2, col='blue')