**NOTE: For your homework download and use the template** (https://math.dartmouth.edu/~m50f17/HW5.Rmd)

**Read the green comments in the rmd file to see where your answers should go.**

You can use qqnorm and qqline functions to plot probability plots of residuals. The functions rstandard and rstudent calculate the standardized residuals and R-student residuals, respectively.

```
prop = read.table("https://math.dartmouth.edu/~m50f17/propellant.csv", header=T, sep=",")
age <- prop$Age
shearS <- prop$ShearS
fitted = lm(shearS ~ age)
stdRes = rstandard(fitted)
rStuRes = rstudent(fitted)
qqnorm(rStuRes, main="Normal Probability Plot (residuals on vertical axis)")
qqline(rStuRes)
```

However, note that the book uses residuals on the x-axis instead of y-axis. In order to obtain that use the parameter datax as shown below. In the below graph x-axis denotes the R-student residuals and the y-axis is the theoretical quantiles ( in the book y-axis is probability instead of quantiles).

```
qqnorm(rStuRes, datax = TRUE ,
main="Normal Probability Plot")
qqline(rStuRes, datax = TRUE )
```

```
yHat <- predict(fitted)
plot (yHat, rStuRes)
abline(0,0)
```

After observing that the observation points 5 and 6 look like potential outliers, next we delete those points and compare the fitted model of the deleted data with the full data.

```
plot(age, shearS, xlim=c(0,30), ylim=c(1600,2700))
abline(fitted$coef, lwd = 2, col = "blue")
ageRem <- age[-6]
ageRem <- ageRem[-5]
shearSRem <- shearS[-6]
shearSRem <- shearSRem[-5]
fitted2 = lm(shearSRem ~ ageRem)
abline(fitted2$coef, lwd = 2, col = "red")
```

There is a dedicated library :

MPV: Data Sets from Montgomery, Peck and Vining’s Book

in order to provide an easy way to load tables from the book. To install the library type :

install.packages(“MPV”)

Below is an example how to use this library. Check https://cran.r-project.org/web/packages/MPV/MPV.pdf for table names.

`library(MPV)`

```
##
## Attaching package: 'MPV'
```

```
## The following object is masked from 'package:datasets':
##
## stackloss
```

```
data(table.b1)
y <- table.b1$y
x1 <- table.b1$x1
x3 <- table.b1$x3
x8 <- table.b1$x8
y.lm <- lm(y ~ x1 + x3 + x8)
summary(y.lm)
```

```
##
## Call:
## lm(formula = y ~ x1 + x3 + x8)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.6835 -1.5609 -0.1448 1.7188 4.0386
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.136744 9.145612 1.327 0.19698
## x1 0.001123 0.001938 0.580 0.56762
## x3 0.159582 0.296318 0.539 0.59516
## x8 -0.006496 0.002088 -3.112 0.00475 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.42 on 24 degrees of freedom
## Multiple R-squared: 0.5702, Adjusted R-squared: 0.5164
## F-statistic: 10.61 on 3 and 24 DF, p-value: 0.0001244
```

Solve the parts (a), (b) and (c) of Problem 4.1. In addition answer the following.

- Is it possible to perform lack of fit test using the steps (4.20) to (4.24) ?

```
library("MPV")
fitted = lm(y~x8, data=table.b1)
cat("Part (a).
There are some minor deviations but in general it seems OK and we can say no problem with the normality assumptions.")
```

```
## Part (a).
## There are some minor deviations but in general it seems OK and we can say no problem with the normality assumptions.
```

```
rStuRes <- rstudent(fitted)
qqnorm(rStuRes, datax = TRUE, main = "Normal Probability Plot")
qqline(rStuRes, datax = TRUE )
```

```
cat("Part (b).
Plot seems in a band and it doesn't suggest any violations of our assumptions.")
```

```
## Part (b).
## Plot seems in a band and it doesn't suggest any violations of our assumptions.
```

```
yHat <- predict(fitted)
plot(yHat, rStuRes)
abline(0, 0)
```

```
cat("Part (c).
Plot shows a structure (trend in different parts) so we conclude that adding x2 might contribute to the model.")
```

```
## Part (c).
## Plot shows a structure (trend in different parts) so we conclude that adding x2 might contribute to the model.
```

```
plot(table.b1$x2, rStuRes)
abline(0, 0)
```

```
cat("Part (d).
The lack of fit test explained in steps (4.20) to (4.24) requires repeated observations, therefore not possible to do in this case.")
```

```
## Part (d).
## The lack of fit test explained in steps (4.20) to (4.24) requires repeated observations, therefore not possible to do in this case.
```

Chapter 4, Problem 2 all parts.

```
cat("Part (a).
There are some deviations from the normality in the first part. First half above and second half seems to be below the normal distribution line (qq-line).")
```

```
## Part (a).
## There are some deviations from the normality in the first part. First half above and second half seems to be below the normal distribution line (qq-line).
```

```
fitted <- lm(y ~ x2 + x7 + x8, data=table.b1)
rStuRes <- rstudent(fitted)
qqnorm(rStuRes, datax = TRUE, main = "Normal Probability Plot")
qqline(rStuRes, datax = TRUE)
```

```
cat("Part (b).
This also seems in a band with no apparent violations in our assumptions from this plot.")
```

```
## Part (b).
## This also seems in a band with no apparent violations in our assumptions from this plot.
```

```
yhat <- predict(fitted)
plot(yhat, rStuRes)
abline(0, 0)
```

```
cat("Part (c).
The plots of residuals of x7 shows non-constant variance (variance increasing in the first half). Plot against x2 has also some narrowing in the middle and suggests non-constant variance. On the other hand plot against x8 is a little better, although it shows some narrowing towards ends there are too few points (only two points in each end). ")
```

```
## Part (c).
## The plots of residuals of x7 shows non-constant variance (variance increasing in the first half). Plot against x2 has also some narrowing in the middle and suggests non-constant variance. On the other hand plot against x8 is a little better, although it shows some narrowing towards ends there are too few points (only two points in each end).
```

```
plot(table.b1$x2, rStuRes)
abline(0, 0)
```

```
plot(table.b1$x7, rStuRes)
abline(0, 0)
```