Acknowledgment: the materials below are partially based on Montgomery, D. C., Peck, E. A., Vining, G. G., Introduction to Linear Regression Analysis (5th Edition), Wiley Series in Probability and Statistics, 2012. This materials was initilated by Yichen Qin and modified by Tianhai Zu for teaching purpose.
Let us start with the general model building process.
As a quick review, we have discussed what model adequacy checking means.
Model Adequacy Checking:
Residual analysis, lack of fit testing, determining influential observations
Checks the fit of the model to the available data
On the other hand, let us define what model validation means.
Model Validation:
Determining if the model will behave or function as it was intended in the operating environment
More specifically, what is validation? How do we know that an estimated regression model is generalizable beyond the sample data used to fit it?
Some elementary tools for model validation:
Analysis of model coefficients and predicted values
Check for “inappropriate” signs on the coefficients
Check for unusual magnitudes on the coefficients
Check for stability in the coefficient estimates
Check the predicted values
To formally validate models, we could
Ideally, we can obtain new independent data with which to validate our model.
However, obtaining new data isn’t always available.
How to validate our model? Cross validation.
However, in practice,
Most of the time we cannot obtain new independent data to validate our model.
An alternative is to partition the sample data into a training (or model-building) set, which we can use to develop the model, and a validation (or prediction) set, which is used to evaluate the predictive ability of the model.
This is called cross-validation.
In this section, we will focus on the following cross validation.
K-fold cross validation
Leave one out cross validation (LOOCV)
This partitions the sample dataset into K parts which are (roughly) equal in size.
For each part, we use the remaining K - 1 parts to estimate the model of interest (i.e., the training sample) and test the predictability of the model with the remaining part (i.e., the validation sample).
We then calculate the sum of squared prediction errors, and combine the K estimates of prediction error to produce a K-fold cross-validation estimate.
Usually, K=5 or 10.
The idea of K-fold CV is summarized in the figure below.
K-fold cross-validation
When K = n, this is called leave-one-out cross-validation.
That means that n separate data sets are trained on all of the data (except one point) and then prediction is made for that one point.
The evaluation of this method is very good, but often computationally expensive.
Note that the leave-one-out cross validation estimate of prediction error is identical to the PRESS statistic mentioned in Chapter 4.
To split data into training and testing sets
If the time sequence is known, data splitting can be done by time order.
Given other characteristics of the data, data splitting based on group (operator, machine, location, etc.)
Random selection (most frequently used).
Prediction R square \[R^2_{prediction}=1-\frac{\sum_{i=1}^{n} (y_i-\hat{y}_{(i)})^2}{\sum_{i=1}^{n} (y_i-\bar{y})^2}=1-\frac{PRESS}{SST}\]
Interpretation of prediction R square: we expect the model to explain about 100*R^2% of the variability in prediction of a new observation.
Note that the traditional R squared is \[R^2 = 1 - \frac{SSRes}{SST}\] where \[SSRes=\sum_{i=1}^n (y_i-\hat{y}_{i})^2.\]
\(\hat{y}_{(i)}\) is the predicted value of based on cross-validation (either K-fold or leave-one-out CV).
Example of delivery time data set Now we perform K-fold cross validation.
#install DAAG from archived source
if(!is.element("DAAG", installed.packages()[,1])){
packageurl <- "https://cran.r-project.org/src/contrib/Archive/DAAG/DAAG_1.22.tar.gz"
install.packages("latticeExtra")
install.packages(packageurl, repos=NULL, type="source")
}
#loda DAAG package
library(DAAG)
## Loading required package: lattice
d <- read.csv("data_delivery.csv",header=TRUE)
head(d)
## Index DeliveryTime NumberofCases Distance
## 1 1 16.68 7 560
## 2 2 11.50 3 220
## 3 3 12.03 3 340
## 4 4 14.88 4 80
## 5 5 13.75 6 150
## 6 6 18.11 7 330
n = dim(d)[1]
n
## [1] 25
model1 = lm(DeliveryTime ~ NumberofCases + Distance, data=d)
summary(model1)
##
## Call:
## lm(formula = DeliveryTime ~ NumberofCases + Distance, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.7880 -0.6629 0.4364 1.1566 7.4197
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.341231 1.096730 2.135 0.044170 *
## NumberofCases 1.615907 0.170735 9.464 3.25e-09 ***
## Distance 0.014385 0.003613 3.981 0.000631 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.259 on 22 degrees of freedom
## Multiple R-squared: 0.9596, Adjusted R-squared: 0.9559
## F-statistic: 261.2 on 2 and 22 DF, p-value: 4.687e-16
KCV=cv.lm(data=d, model1, m=3, seed=123)
## Analysis of Variance Table
##
## Response: DeliveryTime
## Df Sum Sq Mean Sq F value Pr(>F)
## NumberofCases 1 5382 5382 506.6 < 2e-16 ***
## Distance 1 168 168 15.8 0.00063 ***
## Residuals 22 234 11
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Warning in cv.lm(data = d, model1, m = 3, seed = 123):
##
## As there is >1 explanatory variable, cross-validation
## predicted values for a fold are not a linear function
## of corresponding overall predicted values. Lines that
## are shown for the different folds are approximate
##
## fold 1
## Observations in test set: 8
## 1 4 6 10 11 12 13 19
## Predicted 21.71 9.96 18.400 19.12 38.09 21.593 12.5 7.71
## cvpred 21.69 9.15 17.933 19.29 37.89 20.759 12.0 6.86
## DeliveryTime 16.68 14.88 18.110 21.50 40.33 21.000 13.5 9.50
## CV residual -5.01 5.73 0.177 2.21 2.44 0.241 1.5 2.64
##
## Sum of squares = 78.1 Mean square = 9.76 n = 8
##
## fold 2
## Observations in test set: 9
## 2 5 15 16 18 20 22 23 25
## Predicted 10.4 14.194 23.329 29.66 15.55 40.89 56.01 23.36 10.963
## cvpred 10.0 14.624 24.141 30.35 16.24 43.24 60.45 24.17 10.918
## DeliveryTime 11.5 13.750 24.000 29.00 19.00 35.10 52.32 18.75 10.750
## CV residual 1.5 -0.874 -0.141 -1.35 2.76 -8.14 -8.13 -5.42 -0.168
##
## Sum of squares = 174 Mean square = 19.4 n = 9
##
## fold 3
## Observations in test set: 8
## 3 7 8 9 14 17 21 24
## Predicted 12.080 7.16 16.673 71.8 18.68 14.914 20.51 24.40
## cvpred 12.685 8.89 17.442 64.6 18.43 15.831 21.28 23.17
## DeliveryTime 12.030 8.00 17.830 79.2 19.75 15.350 17.90 19.83
## CV residual -0.655 -0.89 0.388 14.6 1.32 -0.481 -3.38 -3.34
##
## Sum of squares = 240 Mean square = 29.9 n = 8
##
## Overall (Sum over all 8 folds)
## ms
## 19.7
# MSPE can be alternatively calculated as
sum((d$DeliveryTime-KCV$cvpred)^2)/n
## [1] 19.7
# PRESS can be calculated as
sum((d$DeliveryTime-KCV$cvpred)^2)
## [1] 492
# Prediction R squared can be calculated as
1-sum((d$DeliveryTime-KCV$cvpred)^2)/sum((d$DeliveryTime-mean(d$DeliveryTime))^2)
## [1] 0.915
summary(model1)$r.squared
## [1] 0.96
Let us take the 9th observation as an example. The acutal observation of delivery time is d$DeliveryTime[9]
=79.24. The fitted value is model1$fitted.values[9]
=71.82. The cross validated prediction is KCV$cvpred[9]
=64.624.
We further perform leave one out cross validation.
LOOCV=cv.lm(data=d, model1, m=n,seed=123)
## Analysis of Variance Table
##
## Response: DeliveryTime
## Df Sum Sq Mean Sq F value Pr(>F)
## NumberofCases 1 5382 5382 506.6 < 2e-16 ***
## Distance 1 168 168 15.8 0.00063 ***
## Residuals 22 234 11
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Warning in cv.lm(data = d, model1, m = n, seed = 123):
##
## As there is >1 explanatory variable, cross-validation
## predicted values for a fold are not a linear function
## of corresponding overall predicted values. Lines that
## are shown for the different folds are approximate
## Warning in plot.xy(xy, type, ...): unimplemented pch value '26'
##
## fold 1
## Observations in test set: 1
## 16
## Predicted 29.663
## cvpred 29.795
## DeliveryTime 29.000
## CV residual -0.795
##
## Sum of squares = 0.63 Mean square = 0.63 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 2
## Observations in test set: 1
## 18
## Predicted 15.55
## cvpred 15.18
## DeliveryTime 19.00
## CV residual 3.82
##
## Sum of squares = 14.6 Mean square = 14.6 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 3
## Observations in test set: 1
## 21
## Predicted 20.51
## cvpred 21.03
## DeliveryTime 17.90
## CV residual -3.13
##
## Sum of squares = 9.81 Mean square = 9.81 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 4
## Observations in test set: 1
## 4
## Predicted 9.96
## cvpred 9.50
## DeliveryTime 14.88
## CV residual 5.38
##
## Sum of squares = 29 Mean square = 29 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 5
## Observations in test set: 1
## 25
## Predicted 10.963
## cvpred 10.978
## DeliveryTime 10.750
## CV residual -0.228
##
## Sum of squares = 0.05 Mean square = 0.05 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 6
## Observations in test set: 1
## 8
## Predicted 16.67
## cvpred 16.59
## DeliveryTime 17.83
## CV residual 1.24
##
## Sum of squares = 1.53 Mean square = 1.53 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 7
## Observations in test set: 1
## 10
## Predicted 19.12
## cvpred 18.54
## DeliveryTime 21.50
## CV residual 2.96
##
## Sum of squares = 8.74 Mean square = 8.74 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 8
## Observations in test set: 1
## 23
## Predicted 23.36
## cvpred 23.56
## DeliveryTime 18.75
## CV residual -4.81
##
## Sum of squares = 23.1 Mean square = 23.1 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 9
## Observations in test set: 1
## 24
## Predicted 24.4
## cvpred 25.0
## DeliveryTime 19.8
## CV residual -5.2
##
## Sum of squares = 27 Mean square = 27 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 10
## Observations in test set: 1
## 11
## Predicted 38.09
## cvpred 37.88
## DeliveryTime 40.33
## CV residual 2.45
##
## Sum of squares = 5.99 Mean square = 5.99 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 11
## Observations in test set: 1
## 5
## Predicted 14.19
## cvpred 14.23
## DeliveryTime 13.75
## CV residual -0.48
##
## Sum of squares = 0.23 Mean square = 0.23 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 12
## Observations in test set: 1
## 7
## Predicted 7.16
## cvpred 7.08
## DeliveryTime 8.00
## CV residual 0.92
##
## Sum of squares = 0.85 Mean square = 0.85 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 13
## Observations in test set: 1
## 19
## Predicted 7.71
## cvpred 7.52
## DeliveryTime 9.50
## CV residual 1.98
##
## Sum of squares = 3.94 Mean square = 3.94 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 14
## Observations in test set: 1
## 20
## Predicted 40.89
## cvpred 41.54
## DeliveryTime 35.10
## CV residual -6.44
##
## Sum of squares = 41.5 Mean square = 41.5 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 15
## Observations in test set: 1
## 3
## Predicted 12.0798
## cvpred 12.0852
## DeliveryTime 12.0300
## CV residual -0.0552
##
## Sum of squares = 0 Mean square = 0 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 16
## Observations in test set: 1
## 1
## Predicted 21.7
## cvpred 22.3
## DeliveryTime 16.7
## CV residual -5.6
##
## Sum of squares = 31.3 Mean square = 31.3 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 17
## Observations in test set: 1
## 22
## Predicted 56.01
## cvpred 58.38
## DeliveryTime 52.32
## CV residual -6.06
##
## Sum of squares = 36.7 Mean square = 36.7 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 18
## Observations in test set: 1
## 17
## Predicted 14.914
## cvpred 14.886
## DeliveryTime 15.350
## CV residual 0.464
##
## Sum of squares = 0.22 Mean square = 0.22 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 19
## Observations in test set: 1
## 6
## Predicted 18.400
## cvpred 18.413
## DeliveryTime 18.110
## CV residual -0.303
##
## Sum of squares = 0.09 Mean square = 0.09 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 20
## Observations in test set: 1
## 2
## Predicted 10.35
## cvpred 10.27
## DeliveryTime 11.50
## CV residual 1.23
##
## Sum of squares = 1.52 Mean square = 1.52 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 21
## Observations in test set: 1
## 14
## Predicted 18.68
## cvpred 18.59
## DeliveryTime 19.75
## CV residual 1.16
##
## Sum of squares = 1.34 Mean square = 1.34 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 22
## Observations in test set: 1
## 12
## Predicted 21.593
## cvpred 21.669
## DeliveryTime 21.000
## CV residual -0.669
##
## Sum of squares = 0.45 Mean square = 0.45 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 23
## Observations in test set: 1
## 15
## Predicted 23.3
## cvpred 23.3
## DeliveryTime 24.0
## CV residual 0.7
##
## Sum of squares = 0.49 Mean square = 0.49 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 24
## Observations in test set: 1
## 9
## Predicted 71.8
## cvpred 64.5
## DeliveryTime 79.2
## CV residual 14.8
##
## Sum of squares = 219 Mean square = 219 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## fold 25
## Observations in test set: 1
## 13
## Predicted 12.47
## cvpred 12.41
## DeliveryTime 13.50
## CV residual 1.09
##
## Sum of squares = 1.2 Mean square = 1.2 n = 1
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
## Warning in plot.xy(xy.coords(x, y), type = type, ...): unimplemented pch value
## '26'
##
## Overall (Sum over all 1 folds)
## ms
## 18.4
# MSPE can be alternatively calculated as
sum((d$DeliveryTime-LOOCV$cvpred)^2)/n
## [1] 18.4
# PRESS can be calculated as
sum((d$DeliveryTime-LOOCV$cvpred)^2)
## [1] 459
# Prediction R squared can be calculated as
1-sum((d$DeliveryTime-LOOCV$cvpred)^2)/sum((d$DeliveryTime-mean(d$DeliveryTime))^2)
## [1] 0.921
summary(model1)$r.squared
## [1] 0.96
Finally, I have include three functions that can quickly compute the PRESS and prediction R squared.
### This calculate the PRESS (predictive residual sum of squares), the lower, the better
#' @title PRESS
#' @author Thomas Hopper
#' @description Returns the PRESS statistic (predictive residual sum of squares).
#' Useful for evaluating predictive power of regression models.
#' @param linear.model A linear regression model (class 'lm'). Required.
PRESS <- function(linear.model) {
#' calculate the predictive residuals
pr <- residuals(linear.model)/(1-lm.influence(linear.model)$hat)
#' calculate the PRESS
PRESS <- sum(pr^2)
return(PRESS)
}
### This calculate the MSPE (mean square prediction error), the lower, the better
#' @title MSPE
#' @author Yichen Qin
#' @description Returns the MSPE statistic (mean square prediction error).
#' @param linear.model A linear regression model (class 'lm'). Required.
MSPE <- function(linear.model) {
#' calculate the MSPE =PRESS/sample size
return(PRESS(linear.model)/length(residuals(linear.model)))
}
### This calculate the Prediction r-squared
#' @title Predictive R-squared
#' @author Thomas Hopper
#' @description returns the prediction r-squared. Requires the function PRESS(), which returns
#' the PRESS statistic.
#' @param linear.model A linear regression model (class 'lm'). Required.
pred_r_squared <- function(linear.model) {
#' Use anova() to get the sum of squares for the linear model
lm.anova <- anova(linear.model)
#' Calculate the total sum of squares
tss <- sum(lm.anova$'Sum Sq')
# Calculate the predictive R^2
pred.r.squared <- 1-PRESS(linear.model)/(tss)
return(pred.r.squared)
}
MSPE(model1)
## [1] 18.4
PRESS(model1)
## [1] 459
pred_r_squared(model1)
## [1] 0.921
summary(model1)$r.squared
## [1] 0.96
End of example
If we could obtain new data, we could use new data to validate the model.
Example of the delivery time data Suppose we use the original data set to build a model.
d <- read.csv("data_delivery.csv",header=TRUE)
head(d)
## Index DeliveryTime NumberofCases Distance
## 1 1 16.7 7 560
## 2 2 11.5 3 220
## 3 3 12.0 3 340
## 4 4 14.9 4 80
## 5 5 13.8 6 150
## 6 6 18.1 7 330
n = dim(d)[1]
n
## [1] 25
In this example, we have in total 25 observations in the original data set data_delivery.csv
, we build a multiple linear regression as follows.
model1 = lm(DeliveryTime ~ NumberofCases + Distance, data=d)
summary(model1)
##
## Call:
## lm(formula = DeliveryTime ~ NumberofCases + Distance, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.788 -0.663 0.436 1.157 7.420
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.34123 1.09673 2.13 0.04417 *
## NumberofCases 1.61591 0.17073 9.46 3.3e-09 ***
## Distance 0.01438 0.00361 3.98 0.00063 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.26 on 22 degrees of freedom
## Multiple R-squared: 0.96, Adjusted R-squared: 0.956
## F-statistic: 261 on 2 and 22 DF, p-value: 4.69e-16
It seems model1
works well for the observed data set data_delivery.csv
. However, how do we validate model1
? In other words, how will model1
perform for a newly collected data set? Or how well does model1
generalize beyond the observed data?
To validate the performance of model1
, we can collect more observations. Suppose the newly collected observations are stored in data_delivery_additional.csv
. Then we could validate model1
as follows.
dnew <- read.csv("data_delivery_additional.csv",header=TRUE)
We can first make prediction for these new observations using the fitted model model1
and calculate the predication error.
prediction = predict(model1, dnew, interval = c("confidence"), level = 0.95, type="response")
head(prediction)
## fit lwr upr
## 1 50.9 47.9 53.9
## 2 21.1 19.2 23.1
## 3 30.8 27.5 34.0
## 4 17.6 15.3 20.0
## 5 26.4 21.0 31.9
## 6 15.3 13.7 16.8
prediction_error = dnew$DeliveryTime-prediction[,1]
head(prediction_error)
## 1 2 3 4 5 6
## 0.0905 -4.3327 -4.5914 2.2868 -2.4215 3.2767
# putting new observations, prediction and prediction error side by side.
head(cbind(dnew,prediction,prediction_error))
## Index DeliveryTime NumberofCases Distance fit lwr upr prediction_error
## 1 26 51.0 22 905 50.9 47.9 53.9 0.0905
## 2 27 16.8 7 520 21.1 19.2 23.1 -4.3327
## 3 28 26.2 15 290 30.8 27.5 34.0 -4.5914
## 4 29 19.9 5 500 17.6 15.3 20.0 2.2868
## 5 30 24.0 6 1000 26.4 21.0 31.9 -2.4215
## 6 31 18.6 6 225 15.3 13.7 16.8 3.2767
The whole idea is summarized in the figure below End of example
Once the prediction for new observations are obtained, we can use the following criterion to validate the model.
Mean squared prediction error (MSPE) for new observations: the smaller the better \[MSPE=\sum_{i=n+1}^{n+m} (y_i-\hat{y}_{i,predict})^2/m\]
Predicted residual error sum of squares (PRESS) for new observations: the smaller the better \[PRESS=\sum_{i=n+1}^{n+m} (y_i-\hat{y}_{i,predict})^2\]
Prediction R squared for new observations: the higher the better \[R^2_{prediction}=1-\frac{\sum_{i=n+1}^{n+m} (y_i-\hat{y}_{i,predict})^2}{\sum_{i=n+1}^{n+m} (y_i-\bar{y})^2}= 1- \frac{PRESS}{\sum_{i=n+1}^{n+m} (y_i-\bar{y})^2}\]
Note that \(n\) is the original sample size and \(m\) is the sample size of the new observations. Therefore, \(\hat{y}_{i,predict}\) for \(i=n+1,...,n+m\) are the predicted values of the new observations. \(y_{i}\) for \(i=n+1,...,n+m\) are the observed value of the new observations.
Example: For the delivery time data set, we can calculate the mean squared prediction error (MSPE) as follows.
MSPE = sum( (dnew$DeliveryTime - prediction[,1])^2 ) / dim(dnew)[1]
MSPE
## [1] 22.1
We can calculate the predicted residual error sum of squares (PRESS) statistic as follows.
PRESS = sum((dnew$DeliveryTime-prediction[,1])^2)
PRESS
## [1] 332
We can calculate prediction R squared as follows.
pred_Rsq = 1-PRESS/sum((dnew$DeliveryTime-mean(dnew$DeliveryTime))^2)
pred_Rsq
## [1] 0.897
Once we have the prediction R squared pred_Rsq
= 0.897, we can compare it with the traditional R square summary(model1)$r.squared
=0.96. Once again, we see that the least-sqaure model model1
does not predict the new observations dnew
as well as it fits the original data d
. However the lost in R squared for prediction is slight. Collecting new data has indicated that the least-squares fit for the delivery time data results in a reasonably good prediction equation.
In addition, we can compare PRESS with the Residual Sum of Squares (SSRes) from model1
. To obtain the SSRes of model1
, we use the following code
sum(model1$residuals^2)
## [1] 234
Clearly, SSRes=233.732 is lower than PRESS=331.832, which means model1
performs worse for the new observations dnew
than the original observations d
, but this is often expected.
End of example