predict.gam {mgcv}R Documentation

Prediction from fitted GAM model

Description

Takes a fitted gam object produced by gam() and produces predictions given a new set of values for the model covariates or the original values used for the model fit.

Usage

predict.gam(object,newdata,type="link",se.fit=FALSE,terms=NULL,
            block.size=1000,newdata.guaranteed=FALSE,...)

Arguments

object a fitted gam object as produced by gam().
newdata A data frame containing the values of the model covariates at which predictions are required. If this is not provided then predictions corresponding to the original data are returned. If newdata is provided then it should contain all the variables needed for prediction: a warning is generated if not.
type When this has the value "link" (default) the linear predictor (possibly with associated standard errors) is returned. When type="terms" each component of the linear predictor is returned seperately (possibly with standard errors): this includes parametric model components, followed by each smooth component, but excludes any offset and any intercept. When type="response" predictions on the scale of the response are returned (possibly with approximate standard errors). When type="lpmatrix" then a matrix is returned which yields the values of the linear predictor (minus any offset) when applied to the parameter vector (in this case se.fit is ignored). The latter option is most useful for getting variance estimates for integrated quantities.
se.fit when this is TRUE (not default) standard error estimates are returned for each prediction.
terms if type=="terms" then only results for the terms given in this array will be returned.
block.size maximum number of predictions to process per call to underlying code: larger is quicker, but more memory intensive. Set to < 1 to use total number of predictions as this.
newdata.guaranteed Set to TRUE to turn off all checking of newdata except for sanity of factor levels: this can speed things up for large prediction tasks, but newdata must be complete.
... other arguments.

Details

The standard errors produced by predict.gam are based on the Bayesian posterior covariance matrix of the parameters Vp in the fitted gam object.

To facilitate plotting with termplot, if object possesses an attribute "para.only" and type=="terms" then only parametric terms of order 1 are returned (i.e. those that termplot can handle).

Note that, in common with other prediction functions, any offset supplied to gam as an argument is always ignored when predicting, unlike offsets specified in the gam model formula.

Value

If type=="lpmatrix" then a matrix is returned which will give a vector of linear predictor values (minus any offest) at the supplied covariate values, when applied to the model coefficient vector. Otherwise, if se.fit is TRUE then a 2 item list is returned with items (both arrays) fit and se.fit containing predictions and associated standard error estimates, otherwise an array of predictions is returned. The dimensions of the returned arrays depends on whether type is "terms" or not: if it is then the array is 2 dimensional with each term in the linear predictor separate, otherwise the array is 1 dimensional and contains the linear predictor/predicted values (or corresponding s.e.s). The linear predictor returned termwise will not include the offset or the intercept.
newdata can be a data frame, list or model.frame: if it's a model frame then all variables must be supplied.

WARNING

Note that the behaviour of this function is not identical to predict.gam() in Splus.

Author(s)

Simon N. Wood simon.wood@r-project.org

The design is inspired by the S function of the same name described in Chambers and Hastie (1993) (but is not a clone).

References

Chambers and Hastie (1993) Statistical Models in S. Chapman & Hall.

Gu and Wahba (1991) Minimizing GCV/GML scores with multiple smoothing parameters via the Newton method. SIAM J. Sci. Statist. Comput. 12:383-398

Wood, S.N. (2000) Modelling and Smoothing Parameter Estimation with Multiple Quadratic Penalties. J.R.Statist.Soc.B 62(2):413-428

Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114

http://www.stats.gla.ac.uk/~simon/

See Also

gam, gamm, plot.gam

Examples

library(mgcv)
n<-200
sig <- 2
x0 <- runif(n, 0, 1)
x1 <- runif(n, 0, 1)
x2 <- runif(n, 0, 1)
x3 <- runif(n, 0, 1)
y <- 2 * sin(pi * x0)
y <- y + exp(2 * x1) 
y <- y + 0.2 * x2^11 * (10 * (1 - x2))^6 + 10 * (10 * x2)^3 * (1 - x2)^10
y <- y + x3
e <- rnorm(n, 0, sig)
y <- y + e
b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3))
rm(y,x0,x1,x2,x3)
newd <- data.frame(x0=(0:30)/30,x1=(0:30)/30,x2=(0:30)/30,x3=(0:30)/30)
pred <- predict.gam(b,newd)
## now get variance of sum of predictions using lpmatrix
Xp <- predict(b,newd,type="lpmatrix") 
## Xp 
a <- rep(1,31)
Xs <- t(a) 
var.sum <- Xs 
## Now get the variance of non-linear function of predictions
## by simulation from posterior distribution of the params
library(MASS)
br<-mvrnorm(1000,coef(b),b$Vp) ## 1000 replicate param. vectors
res <- rep(0,1000)
for (i in 1:1000)
{ pr <- Xp 
  res[i] <- sum(log(abs(pr))) ## example non-linear function
}
mean(res);var(res)
## note: loop is replace-able by res <- colSums(log(abs(Xp 

[Package mgcv version 1.2-3 Index]