ppr {stats}  R Documentation 
Fit a projection pursuit regression model.
ppr(x, ...) ## S3 method for class 'formula': ppr(formula, data, weights, subset, na.action, contrasts = NULL, ..., model = FALSE) ## Default S3 method: ppr(x, y, weights = rep(1,n), ww = rep(1,q), nterms, max.terms = nterms, optlevel = 2, sm.method = c("supsmu", "spline", "gcvspline"), bass = 0, span = 0, df = 5, gcvpen = 1, ...)
formula 
a formula specifying one or more numeric response variables and the explanatory variables. 
x 
numeric matrix of explanatory variables. Rows represent observations, and columns represent variables. Missing values are not accepted. 
y 
numeric matrix of response variables. Rows represent observations, and columns represent variables. Missing values are not accepted. 
nterms 
number of terms to include in the final model. 
data 
data frame from which variables specified in formula are
preferentially to be taken.

weights 
a vector of weights w_i for each case. 
ww 
a vector of weights for each response, so the fit criterion is
the sum over case i and responses j of
w_i ww_j (y_ij  fit_ij)^2 divided by the sum of w_i .

subset 
an index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.) 
na.action 
a function to specify the action to be taken if NA s are
found. The default action is given by getOption("na.action") .
(NOTE: If given, this argument must be named.)

contrasts 
the contrasts to be used when any factor explanatory variables are coded. 
max.terms 
maximum number of terms to choose from when building the model. 
optlevel 
integer from 0 to 3 which determines the thoroughness of an optimization routine in the SMART program. See the Details section. 
sm.method 
the method used for smoothing the ridge functions. The default is to
use Friedman's super smoother supsmu . The alternatives are to use
the smoothing spline code underlying smooth.spline , either with a
specified (equivalent) degrees of freedom for each ridge functions, or
to allow the smoothness to be chosen by GCV.

bass 
super smoother bass tone control used with automatic span selection
(see supsmu ); the range of values is 0 to 10, with larger values
resulting in increased smoothing.

span 
super smoother span control (see supsmu ). The default, 0 ,
results in automatic span selection by local cross validation. span
can also take a value in (0, 1] .

df 
if sm.method is "spline" specifies the smoothness of
each ridge term via the requested equivalent degrees of freedom.

gcvpen 
if sm.method is "gcvspline" this is the penalty used
in the GCV selection for each degree of freedom used.

... 
arguments to be passed to or from other methods. 
model 
logical. If true, the model frame is returned. 
The basic method is given by Friedman (1984), and is essentially the
same code used by SPLUS's ppreg
. This code is extremely
sensitive to the compiler used.
The algorithm first adds up to max.terms
ridge terms one at a
time; it will use less if it is unable to find a term to add that makes
sufficient difference. It then removes the least
“important” term at each step until nterm
terms
are left.
The levels of optimization (argument optlevel
)
differ in how thoroughly the models are refitted during this process.
At level 0 the existing ridge terms are not refitted. At level 1
the projection directions are not refitted, but the ridge
functions and the regression coefficients are.
Levels 2 and 3 refit all the terms and are equivalent for one
response; level 3 is more careful to rebalance the contributions
from each regressor at each step and so is a little less likely to
converge to a saddle point of the sum of squares criterion.
A list with the following components, many of which are for use by the method functions.
call 
the matched call 
p 
the number of explanatory variables (after any coding) 
q 
the number of response variables 
mu 
the argument nterms 
ml 
the argument max.terms 
gof 
the overall residual (weighted) sum of squares for the selected model 
gofn 
the overall residual (weighted) sum of squares against the
number of terms, up to max.terms . Will be invalid (and zero)
for less than nterms . 
df 
the argument df 
edf 
if sm.method is "spline" or "gcvspline"
the equivalent number of degrees of freedom for each ridge term used. 
xnames 
the names of the explanatory variables 
ynames 
the names of the response variables 
alpha 
a matrix of the projection directions, with a column for each ridge term 
beta 
a matrix of the coefficients applied for each response to the ridge terms: the rows are the responses and the columns the ridge terms 
yb 
the weighted means of each response 
ys 
the overall scale factor used: internally the responses are
divided by ys to have unit total weighted sum of squares. 
fitted.values 
the fitted values, as a matrix if q > 1 . 
residuals 
the residuals, as a matrix if q > 1 . 
smod 
internal work array, which includes the ridge functions evaluated at the training set points. 
model 
(only if model=TRUE ) the model frame. 
Friedman, J. H. and Stuetzle, W. (1981) Projection pursuit regression. Journal of the American Statistical Association, 76, 817–823.
Friedman, J. H. (1984) SMART User's Guide. Laboratory for Computational Statistics, Stanford University Technical Report No. 1.
Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Springer.
plot.ppr
, supsmu
, smooth.spline
# Note: your numerical values may differ attach(rock) area1 < area/10000; peri1 < peri/10000 rock.ppr < ppr(log(perm) ~ area1 + peri1 + shape, data = rock, nterms = 2, max.terms = 5) rock.ppr # Call: # ppr.formula(formula = log(perm) ~ area1 + peri1 + shape, data = rock, # nterms = 2, max.terms = 5) # # Goodness of fit: # 2 terms 3 terms 4 terms 5 terms # 8.737806 5.289517 4.745799 4.490378 summary(rock.ppr) # ..... (same as above) # ..... # # Projection direction vectors: # term 1 term 2 # area1 0.34357179 0.37071027 # peri1 0.93781471 0.61923542 # shape 0.04961846 0.69218595 # # Coefficients of ridge terms: # term 1 term 2 # 1.6079271 0.5460971 par(mfrow=c(3,2))# maybe: , pty="s") plot(rock.ppr, main="ppr(log(perm)~ ., nterms=2, max.terms=5)") plot(update(rock.ppr, bass=5), main = "update(..., bass = 5)") plot(update(rock.ppr, sm.method="gcv", gcvpen=2), main = "update(..., sm.method=\"gcv\", gcvpen=2)") cbind(perm=rock$perm, prediction=round(exp(predict(rock.ppr)), 1)) detach()