smooth.construct {mgcv} | R Documentation |
Smooth terms in a GAM formula are turned into smooth specification objects of
class xx.smooth.spec
during processing of the formula. Each of these objects is
converted to a smooth object using an appropriate smooth.construct
function. New smooth classes
can be added by writing a new smooth.construct
method function and a corresponding
Predict.matrix
method function (see example code below).
In practice, smooth.construct
is usually called via smooth.construct2
and the wrapper
function smoothCon
, in order to handle by
variables and
centering constraints (see the smoothCon
documentation if
you need to handle these things directly, for a user defined smooth class).
smooth.construct(object,data,knots) smooth.construct2(object,data,knots)
object |
is a smooth specification object, generated by an s or te term in a GAM
formula. Objects generated by s terms have class xx.smooth.spec where xx is given by the
bs argument of s (this convention allows the user to add their own smoothers).
If object is not class tensor.smooth.spec it will have the following elements:
If object is of class tensor.smooth.spec then it was generated by a te term in the GAM formula,
and specifies a smooth of several variables with a basis generated as a tensor product of lower dimensional bases.
In this case the object will be different and will have the following elements:
|
data |
For smooth.construct a data frame or list containing the evaluation of the elements of object$term ,
with names given by object$term . The last entry will be the by variable, if object$by
is not "NA" . For smooth.construct2 data need only be an object within which object$term
can be evaluated, the variables can be in any order, and there can be irrelevant variables present as well. |
knots |
an optional data frame or list containing the knots relating to object$term .
If it is NULL then the knot locations are generated automatically. The structure of knots should
be as for data , depending on whether smooth.construct or smooth.construct2 is used. |
There are built in methods for objects with the following classes:
tp.smooth.spec
(thin plate regression splines: see tprs
);
ts.smooth.spec
(thin plate regression splines with shrinkage-to-zero);
cr.smooth.spec
(cubic regression splines: see cubic.regression.spline
;
cs.smooth.spec
(cubic regression splines with shrinkage-to-zero);
cc.smooth.spec
(cyclic cubic regression splines);
ps.smooth.spec
(Eilers and Marx (1986) style P-splines: see p.spline
);
cp.smooth.spec
(cyclic P-splines)
ad.smooth.spec
(adaptive smooths of 1 or 2 variables: see adaptive.smooth
);
tensor.smooth.spec
(tensor product smooths).
There is an implicit assumption that the basis only depends on the knots and/or the set of unique covariate combinations; i.e. that the basis is the same whether generated from the full set of covariates, or just the unique combinations of covariates.
The input argument object
, assigned a new class to indicate what type of smooth it is and with at least the
following items added:
X |
The model matrix from this term. This may have an "offset"
attribute: a vector of length nrow(X) containing any contribution of
the smooth to the model offset term. by variables do not need to be
dealt with here, but if they are then an item by.done must be added to
the object . |
S |
A list of positive semi-definite penalty matrices that apply to this term. The list will be empty if the term is to be left un-penalized. |
rank |
An array giving the ranks of the penalties. |
null.space.dim |
The dimension of the penalty null space (before centering). |
C |
The matrix defining any constraints on the term. If this is NULL then
smoothCon will add an identifiability constraint that each term should
sum to zero over the covariate values. Set to a zero row matrix if no
constraints are required. Note that a supplied C is never ignored, even if any
by variables of a smooth imply that no constraint is actually needed. |
no.rescale |
if this is non-NULL then the penalty coefficient matrix of the smooth will not be
rescaled for enhaced numerical stability (rescaling is the default, because gamm requires it).
Turning off rescaling is useful if the values of the smoothing parameters should be interpretable in a model,
for example because they are inverse variance components. |
df |
the degrees of freedom associated with this term (when
unpenalized and unconstrained). If this is null then smoothCon will set it to the basis
dimension. smoothCon will reduce this by the number of constraints. |
L |
smooths may depend on fewer `underlying' smoothing parameters than there are elements of
S . In this case L is the matrix mapping the vector of underlying log smoothing
parameters to the vector of logs of the smoothing parameters actually multiplying the S[[i]] .
L=NULL signifies that there is one smoothing parameter per S[[i]] . |
Usually the returned object will also include extra information required to define the basis, and used by
Predict.matrix
methods to make predictions using the basis. See
the Details
section for links to the information included for the built in smooth classes.
tensor.smooth
returned objects will additionally have each element of
the margin
list updated in the same way. tensor.smooths
also
have a list, XP
, containing re-parameterization matrices for any 1-D marginal terms
re-parameterized in terms of function values. This list will have NULL
entries for marginal smooths that are not re-parameterized, and is only long
enough to reach the last re-parameterized marginal in the list.
User defined smooth objects should avoid having attributes names
"qrc"
or "nCons"
as these are used internally to provide
constraint free parameterizations.
Simon N. Wood simon.wood@r-project.org
Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114
Wood, S.N. (2006) Low rank scale invariant tensor product smooths for generalized additive mixed models. Biometrics 62(4):1025-1036
The P-spline code given in the example is based on those advocated in:
Ruppert, D., M.P. Wand and R.J. Carroll (2003) Semiparametric Regression. Cambridge University Press.
However if you want p-splines, rather than splines with derivative based penalties, then the built in "ps" class is probably a better bet. It's based on
Eilers, P.H.C. and B.D. Marx (1996) Flexible Smoothing with B-splines and Penalties. Statistical Science, 11(2):89-121
http://www.maths.bath.ac.uk/~sw283/
s
,get.var
, gamm
, gam
,
Predict.matrix
,
smoothCon
, PredictMat
# adding "p-spline" classes and methods smooth.construct.tr.smooth.spec<-function(object,data,knots) ## a truncated power spline constructor method function ## object$p.order = null space dimension { m <- object$p.order[1] if (is.na(m)) m <- 2 ## default if (m<1) stop("silly m supplied") if (object$bs.dim<0) object$bs.dim <- 10 ## default nk<-object$bs.dim-m-1 ## number of knots if (nk<=0) stop("k too small for m") x <- data[[object$term]] ## the data x.shift <- mean(x) # shift used to enhance stability k <- knots[[object$term]] ## will be NULL if none supplied if (is.null(k)) # space knots through data { n<-length(x) k<-quantile(x[2:(n-1)],seq(0,1,length=nk+2))[2:(nk+1)] } if (length(k)!=nk) # right number of knots? stop(paste("there should be ",nk," supplied knots")) x <- x - x.shift # basis stabilizing shift k <- k - x.shift # knots treated the same! X<-matrix(0,length(x),object$bs.dim) for (i in 1:(m+1)) X[,i] <- x^(i-1) for (i in 1:nk) X[,i+m+1]<-(x-k[i])^m*as.numeric(x>k[i]) object$X<-X # the finished model matrix if (!object$fixed) # create the penalty matrix { object$S[[1]]<-diag(c(rep(0,m+1),rep(1,nk))) } object$rank<-nk # penalty rank object$null.space.dim <- m+1 # dim. of unpenalized space ## store "tr" specific stuff ... object$knots<-k;object$m<-m;object$x.shift <- x.shift object$df<-ncol(object$X) # maximum DoF (if unconstrained) class(object)<-"tr.smooth" # Give object a class object } Predict.matrix.tr.smooth<-function(object,data) ## prediction method function for the `tr' smooth class { x <- data[[object$term]] x <- x - object$x.shift # stabilizing shift m <- object$m; # spline order (3=cubic) k<-object$knots # knot locations nk<-length(k) # number of knots X<-matrix(0,length(x),object$bs.dim) for (i in 1:(m+1)) X[,i] <- x^(i-1) for (i in 1:nk) X[,i+m+1] <- (x-k[i])^m*as.numeric(x>k[i]) X # return the prediction matrix } # an example, using the new class.... set.seed(100) dat <- gamSim(1,n=400,scale=2) b<-gam(y~s(x0,bs="tr",m=2)+s(x1,bs="ps",m=c(1,3))+ s(x2,bs="tr",m=3)+s(x3,bs="tr",m=2),data=dat) plot(b,pages=1) b<-gamm(y~s(x0,bs="tr",m=2)+s(x1,bs="ps",m=c(1,3))+ s(x2,bs="tr",m=3)+s(x3,bs="tr",m=2),data=dat) plot(b$gam,pages=1) # another example using tensor products of the new class dat <- gamSim(2,n=400,scale=.1)$data b <- gam(y~te(x,z,bs=c("tr","tr"),m=c(2,2)),data=dat) vis.gam(b)