smooth.construct package:mgcv R Documentation _C_o_n_s_t_r_u_c_t_o_r _f_u_n_c_t_i_o_n_s _f_o_r _s_m_o_o_t_h _t_e_r_m_s _i_n _a _G_A_M _D_e_s_c_r_i_p_t_i_o_n: Smooth terms in a GAM formula are turned into smooth specification objects of class 'xx.smooth.spec' during processing of the formula. Each of these objects is converted to a smooth object using an appropriate 'smooth.construct' function. New smooth classes can be added by writing a new 'smooth.construct' method function and a corresponding 'Predict.matrix' method function (see example code below). In practice, 'smooth.construct' is usually called via 'smooth.construct2' and the wrapper function 'smoothCon', in order to handle 'by' variables and centering constraints (see the 'smoothCon' documentation if you need to handle these things directly, for a user defined smooth class). _U_s_a_g_e: smooth.construct(object,data,knots) smooth.construct2(object,data,knots) _A_r_g_u_m_e_n_t_s: object: is a smooth specification object, generated by an 's' or 'te' term in a GAM formula. Objects generated by 's' terms have class 'xx.smooth.spec' where 'xx' is given by the 'bs' argument of 's' (this convention allows the user to add their own smoothers). If 'object' is not class 'tensor.smooth.spec' it will have the following elements: _t_e_r_m The names of the covariates for this smooth, in an array. _b_s._d_i_m Argument 'k' of the 's' term generating the object. This is the dimension of the basis used to represent the term (or, arguably, 1 greater than the basis dimension for 'cc' terms). 'bs.dim<0' indicates that the constructor should set this to the default value. _f_i_x_e_d 'TRUE' if the term is to be unpenalized, otherwise 'FALSE'. _d_i_m the number covariates of which this smooth is a function. _p._o_r_d_e_r the order of the smoothness penalty or 'NA' for autoselection of this. This is argument 'm' of the 's' term that generated 'object'. _b_y the name of any 'by' variable to multiply this term as supplied as an argument to 's'. '"NA"' if there is no such term. _l_a_b_e_l A suitable label for use with this term. _x_t An object containing information that may be needed for basis setup (used, e.g. by '"tp"' smooths to pass optional information on big dataset handling). _i_d Any identity associated with this term - used for linking bases and smoothing parameters. 'NULL' by default, indicating no linkage. _s_p Smoothing parameters for the term. Any negative are estimated, otherwise they are fixed at the supplied value. Unless 'NULL' (default), over-rides 'sp' argument to 'gam'. If 'object' is of class 'tensor.smooth.spec' then it was generated by a 'te' term in the GAM formula, and specifies a smooth of several variables with a basis generated as a tensor product of lower dimensional bases. In this case the object will be different and will have the following elements: _m_a_r_g_i_n is a list of smooth specification objects of the type listed above, defining the bases which have their tensor product formed in order to construct this term. _t_e_r_m is the array of names of the covariates that are arguments of the smooth. _b_y is the name of any 'by' variable, or '"NA"'. _f_x is an array, the elements of which indicate whether ('TRUE') any of the margins in the tensor product should be unpenalized. _l_a_b_e_l A suitable label for use with this term. _d_i_m is the number of covariates of which this smooth is a function. _m_p 'TRUE' if multiple penalties are to be used. _n_p 'TRUE' if 1-D marginal smooths are to be re-parameterized in terms of function values. _i_d Any identity associated with this term - used for linking bases and smoothing parameters. 'NULL' by default, indicating no linkage. _s_p Smoothing parameters for the term. Any negative are estimated, otherwise they are fixed at the supplied value. Unless 'NULL' (default), over-rides 'sp' argument to 'gam'. data: For 'smooth.construct' a data frame or list containing the evaluation of the elements of 'object$term', with names given by 'object$term'. The last entry will be the 'by' variable, if 'object$by' is not '"NA"'. For 'smooth.construct2' 'data' need only be an object within which 'object$term' can be evaluated, the variables can be in any order, and there can be irrelevant variables present as well. knots: an optional data frame or list containing the knots relating to 'object$term'. If it is 'NULL' then the knot locations are generated automatically. The structure of 'knots' should be as for 'data', depending on whether 'smooth.construct' or 'smooth.construct2' is used. _D_e_t_a_i_l_s: There are built in methods for objects with the following classes: 'tp.smooth.spec' (thin plate regression splines: see 'tprs'); 'ts.smooth.spec' (thin plate regression splines with shrinkage-to-zero); 'cr.smooth.spec' (cubic regression splines: see 'cubic.regression.spline'; 'cs.smooth.spec' (cubic regression splines with shrinkage-to-zero); 'cc.smooth.spec' (cyclic cubic regression splines); 'ps.smooth.spec' (Eilers and Marx (1986) style P-splines: see 'p.spline'); 'cp.smooth.spec' (cyclic P-splines) 'ad.smooth.spec' (adaptive smooths of 1 or 2 variables: see 'adaptive.smooth'); 'tensor.smooth.spec' (tensor product smooths). There is an implicit assumption that the basis only depends on the knots and/or the set of unique covariate combinations; i.e. that the basis is the same whether generated from the full set of covariates, or just the unique combinations of covariates. _V_a_l_u_e: The input argument 'object', assigned a new class to indicate what type of smooth it is and with at least the following items added: X: The model matrix from this term. This may have an '"offset"' attribute: a vector of length 'nrow(X)' containing any contribution of the smooth to the model offset term. 'by' variables do not need to be dealt with here, but if they are then an item 'by.done' must be added to the 'object'. S: A list of positive semi-definite penalty matrices that apply to this term. The list will be empty if the term is to be left un-penalized. rank: An array giving the ranks of the penalties. null.space.dim: The dimension of the penalty null space (before centering). C: The matrix defining any constraints on the term. If this is 'NULL' then 'smoothCon' will add an identifiability constraint that each term should sum to zero over the covariate values. Set to a zero row matrix if no constraints are required. Note that a supplied 'C' is never ignored, even if any 'by' variables of a smooth imply that no constraint is actually needed. no.rescale: if this is non-NULL then the penalty coefficient matrix of the smooth will not be rescaled for enhaced numerical stability (rescaling is the default, because 'gamm' requires it). Turning off rescaling is useful if the values of the smoothing parameters should be interpretable in a model, for example because they are inverse variance components. df: the degrees of freedom associated with this term (when unpenalized and unconstrained). If this is null then 'smoothCon' will set it to the basis dimension. 'smoothCon' will reduce this by the number of constraints. L: smooths may depend on fewer `underlying' smoothing parameters than there are elements of 'S'. In this case 'L' is the matrix mapping the vector of underlying log smoothing parameters to the vector of logs of the smoothing parameters actually multiplying the 'S[[i]]'. 'L=NULL' signifies that there is one smoothing parameter per 'S[[i]]'. Usually the returned object will also include extra information required to define the basis, and used by 'Predict.matrix' methods to make predictions using the basis. See the 'Details' section for links to the information included for the built in smooth classes. 'tensor.smooth' returned objects will additionally have each element of the 'margin' list updated in the same way. 'tensor.smooths' also have a list, 'XP', containing re-parameterization matrices for any 1-D marginal terms re-parameterized in terms of function values. This list will have 'NULL' entries for marginal smooths that are not re-parameterized, and is only long enough to reach the last re-parameterized marginal in the list. _W_A_R_N_I_N_G: User defined smooth objects should avoid having attributes names '"qrc"' or '"nCons"' as these are used internally to provide constraint free parameterizations. _A_u_t_h_o_r(_s): Simon N. Wood simon.wood@r-project.org _R_e_f_e_r_e_n_c_e_s: Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114 Wood, S.N. (2006) Low rank scale invariant tensor product smooths for generalized additive mixed models. Biometrics 62(4):1025-1036 The P-spline code given in the example is based on those advocated in: Ruppert, D., M.P. Wand and R.J. Carroll (2003) Semiparametric Regression. Cambridge University Press. However if you want p-splines, rather than splines with derivative based penalties, then the built in "ps" class is probably a better bet. It's based on Eilers, P.H.C. and B.D. Marx (1996) Flexible Smoothing with B-splines and Penalties. Statistical Science, 11(2):89-121 _S_e_e _A_l_s_o: 's','get.var', 'gamm', 'gam', 'Predict.matrix', 'smoothCon', 'PredictMat' _E_x_a_m_p_l_e_s: # adding "p-spline" classes and methods smooth.construct.tr.smooth.spec<-function(object,data,knots) ## a truncated power spline constructor method function ## object$p.order = null space dimension { m <- object$p.order[1] if (is.na(m)) m <- 2 ## default if (m<1) stop("silly m supplied") if (object$bs.dim<0) object$bs.dim <- 10 ## default nk<-object$bs.dim-m-1 ## number of knots if (nk<=0) stop("k too small for m") x <- data[[object$term]] ## the data x.shift <- mean(x) # shift used to enhance stability k <- knots[[object$term]] ## will be NULL if none supplied if (is.null(k)) # space knots through data { n<-length(x) k<-quantile(x[2:(n-1)],seq(0,1,length=nk+2))[2:(nk+1)] } if (length(k)!=nk) # right number of knots? stop(paste("there should be ",nk," supplied knots")) x <- x - x.shift # basis stabilizing shift k <- k - x.shift # knots treated the same! X<-matrix(0,length(x),object$bs.dim) for (i in 1:(m+1)) X[,i] <- x^(i-1) for (i in 1:nk) X[,i+m+1]<-(x-k[i])^m*as.numeric(x>k[i]) object$X<-X # the finished model matrix if (!object$fixed) # create the penalty matrix { object$S[[1]]<-diag(c(rep(0,m+1),rep(1,nk))) } object$rank<-nk # penalty rank object$null.space.dim <- m+1 # dim. of unpenalized space ## store "tr" specific stuff ... object$knots<-k;object$m<-m;object$x.shift <- x.shift object$df<-ncol(object$X) # maximum DoF (if unconstrained) class(object)<-"tr.smooth" # Give object a class object } Predict.matrix.tr.smooth<-function(object,data) ## prediction method function for the `tr' smooth class { x <- data[[object$term]] x <- x - object$x.shift # stabilizing shift m <- object$m; # spline order (3=cubic) k<-object$knots # knot locations nk<-length(k) # number of knots X<-matrix(0,length(x),object$bs.dim) for (i in 1:(m+1)) X[,i] <- x^(i-1) for (i in 1:nk) X[,i+m+1] <- (x-k[i])^m*as.numeric(x>k[i]) X # return the prediction matrix } # an example, using the new class.... set.seed(100) dat <- gamSim(1,n=400,scale=2) b<-gam(y~s(x0,bs="tr",m=2)+s(x1,bs="ps",m=c(1,3))+ s(x2,bs="tr",m=3)+s(x3,bs="tr",m=2),data=dat) plot(b,pages=1) b<-gamm(y~s(x0,bs="tr",m=2)+s(x1,bs="ps",m=c(1,3))+ s(x2,bs="tr",m=3)+s(x3,bs="tr",m=2),data=dat) plot(b$gam,pages=1) # another example using tensor products of the new class dat <- gamSim(2,n=400,scale=.1)$data b <- gam(y~te(x,z,bs=c("tr","tr"),m=c(2,2)),data=dat) vis.gam(b)