smooth.construct            package:mgcv            R Documentation

_C_o_n_s_t_r_u_c_t_o_r _f_u_n_c_t_i_o_n_s _f_o_r _s_m_o_o_t_h _t_e_r_m_s _i_n _a _G_A_M

_D_e_s_c_r_i_p_t_i_o_n:

     Smooth terms in a GAM formula are turned into smooth specification
     objects of  class 'xx.smooth.spec' during processing of the
     formula. Each of these objects is converted to a smooth object
     using an appropriate 'smooth.construct' function. New smooth
     classes  can be added by writing a new 'smooth.construct' method
     function and a corresponding  'Predict.matrix' method function
     (see example code below).

     In practice, 'smooth.construct' is usually called via
     'smooth.construct2' and the wrapper function 'smoothCon', in order
     to handle 'by' variables and centering constraints (see the
     'smoothCon' documentation if  you need to handle these things
     directly, for a user defined smooth class).

_U_s_a_g_e:

     smooth.construct(object,data,knots)
     smooth.construct2(object,data,knots)

_A_r_g_u_m_e_n_t_s:

  object: is a smooth specification object, generated by an 's' or 'te'
          term in a GAM  formula. Objects generated by 's' terms have
          class 'xx.smooth.spec' where 'xx' is given by the  'bs'
          argument of 's' (this convention allows the user to add their
          own smoothers).  If 'object' is not class
          'tensor.smooth.spec' it will have the following elements:


        _t_e_r_m The names of the covariates for this smooth, in an array.

        _b_s._d_i_m Argument 'k' of the 's' term generating the object. This
             is the dimension of the basis  used to represent the term
             (or, arguably, 1 greater than the basis dimension for 'cc'
             terms).  'bs.dim<0' indicates that the constructor should
             set this to the default value.

        _f_i_x_e_d 'TRUE' if the term is to be unpenalized, otherwise
             'FALSE'.

        _d_i_m the number covariates of which this smooth is a function.

        _p._o_r_d_e_r the order of the smoothness penalty or 'NA' for
             autoselection of this. This is argument  'm' of the 's'
             term that generated 'object'.

        _b_y the name of any 'by' variable to multiply this term as
             supplied as an argument to 's'.  '"NA"' if there is no
             such term.

        _l_a_b_e_l A suitable label for use with this term.

        _x_t An object containing information that may be needed for
             basis setup (used, e.g. by '"tp"' smooths to pass optional
             information on big dataset handling).

        _i_d Any identity associated with this term - used for linking
             bases and smoothing parameters. 'NULL' by default,
             indicating no linkage.

        _s_p Smoothing parameters for the term. Any negative are
             estimated, otherwise they are fixed at the supplied value.
             Unless 'NULL' (default), over-rides 'sp' argument to
             'gam'.

          If 'object' is of class 'tensor.smooth.spec' then it was
          generated by a 'te' term in the GAM formula,  and specifies a
          smooth of several variables with a basis generated as a
          tensor product of lower dimensional bases.  In this case the
          object will be different and will have the following
          elements:

        _m_a_r_g_i_n is a list of smooth specification objects of the type
             listed above, defining the bases which have  their tensor
             product formed in order to construct this term.

        _t_e_r_m is the array of names of the covariates that are arguments
             of the smooth.

        _b_y is the name of any 'by' variable, or '"NA"'.

        _f_x is an array, the elements of which indicate whether ('TRUE')
             any of the margins in the  tensor product should be
             unpenalized.

        _l_a_b_e_l A suitable label for use with this term.

        _d_i_m is the number of covariates of which this smooth is a
             function.

        _m_p 'TRUE' if multiple penalties are to be used.

        _n_p 'TRUE' if 1-D marginal smooths are to be re-parameterized in
             terms of function values.

        _i_d Any identity associated with this term - used for linking
             bases and smoothing parameters. 'NULL' by default,
             indicating no linkage.

        _s_p Smoothing parameters for the term. Any negative are
             estimated, otherwise they are fixed at the supplied value.
             Unless 'NULL' (default), over-rides 'sp' argument to
             'gam'.

    data: For 'smooth.construct' a data frame or list containing the
          evaluation of the elements of 'object$term', with names given
          by 'object$term'. The last entry will be the 'by' variable,
          if 'object$by' is not '"NA"'. For 'smooth.construct2' 'data'
          need only be an object within which 'object$term'  can be
          evaluated, the variables can be in any order, and there can
          be irrelevant variables present as well. 

   knots: an optional data frame or list containing the knots relating
          to 'object$term'.  If it is 'NULL' then the knot locations
          are generated automatically. The structure of 'knots' should
          be as for 'data', depending on whether 'smooth.construct' or
          'smooth.construct2' is used.

_D_e_t_a_i_l_s:

     There are built in methods for objects with the following classes:
     'tp.smooth.spec' (thin plate regression splines: see 'tprs'); 
     'ts.smooth.spec' (thin plate regression splines with
     shrinkage-to-zero); 'cr.smooth.spec' (cubic regression splines:
     see 'cubic.regression.spline'; 'cs.smooth.spec' (cubic regression
     splines with shrinkage-to-zero); 'cc.smooth.spec' (cyclic cubic
     regression splines); 'ps.smooth.spec' (Eilers and Marx (1986)
     style P-splines: see 'p.spline'); 'cp.smooth.spec' (cyclic
     P-splines) 'ad.smooth.spec' (adaptive smooths of 1 or 2 variables:
     see 'adaptive.smooth'); 'tensor.smooth.spec' (tensor product
     smooths).

     There is an implicit assumption that the basis only depends on the
     knots and/or the set of unique  covariate combinations; i.e. that
     the basis is the same whether generated from the full set of
     covariates, or just the unique combinations of covariates.

_V_a_l_u_e:

     The input argument 'object', assigned a new class to indicate what
     type of smooth it is and with at least the  following items added: 

       X: The model matrix from this term. This may have an '"offset"'
          attribute:  a vector of length 'nrow(X)' containing any
          contribution of the smooth to the model offset term. 'by'
          variables do not need to be dealt with here, but if they are
          then an item 'by.done' must be added to the 'object'.

       S: A list of positive semi-definite penalty matrices that apply
          to this term. The list will be empty  if the term is to be
          left un-penalized.

    rank: An array giving the ranks of the penalties.

null.space.dim: The dimension of the penalty null space (before
          centering).

       C: The matrix defining any constraints on the term. If this is
          'NULL' then 'smoothCon' will add an identifiability
          constraint that each term should sum to zero over the
          covariate values. Set to a zero row matrix if no constraints
          are required. Note that a supplied 'C' is never ignored, even
          if any 'by' variables of a smooth imply that no constraint is
          actually needed. 

no.rescale: if this is non-NULL then the penalty coefficient matrix of
          the smooth will not be  rescaled for enhaced numerical
          stability (rescaling is the default, because 'gamm' requires
          it).  Turning off rescaling is useful if the values of the
          smoothing parameters should be interpretable in a model,  for
          example because they are inverse variance components.  

      df: the degrees of freedom associated with this term (when
          unpenalized and unconstrained). If this is null then
          'smoothCon' will set it to the basis  dimension. 'smoothCon'
          will reduce this by the number of constraints.

       L: smooths may depend on fewer `underlying' smoothing parameters
          than there are elements of 'S'. In this case 'L' is the
          matrix mapping the vector of underlying log smoothing 
          parameters to the vector of logs of the smoothing parameters
          actually multiplying the 'S[[i]]'.  'L=NULL' signifies that
          there is one smoothing parameter per 'S[[i]]'. 


     Usually the returned object will also include extra information
     required to define the basis, and used by  'Predict.matrix'
     methods to make predictions using the basis. See the 'Details'
     section for links to the information included for the built in
     smooth classes. 

     'tensor.smooth' returned objects will additionally have each
     element of the 'margin' list updated in the same way.
     'tensor.smooths' also have a list, 'XP', containing
     re-parameterization matrices for any 1-D marginal terms
     re-parameterized in terms of function values. This list will have
     'NULL' entries for marginal smooths that are not re-parameterized,
     and is only long enough to reach the last re-parameterized
     marginal in the list.

_W_A_R_N_I_N_G:

     User defined smooth objects should avoid having attributes names
     '"qrc"' or '"nCons"' as these are used internally to provide
     constraint free parameterizations.

_A_u_t_h_o_r(_s):

     Simon N. Wood simon.wood@r-project.org

_R_e_f_e_r_e_n_c_e_s:

     Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B
     65(1):95-114

     Wood, S.N. (2006) Low rank scale invariant tensor product smooths
     for generalized additive mixed models. Biometrics 62(4):1025-1036

     The P-spline code given in the example is based on those advocated
     in:

     Ruppert, D., M.P. Wand and R.J. Carroll (2003) Semiparametric
     Regression. Cambridge  University Press.

     However if you want p-splines, rather than splines with derivative
     based penalties, then the built in "ps" class is probably a better
     bet. It's based on

     Eilers, P.H.C. and B.D. Marx (1996) Flexible Smoothing with
     B-splines and Penalties.  Statistical Science, 11(2):89-121

     <URL: http://www.maths.bath.ac.uk/~sw283/>

_S_e_e _A_l_s_o:

     's','get.var', 'gamm', 'gam', 'Predict.matrix', 'smoothCon',
     'PredictMat'

_E_x_a_m_p_l_e_s:

     # adding "p-spline" classes and methods

     smooth.construct.tr.smooth.spec<-function(object,data,knots)
     ## a truncated power spline constructor method function
     ## object$p.order = null space dimension
     { m <- object$p.order[1]
       if (is.na(m)) m <- 2 ## default 
       if (m<1) stop("silly m supplied")
       if (object$bs.dim<0) object$bs.dim <- 10 ## default
       nk<-object$bs.dim-m-1 ## number of knots
       if (nk<=0) stop("k too small for m")
       x <- data[[object$term]]  ## the data
       x.shift <- mean(x) # shift used to enhance stability
       k <- knots[[object$term]] ## will be NULL if none supplied
       if (is.null(k)) # space knots through data
       { n<-length(x)
         k<-quantile(x[2:(n-1)],seq(0,1,length=nk+2))[2:(nk+1)]
       }
       if (length(k)!=nk) # right number of knots?
       stop(paste("there should be ",nk," supplied knots"))
       x <- x - x.shift # basis stabilizing shift
       k <- k - x.shift # knots treated the same!
       X<-matrix(0,length(x),object$bs.dim)
       for (i in 1:(m+1)) X[,i] <- x^(i-1)
       for (i in 1:nk) X[,i+m+1]<-(x-k[i])^m*as.numeric(x>k[i])
       object$X<-X # the finished model matrix
       if (!object$fixed) # create the penalty matrix
       { object$S[[1]]<-diag(c(rep(0,m+1),rep(1,nk)))
       }
       object$rank<-nk  # penalty rank
       object$null.space.dim <- m+1  # dim. of unpenalized space
       ## store "tr" specific stuff ...
       object$knots<-k;object$m<-m;object$x.shift <- x.shift
      
       object$df<-ncol(object$X)     # maximum DoF (if unconstrained)
      
       class(object)<-"tr.smooth"  # Give object a class
       object
     }

     Predict.matrix.tr.smooth<-function(object,data)
     ## prediction method function for the `tr' smooth class
     { x <- data[[object$term]]
       x <- x - object$x.shift # stabilizing shift
       m <- object$m;     # spline order (3=cubic)
       k<-object$knots    # knot locations
       nk<-length(k)      # number of knots
       X<-matrix(0,length(x),object$bs.dim)
       for (i in 1:(m+1)) X[,i] <- x^(i-1)
       for (i in 1:nk) X[,i+m+1] <- (x-k[i])^m*as.numeric(x>k[i])
       X # return the prediction matrix
     }


     # an example, using the new class....
     set.seed(100)
     dat <- gamSim(1,n=400,scale=2)
     b<-gam(y~s(x0,bs="tr",m=2)+s(x1,bs="ps",m=c(1,3))+
              s(x2,bs="tr",m=3)+s(x3,bs="tr",m=2),data=dat)
     plot(b,pages=1)
     b<-gamm(y~s(x0,bs="tr",m=2)+s(x1,bs="ps",m=c(1,3))+
              s(x2,bs="tr",m=3)+s(x3,bs="tr",m=2),data=dat)
     plot(b$gam,pages=1)
     # another example using tensor products of the new class
     dat <- gamSim(2,n=400,scale=.1)$data
     b <- gam(y~te(x,z,bs=c("tr","tr"),m=c(2,2)),data=dat)
     vis.gam(b)