summary.gam {mgcv}R Documentation

Summary for a GAM fit

Description

Takes a fitted gam object produced by gam() and produces various useful summaries from it. (See sink to divert output to a file.)

Usage

## S3 method for class 'gam':
summary(object, dispersion=NULL, freq=FALSE, alpha=0, ...)

## S3 method for class 'summary.gam':
print(x,digits = max(3, getOption("digits") - 3), 
                  signif.stars = getOption("show.signif.stars"),...)

Arguments

object a fitted gam object as produced by gam().
x a summary.gam object produced by summary.gam().
dispersion A known dispersion parameter. NULL to use estimate or default (e.g. 1 for Poisson).
freq By default p-values for individual terms are calculated using the frequentist estimated covariance matrix of the parameter estimators. If this is set to FALSE then the Bayesian covariance matrix of the parameters is used instead. See details.
alpha adjustment to reference distribution per estimated smoothing parameter.
digits controls number of digits printed in output.
signif.stars Should significance stars be printed alongside output.
... other arguments.

Details

Model degrees of freedom are taken as the trace of the influence (or hat) matrix A for the model fit. Residual degrees of freedom are taken as number of data minus model degrees of freedom. Let P_i be the matrix giving the parameters of the ith smooth when applied to the data (or pseudodata in the generalized case) and let X be the design matrix of the model. Then tr(XP_i) is the edf for the ith term. Clearly this definition causes the edf's to add up properly!

print.summary.gam tries to print various bits of summary information useful for term selection in a pretty way.

If freq=TRUE then the frequentist approximation for p-values of smooth terms described in section 4.8.5 of Wood (2006) is used. The approximation is not great. If p_i is the parameter vector for the ith smooth term, and this term has estimated covariance matrix V_i then the statistic is p_i'V_i^{k-}p_i, where V_i^{k-} is the rank k pseudo-inverse of V_i, and k is estimated rank of V_i. p-values are obtained as follows. In the case of known dispersion parameter, they are obtained by comparing the chi.sq statistic to the chi-squared distribution with k degrees of freedom, where k is the estimated rank of V_i. If the dispersion parameter is unknown (in which case it will have been estimated) the statistic is compared to an F distribution with k upper d.f. and lower d.f. given by the residual degrees of freedom for the model. Typically the p-values will be somewhat too low.

If freq=FALSE then `Bayesian p-values' are returned for the smooth terms, based on a test statistic motivated by Nychka's (1988) analysis of the frequentist properties of Bayesian confidence intervals for smooths. These appear to have better frequentist performance (in terms of power and distribution under the null) than the alternative strictly frequentist approximation. Let f denote the vector of values of a smooth term evaluated at the original covariate values and let V_f denote the corresponding Bayesian covariance matrix. Let V*_f denote the rank r pseudoinverse of V_f, where r is the EDF for the term. The statistic used is then

T = f'V*_f f

(this can be calculated efficiently without forming the pseudoinverse explicitly). T is compared to a chi-squared distribution with degrees of freedom given by the EDF for the term + alpha * (number of estimated smoothing parameters), or T is used as a component in an F ratio statistic if the scale parameter has been estimated. There is some simulation evidence that the adjustment per smoothing parameter improves p-values when using GCV or AIC type smoothing parameter selection (alpha=0.5 seems to be best). The default alpha=0 seems to be appropriate when smoothing parameters are selected by REML/ML/PQL. The non-integer rank truncated inverse is constructed to give an approximation varying smoothly between the bounding integer rank approximations, while yielding test statistics with the correct mean and variance.

The definition of Bayesian p-value used is: the probability of observing an f less probable than bf 0, under the approximation for the posterior for f implied by the truncation used in the test statistic.

Note that the p-values distributional approximations start to break down below one effective degree of freedom, and p-values are not reported below 0.5 degrees of freedom.

Also note that increasing the gamma parameter of gam tends to mean that alpha should be decreased, to get close to uniform p-value distribution under the null.

For best p-value performance it seems to be best to use ML or REML smoothness selection.

Value

summary.gam produces a list of summary information for a fitted gam object.

p.coeff is an array of estimates of the strictly parametric model coefficients.
p.t is an array of the p.coeff's divided by their standard errors.
p.pv is an array of p-values for the null hypothesis that the corresponding parameter is zero. Calculated with reference to the t distribution with the estimated residual degrees of freedom for the model fit if the dispersion parameter has been estimated, and the standard normal if not.
m The number of smooth terms in the model.
chi.sq An array of test statistics for assessing the significance of model smooth terms. See details.
s.pv An array of approximate p-values for the null hypotheses that each smooth term is zero. Be warned, these are only approximate.
se array of standard error estimates for all parameter estimates.
r.sq The adjusted r-squared for the model. Defined as the proportion of variance explained, where original variance and residual variance are both estimated using unbiased estimators. This quantity can be negative if your model is worse than a one parameter constant model, and can be higher for the smaller of two nested models! Note that proportion null deviance explained is probably more appropriate for non-normal errors.
dev.expl The proportion of the null deviance explained by the model.
edf array of estimated degrees of freedom for the model terms.
residual.df estimated residual degrees of freedom.
n number of data.
gcv minimized GCV score for the model, if GCV used.
ubre minimized UBRE score for the model, if UBRE used.
scale estimated (or given) scale parameter.
family the family used.
formula the original GAM formula.
dispersion the scale parameter.
pTerms.df the degrees of freedom associated with each parameteric term (excluding the constant).
pTerms.chi.sq a Wald statistic for testing the null hypothesis that the each parametric term is zero.
pTerms.pv p-values associated with the tests that each term is zero. For penalized fits these are approximate. The reference distribution is an appropriate chi-squared when the scale parameter is known, and is based on an F when it is not.
cov.unscaled The estimated covariance matrix of the parameters (or estimators if freq=TRUE), divided by scale parameter.
cov.scaled The estimated covariance matrix of the parameters (estimators if freq=TRUE).
p.table significance table for parameters
s.table significance table for smooths
p.Terms significance table for parametric model terms

WARNING

The p-values are approximate: do read the details section.

Author(s)

Simon N. Wood simon.wood@r-project.org with substantial improvements by Henric Nilsson.

References

Nychka (1988) Bayesian Confidence Intervals for Smoothing Splines. Journal of the American Statistical Association 83:1134-1143.

Wood S.N. (2006) Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press.

See Also

gam, predict.gam, gam.check, anova.gam

Examples

library(mgcv)
set.seed(0)
dat <- gamSim(1,n=200,scale=2) ## simulate data

b <- gam(y~s(x0)+s(x1)+s(x2)+s(x3),data=dat)
plot(b,pages=1)
summary(b)

## now check the p-values by using a pure regression spline.....
b.d <- round(summary(b)$edf)+1 ## get edf per smooth
b.d <- pmax(b.d,3) # can't have basis dimension less than 3!
bc<-gam(y~s(x0,k=b.d[1],fx=TRUE)+s(x1,k=b.d[2],fx=TRUE)+
        s(x2,k=b.d[3],fx=TRUE)+s(x3,k=b.d[4],fx=TRUE),data=dat)
plot(bc,pages=1)
summary(bc)

## p-value check - increase k to make this useful!
k<-20;n <- 200;p <- rep(NA,k)
for (i in 1:k)
{ b<-gam(y~te(x,z),data=data.frame(y=rnorm(n),x=runif(n),z=runif(n)),
         method="ML")
  p[i]<-summary(b)$s.p[1]
}
plot(((1:k)-0.5)/k,sort(p))
abline(0,1,col=2)
ks.test(p,"punif") ## how close to uniform are the p-values?


[Package mgcv version 1.5-5 Index]