predict.lm               package:stats               R Documentation

_P_r_e_d_i_c_t _m_e_t_h_o_d _f_o_r _L_i_n_e_a_r _M_o_d_e_l _F_i_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     Predicted values based on linear model object.

_U_s_a_g_e:

     ## S3 method for class 'lm':
     predict(object, newdata, se.fit = FALSE, scale = NULL, df = Inf, 
             interval = c("none", "confidence", "prediction"),
             level = 0.95, type = c("response", "terms"),
             terms = NULL, na.action = na.pass,
             pred.var = res.var/weights, weights = 1, ...)

_A_r_g_u_m_e_n_t_s:

  object: Object of class inheriting from '"lm"'

 newdata: An optional data frame in which to look for variables with
          which to predict.  If omitted, the fitted values are used.

  se.fit: A switch indicating if standard errors are required.

   scale: Scale parameter for std.err. calculation

      df: Degrees of freedom for scale

interval: Type of interval calculation.

   level: Tolerance/confidence level

    type: Type of prediction (response or model term).

   terms: If 'type="terms"', which terms (default is all terms)

na.action: function determining what should be done with missing values
          in 'newdata'.  The default is to predict 'NA'.

pred.var: the variance(s) for future observations to be assumed for
          prediction intervals.  See 'Details'.

 weights: variance weights for prediction. This can be a numeric vector
          or a one-sided model formula. In the latter case, it is
          interpreted as an expression evaluated in 'newdata'

     ...: further arguments passed to or from other methods.

_D_e_t_a_i_l_s:

     'predict.lm' produces predicted values, obtained by evaluating the
     regression function in the frame 'newdata' (which defaults to
     'model.frame(object)'.  If the logical 'se.fit' is 'TRUE',
     standard errors of the predictions are calculated.  If the numeric
     argument 'scale' is set (with optional 'df'), it is used as the
     residual standard deviation in the computation of the standard
     errors, otherwise this is extracted from the model fit. Setting
     'intervals' specifies computation of confidence or prediction
     (tolerance) intervals at the specified 'level', sometimes 
     referred to as narrow vs. wide intervals.

     If the fit is rank-deficient, some of the columns of the design
     matrix will have been dropped.  Prediction from such a fit only
     makes sense if 'newdata' is contained in the same subspace as the
     original data.  That cannot be checked accurately, so a warning is
     issued.

     If 'newdata' is omitted the predictions are based on the data used
     for the fit.  In that case how cases with missing values in the
     original fit is determined by the 'na.action' argument of that
     fit.  If 'na.action = na.omit' omitted cases will not appear in
     the residuals, whereas if 'na.action = na.exclude' they will
     appear (in predictions, standard errors or interval limits), with
     residual value 'NA'.  See also 'napredict'.

     The prediction intervals are for a single observation at each case
     in 'newdata' (or by default, the data used for the fit) with error
     variance(s) 'pred.var'. This can be a multiple of 'res.var', the
     estimated value of sigma^2: the default is to assume that future
     observations have the same error variance as those used for
     fitting. If 'weights' is supplied, the inverse of this is used as
     a scale factor. For a weighted fit, if the prediction is for the
     original data frame, 'weights' defaults to the weights used for
     the  model fit, with a warning since it might not be the intended
     result. If the fit was weighted and newdata is given, the default
     is to assume constant prediction variance, with a warning.

_V_a_l_u_e:

     'predict.lm' produces a vector of predictions or a matrix of
     predictions and bounds with column names 'fit', 'lwr', and 'upr'
     if 'interval' is set.  If 'se.fit' is 'TRUE', a list with the
     following components is returned:  

     fit: vector or matrix as above

  se.fit: standard error of predicted means

residual.scale: residual standard deviations

      df: degrees of freedom for residual

_N_o_t_e:

     Variables are first looked for in 'newdata' and then searched for
     in the usual way (which will include the environment of the
     formula used in the fit).  A warning will be given if the
     variables found are not of the same length as those in 'newdata'
     if it was supplied.

     Notice that prediction variances and prediction intervals always
     refer to _future_ observations, possibly corresponding to the same
     predictors as used for the fit. The variance of the _residuals_
     will be smaller.

     Strictly speaking, the formula used for prediction limits assumes
     that the degrees of freedom for the fit are the same as those for
     the residual variance.  This may not be the case if 'res.var' is
     not obtained from the fit.

_S_e_e _A_l_s_o:

     The model fitting function 'lm', 'predict', 'SafePrediction'

_E_x_a_m_p_l_e_s:

     require(graphics)

     ## Predictions
     x <- rnorm(15)
     y <- x + rnorm(15)
     predict(lm(y ~ x))
     new <- data.frame(x = seq(-3, 3, 0.5))
     predict(lm(y ~ x), new, se.fit = TRUE)
     pred.w.plim <- predict(lm(y ~ x), new, interval="prediction")
     pred.w.clim <- predict(lm(y ~ x), new, interval="confidence")
     matplot(new$x,cbind(pred.w.clim, pred.w.plim[,-1]),
             lty=c(1,2,2,3,3), type="l", ylab="predicted y")

     ## Prediction intervals, special cases
     ##  The first three of these throw warnings
     w <- 1 + x^2
     fit <- lm(y ~ x)
     wfit <- lm(y ~ x, weights = w)
     predict(fit, interval = "prediction")
     predict(wfit, interval = "prediction")
     predict(wfit, new, interval = "prediction")
     predict(wfit, new, interval = "prediction", weights = (new$x)^2)
     predict(wfit, new, interval = "prediction", weights = ~x^2)