polr                  package:MASS                  R Documentation

_O_r_d_e_r_e_d _L_o_g_i_s_t_i_c _o_r _P_r_o_b_i_t _R_e_g_r_e_s_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a logistic or probit regression model to an ordered factor
     response.  The default logistic case is _proportional odds
     logistic regression_, after which the function is named.

_U_s_a_g_e:

     polr(formula, data, weights, start, ..., subset, na.action,
          contrasts = NULL, Hess = FALSE, model = TRUE,
          method = c("logistic", "probit", "cloglog", "cauchit"))

_A_r_g_u_m_e_n_t_s:

 formula: a formula expression as for regression models, of the form
          'response ~ predictors'. The response should be a factor
          (preferably an ordered factor), which will be interpreted as
          an ordinal response, with levels ordered as in the factor.  
          The model must have an intercept: attempts to remove one will
          lead to a warning and be ignored.  An offset may be used. 
          See the documentation of 'formula' for other details. 

    data: an optional data frame in which to interpret the variables
          occurring in 'formula'. 

 weights: optional case weights in fitting.  Default to 1. 

   start: initial values for the parameters.  This is in the format
          'c(coefficients, zeta)': see the Values section. 

     ...: additional arguments to be passed to 'optim', most often a
          'control' argument. 

  subset: expression saying which subset of the rows of the data should
           be used in the fit.  All observations are included by
          default. 

na.action: a function to filter missing data. 

contrasts: a list of contrasts to be used for some or all of the
          factors appearing as variables in the model formula. 

    Hess: logical for whether the Hessian (the observed information
          matrix) should be returned.  Use this if you intend to call
          'summary' or 'vcov' on the fit. 

   model: logical for whether the model matrix should be returned. 

  method: logistic or probit or complementary log-log or cauchit
          (corresponding to a Cauchy latent variable). 

_D_e_t_a_i_l_s:

     This model is what Agresti (2002) calls a _cumulative link_ model.
      The basic interpretation is as a _coarsened_ version of a latent
     variable Y_i which has a logistic or normal or extreme-value or
     Cauchy distribution with scale parameter one and a linear model
     for the mean.  The ordered factor which is observed is which bin
     Y_i falls into with breakpoints

             zeta_0 = -Inf < zeta_1 < ... < zeta_K = Inf

     This leads to the model

                  logit P(Y <= k | x) = zeta_k - eta

     with _logit_ replaced by _probit_ for a normal latent variable,
     and eta being the linear predictor, a linear function of the
     explanatory variables (with no intercept).  Note that it is quite
     common for other software to use the opposite sign for eta (and
     hence the coefficients 'beta').

     In the logistic case, the left-hand side of the last display is
     the log odds of category k or less, and since these are log odds
     which differ only by a constant for different k, the odds are
     proportional.  Hence the term _proportional odds logistic
     regression_.

     In the complementary log-log case, we have a _proportional
     hazards_ model for grouped survival times.

     There are methods for the standard model-fitting functions,
     including 'predict', 'summary', 'vcov', 'anova', 'model.frame' and
     an 'extractAIC' method for use with 'stepAIC'.  There are also
     'profile' and 'confint' methods.

_V_a_l_u_e:

     A object of class '"polr"'.  This has components

coefficients: the coefficients of the linear predictor, which has no
          intercept.

    zeta: the intercepts for the class boundaries.

deviance: the residual deviance.

fitted.values: a matrix, with a column for each level of the response.

     lev: the names of the response levels.

   terms: the 'terms' structure describing the model.

df.residual: the number of residual degrees of freedoms, calculated
          using the weights.

     edf: the (effective) number of degrees of freedom used by the
          model

 n, nobs: the (effective) number of observations, calculated using the
          weights. ('nobs' is for use by 'stepAIC'.

    call: the matched call.

  method: the matched method used.

convergence: the convergence code returned by 'optim'.

   niter: the number of function and gradient evaluations used by
          'optim'.

 Hessian: (if 'Hess' is true).

   model: (if 'model' is true).

_R_e_f_e_r_e_n_c_e_s:

     Agresti, A. (2002) _Categorical Data._ Second edition.  Wiley.

     Venables, W. N. and Ripley, B. D. (2002) _Modern Applied
     Statistics with S._ Fourth edition.  Springer.

_S_e_e _A_l_s_o:

     'optim', 'glm', 'multinom'.

_E_x_a_m_p_l_e_s:

     options(contrasts = c("contr.treatment", "contr.poly"))
     house.plr <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
     house.plr
     summary(house.plr)
     ## slightly worse fit from
     summary(update(house.plr, method = "probit"))
     ## although it is not really appropriate, can fit
     summary(update(house.plr, method = "cloglog"))

     predict(house.plr, housing, type = "p")
     addterm(house.plr, ~.^2, test = "Chisq")
     house.plr2 <- stepAIC(house.plr, ~.^2)
     house.plr2$anova
     anova(house.plr, house.plr2)

     house.plr <- update(house.plr, Hess=TRUE)
     pr <- profile(house.plr)
     confint(pr)
     plot(pr)
     pairs(pr)