model.matrix              package:stats              R Documentation

_C_o_n_s_t_r_u_c_t _D_e_s_i_g_n _M_a_t_r_i_c_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     'model.matrix' creates a design matrix.

_U_s_a_g_e:

     model.matrix(object, ...)

     ## Default S3 method:
     model.matrix(object, data = environment(object),
                  contrasts.arg = NULL, xlev = NULL, ...)

_A_r_g_u_m_e_n_t_s:

  object: an object of an appropriate class.  For the default method, a
          model formula or terms object.

    data: a data frame created with 'model.frame'.  If another sort of
          object, 'model.frame' is called first.

contrasts.arg: A list, whose entries are contrasts suitable for input
          to the 'contrasts' replacement function and whose names are
          the names of columns of 'data' containing 'factor's.

    xlev: to be used as argument of 'model.frame' if 'data' has no
          '"terms"' attribute.

     ...: further arguments passed to or from other methods.

_D_e_t_a_i_l_s:

     'model.matrix' creates a design matrix from the description given
     in 'terms(object)', using the data in 'data' which must contain
     variables with the same names as would be created by a call to
     'model.frame(object)' or, more precisely, by evaluating
     'attr(terms(object), "variables")'.  If it is a data frame, there
     may be other columns and the order of columns is not important.
     Any character variables are coerced to factors, with a warning.
     After coercion, all the variables used in RHD of the formula must
     be logical, integer, numeric or factor.

     If 'contrasts.arg' is specified for a factor it overrides the
     default factor coding for that variable and any '"contrasts"'
     attribute set by 'C' or 'contrasts'.

     In an interaction term, the variable whose levels vary fastest is
     the first one to appear in the formula (and not in the term), so
     in '~ a + b + b:a' the interaction will have 'a' varying fastest.

     By convention, if the response variable also appears on the
     right-hand side of the formula it is dropped (with a warning),
     although interactions involving the term are retained.

_V_a_l_u_e:

     The design matrix for a regression model with the specified
     formula and data.

     There is an attribute '"assign"', an integer vector with an entry
     for each column in the matrix giving the term in the formula which
     gave rise to the column.  Value '0' corresponds to the intercept
     (if any), and positive values to terms in the order given by the
     'terms.labels' attribute of the 'terms' structure corresponding to
     'object'.

     If there are any factors in terms in the model, there is an
     attribute '"contrasts"', a named list with an entry for each
     factor.  This specifies the contrasts that would be used in terms
     in which the factor is coded by contrasts (in some terms dummy
     coding may be used), either as a character vector naming a
     function or as a numeric matrix.

_R_e_f_e_r_e_n_c_e_s:

     Chambers, J. M. (1992) _Data for models._ Chapter 3 of
     _Statistical Models in S_ eds J. M. Chambers and T. J. Hastie,
     Wadsworth & Brooks/Cole.

_S_e_e _A_l_s_o:

     'model.frame', 'model.extract', 'terms'

_E_x_a_m_p_l_e_s:

     ff <- log(Volume) ~ log(Height) + log(Girth)
     utils::str(m <- model.frame(ff, trees))
     mat <- model.matrix(ff, m)

     dd <- data.frame(a = gl(3,4), b = gl(4,1,12))# balanced 2-way
     options("contrasts")
     model.matrix(~ a + b, dd)
     model.matrix(~ a + b, dd, contrasts = list(a="contr.sum"))
     model.matrix(~ a + b, dd, contrasts = list(a="contr.sum", b="contr.poly"))
     m.orth <- model.matrix(~a+b, dd, contrasts = list(a="contr.helmert"))
     crossprod(m.orth)# m.orth is  ALMOST  orthogonal