model.frame package:stats R Documentation _E_x_t_r_a_c_t_i_n_g _t_h_e "_E_n_v_i_r_o_n_m_e_n_t" _o_f _a _M_o_d_e_l _F_o_r_m_u_l_a _D_e_s_c_r_i_p_t_i_o_n: 'model.frame' (a generic function) and its methods return a 'data.frame' with the variables needed to use 'formula' and any '...' arguments. _U_s_a_g_e: model.frame(formula, ...) ## Default S3 method: model.frame(formula, data = NULL, subset = NULL, na.action = na.fail, drop.unused.levels = FALSE, xlev = NULL, ...) ## S3 method for class 'aovlist': model.frame(formula, data = NULL, ...) ## S3 method for class 'glm': model.frame(formula, ...) ## S3 method for class 'lm': model.frame(formula, ...) get_all_vars(formula, data, ...) _A_r_g_u_m_e_n_t_s: formula: a model 'formula' or 'terms' object or an R object. data: a data.frame, list or environment (or object coercible by 'as.data.frame' to a data.frame), containing the variables in 'formula'. Neither a matrix nor an array will be accepted. subset: a specification of the rows to be used: defaults to all rows. This can be any valid indexing vector (see '[.data.frame') for the rows of 'data' or if that is not supplied, a data frame made up of the variables used in 'formula'. na.action: how 'NA's are treated. The default is first, any 'na.action' attribute of 'data', second a 'na.action' setting of 'options', and third 'na.fail' if that is unset. The 'factory-fresh' default is 'na.omit'. Another possible value is 'NULL'. drop.unused.levels: should factors have unused levels dropped? Defaults to 'FALSE'. xlev: a named list of character vectors giving the full set of levels to be assumed for each factor. ...: further arguments such as 'data', 'na.action', 'subset'. Any additional arguments such as 'offset' and 'weights' which reach the default method are used to create further columns in the model frame, with parenthesised names such as '"(offset)"'. _D_e_t_a_i_l_s: Exactly what happens depends on the class and attributes of the object 'formula'. If this is an object of fitted-model class such as '"lm"', the method will either return the saved model frame used when fitting the model (if any, often selected by argument 'model = TRUE') or pass the call used when fitting on to the default method. The default method itself can cope with rather standard model objects such as those of class '"lqs"' from package 'MASS' if no other arguments are supplied. The rest of this section applies only to the default method. If either 'formula' or 'data' is already a model frame (a data frame with a '"terms"' attribute) and the other is missing, the model frame is returned. Unless 'formula' is a terms object, 'as.formula' and then 'terms' is called on it. (If you wish to use the 'keep.order' argument of 'terms.formula', pass a terms object rather than a formula.) Row names for the model frame are taken from the 'data' argument if present, then from the names of the response in the formula (or rownames if it is a matrix), if there is one. All the variables in 'formula', 'subset' and in '...' are looked for first in 'data' and then in the environment of 'formula' (see the help for 'formula()' for further details) and collected into a data frame. Then the 'subset' expression is evaluated, and it is used as a row index to the data frame. Then the 'na.action' function is applied to the data frame (and may well add attributes). The levels of any factors in the data frame are adjusted according to the 'drop.unused.levels' and 'xlev' arguments. Unless 'na.action = NULL', time-series attributes will be removed from the variables found (since they will be wrong if 'NA's are removed). Note that _all_ the variables in the formula are included in the data frame, even those preceded by '-'. Only variables whose type is raw, logical, integer, real, complex or character can be included in a model frame: this includes classed variables such as factors (whose underlying type is integer), but excludes lists. 'get_all_vars' returns a 'data.frame' containing the variables used in 'formula' plus those specified '...'. Unlike 'model.frame.default', it returns the input variables and not those resulting from function calls in 'formula'. _V_a_l_u_e: A 'data.frame' containing the variables used in 'formula' plus those specified in '...'. It will have additional attributes, including '"terms"' for an object of class '"terms"' derived from 'formula', and possibly '"na.action"' giving information on the handling of 'NA's (which will not be present if no special handling was done, e.g. by 'na.pass'). _R_e_f_e_r_e_n_c_e_s: Chambers, J. M. (1992) _Data for models._ Chapter 3 of _Statistical Models in S_ eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole. _S_e_e _A_l_s_o: 'model.matrix' for the 'design matrix', 'formula' for formulas and 'expand.model.frame' for model.frame manipulation. _E_x_a_m_p_l_e_s: data.class(model.frame(dist ~ speed, data = cars))