xtabs {Matrix}R Documentation

Cross Tabulation, Optionally Sparse

Description

Create a contingency table from cross-classifying factors, usually contained in a data frame, using a formula interface.

This is a fully compatible extension of the standard stats package xtabs() function with the added option to produce a sparse matrix result via sparse = TRUE.

Usage

xtabs(formula = ~., data = parent.frame(), subset, sparse = FALSE, na.action,
      exclude = c(NA, NaN), drop.unused.levels = FALSE)

Arguments

formula a formula object with the cross-classifying variables (separated by +) on the right hand side (or an object which can be coerced to a formula). Interactions are not allowed. On the left hand side, one may optionally give a vector or a matrix of counts; in the latter case, the columns are interpreted as corresponding to the levels of a variable. This is useful if the data have already been tabulated, see the examples below.
data an optional matrix or data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).
subset an optional vector specifying a subset of observations to be used.
sparse logical specifying if the result should be a sparse matrix, i.e., inheriting from sparseMatrix. Only works for two factors (since there are no higher-order sparse array classes yet).
na.action a function which indicates what should happen when the data contain NAs.
exclude a vector of values to be excluded when forming the set of levels of the classifying factors.
drop.unused.levels a logical indicating whether to drop unused levels in the classifying factors. If this is FALSE and there are unused levels, the table will contain zero marginals, and a subsequent chi-squared test for independence of the factors will not work.

Details

For (non-sparse) xtabs results, there is a summary method for contingency table objects created by table or xtabs, which gives basic information and performs a chi-squared test for independence of factors (note that the function chisq.test currently only handles 2-d tables).

If a left hand side is given in formula, its entries are simply summed over the cells corresponding to the right hand side; this also works if the lhs does not give counts.

Value

By default, when sparse=FALSE, a contingency table in array representation of S3 class c("xtabs", "table"), with a "call" attribute storing the matched call.
When sparse=TRUE, a sparse numeric matrix, specifically an object of S4 class dgTMatrix.

See Also

The stats package version xtabs and its references.

Examples

## See for non-sparse examples:
example(xtabs, package = "stats")

## similar to "nlme"s  'ergoStool' :
d.ergo <- data.frame(Type = paste("T", rep(1:4, 9*4), sep=""),
                     Subj = gl(9,4, 36*4))
xtabs(~ Type + Subj, data=d.ergo) # 4 replicates each
set.seed(15) # a subset of cases:
xtabs(~ Type + Subj, data=d.ergo[sample(36, 10),], sparse=TRUE)

## Hypothetical two level setup:
inner <- factor(sample(letters[1:25], 100, replace = TRUE))
inout <- factor(sample(LETTERS[1:5], 25, replace = TRUE))
fr <- data.frame(inner = inner, outer = inout[as.integer(inner)])
xtabs(~ inner + outer, fr, sparse = TRUE)

[Package Matrix version 0.999375-29 Index]