voom {limma} | R Documentation |
Transform count data to log2-counts per million (logCPM), estimate the mean-variance relationship and use this to compute appropriate observation-level weights. The data are then ready for linear modelling.
voom(counts, design = NULL, lib.size = NULL, normalize.method = "none", span = 0.5, plot = FALSE, save.plot = FALSE, ...)
counts |
a numeric |
design |
design matrix with rows corresponding to samples and columns to coefficients to be estimated. Defaults to the unit vector meaning that samples are treated as replicates. |
lib.size |
numeric vector containing total library sizes for each sample.
Defaults to the normalized (effective) library sizes in |
normalize.method |
the microarray-style normalization method to be applied to the logCPM values (if any).
Choices are as for the |
span |
width of the lowess smoothing window as a proportion. |
plot |
logical, should a plot of the mean-variance trend be displayed? |
save.plot |
logical, should the coordinates and line of the plot be saved in the output? |
... |
other arguments are passed to |
This function is intended to process RNA-Seq or ChIP-Seq data prior to linear modelling in limma.
voom
is an acronym for mean-variance modelling at the observational level.
The idea is to estimate the mean-variance relationship in the data, then use this to compute an appropriate precision weight for each observation.
Count data always show marked mean-variance relationships.
Raw counts show increasing variance with increasing count size, while log-counts typically show a decreasing mean-variance trend.
This function estimates the mean-variance trend for log-counts, then assigns a weight to each observation based on its predicted variance.
The weights are then used in the linear modelling process to adjust for heteroscedasticity.
voom
performs the following specific calculations.
First, the counts are converted to logCPM values, adding 0.5 to all the counts to avoid taking the logarithm of zero.
The matrix of logCPM values is then optionally normalized.
The lmFit
function is used to fit row-wise linear models.
The lowess
function is then used to fit a trend to the square-root-standard-deviations as a function of average logCPM.
The trend line is then used to predict the variance of each logCPM value as a function of its fitted value, and the inverse variances become the estimated precision weights.
For good results, the counts
matrix should be filtered to remove remove rows with very low counts before running voom().
The filterByExpr
function in the edgeR package can be used for that purpose.
If counts
is a DGEList
object from the edgeR package, then voom will use the normalization factors found in the object when computing the logCPM values.
In other words, the logCPM values are computed from the effective library sizes rather than the raw library sizes.
If the DGEList
object has been scale-normalized in edgeR, then it is usual to leave normalize.method="none"
in voom, i.e., the logCPM values should not usually be re-normalized in the voom
call.
The voom
method is similar in purpose to the limma-trend method, which uses eBayes
or treat
with trend=TRUE
.
The voom method incorporates the mean-variance trend into the precision weights, whereas limma-trend incorporates the trend into the empirical Bayes moderation.
The voom method takes into account the sequencing depths (library sizes) of the individual columns of counts
and applies the mean-variance trend on an individual observation basis.
limma-trend, on the other hand, assumes that the library sizes are not wildly different and applies the mean-variance trend on a genewise basis.
As noted by Law et al (2014), voom should be more powerful than limma-trend if the library sizes are very different but, otherwise, the two methods should give similar results.
An EList
object with the following components:
E |
numeric matrix of normalized expression values on the log2 scale |
weights |
numeric matrix of inverse variance weights |
design |
design matrix |
lib.size |
numeric vector of total normalized library sizes |
genes |
dataframe of gene annotation extracted from |
voom.xy |
if |
voom.line |
if |
Charity Law and Gordon Smyth
Law, CW, Chen, Y, Shi, W, Smyth, GK (2014). Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology 15, R29. http://genomebiology.com/2014/15/2/R29
eBayes
,
voomWithQualityWeights
.
vooma
is similar to voom
but for microarrays instead of RNA-seq.
A summary of functions for RNA-seq analysis is given in 11.RNAseq.
## Not run: keep <- filterByExpr(counts, design) v <- voom(counts[keep,], design, plot=TRUE) fit <- lmFit(v, design) fit <- eBayes(fit, robust=TRUE) ## End(Not run)