ebayes {limma} | R Documentation |
Given a series of related parameter estimates and standard errors, compute moderated t-statistics, moderated F-statistic, and log-odds of differential expression by empirical Bayes shrinkage of the standard errors towards a common value.
ebayes(fit, proportion=0.01, stdev.coef.lim=c(0.1,4)) eBayes(fit, proportion=0.01, stdev.coef.lim=c(0.1,4)) treat(fit, lfc=0)
fit |
an MArrayLM fitted model object produced by lmFit or contrasts.fit , or an unclassed list produced by lm.series , gls.series or mrlm containing components coefficients , stdev.unscaled , sigma and df.residual |
proportion |
numeric value between 0 and 1, assumed proportion of genes which are differentially expressed |
stdev.coef.lim |
numeric vector of length 2, assumed lower and upper limits for the standard deviation of log2-fold-changes for differentially expressed genes |
lfc |
the minimum log2-fold-change which is considered material |
These functions is used to rank genes in order of evidence for differential expression.
They use an empirical Bayes method to shrink the probe-wise sample variances towards a common value and to augmenting the degrees of freedom for the individual variances (Smyth, 2004).
The functions accept as input argument fit
a fitted model object from the functions lmFit
, lm.series
, mrlm
or gls.series
.
The fitted model object may have been processed by contrasts.fit
before being passed to eBayes
to convert the coefficients of the design matrix into an arbitrary number of contrasts which are to be tested equal to zero.
The columns of fit
define a set of contrasts which are to be tested equal to zero.
The empirical Bayes moderated t-statistics test each individual contrast equal to zero. For each probe (row), the moderated F-statistic tests whether all the contrasts are zero. The F-statistic is an overall test computed from the set of t-statistics for that probe. This is exactly analogous the relationship between t-tests and F-statistics in conventional anova, except that the residual mean squares and residual degrees of freedom have been moderated between probes.
The estimates s2.prior
and df.prior
are computed by fitFDist
.
s2.post
is the weighted average of s2.prior
and sigma^2
with weights proportional to df.prior
and df.residual
respectively.
The lods
is sometimes known as the B-statistic.
The F-statistics F
are computed by classifyTestsF
with fstat.only=TRUE
.
eBayes
doesn't compute ordinary (unmoderated) t-statistics by default, but these can be easily extracted from
the linear model output, see the example below.
ebayes
is the earlier and leaner function.
eBayes
is intended to have a more object-orientated flavor as it produces objects containing all the necessary components for downstream analysis.
treat
computes empirical Bayes moderated-t p-values relative to a minimum required fold-change threshold.
Use topTreat
to summarize output from treat
.
Instead of testing for genes which have log-fold-changes different from zero, it tests whether the log2-fold-change is greater than lfc
in absolute value (McCarthy and Smyth, 2009).
treat
is concerned with p-values rather than posterior odds, so it does not compute the B-statistic lods
.
The idea of thresholding doesn't apply to F-statistics in a straightforward way, so moderated F-statistics are also not computed.
eBayes
produces an object of class MArrayLM
with the following components, see MArrayLM-class
.
ebayes
produces an ordinary list without F
or F.p.value
.
treat
produces an MArrayLM
object, but without lods
, var.prior
, F
or F.p.value
.
t |
numeric vector or matrix of moderated t-statistics |
p.value |
numeric vector of p-values corresponding to the t-statistics |
s2.prior |
estimated prior value for sigma^2 |
df.prior |
degrees of freedom associated with s2.prior |
s2.post |
vector giving the posterior values for sigma^2 |
lods |
numeric vector or matrix giving the log-odds of differential expression |
var.prior |
estimated prior value for the variance of the log2-fold-change for differentially expressed gene |
F |
numeric vector of moderated F-statistics for testing all contrasts defined by the columns of fit simultaneously equal to zero |
F.p.value |
numeric vector giving p-values corresponding to F |
Gordon Smyth
McCarthy, D. J., and Smyth, G. K. (2009). Testing significance relative to a fold-change threshold is a TREAT. Bioinformatics. http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp053
Loennstedt, I., and Speed, T. P. (2002). Replicated microarray data. Statistica Sinica 12, 31-46.
Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, Volume 3, Article 3. http://www.bepress.com/sagmb/vol3/iss1/art3
squeezeVar
, fitFDist
, tmixture.matrix
.
An overview of linear model functions in limma is given by 06.LinearModels.
# See also lmFit examples # Simulate gene expression data, # 6 microarrays and 100 genes with one gene differentially expressed set.seed(2004); invisible(runif(100)) M <- matrix(rnorm(100*6,sd=0.3),100,6) M[1,] <- M[1,] + 1 fit <- lmFit(M) # Ordinary t-statistic par(mfrow=c(1,2)) ordinary.t <- fit$coef / fit$stdev.unscaled / fit$sigma qqt(ordinary.t,df=fit$df.residual,main="Ordinary t") abline(0,1) # Moderated t-statistic eb <- eBayes(fit) qqt(eb$t,df=eb$df.prior+eb$df.residual,main="Moderated t") abline(0,1) # Points off the line may be differentially expressed par(mfrow=c(1,1))