06.LinearModels package:limma R Documentation _L_i_n_e_a_r _M_o_d_e_l_s _f_o_r _M_i_c_r_o_a_r_r_a_y_s _D_e_s_c_r_i_p_t_i_o_n: This page gives an overview of the LIMMA functions available to fit linear models and to interpret the results. This page covers models for two color arrays in terms of log-ratios or for single-channel arrays in terms of log-intensities. If you wish to fit models to the individual channel log-intensities from two colour arrays, see 07.SingleChannel. The core of this package is the fitting of gene-wise linear models to microarray data. The basic idea is to estimate log-ratios between two or more target RNA samples simultaneously. See the LIMMA User's Guide for several case studies. _F_i_t_t_i_n_g _M_o_d_e_l_s: The main function for model fitting is 'lmFit'. This is recommended interface for most users. 'lmFit' produces a fitted model object of class 'MArrayLM' containing coefficients, standard errors and residual standard errors for each gene. 'lmFit' calls one of the following three functions to do the actual computations: '_l_m._s_e_r_i_e_s' Straightforward least squares fitting of a linear model for each gene. '_m_r_l_m' An alternative to 'lm.series' using robust regression as implemented by the 'rlm' function in the MASS package. '_g_l_s._s_e_r_i_e_s' Generalized least squares taking into account correlations between duplicate spots (i.e., replicate spots on the same array) or related arrays. The function 'duplicateCorrelation' is used to estimate the inter-duplicate or inter-block correlation before using 'gls.series'. All the functions which fit linear models use 'link{getEAW}' to extract data from microarray data objects, and 'unwrapdups' which provides an unified method for handling duplicate spots. _F_o_r_m_i_n_g _t_h_e _D_e_s_i_g_n _M_a_t_r_i_x: 'lmFit' has two main arguments, the expression data and the design matrix. The design matrix is essentially an indicator matrix which specifies which target RNA samples were applied to each channel on each array. There is considerable freedom in choosing the design matrix - there is always more than one choice which is correct provided it is interpreted correctly. Design matrices for Affymetrix or single-color arrays can be created using the function 'model.matrix' which is part of the R base package. The function 'modelMatrix' is provided to assist with creation of an appropriate design matrix for two-color microarray experiments. For direct two-color designs, without a common reference, the design matrix often needs to be created by hand. _M_a_k_i_n_g _C_o_m_p_a_r_i_s_o_n_s _o_f _I_n_t_e_r_e_s_t: Once a linear model has been fit using an appropriate design matrix, the command 'makeContrasts' may be used to form a contrast matrix to make comparisons of interest. The fit and the contrast matrix are used by 'contrasts.fit' to compute fold changes and t-statistics for the contrasts of interest. This is a way to compute all possible pairwise comparisons between treatments for example in an experiment which compares many treatments to a common reference. _A_s_s_e_s_s_i_n_g _D_i_f_f_e_r_e_n_t_i_a_l _E_x_p_r_e_s_s_i_o_n: After fitting a linear model, the standard errors are moderated using a simple empirical Bayes model using 'eBayes' or 'treat'. 'ebayes' is an older version of 'eBayes'. A moderated t-statistic and a log-odds of differential expression is computed for each contrast for each gene. 'treat' tests whether log-fold-changes are greater than a threshold rather than merely different to zero. 'eBayes' and 'eBayes' use internal functions 'squeezeVar', 'fitFDist', 'tmixture.matrix' and 'tmixture.vector'. The function 'zscoreT' is sometimes used for computing z-score equivalents for t-statistics so as to place t-statistics with different degrees of freedom on the same scale. 'zscoreGamma' is used the same way with standard deviations instead of t-statistics. These functions are for research purposes rather than for routine use. _S_u_m_m_a_r_i_z_i_n_g _M_o_d_e_l _F_i_t_s: After the above steps the results may be displayed or further processed using: '_t_o_p_t_a_b_l_e' _o_r '_t_o_p_T_a_b_l_e' Presents a list of the genes most likely to be differentially expressed for a given contrast. '_t_o_p_T_a_b_l_e_F' Presents a list of the genes most likely to be differentially expressed for a given set of contrasts. '_v_o_l_c_a_n_o_p_l_o_t' Volcano plot of fold change versus the B-statistic for any fitted coefficient. '_p_l_o_t_l_i_n_e_s' Plots fitted coefficients or log-intensity values for time-course data. '_w_r_i_t_e._f_i_t' Writes an 'MarrayLM' object to a file. Note that if 'fit' is an 'MArrayLM' object, either 'write.fit' or 'write.table' can be used to write the results to a delimited text file. For multiple testing functions which operate on linear model fits, see 08.Tests. _M_o_d_e_l _S_e_l_e_c_t_i_o_n: 'selectModel' provides a means to choose between alternative linear models using AIC or BIC information criteria. _A_u_t_h_o_r(_s): Gordon Smyth _R_e_f_e_r_e_n_c_e_s: Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. _Statistical Applications in Genetics and Molecular Biology_, *3*, No. 1, Article 3. Smyth, G. K., Michaud, J., and Scott, H. (2005). The use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics 21(9), 2067-2075.