controlStatus package:limma R Documentation _S_e_t _S_t_a_t_u_s _o_f _e_a_c_h _S_p_o_t _f_r_o_m _L_i_s_t _o_f _S_p_o_t _T_y_p_e_s _D_e_s_c_r_i_p_t_i_o_n: Determine the type (or status) of each spot in the gene list. _U_s_a_g_e: controlStatus(types, genes, spottypecol="SpotType", regexpcol, verbose=TRUE) _A_r_g_u_m_e_n_t_s: types: dataframe containing spot type specifiers, usually input using 'readSpotTypes' genes: dataframe containing the microarray gene list, or an 'RGList', 'MAList' or 'MArrayList' containing 'genes' spottypecol: integer or name specifying column of 'types' containing spot type names regexpcol: vector of integers or column names specifying columns of types containing regular expressions. Defaults to any column names in common between 'types' and 'genes'. verbose: logical, if 'TRUE' then progess on pattern matching is reported to the standard output channel _D_e_t_a_i_l_s: This function constructs a vector of status codes by searching for patterns in the gene list. The data frame 'genes' contains gene IDs and should have as many rows as there are spots on the microarrays. Such a data frame is often read using 'readGAL'. The data frame 'types' has as many rows as you want to distinguish types of spots in the gene list. This data frame should contain a column or columns, the 'regexpcol' columns, which have the same names as columns in 'genes' and which contain patterns to match in the gene list. Another column, the 'spottypecol', contains the names of the spot types. Any other columns are assumed to contain plotting parameters, such as colors or symbols, to be associated with the spot types. The patterns in the 'regexpcol' columns are simplified regular expressions. For example, 'AA*' means any string starting with 'AA', '*AA' means any code ending with 'AA', 'AA' means exactly these two letters, '*AA*' means any string containing 'AA', 'AA.' means 'AA' followed by exactly one other character and 'AA\.' means exactly 'AA' followed by a period and no other characters. Any other regular expressions are allowed but the codes '^' for beginning of string and '$' for end of string should not be included. Note that the patterns are matched sequentially from first to last, so more general patterns should be included first. For example, it is often a good idea to include a default spot-type as the first line in 'types' with pattern '*' for all 'regexpcol' columns and default plotting parameters. _V_a_l_u_e: Character vector specifying the type (or status) of each spot on the array. Attributes contain plotting parameters associated with each spot type. _A_u_t_h_o_r(_s): Gordon Smyth _S_e_e _A_l_s_o: An overview of LIMMA functions for reading data is given in 03.ReadingData. _E_x_a_m_p_l_e_s: genes <- data.frame(ID=c("Control","Control","Control","Control","AA1","AA2","AA3","AA4"), Name=c("Ratio 1","Ratio 2","House keeping 1","House keeping 2","Gene 1","Gene 2","Gene 3","Gene 4")) types <- data.frame(SpotType=c("Gene","Ratio","Housekeeping"),ID=c("*","Control","Control"),Name=c("*","Ratio*","House keeping*"),col=c("black","red","blue")) status <- controlStatus(types,genes)