fsva {sva} | R Documentation |
This function performs frozen surrogate variable analysis as described in Parker, Corrada Bravo and Leek 2013.
The approach uses a training database to create surrogate variables which are then used
to remove batch effects both from the training database and a new data set for prediction purposes.
For inferential analysis see sva
, svaseq
, with low level functionality available
through irwsva.build
and ssva
.
fsva(dbdat, mod, sv, newdat = NULL, method = c("fast", "exact"))
dbdat |
A m genes by n arrays matrix of expression data from the database/training data |
mod |
The model matrix for the terms included in the analysis for the training data |
sv |
The surrogate variable object created by running sva on dbdat using mod. |
newdat |
(optional) A set of test samples to be adjusted using the training database |
method |
If method ="fast" then the SVD is calculated using an online approach, this may introduce slight bias. If method="exact" the exact SVD is calculated, but will be slower |
db An adjusted version of the training database where the effect of batch/expression heterogeneity has been removed
new An adjusted version of the new samples, adjusted one at a time using the fsva methodology.
newsv Surrogate variables for the new samples
library(bladderbatch) library(pamr) data(bladderdata) dat <- bladderEset[1:50,] pheno = pData(dat) edata = exprs(dat) set.seed(1234) trainIndicator = sample(1:57,size=30,replace=FALSE) testIndicator = (1:57)[-trainIndicator] trainData = edata[,trainIndicator] testData = edata[,testIndicator] trainPheno = pheno[trainIndicator,] testPheno = pheno[testIndicator,] mydata = list(x=trainData,y=trainPheno$cancer) mytrain = pamr.train(mydata) table(pamr.predict(mytrain,testData,threshold=2),testPheno$cancer) trainMod = model.matrix(~cancer,data=trainPheno) trainMod0 = model.matrix(~1,data=trainPheno) trainSv = sva(trainData,trainMod,trainMod0) fsvaobj = fsva(trainData,trainMod,trainSv,testData) mydataSv = list(x=fsvaobj$db,y=trainPheno$cancer) mytrainSv = pamr.train(mydataSv) table(pamr.predict(mytrainSv,fsvaobj$new,threshold=1),testPheno$cancer)