Entering edit mode
4.7 years ago
scott.hazelhurst
▴
10
I am trying to use crlmm to process some idat files. Using genotype.illumina I get the following error.
Error in validObject(.Object) :
invalid class “NChannelSet” object: sampleNames differ between phenoData and protocolData
In addition: Warning messages:
1: Setting row names on a tibble is deprecated.
2: In data.frame(isSnp = isSnp, position = position, chromosome = chromosome, :
NAs introduced by coercion
3: Setting row names on a tibble is deprecated.
4: Setting row names on a tibble is deprecated.
I've spent several hours trying to resolve without success, I would very much appreciate help.
I've tracked the problem down to this call in crlmm-illumina.R
RG = new("NChannelSet",
R = initializeBigMatrix(name = "R", nr = nprobes, nc = narrays, vmode = "integer"),
G = initializeBigMatrix(name = "G", nr = nprobes, nc = narrays, vmode = "integer"),
zero = initializeBigMatrix(name = "zero", nr = nprobes, nc = narrays, vmode = "integer"),
annotation=headerInfo$Manifest[1],
phenoData=pd, storage.mode="environment")
but have got no further.
The code I am calling is this
sample=read_excel("/data/testbatch.xlsx")
names = paste(sample$"Array Info.S",sample$"Sentrix ID",sep="_")
annof =read.table("/aux/A3.annot",sep=",",header=TRUE)
genotype.Illumina(sampleSheet=sample,
path="/data/all",
arrayNames = names,
arrayInfoColNames=list(barcode="Array Info.S", position="Sentrix ID"),
highDensity=TRUE, sep="_",
fileExt=list(green="Grn.idat", red="Red.idat"),
XY=NULL, anno=annof, genome="hg19", call.method="krlmm", trueCalls=NULL,
cdfName='nopackage', copynumber=FALSE, batch=NULL,
saveDate=FALSE, verbose=TRUE)
I have
> colnames(annof)
[1] "IlmnID" "featureNames" "IlmnStrand" "SNP"
[5] "AddressA_ID" "AlleleA_ProbeSeq" "AddressB_ID" "AlleleB_ProbeSeq"
[9] "GenomeBuild" "chromosome" "position" "Ploidy"
[13] "Species" "Source" "SourceVersion" "SourceStrand"
[17] "SourceSeq" "TopGenomicSeq" "BeadSetID" "Exp_Clusters"
[21] "RefStrand" "CHR" "SNP_y" "A1"
[25] "A2" "MAF" "NCHROBS" "isSnp"
with values that look reasonable and
> head(names)
[1] "202351410061_R08C01" "202351410061_R01C01" "202351410061_R06C01"
[4] "202351410061_R02C01" "202351410061_R03C01" "202351410061_R07C01"
which matches what's at the path.
Thank you for any help.
Where is your
pd
object created? The error indicates that the phenoData (pd
) does not align with your protocolDataIt's not my
pd
object -- at least directly. The "pd" object is created inside the clrmm-illumina.R code and so I am hoping that somone who knows the clrmm code can help. I have tried to trace through the code and understand what's happening but not managed. Moreover the Biobase call which creates the NChannelSet object does not specify protocolData and looking at the Biobase code, it seems like in that case the protocolData object's default value is the same as the phenoData object. Obvoiusly my understanding is wrong somewhere, but I am stuck.I doubt that many here are intimately familiar with the CRLMM code. I've used CRLMM in the past but via other programs - that was back when I did not even call myself a bioinformatician. All that I can say is work through it line by line in order to find out where the error is being propagated. You are already doing this, though, it seems.