Question

crlmm problem: NChannelSet error

0

Entering edit mode

4.7 years ago

scott.hazelhurst ▴ 10

I am trying to use crlmm to process some idat files. Using genotype.illumina I get the following error.

Error in validObject(.Object) : 
  invalid class “NChannelSet” object: sampleNames differ between phenoData and protocolData
In addition: Warning messages:
1: Setting row names on a tibble is deprecated. 
2: In data.frame(isSnp = isSnp, position = position, chromosome = chromosome,  :
  NAs introduced by coercion
3: Setting row names on a tibble is deprecated. 
4: Setting row names on a tibble is deprecated.

I've spent several hours trying to resolve without success, I would very much appreciate help.

I've tracked the problem down to this call in crlmm-illumina.R

RG = new("NChannelSet",
    R = initializeBigMatrix(name = "R", nr = nprobes, nc = narrays, vmode = "integer"),
    G = initializeBigMatrix(name = "G", nr = nprobes, nc = narrays, vmode = "integer"),
   zero = initializeBigMatrix(name = "zero", nr = nprobes, nc = narrays, vmode = "integer"),
    annotation=headerInfo$Manifest[1],
   phenoData=pd, storage.mode="environment")

but have got no further.

The code I am calling is this

sample=read_excel("/data/testbatch.xlsx")
names = paste(sample$"Array Info.S",sample$"Sentrix ID",sep="_")
annof =read.table("/aux/A3.annot",sep=",",header=TRUE)
genotype.Illumina(sampleSheet=sample,
     path="/data/all",
     arrayNames = names,
     arrayInfoColNames=list(barcode="Array Info.S", position="Sentrix ID"),
     highDensity=TRUE, sep="_",
     fileExt=list(green="Grn.idat", red="Red.idat"),
     XY=NULL, anno=annof, genome="hg19", call.method="krlmm", trueCalls=NULL,
     cdfName='nopackage', copynumber=FALSE, batch=NULL,
     saveDate=FALSE, verbose=TRUE)

I have

> colnames(annof)
 [1] "IlmnID"           "featureNames"     "IlmnStrand"       "SNP"             
 [5] "AddressA_ID"      "AlleleA_ProbeSeq" "AddressB_ID"      "AlleleB_ProbeSeq"
 [9] "GenomeBuild"      "chromosome"       "position"         "Ploidy"          
[13] "Species"          "Source"           "SourceVersion"    "SourceStrand"    
[17] "SourceSeq"        "TopGenomicSeq"    "BeadSetID"        "Exp_Clusters"    
[21] "RefStrand"        "CHR"              "SNP_y"            "A1"              
[25] "A2"               "MAF"              "NCHROBS"          "isSnp"

with values that look reasonable and

> head(names)
[1] "202351410061_R08C01" "202351410061_R01C01" "202351410061_R06C01"
[4] "202351410061_R02C01" "202351410061_R03C01" "202351410061_R07C01"

which matches what's at the path.

Thank you for any help.

crlmm biobase • 1.2k views

ADD COMMENT • link 4.7 years ago by scott.hazelhurst ▴ 10

0

Entering edit mode

Where is your pd object created? The error indicates that the phenoData (pd) does not align with your protocolData

ADD REPLY • link 4.7 years ago by Kevin Blighe 87k

0

Entering edit mode

It's not my pd object -- at least directly. The "pd" object is created inside the clrmm-illumina.R code and so I am hoping that somone who knows the clrmm code can help. I have tried to trace through the code and understand what's happening but not managed. Moreover the Biobase call which creates the NChannelSet object does not specify protocolData and looking at the Biobase code, it seems like in that case the protocolData object's default value is the same as the phenoData object. Obvoiusly my understanding is wrong somewhere, but I am stuck.

ADD REPLY • link 4.7 years ago by scott.hazelhurst ▴ 10

0

Entering edit mode

I doubt that many here are intimately familiar with the CRLMM code. I've used CRLMM in the past but via other programs - that was back when I did not even call myself a bioinformatician. All that I can say is work through it line by line in order to find out where the error is being propagated. You are already doing this, though, it seems.

ADD REPLY • link 4.7 years ago by Kevin Blighe 87k