I am doing the differential binding analysis of some human histone ChIP-seq samples (bowtie2 mapping to hg19, duplicate marking, MACS2 narrow peak calling, blacklisted regions filtering with bedtools intersect and Anshul Kundaje's BED file for hg19). The samples are H, Y and Z and I'm comparing Y vs H and Z vs H (H is the control group). First, I did all that for with version hg38 and everything went fine. For some purpose, I needed to redo everything using the hg19 version of the genome. At the DiffBind step, the same script that did not have any problems before had only one error when dealing with the Y vs H comparison:
DiffBind: Error in `.rowNamesDF<-`(x, value = value) : invalid 'row.names' length
at the point of my script that I was doing a volcano plot (here you have the code I was using):
sampleInfo<-cbind.data.frame(treatmentList$SampleID,treatmentList$Condition,Species,bamReads,inputList$ControlID,bamControl,Peaks,PeakCaller) results=dba(sampleSheet=sampleInfo) results=dba.count(results,minOverlap=1,score=countNorm) results$contrasts=NULL results=dba.contrast(results,results$masks$Y,results$masks$H,unique(Condition),unique(Condition),minMembers=min(c(table(Condition))),categories=DBA_CONDITION) results=dba.analyze(results, method=DBA_DESEQ2) pdf(paste(diffBindDir,"volc_plot.pdf",sep = "")) dba.plotVolcano(results) dev.off()
(entry 2 in the Condition vector is Y and entry 1 is H). After the volcano plot code, there's MAplot code and this one was produced without any problems.
The concrete problem:
How can it be that that sample is giving me this error? Since DiffBind has so many stuff inside each of its objects, does someone have some hints of what can be wrong? Thank you !
MacOS High Sierra v10.13.6, Rstudio Version 1.1.463, R version 3.5.2 (2018-12-20) -- "Eggshell Igloo", DiffBind version 2.10.0