Error in DiffBind. Please HELP!
1
0
Entering edit mode
4 days ago
a.hayat20 • 0

Hi all,

I have set up this csv file for my differential analysis but I continue an error which I do not know how to solve. This is the set for DiffBind: enter image description here

This is the error I get:

Error in Summary.factor(c(1L, 47442L, 58553L, 69664L, 80775L, 91886L,  : 
  ‘max’ not meaningful for factors

Any help would be very much appreciated!

Thanks

Analysis NGS ATAC R DiffBind • 254 views
ADD COMMENT
0
Entering edit mode

Code?

ADD REPLY
0
Entering edit mode

Thanks! Sorry, here is the code:

db.object = dba(sampleSheet="diffbind_low_high_21072021.csv"). I have previously used this with no problem but seems like I cannot get away now.. Not sure exactly what is going wrong.

ADD REPLY
1
Entering edit mode
4 days ago
Rory Stark ★ 1.1k

Looks like you are using _summits.bed files from MACS. Two issues here:

  1. The summits files only contain a single base for each peak, not a region, so they really aren't the right ones to use. The peaks_bed or _peaks.xls, are usually used, or the narrowPeak or broadPeak files.
  2. You haven't included a specification of the peak file format with the PeakCallerand/or PeakFormat columns in the samplesheet. The default format is raw, which has the score in the fourth column, while a bed format has the score in the fifth column.
ADD COMMENT
0
Entering edit mode

Dear Rory,

Thanks for your response! I changed the summit files to NarrowPeak files and peaks.xls but still come up with this error:

Low_HER2 Low_HER2 72H DMSO 1 raw Error in if (file.info(peaks)$size > 0) { : missing value where TRUE/FALSE needed

I did not fully understand your second comment. Could you please explain in a little bit more detail for novice like me please?

Thanks so much!

ADD REPLY
1
Entering edit mode

You need to tell DiffBind the format of the peak files so it knows where to find the score. If you are using the _peaks.xls files, you should include a column in your sample sheet labelled PeakCaller with the values set to macs. If you are using the narrowPeak files, the values should be narrow. You can see an example (using bed format) as follows:

samples <- read.csv(paste(system.file('extra',package='DiffBind'),
                          "/tamoxifen.csv",sep=""))

Based on the error you are seeing, it looks like the peak files in your sample sheet can't actually be found. Try this:

samples <- read.csv("diffbind_low_high_21072021.csv")
file.info(samples$Peaks)

If any of the reported values are FALSE, you need to fix the sample sheet (or change the working directory) so the peak files can be read.

ADD REPLY
0
Entering edit mode

Thank you so much for your help. I used the .xls files and labelled the column PeakCaller. This worked!

My next code in the pipeline is:

db.object = dba.count(db.object, bParallel=TRUE, fragmentSize=0, score=DBA_SCORE_RPKM_FOLD)

But I get the following error:

Error in pv.counts(DBA, peaks = peaks, minOverlap = minOverlap, defaultScore = score, : Some read files could not be accessed. See warnings for details. In addition: Warning message: not accessible

It looked me as if it could not find my bam files or that they were not indexed based on my research. But I am sure the working directory is set correctly and that files are indexed. So I am not sure what might be happening.

Thank you!

ADD REPLY

Login before adding your answer.

Traffic: 2540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6