Hi all,

I have set up this csv file for my differential analysis but I continue an error which I do not know how to solve. This is the set for DiffBind:

This is the error I get:

Error in Summary.factor(c(1L, 47442L, 58553L, 69664L, 80775L, 91886L,  :
‘max’ not meaningful for factors


Any help would be very much appreciated!

Thanks

Thanks! Sorry, here is the code:

db.object = dba(sampleSheet="diffbind_low_high_21072021.csv"). I have previously used this with no problem but seems like I cannot get away now.. Not sure exactly what is going wrong.

Looks like you are using _summits.bed files from MACS. Two issues here:

1. The summits files only contain a single base for each peak, not a region, so they really aren't the right ones to use. The peaks_bed or _peaks.xls, are usually used, or the narrowPeak or broadPeak files.
2. You haven't included a specification of the peak file format with the PeakCallerand/or PeakFormat columns in the samplesheet. The default format is raw, which has the score in the fourth column, while a bed format has the score in the fifth column.
Dear Rory,

Thanks for your response! I changed the summit files to NarrowPeak files and peaks.xls but still come up with this error:

Low_HER2 Low_HER2 72H DMSO 1 raw Error in if (file.info(peaks)$size > 0) { : missing value where TRUE/FALSE needed I did not fully understand your second comment. Could you please explain in a little bit more detail for novice like me please? Thanks so much! ADD REPLY 1 Entering edit mode You need to tell DiffBind the format of the peak files so it knows where to find the score. If you are using the _peaks.xls files, you should include a column in your sample sheet labelled PeakCaller with the values set to macs. If you are using the narrowPeak files, the values should be narrow. You can see an example (using bed format) as follows: samples <- read.csv(paste(system.file('extra',package='DiffBind'), "/tamoxifen.csv",sep=""))  Based on the error you are seeing, it looks like the peak files in your sample sheet can't actually be found. Try this: samples <- read.csv("diffbind_low_high_21072021.csv") file.info(samples$Peaks)


If any of the reported values are FALSE, you need to fix the sample sheet (or change the working directory) so the peak files can be read.

Thank you so much for your help. I used the .xls files and labelled the column PeakCaller. This worked!

My next code in the pipeline is:

db.object = dba.count(db.object, bParallel=TRUE, fragmentSize=0, score=DBA_SCORE_RPKM_FOLD)

But I get the following error:

Error in pv.counts(DBA, peaks = peaks, minOverlap = minOverlap, defaultScore = score, : Some read files could not be accessed. See warnings for details. In addition: Warning message: not accessible

It looked me as if it could not find my bam files or that they were not indexed based on my research. But I am sure the working directory is set correctly and that files are indexed. So I am not sure what might be happening.

Thank you!

