Hi everyone.
I’m performing an H3K27ac ChIP-seq analysis on mouse samples from an early-stage model of a neurodegenerative disease. From what I’m seeing, I’m not finding many differences between healthy and diseased mice (only two biological replicates).
In my ChIP-seq analysis with DiffBind, when I reach the dba.analyze step and choose the method, I’ve run into a situation that I don’t fully understand. When I set method = DBA_ALL_METHODS, the analysis runs correctly and reports a total of 1,599 differential peaks using dba.edgeR. However, when I use method = DBA_EDGER, I get an error: “diffbind error in qr.default(contrast): na/nan/inf in foreign function call (arg 1)”. I don’t understand why this error appears, since with DBA_ALL_METHODS the error doesn’t occur and it still reports results for edgeR.
Related to this, I also tried generating MA plots, and the outcome seems quite odd, especially considering that edgeR uses stricter normalization conditions than DESeq2. My FRiP values are very low, as I mentioned… In summary, I’m not sure whether I can trust the results I’m getting, even though I believe I’m following the methodology correctly.
Given that the differences seem so subtle, how can I reliably evaluate them in this case? Thank you very much for any insights!
DIFFBIND ANALYSIS
# BUILT METADATA
> metadata <- data.frame(SampleID = samplesID, \
Factor = factor, \
Tissue = tissue, \
Condition = condition, \
Replicate = replicate, \
bamReads = bamReads, \
ControlID = controlID, \
bamControl = bamControl, \
Peaks = peaks, \
PeakCaller = peakcaller, \
peakFormat = peakformat, \
ScoreCol = 5 \
)
# CREATE DBA OBJECT
> dbObj <- dba(sampleSheet=metadata)
# CREATE CONSENSUS PEAKSET
> consensusPeaks <- dba.peakset(dbObj, consensus = DBA_CONDITION, minOverlap = 2) \
> consensus <- dba(consensusPeaks, consensusPeaks$masks$Consensus, minOverlap=1) \
> consensus.peaks <- dba.peakset(consensus, bRetrieve=TRUE)
# COUNT
> dbObj.counted <-dba.count(dbObj, peaks = consensus.peaks, summits = 500, bParallel = TRUE, bSubControl = FALSE, minOverlap=2)
# CONTRAST
> dbObj.counted <- dba.contrast(dbObj.counted, categories = DBA_CONDITION, minMembers = 2, reorderMeta=list(Condition=c("wt","htt")))
# ANALYZE
> dbObj.counted <- dba.analyze(dbObj.counted, method = DBA_ALL_METHODS)
Applying Blacklist/Greylists... \
Genome detected: Mmusculus.UCSC.mm10 \
Applying blacklist... \
Removed: 43 of 43259 intervals. \
Counting control reads for greylist... \
Building greylist: \/mnt/Disco2/IreneCE/chipseq/results/2_bowtie2/1_filter/wt_input_no_chrM_chrUn_chrY_q20_sorted_nd_sorted_NoBlackList_sorted.bam
coverage: 288256 bp (0.01%) \
Building greylist: \/mnt/Disco2/IreneCE/chipseq/results/2_bowtie2/1_filter/htt_input_no_chrM_chrUn_chrY_q20_sorted_nd_sorted_NoBlackList_sorted.bam
coverage: 510464 bp (0.02%) \
wt_input: 110 ranges, 288256 bases \
htt_input: 167 ranges, 510464 bases \
Master greylist: 203 ranges, 585216 bases \
Removed: 42 of 43216 intervals. \
Removed 85 (of 43259) consensus peaks. \
Normalize edgeR with defaults... \
Normalize DESeq2 with defaults... \
Analyzing... \
gene-wise dispersion estimates \
mean-dispersion relationship \
final dispersion estimates
# MA PLOT
> par(mfrow=c(3,1)) \
> dba.plotMA(dbObj.counted,bNormalized=FALSE) \
> dba.plotMA(dbObj.counted, method=DBA_DESEQ2) \
> dba.plotMA(dbObj.counted, method=DBA_EDGER)
I use the broadPeak option of MACS3, and i have really big peaks, i do not know if i should use the summit option
Min. 1st Qu. Median Mean 3rd Qu. Max. 266 1213 1959 2828 3290 90541