DiffBind consensus peak set query
1
0
Entering edit mode
8 months ago
brisbio ▴ 30

I know that in making a consensus peakset you can set a threshold of how many samples the peak has to occur in. However I want to know if you can also set this by group. Say I have 3 groups A, B and C and 4 samples in each. Instead of setting the threshold to occur in say 4 out of 12 samples can I specify that I only want to include peaks that occur in at least 2 out of 4 samples in each each group?

DiffBind • 414 views
0
Entering edit mode
8 months ago
Rory Stark ★ 1.2k

This is the consensus-of-consensus approach, where you first make a consensus peakset for each sample group, then combine them in an overall consensus you use for counting. This is discussed in Section 8.2 of the DiffBind vignette.

The workflow is as follows (assuming the group designation A, B, and C are stored as the Condition attribute):

myDBA <- dba(sampleSheet = "mysamples.csv")
myDBA.cons <- dba.peakset(myDBA, consensus = DBA_CONDITION, minOverlap = 2)
myDBA.cons <- dba(myDBA.cons, mask = myDBA.cons$masks$Consensus, minOverlap = 1)
consensus.peaks <- dba.peakset(myDBA.cons, bRetrieve = TRUE)
myDBA.counts <- dba.count(myDBA, peaks = consensus.peaks)

0
Entering edit mode

Thank you so much Rory, that is really helpful!