How to make a read count matrix from multiple bed files generated by ROSE
19 months ago
Researcher ▴ 70

Hi All, I have 10 bed files generated by ROSE, each with distinct start-end cordinates for SuperEnhancers (SEs) from 10 different samples, 5 of them are from one condition and rest 5 are from other. In order to check the differential binding for these SEs between the two conditions, I have parsed these sample-wise bedfiles as chip-seq peaks to the DiffBind along with their bam file using a sample-sheet and given command.

2cond_K27Ac_SE.csv has the following info:

SampleID Condition bamReads ControlID bamControl Peaks PeakCaller

sampleA C1 sampleA.bam A_Input A_Input.bam sampleA_K27Ac_SE.bed bed

sampleB C1 sampleB.bam B_Input B_Input.bam sampleB_K27Ac_SE.bed bed


sampleF C2 sampleF.bam F_Input F_Input.bam sampleF_K27Ac_SE.bed bed

sampleG C2 sampleG.bam G_Input G_Input.bam sampleG_K27Ac_SE.bed bed

samples <- read.csv("2cond_K27Ac_SE.csv")
DBdata <- dba(sampleSheet=samples)

DBdata_count <- dba.count(DBdata,score=DBA_SCORE_TMM_MINUS_FULL_CPM)
counts <- dba.peakset(DBdata_count,bRetrieve=TRUE,DataType=DBA_DATA_FRAME, consensus=TRUE)

I am not sure will it be a recommended approach to get the normalized read count to perform differential binding. I am looking for your suggestions, please share your thoughts.


