bedGraph for Coverage of Insert from Paired-End Data?
1
0
Entering edit mode
7 months ago
vanbelj ▴ 40

I have bam files of paired-end sequencing data. I want a bedGraph file that quantifies coverage across the genome, where each set of reads and the insert between them to be counted once. For instance, [Read1---insert---Read2] would all receive a coverage value of 1.

From what I can tell, bam files contain alignment information for the paired reads, and while the insert between the two reads can be inferred, coverage calculations exclude the insert and only use read coverage. Is this correct?

I've found that I can convert bam to bed with bedtools using the Paired-End mode to get read-pairs on the same line. I can see how I could use awk to calculate coverage that counts Read1--insert--Read2 once, but I would rather use a package if it's available.

Does anyone know if bedtools genomecov can take either the bam or the bedPE as input and quantify coverage of the the Reads+insert across the genome?

#Taken from bedtools website.  
$ bedtools bamtobed -i reads.bam -bedpe | head -3
chr7   118965072   118965122   chr7   118970079   118970129 TUPAC_0001:3:1:0:1452#0 37     +     -
chr11  46765606    46765656    chr11  46769934    46769984 TUPAC_0001:3:1:0:1472#0 37     +     -
chr20  54704674    54704724    chr20  54708987    54709037 TUPAC_0001:3:1:1:1833#0 37     +     -
genome awk coverage bed paired-end • 489 views
ADD COMMENT
1
Entering edit mode
7 months ago
shelkmike ★ 1.2k

"bedtools genomecov" can do this.
See the options "-pc" and "-fs" at https://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html , and also the paragraph "Coverage by fragment" there.

ADD COMMENT

Login before adding your answer.

Traffic: 2419 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6