Question: How Do You Get Tag Counts From Paired-End Reads?
gravatar for KCC
5.7 years ago by
Cambridge, MA
KCC3.9k wrote:

I was wondering what approaches are commonly used to get tag counts from paired-end reads in ChIP-seq data. Naively, I wrote a script that takes each pair of paired-end reads and stitches them together to make a fragment. So my pipeline is:

  • Convert BAM to BED:

    bamToBed -i reads.bam > reads.bed

  • Merge reads into fragments (I wrote a custom script here):

    python reads.bed > fragments.bed


read1: chr1 5 10 +

read2: chr1 41 50 -

fragment: chr1 5 50 +

  • BED to BedGraph:

    genomeCoverageBed -bga -i fragment.bed -g chrom.sizes > fragment.bedGraph

Basically, I am creating counts between the two reads to fill out the whole fragment. I could see someone arguing that I am creating data. So, I was wondering if my approach is similar to what others do when trying to visualize paired-end data.

genome paired coverage • 1.8k views
ADD COMMENTlink modified 5.7 years ago by Istvan Albert ♦♦ 79k • written 5.7 years ago by KCC3.9k

I would vote that this is kind of "creating" data. You're not sequencing the insert, so why use it as evidence for anything? If you need tag counts, why not count only reads mapping in a proper pair (by SAM flag)?

ADD REPLYlink written 5.7 years ago by Matt Shirley8.9k
gravatar for Istvan Albert
5.7 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

In the end this all boils down to what are you using this data for. This extension method could be useful in finding out which bases are most often covered by the fragments.

On the other hand it should not be passed to methods that make use of statistical methods to infer various attributes as it would mix data that was measured with data that was 'created'.

ADD COMMENTlink written 5.7 years ago by Istvan Albert ♦♦ 79k

Yes, ultimately I am using it to provide a more refined version of tag extension which is common in ChIP-seq data.

ADD REPLYlink written 5.7 years ago by KCC3.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1259 users visited in the last hour