Question

How Do You Get Tag Counts From Paired-End Reads?

0

Entering edit mode

10.8 years ago

KCC ★ 4.1k

I was wondering what approaches are commonly used to get tag counts from paired-end reads in ChIP-seq data. Naively, I wrote a script that takes each pair of paired-end reads and stitches them together to make a fragment. So my pipeline is:

Convert BAM to BED:

bamToBed -i reads.bam > reads.bed
Merge reads into fragments (I wrote a custom script here):

python stitch.py reads.bed > fragments.bed

EXAMPLE:

read1: chr1 5 10 +

read2: chr1 41 50 -

fragment: chr1 5 50 +

BED to BedGraph:

genomeCoverageBed -bga -i fragment.bed -g chrom.sizes > fragment.bedGraph

Basically, I am creating counts between the two reads to fill out the whole fragment. I could see someone arguing that I am creating data. So, I was wondering if my approach is similar to what others do when trying to visualize paired-end data.

paired genome coverage • 2.8k views

ADD COMMENT • link updated 10.8 years ago by Istvan Albert 100k • written 10.8 years ago by KCC ★ 4.1k

0

Entering edit mode

I would vote that this is kind of "creating" data. You're not sequencing the insert, so why use it as evidence for anything? If you need tag counts, why not count only reads mapping in a proper pair (by SAM flag)?

ADD REPLY • link 10.8 years ago by Matt Shirley 10k

score 1 · Answer 1 · 2013-07-16

1

Entering edit mode

10.8 years ago

Istvan Albert 100k

In the end this all boils down to what are you using this data for. This extension method could be useful in finding out which bases are most often covered by the fragments.

On the other hand it should not be passed to methods that make use of statistical methods to infer various attributes as it would mix data that was measured with data that was 'created'.

ADD COMMENT • link 10.8 years ago by Istvan Albert 100k

0

Entering edit mode

Yes, ultimately I am using it to provide a more refined version of tag extension which is common in ChIP-seq data.

ADD REPLY • link 10.8 years ago by KCC ★ 4.1k