Tophat: Proper-Paired Reads
2
0
Entering edit mode
11.0 years ago
ai • 0

Hi Everyone,

I tried Tophat v 2.0.6 and GSNAP v 2013-02-05 to align Illumina paired-end reads. However, the numbers of proper-paired reads were very different. I also used cufflinks for assembly, but got similar number of genes/isoforms. Is there anything wrong with my commands? Why did Tophat result in such low rates of proper-paired reads? Thanks!

To get proper-paired reads: samtools view -f 0x2 accepted_hits.bam | cut -f1 | sort | uniq | wc -l

Tophat comands:

tophat -G genes.gtf -o tophat --no-novel-juncs genome read_1 read_2

Results: 1,431,500 (6.34%) proper-paired reads; 3746 genes (cufflinks).

GSNAP command

gsnap -A sam -N 0 -D ~/Software/gmap-2013-02-05/ -d mm10 -s mm10.splicesites.iit read_1 read_2

Results: 20,543,759 (90.9%) proper-paired reads;3959 genes (cufflinks).

tophat • 2.7k views
ADD COMMENT
0
Entering edit mode
11.0 years ago

Isn't cut f1 | uniq going to compress two properly paired reads into a single read name?

What does samtools flagstat accepted_hits.bam tell you is the # of properly paired reads?

ADD COMMENT
0
Entering edit mode
11.0 years ago

Well this is not an answer but you should try '-c' to count. For e.g. samtools view -c -f 0x2 accepted_hits.bam. It should give you the counts of the properly paired mapped reads.

ADD COMMENT

Login before adding your answer.

Traffic: 2699 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6