Tophat Pair-End Rnaseq Statistics
2
2
Entering edit mode
12.2 years ago
Xiaomin Yu ▴ 20

Hi, I am analyzing pair-end RNAseq with tophat, I want to know how many fragments (paired) mapped to sense/antisense of the reference sequencing, and how many single reads (not paired) mapped to sense/antisense of the reference, what's tools I could use for this purpose? Thanks.

tophat • 3.6k views
ADD COMMENT
1
Entering edit mode
12.2 years ago

You can get forward/reverse strand information by looking at the bitwise flag of the BAM/SAM file. [?]Here[?] is a good blog post on the bitwise flags. The post list some relavant flags:

0×0001 1 the read is paired in sequencing, no matter whether it is mapped in a pair
0×0002 2 the read is mapped in a proper pair (depends on the protocol, normally inferred during alignment)
0×0004 4 the query sequence itself is unmapped
0×0008 8 the mate is unmapped
0×0010 16 strand of the query (0 for forward; 1 for reverse strand)
0×0020 32 strand of the mate
0×0040 64 the read is the first read in a pair
0×0080 128 the read is the second read in a pair
0×0100 256 the alignment is not primary (a read having split hits may have multiple primary alignment records)

You can get the bitwise flag in python or perl with the bitwise operator '&'. For example in python:

if flag & 16:
   strand = 'reverse'
else
   strand = 'forward'
ADD COMMENT
0
Entering edit mode
12.1 years ago
Wen.Huang ★ 1.2k

perhaps use a pipe to filter your bam file first?

samtools view -b -f 0x0010 input.bam | samtools flagstat
ADD COMMENT
0
Entering edit mode

Because the tophat will output multiple records for one read, the "samtools view -f " will not work.

ADD REPLY

Login before adding your answer.

Traffic: 3000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6