number of reads mapped
1
0
Entering edit mode
8.6 years ago

Hi,

I have human cell line paired end RNA-seq data that I have mapped to the human genome concatenated with a viral genome, as I would like to calculate FPKM values for some viral genes I need to know the total number of mapped reads in my samples, how do I retrieve that? As I understood the samtools flagstat gives me the number of alignments in the bam file but without excluding multimappers or chimeric entries..

Thanks!

RNA-Seq FPKM samtools • 2.2k views
ADD COMMENT
0
Entering edit mode

You could also filter reads based on Mapping qualities and count them. or you could also use FLAG information to count different set of reads.

ADD REPLY
0
Entering edit mode
8.6 years ago

Although I don't completely understand what value you are looking for exactly, flagstat gives you all information stored within the FLAG field of the SAM (or BAM) file. If the aligner that you used set the flags according to the SAM format specifications, you should be able to identify secondary and supplementary alignments and flagstat will count them, too.

The SAM format specification specifically mentions how chimeric reads should be represented:

Typically, one of the linear alignments in a chimeric alignment is considered the "representative" alignment, and the others are called "supplementary" and are distinguished by the supplementary alignment flag. All the SAM records in a chimeric alignment have the same QNAME and the same values for 0x40 and 0x80 flags.

In order to exclude reads with a certain flag, you can use: samtools view -F <FLAG value to exclude> <in.sam>

ADD COMMENT
0
Entering edit mode

The value I am looking for is the value a can use to normalizer into FPKM values.

ADD REPLY

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6