Question: Calculate percentage mapped reads from Bowtie2/RSEM
gravatar for ando.kelli
4.1 years ago by
University of Tasmania
ando.kelli40 wrote:

Hey all,

I used the following script to align my reads to a reference transcriptome (bowtie2), and calculate transcript abundance (RSEM).

$TRINITY_HOME/util/ --transcripts location_of_data/assembly.fasta --seqType fq --est_method RSEM -aln_method bowtie2 --trinity_mode --thread_count 32 --output_dir location_of_output_dir --left left_reads.fq.gz --right right_reads.fq.gz --output_prefix sample_name

It all ran without any problems. However, when I tried to calculate the percentage of reads that mapped to the reference transcriptome, the output stated that 100% of reads were mapped which is unusual.

I used samtools flagstat to generate statistics for the output .bam file for each sample, and it said 100% of reads were mapped.

I then used the RSEM script on the contents of the .stat output directory to generate statistics for the bowtie2 mapping. It said the same. 100% of reads mapped, with ~65% unique and 35 multi-aligned.

Does the .stat output directory and the output .bam file from bowtie2 only contain data for mapped reads? Is it possible that 100% of the reads did map, since the reference transcriptome was generated from those reads to begin with?

How else can I generate the statistic?

Cheers :-)

ADD COMMENTlink modified 11 months ago by rohitsatyam102220 • written 4.1 years ago by ando.kelli40

I've never seen an instance in which 100% of a nontrivial library mapped to anything. But when mapping to an assembly of those reads, it's possible. Normally that would mean you have excellent library-creation and read preprocessing procedures.

That said - you should inspect the command used in "". By default bowtie2 does output unmapped reads, but perhaps they are being suppressed or redirected.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by Brian Bushnell17k

Thanks Brian, I thought it was strange but technically not impossible. I'll have a look at the script :-)

ADD REPLYlink written 4.0 years ago by ando.kelli40
gravatar for rohitsatyam102
11 months ago by
rohitsatyam102220 wrote:

I think using Samtools Flagstat is not the best way to quantify the reads successfully mapped. As mentioned elsewhere by Devon Ryan, samtools flagstat will report more reads than actually present in fastq file also considering the secondary alignments. In such case the stats generated are merely representative and doesn't report the actual % or reads mapped (multimapping reads are counted multiple times), which could exaggerate the mapping percentages.

Please correct me if I am missing something.

For more reference, refer the discussion here

ADD COMMENTlink modified 11 months ago • written 11 months ago by rohitsatyam102220
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1169 users visited in the last hour