Good statistics for RNA-seq alignments using RSEM
Entering edit mode
2.0 years ago
Thanh • 0

Hi, After calculating the expression from my raw read files, I retrieved these statistics from the cnt file in the stat folder:

17572420 28769454 0 46341874

27465439 1304015 11918214

60305896 3

I'm quite concerned that the number of unalignable reads are 2/3 of the number of alignable reads. However, my reference transcriptome are that provided on the RSEM website which only includes RefSeq with NM prefix.

Does this mean that the unalignable reads may be belong to noncoding sequences, miRNA, etc. instead of mature RNAs? And is this alignment statistics good enough to be proceeded to differential expression analysis?

cnt rsem stat rsem-calculate-expression alignment • 726 views
Entering edit mode

Is this standard RNA-seq? Did you do poly-A enrichment or ribosomal depletion?

Some of the reasons why you might get low mapping rates:

  • There is a high level of adapter only reads
  • The inserts are very short due to RNA degradation
  • There is a high level of ribosomal RNA contamination
  • The base quality scores are very low
  • There was a mixup with the reference genome
  • There was contamination with genomic DNA

So you should look further into the QC to eliminate the above possibilities.


Login before adding your answer.

Traffic: 2305 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6