I am new to RNA-seq data analysis and need some help. I sampled 1,000,000 reads from a mRNA RNA-seq experiment of yeast (SRR1177156). The library is a single-end, forward-stranded one, with 50bp reads, sequencing the reference strain S288C at standard growth conditions. I then created a reference genome index using STAR, with the latest genome sequence and annotation from GDB. I mapped the reads to the genome, using STAR again and then created a QA report using Qualimap (attached).
I was quite surprised to see that only ~53% of the reads were reliably mapped. As you can see in the attached report, I have lots of secondary and non-unique alignments. I looked at some example data sets provided by Qualimap, and it seems that mapping proportion is usually close to 100%. Can you help me think of possible reasons for why I'm getting such low values?
Thanks a lot!