I used STAR for mapping my RNAseq data using STAR. I had used 2 pass mapping and had supplied GTF files. I am also interested in how the reads gets aligned to transcriptome. So, I used the following script during 2nd pass mapping and the script for 1st pass mapping is similar:
STAR --runThreadN 16 --runMode alignReads --genomeDir 2nDpassGenomeDir --readFilesIn Trm_PE-2ms01e_R1.fastq Trm_PE-2ms01e_R2.fastq --outFileNamePrefix 2nDpassAlignment/2ms01e/2ms01e --outFilterMultimapNmax 10 --outSAMmapqUnique Integer0to255 --outSAMtype BAM SortedByCoordinate --outReadsUnmapped Fastx --outSAMattributes All --alignIntronMin 10 --quantMode TranscriptomeSAM GeneCounts
When I look for the aligned bam file from STAR on IGV I was surprised to see that all the reads have very low mapping quality (0-3), infact reads with mapQ 0 are above 80%. The rest are mapQ 1 and 3. And this is the case for every gene.
I also ran awk to capture any reads with mapping quality above 5 and there were none.
Also, the bam file generated by aligning RNAseq reads to the transcriptome has lots of read with mapping quality 0 (zero).
For confirmation I compared the aligned genome sequence data (aligned using BWA from several samples) at the same regions for several genes and I see that there are lost of reads with mapQ above 40. Also, the reads from RNAseq and Genome sequence data share several variants at the same site.
What is causing the reads from RNAseq alignment using STAR to the same position to have such a low mapping quality?
Is something wrong with the script?
Is something different with how STAR assigns mapping quality?
or, is there some other problem?
Thanks in advance, - B