Hi everybody, I have a set of RNA-seq (single end files). After mapping with STAR (himan genome)
STAR --genomeDir /index --readFilesCommand zcat --readFilesIn file1.fastq.gz --runThreadN 16 --outSAMtype BAM SortedByCoordinate --outWigType bedGraph
I have a huge % of reads mapped to multiple loci
Number of input reads | 49078760 Average input read length | 74 UNIQUE READS: Uniquely mapped reads number | 9806908 Uniquely mapped reads % | 19.98% Average mapped length | 73.42 Number of splices: Total | 1022832 Number of splices: Annotated (sjdb) | 985106 Number of splices: GT/AG | 1005417 Number of splices: GC/AG | 6850 Number of splices: AT/AC | 496 Number of splices: Non-canonical | 10069 Mismatch rate per base, % | 0.94% Deletion rate per base | 0.02% Deletion average length | 2.33 Insertion rate per base | 0.01% Insertion average length | 1.22 MULTI-MAPPING READS: Number of reads mapped to multiple loci | 36919747 % of reads mapped to multiple loci | 75.23% Number of reads mapped to too many loci | 186914 % of reads mapped to too many loci | 0.38% UNMAPPED READS: % of reads unmapped: too many mismatches | 0.00% % of reads unmapped: too short | 3.97% % of reads unmapped: other | 0.45%
I would be grateful for any suggestions.