STAR multiple mapping
12 months ago

Hello Everyone.

I am running STAR aligner for a single small read. The read details are mentioned below. I am not understanding the reason behind this mapping. I expected maximum of 2 mismatches in alignment as per my command. I would appreciate if the experts could provide me with an explanation.

Command used:

STAR --genomeDir referencegenome --runThreadN 16 --readFilesIn testE1.fastq --genomeLoad LoadAndKeep --outFilterMismatchNmax 2 --limitBAMsortRAM 75000000000 --sjdbOverhang 100 --outFileNamePrefix OutfileE1 --outSAMunmapped Within --outSAMattributes Standard


SAM output:

GGGTTCCAGGAGACCCGGGTTCGTTCCCGGCATCGGAG  16  5   9815402 0   15M1D18M1S  *   0   0   CTCCGATGCCGGGAACGAACCCGGGTCTCCTGGA
GGGTTCCAGGAGACCCGGGTTCGTTCCCGGCATCGGAG  272 5   18542980    0   15M1D18M1S  *   0   0   CTCCGATGCCGGGAACGAACCCGGGTCTCCTGGA
GGGTTCCAGGAGACCCGGGTTCGTTCCCGGCATCGGAG  272 5   26653270    0   15M1D18M1S  *   0   0   CTCCGATGCCGGGAACGAACCCGGGTCTCCTGGA
GGGTTCCAGGAGACCCGGGTTCGTTCCCGGCATCGGAG  272 4   17500807    0   15M1D18M1S  *   0   0   CTCCGATGCCGGGAACGAACCCGGGTCTCCTGGA
GGGTTCCAGGAGACCCGGGTTCGTTCCCGGCATCGGAG  256 1   2939455 0   1S18M1D15M  *   0   0   TCCAGGAGACCCGGGTTCGTTCCCGGCATCGGAG
GGGTTCCAGGAGACCCGGGTTCGTTCCCGGCATCGGAG  272 1   6995282 0   15M1D18M1S  *   0   0   CTCCGATGCCGGGAACGAACCCGGGTCTCCTGGA


Kindly let me know if you need any further details.

Thanks and regards,

Niranjan

12 months ago
swbarnes2 9.8k

Looks like two mismatches to me, a one base deletion in the middle, and one base soft clipped at the end.

Could you please explain how it is 2 mismatches? May be what I am assuming about mismatches is very incorrect. Something like this would really help.

        GGG TTCCAGG A G ACCCGGGTTCGTTCCCGGCATCGGAG
CTCC  GATGCCGGGAACGAACCCGGGT C T CC     T GGA