I am using bowtie2 to align sequences to a reference genome. The results are quite disappointing: 48% of the reads align exactly 1 time and 44% of the reads aligned more than once.
I have single-end reads 55-70bp long. The reference genome is the OreoNil2 (Oreochromis niloticus).
I am not sure about this, but I guess each sequence that aligns multiple times has different score according to how good is the alignment on the reference genome. I would like to extract in a new sam file the reads that align only once (48%) and the reads with the best score among the reads that align multiple times.
Does anybody knows if this is possible and how to do something like that? Do I introduce any bias if I pick those reads?
Thanks in advance!