how to distinguish between artifacts and real indels?
Entering edit mode
6.5 years ago
genya35 ▴ 40


I’m working with multiple sequences (fasta format) of the same human gene (exons 2,3,4). Each sequence is about 5300 nucleotides long. When I import the sequences into MEGA I can see that there are multiple “-“ deletions and “|” insertions in random places. These sequences came from a software that assigned the genotype allele to each sequence. When I view multiple sequences that presumable belong to the same genotype allele, I see that they don’t exactly align due to these “artifacts” .

Should I assume these are artifacts indels and remove them before doing the alignment? My goal is to find new variants outside the exon 2. How would I know these are variants and not artifacts?

Any advice will be greatly appreciated.


alignment • 1.3k views

Login before adding your answer.

Traffic: 1848 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6