Question: Are the identified SNPs in the given plot paralogs or positive variants (alleles)?
I am trying to filter out the paralogs from alleles. Generally MAPQ values and/or coverage values at a locus gives an estimate of paralogous alignment. Other than than one needs to rely on the what is being seeing in the alignment.

In the given example (screenshot), I am little doubful with what I am seeing.

In the following IGV screen shot there are two biological samples, first one is deduped_MA605.bam (and two filtered version of it..which isn't important though) and the sample deduped_Sp164.bam. These are samples from two different populations which diverged about 10-35 K years ago. The genome data is aligned to reference genome and the screenshot (with observed variants) belong to an exome of a gene.

  • The sample MA605 has only 1 allele variant in that window, while Sp164 sample has way more variant (8 SNPs) with in 100 bp frame, which is more than expected.

  • The coverage at this locus was decent (not to high from expected) for both the samples

  • The mapping quality is at 60 for both the samples.

Could the observed vaiant be a paralog? I know it will need help from other sequences but I would like to hear opinion from people on what they think? and why they think it is a real variant vs. paralog?


