Hi
I am trying to understand SNP calling in Illumina pair-end sequencing. I am having 30 genotypes od a species. After mapping individually all genotypes to reference with BWA and processing through samtools, After that we a bam file for each genotype.
Now I am not understanding mpileup step to call SNP for each genotype.
For given gene we can I have two alleles, so they may have some nucleotide difference within genotypes than How we can call them as SNP when comparing with reference. see position 7
1 2 3 4 5 6 7 8 9
Refere: A T A T A T G C G
for 1 : A T A T A T A C G
for 2 : A T A T A T A C G
for 3 : A T A T A T A C G
rev 4 : A T A T A T G C G
for pair-end sequencing for can be forward read and rev can be reverse read from opposite strand. I am confused.
Thanks for reply. How genome sequence is assembled, if sometimes coding sequence present is opposite strand?
Like In Radish, there is 9 chromosome pair total 18 chromosomes. But in the genome sequence, only 9 chromosome sequence is considered as whole genome sequence.
I am taking those 9 chromosome sequences as the reference for SNP detection.
Typically, if there are multiple copies of chromosomes, people only consider one of them, and consider SNPs with respect to the canonical copy.