I want to create Python programs with the following specifications:
Purpose: Given a DNA sequence alignment, print to the screen any positions containing a putative polymorphism (transition, transversion or indel). The threshold for the polymorphic allele is defined by the user. The screen output will have the position and the two alleles with their counts and the minor allele frequency in an easy to read format.
The program will have the following command-line structure: findSNPs.py<input MSA><minor allele frequency threshold>
• input MSA – a DNA multiple sequence alignment in FASTA format • minor allele frequency threshold – To call an SNP at a position, a second allele must be present in at least this frequency. NOTE: Only A, T, C, and G are valid alleles. All other letters should be ignored.
I tried this scripts but no SNPs found:
from Bio import AlignIO y=0 alignment = AlignIO.read("DNA_alignment.fasta", "fasta") #print(alignment.get_alignment_length()) for r in range(0,len(alignment.seq)): if alignment[0,r] != alignment[1,r]: if alignment[0,r] != "-" and alignment[1,r] != "-": y=y+1 print (r, alignment[0,r], alignment[1,r], y) else: y=0 from snps import SNPs a= SNPs(alignment) a.snp_count
How can I solve this problem?