I am trying to count the number of haplotypes in a group of sequences using biopython. My sequences have IUPAC ambiguities, which should not count as differences. E.g.:
ACTYG == ACTCG should be: True
I've tried to compare using == but biopython seems to read the sequences as strings for this comparison and does not incorporate the IUPAC codes?
I've also tried to use
hapcount = len(recordlist) for a, b in itertools.combinations(recordlist, 2): if calculator._pairwise(a, b) == 0: print("match") hapcount -= 1
but the pairwise comparison seems to count IUPAC codes as differences too??
Any ideas on how to do this would be much appreciated. It seems like something simple.