I'm struggling to create a sliding window that will loop through sequences (first 30 nucleotides) and identify the forward primer to later trim the primer from the sequences. The file being used as a FASTA file with around 3400 sequences.
filename = "paired.fasta" min_length = 150 mismatches = 3 fprimer = np.asarray(list("GTGCCAGCMGCCGCGGTAA")) f_len = len(fprimer) rprimer = "ACAGCCATGCANCACCT" rprimer = np.asarray(list(reverse_complement(rprimer))) r_len = len(rprimer) forward_region = sequences[0:30] reverse_region = sequences[-30] winSizeF = len(forward_region) - len(fprimer) + 1 winSizeR = len(reverse_region) - len(rprimer) + 1 for header, sequence in good_reads.items(): for i in range(winSizeF): start = i end = i + len(fprimer) target = forward_region[start:end] distance = hamming(fprimer, winSizeF) if distance == 0: break
When I run the for loop I get the following error:
line 103 """ for header, data in sequences.items(): ^ SyntaxError: invalid syntax
Any help with this sliding window would be appreciated.