Trouble with sliding window for FASTA file in python
0
1
Entering edit mode
2.5 years ago
asidhu ▴ 10

I'm struggling to create a sliding window that will loop through sequences (first 30 nucleotides) and identify the forward primer to later trim the primer from the sequences. The file being used as a FASTA file with around 3400 sequences.

filename = "paired.fasta"

min_length = 150
mismatches = 3

fprimer = np.asarray(list("GTGCCAGCMGCCGCGGTAA"))
f_len = len(fprimer)

rprimer = "ACAGCCATGCANCACCT"
rprimer = np.asarray(list(reverse_complement(rprimer)))
r_len = len(rprimer)

forward_region = sequences[0:30]
reverse_region = sequences[-30]
winSizeF = len(forward_region) - len(fprimer) + 1
winSizeR = len(reverse_region) - len(rprimer) + 1

for header, sequence in good_reads.items():
    for i in range(winSizeF):
        start = i
        end = i + len(fprimer)
        target = forward_region[start:end]
        distance = hamming(fprimer, winSizeF)
        if distance == 0:
            break

When I run the for loop I get the following error:

line 103
    """ for header, data in sequences.items():
          ^
SyntaxError: invalid syntax

Any help with this sliding window would be appreciated.

python • 869 views
ADD COMMENT
2
Entering edit mode

The code with the error is not shown.

line 103
    """ for header, data in sequences.items():
          ^
SyntaxError: invalid syntax
ADD REPLY
0
Entering edit mode

Your line 103 is not in your example, maybe you missed a paranthesis or a braket before line 103

ADD REPLY

Login before adding your answer.

Traffic: 2197 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6