Hello everybody,
Using Python, I'm trying to create 30-mers from DNA sequences in a fasta file in order to run a local BLAST analysis on it later. So the end goal is to have a list that contains many lists of these 30-mer sequences that I can turn into a fasta file. To do this, I need to maintain all the fasta information throughout the subsequent code, but when I try to run it, I get an error that states "TypeError: unhashable type: 'SeqRecord''.
The weird thing is that when run the code on my home computer, it handles the code just fine. It's only when I run it on a ssh that links to a Linux server that it gives me this error. Both use Biopython 1.67, but the ssh utilizes Python 3.5 while my home computer runs Python 2.7.
For practical reasons, I cannot run this code on my home computer, so I need to find a way to circumvent or fix this error.
Here is the code that I used and the error message that pops up when running the code from ssh. Thanks!
from Bio import SeqIO
def find_kmers(string, k):
kmers = []
n = len(string)
for i in range(0, n-k+1):
kmers.append(string[i:i+k])
return list(set(kmers))
ltr_seq = SeqIO.parse(open('HIV_Align_5\'LTR_no_gaps_nor_high_gaps.fasta'), "fasta")
all_kmer_list = []
for i in ltr_seq:
all_kmer_list.append(find_kmers(i, 30))
File "HIV_5'LTR_30_mer_BLAST.py", line 24, in <module>
all_kmer_list.append(find_kmers(i, 30))
File "HIV_5'LTR_30_mer_BLAST.py", line 18, in find_kmers
return list(set(kmers))
TypeError: unhashable type: 'SeqRecord'