Question: Can we use JELLYFISH for amino acid seqeunces?
0
gravatar for marksingh1982
16 months ago by
Boston
marksingh19820 wrote:

I have a very large file of 1000's of amino acid sequences. I would like to identify ALL overlapping k-mers.

I was hoping to use JELLYFISH, but it only seems to work with nucleic acid sequences. Is there any way to use the tool to read and parse the fasta file of amino acids?

sequencing jellyfish genome • 422 views
ADD COMMENTlink modified 10 months ago by Biostar ♦♦ 20 • written 16 months ago by marksingh19820

This is a question; you're not posting about a tool. I've made the required change now, please be more careful in the future.

ADD REPLYlink written 16 months ago by RamRS22k
1
gravatar for Sej Modha
16 months ago by
Sej Modha4.2k
Glasgow, UK
Sej Modha4.2k wrote:

Jellyfish cannot deal with the protein sequences. You can use skbio iter_kmers() for this. BioPython solution:

from Bio import SeqIO
from skbio import Sequence

myfile=SeqIO.parse('test.fa','fasta')
for record in myfile:
    sequence=Sequence(str(record.seq))
    for kmer in sequence.iter_kmers(4, overlap=True):
        print(str(kmer))
ADD COMMENTlink written 16 months ago by Sej Modha4.2k

Thank you for your quick reply!

ADD REPLYlink written 16 months ago by marksingh19820
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1584 users visited in the last hour