Fasta to json via python dictionary
1
1
Entering edit mode
2.1 years ago

Good day,

Could you please help with my code: I need to convert from fasta file to json-object and I'm doing this via python dictionary. I have stock with two things:

from Bio import SeqIO

from Bio.SeqIO import parse

my_dict = {}

with open("Salm_ser_Enteritidis.fasta", 'r') as new_fasta:

    for x in SeqIO.parse(new_fasta, 'fasta'):

        dataset = x.id

        sequence = x.seq

        my_dict = {"dataset": x.id,

                 "sequence": x.seq}

print(my_dict)

Output is

{'dataset': 'PYJS01003085.1', 'sequence': 
Seq('TCTCATCCGCCAAAACATCTTCGGCGTTGTAAGGTTAAGCCTCACGGTTCATTA...TTA')

but I need only actial sequence without word Seq for key 'sequence', so that output was:

{'dataset': 'PYJS01003085.1', 'sequence': 

'TCTCATCCGCCAAAACATCTTCGGCGTTGTAAGGTTAAGCCTCACGGTTCATTA...TTA'

and the second problem:

import json
my_json = json.dumps(my_dict)

output TypeError: Object of type Seq is not JSON serializable

but via print(type(my_dict)) I have output that my_dict is <class 'dict'>, not type Seq. How to convert to dict for further # json?

Thanks in advance, Galina

dictionary fasta python json • 1.6k views
ADD COMMENT
2
Entering edit mode
2.1 years ago
Carambakaracho ★ 3.2k

Hi, you generate a SeqRecord Object and the seqrecord.seq is a seq object, not a string. You can convert it to a string with an explicit string cast (conversion) via str(seqrecord.seq), see here

from Bio import SeqIO
import json

# from Bio.SeqIO import parse  # you won't need that, you imported the whole SeqIO package above

my_dict = {}

with open("Salm_ser_Enteritidis.fasta", 'r') as new_fasta:
    for x in SeqIO.parse(new_fasta, 'fasta'):
        # dataset = x.id
        # sequence = x.seq # you won't need that, if the variables go out of scope and aren't used
        my_dict = {
            "dataset": x.id,
            "sequence": str(x.seq) # string cast to SeqRecord seq object
        }

my_json = json.dumps(my_dict)  # maybe use json.dump directly to a file (use pretty print option, eg. indent=2)?

json.dump reference

ADD COMMENT
1
Entering edit mode

thank you a lot :)

I've used dump and now it's looks awsome !))

with open('my_dict.json', 'w') as f: json.dump(my_dict, f, indent=2)

ADD REPLY
1
Entering edit mode

ya the other functions in the link i posted should be helpful afterwards as well ..

ADD REPLY

Login before adding your answer.

Traffic: 2701 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6