How to using GenBank assembly accession to parse out the FASTA file

0

Entering edit mode

4.8 years ago

ricordo.yan • 0

I have 100 randomly selected genomes from a dataset. Then, I want to use the GenBank assembly accession GCA_001600695.1 to parse out the fasta file for his gene. Below is my code, but did not work. Anyone have a suggestion about what command in biopython I can use?

from Bio import Entrez
handle = Entrez.efetch(db="nucleotide", id="GCA_000021505.1", rettype="fasta", retmode="text")
print(handle.read())

Assembly • 1.4k views

ADD COMMENT • link updated 4.7 years ago by Ram 43k • written 4.8 years ago by ricordo.yan • 0

0

Entering edit mode

but did not work

Do you see an error message? What does it say? What tells you that the code does not work? Please see: Ten Simple Rules for Getting Help from Online Scientific Communities

ADD REPLY • link 4.8 years ago by Ram 43k

0

Entering edit mode

File "<ipython-input-16-df2a9646a7ee>", line 3
    print(handle.read())`
                        ^
SyntaxError: invalid syntax

ADD REPLY • link updated 4.7 years ago by Ram 43k • written 4.7 years ago by ricordo.yan • 0

0

Entering edit mode

That was probably a leftover back-tick from formatting here on the website. Can you try again please?

ADD REPLY • link 4.7 years ago by Ram 43k

0

Entering edit mode

I want to use the GenBank assembly accession GCA_001600695.1 to parse out the fasta file for his gene

GCA_001600695.1 is actually assembly accession for Ascoidea asiatica. Actual genome sequence is contained in this link. You may want to try some of the ideas in this thread: Cannot get efetch to download genome - what is wrong?

ADD REPLY • link 4.8 years ago by GenoMax 141k

Login before adding your answer.