Question: GenBank to fasta sequence
0
gravatar for Manoj
2.0 years ago by
Manoj20
Canada
Manoj20 wrote:

Hi,

I have a large file of Genbank format of nucleotide sequence, now I need fetch fasta sequence of all entries in file.

 

sequence • 1.3k views
ADD COMMENTlink modified 2.0 years ago by osullivanchristopher130 • written 2.0 years ago by Manoj20

You can use Sequence Manipulation Suite.

ADD REPLYlink written 2.0 years ago by venu4.3k

from Bio import SeqIO

SeqIO.convert(infile_genbank, "genbank", outfile, "fasta")

 

http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/python/genbank2fasta/

ADD REPLYlink written 24 months ago by Sishuo Wang50
1
gravatar for Pierre Lindenbaum
2.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum98k wrote:

 

something like:

$ curl -Ls "ftp://ftp.ncbi.nlm.nih.gov/genbank/gbenv80.seq.gz" | gunzip -c |\
awk '/^ACCESSION   / {printf(">%s\n",$2);next;} /^ORIGIN/ {inseq=1;next;} /^\/\// {inseq=0;} {if(inseq==0) next; gsub(/[0-9 ]/,"",$0); printf("%s\n",$0);}' |\
head -n 30

>KP304532
cgcggcctatcagcttgttggtgaggtaatggctcaccaaggcaacgacgggtagctggt
ctgagaggacgatcagccacactggaactgagacacggtccagactcctacgggaggcag
cagcagggaatcttgcgcaatgggcgaaagcctgacgcagcgacgccgcgtgggggatga
aggccttcgggttgtaaacccctttcaggagggaagaaaatgacggtacctccagaagaa
gccccggccaactacgtgccagcagccgcggtaatacgtagggggcgagcgttgtccgga
tttattgggcgtaaagggctcgtaggcggcttgacaagtcgatcgtgaaaactcagggct
caaccctgagacgccggtcgatactgtcatggctagggtccggtagaggagaatggaatt
cccggtgtagcggtgaaatgcgcagatatcgggaggaacaccagtagcgaaggcggtcct
ctgggccggtaccgacgctgaggagcgaaagcgtggggagcaaacaggattagataccct
ggtagtccacgccgtaaacgttgggtactaggtgtggcgtctttatcaacggatgccgtg
ccgaagctaacgcattaagtaccccgcctggggagtacgg
>KP304533
cgcggcctatcagcttgttggtggggtaacggcctaccaaggcatcgacgggtagctggt
ctgagaggacgatcagccacactgggactgagacacggcccagactcctacgggaggcag
cagtggggaatattgcgcaatgggcgaaagcctgacgcagcaacgccgcgtgggggatga
aggctttcgggttgtaaacccctttcagtgatgacgaaaatgacggtaatcacagaagaa
gccccggccaactacgtgccagcagccgcggtaacacgtagggggcgagcgttgtccgga
tttattgggcgtaaagagctcgtaggcggttgcgtaagtcggacgtgaaaactcagggct
caaccctgagatgccgttcgatactgcgctgactagagtccggtaggggagcatggaatt
cctggtgtagcggtgaaatgcgcagatatcaggaggaacaccagtggcgaaggcggtgct
ctgggccggaactgacgctgaggagcgaaagcatgggtagcaaacaggattagataccct
ggtagtccatgccgtaaacgttgggcactaggtgtgggacctacttaacgggttccgtgc
cgtagctaacgcattaagtgccccgcctggggagtacgg
>KP304534
cgcggcctatcagcttgttggtgaggtaacggctcaccaaggcatcgacgggtagctggt
ctgagaggacgatcagccacactgggactgagacacggcccagactcctacgggaggcag
cagtagggaatcttgcgcaatgggcgaaagcctgacgcagcaacgccgcgtgggggatga
aggccttcgggtcgtaaacccctttcagcagggacgaaaatgacggtacctgcagaagaa
ggtccggccaactacgtgccagcagccgcggtaatacgtagggaccaagcgttgtccgga

 

ADD COMMENTlink written 2.0 years ago by Pierre Lindenbaum98k
0
gravatar for osullivanchristopher
2.0 years ago by
United States
osullivanchristopher130 wrote:

why not just use efetch?

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=KP304532&rettype=fasta 

http://www.ncbi.nlm.nih.gov/books/NBK25499/

ADD COMMENTlink written 2.0 years ago by osullivanchristopher130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 904 users visited in the last hour