Question: GenBank to fasta sequence
0
gravatar for Manoj
19 months ago by
Manoj20
Canada
Manoj20 wrote:

Hi,

I have a large file of Genbank format of nucleotide sequence, now I need fetch fasta sequence of all entries in file.

 

sequence • 904 views
ADD COMMENTlink modified 19 months ago by osullivanchristopher130 • written 19 months ago by Manoj20

You can use Sequence Manipulation Suite.

ADD REPLYlink written 19 months ago by venu3.3k

from Bio import SeqIO

SeqIO.convert(infile_genbank, "genbank", outfile, "fasta")

 

http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/python/genbank2fasta/

ADD REPLYlink written 18 months ago by Sishuo Wang50
0
gravatar for Pierre Lindenbaum
19 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum91k wrote:

 

something like:

$ curl -Ls "ftp://ftp.ncbi.nlm.nih.gov/genbank/gbenv80.seq.gz" | gunzip -c |\
awk '/^ACCESSION   / {printf(">%s\n",$2);next;} /^ORIGIN/ {inseq=1;next;} /^\/\// {inseq=0;} {if(inseq==0) next; gsub(/[0-9 ]/,"",$0); printf("%s\n",$0);}' |\
head -n 30

>KP304532
cgcggcctatcagcttgttggtgaggtaatggctcaccaaggcaacgacgggtagctggt
ctgagaggacgatcagccacactggaactgagacacggtccagactcctacgggaggcag
cagcagggaatcttgcgcaatgggcgaaagcctgacgcagcgacgccgcgtgggggatga
aggccttcgggttgtaaacccctttcaggagggaagaaaatgacggtacctccagaagaa
gccccggccaactacgtgccagcagccgcggtaatacgtagggggcgagcgttgtccgga
tttattgggcgtaaagggctcgtaggcggcttgacaagtcgatcgtgaaaactcagggct
caaccctgagacgccggtcgatactgtcatggctagggtccggtagaggagaatggaatt
cccggtgtagcggtgaaatgcgcagatatcgggaggaacaccagtagcgaaggcggtcct
ctgggccggtaccgacgctgaggagcgaaagcgtggggagcaaacaggattagataccct
ggtagtccacgccgtaaacgttgggtactaggtgtggcgtctttatcaacggatgccgtg
ccgaagctaacgcattaagtaccccgcctggggagtacgg
>KP304533
cgcggcctatcagcttgttggtggggtaacggcctaccaaggcatcgacgggtagctggt
ctgagaggacgatcagccacactgggactgagacacggcccagactcctacgggaggcag
cagtggggaatattgcgcaatgggcgaaagcctgacgcagcaacgccgcgtgggggatga
aggctttcgggttgtaaacccctttcagtgatgacgaaaatgacggtaatcacagaagaa
gccccggccaactacgtgccagcagccgcggtaacacgtagggggcgagcgttgtccgga
tttattgggcgtaaagagctcgtaggcggttgcgtaagtcggacgtgaaaactcagggct
caaccctgagatgccgttcgatactgcgctgactagagtccggtaggggagcatggaatt
cctggtgtagcggtgaaatgcgcagatatcaggaggaacaccagtggcgaaggcggtgct
ctgggccggaactgacgctgaggagcgaaagcatgggtagcaaacaggattagataccct
ggtagtccatgccgtaaacgttgggcactaggtgtgggacctacttaacgggttccgtgc
cgtagctaacgcattaagtgccccgcctggggagtacgg
>KP304534
cgcggcctatcagcttgttggtgaggtaacggctcaccaaggcatcgacgggtagctggt
ctgagaggacgatcagccacactgggactgagacacggcccagactcctacgggaggcag
cagtagggaatcttgcgcaatgggcgaaagcctgacgcagcaacgccgcgtgggggatga
aggccttcgggtcgtaaacccctttcagcagggacgaaaatgacggtacctgcagaagaa
ggtccggccaactacgtgccagcagccgcggtaatacgtagggaccaagcgttgtccgga

 

ADD COMMENTlink written 19 months ago by Pierre Lindenbaum91k
0
gravatar for osullivanchristopher
19 months ago by
United States
osullivanchristopher130 wrote:

why not just use efetch?

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=KP304532&rettype=fasta 

http://www.ncbi.nlm.nih.gov/books/NBK25499/

ADD COMMENTlink written 19 months ago by osullivanchristopher130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1010 users visited in the last hour