Question: GenBank to fasta sequence
0
gravatar for Manoj
21 months ago by
Manoj20
Canada
Manoj20 wrote:

Hi,

I have a large file of Genbank format of nucleotide sequence, now I need fetch fasta sequence of all entries in file.

 

sequence • 1.1k views
ADD COMMENTlink modified 21 months ago by osullivanchristopher130 • written 21 months ago by Manoj20

You can use Sequence Manipulation Suite.

ADD REPLYlink written 21 months ago by venu3.8k

from Bio import SeqIO

SeqIO.convert(infile_genbank, "genbank", outfile, "fasta")

 

http://www2.warwick.ac.uk/fac/sci/moac/people/students/peter_cock/python/genbank2fasta/

ADD REPLYlink written 21 months ago by Sishuo Wang50
0
gravatar for Pierre Lindenbaum
21 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum94k wrote:

 

something like:

$ curl -Ls "ftp://ftp.ncbi.nlm.nih.gov/genbank/gbenv80.seq.gz" | gunzip -c |\
awk '/^ACCESSION   / {printf(">%s\n",$2);next;} /^ORIGIN/ {inseq=1;next;} /^\/\// {inseq=0;} {if(inseq==0) next; gsub(/[0-9 ]/,"",$0); printf("%s\n",$0);}' |\
head -n 30

>KP304532
cgcggcctatcagcttgttggtgaggtaatggctcaccaaggcaacgacgggtagctggt
ctgagaggacgatcagccacactggaactgagacacggtccagactcctacgggaggcag
cagcagggaatcttgcgcaatgggcgaaagcctgacgcagcgacgccgcgtgggggatga
aggccttcgggttgtaaacccctttcaggagggaagaaaatgacggtacctccagaagaa
gccccggccaactacgtgccagcagccgcggtaatacgtagggggcgagcgttgtccgga
tttattgggcgtaaagggctcgtaggcggcttgacaagtcgatcgtgaaaactcagggct
caaccctgagacgccggtcgatactgtcatggctagggtccggtagaggagaatggaatt
cccggtgtagcggtgaaatgcgcagatatcgggaggaacaccagtagcgaaggcggtcct
ctgggccggtaccgacgctgaggagcgaaagcgtggggagcaaacaggattagataccct
ggtagtccacgccgtaaacgttgggtactaggtgtggcgtctttatcaacggatgccgtg
ccgaagctaacgcattaagtaccccgcctggggagtacgg
>KP304533
cgcggcctatcagcttgttggtggggtaacggcctaccaaggcatcgacgggtagctggt
ctgagaggacgatcagccacactgggactgagacacggcccagactcctacgggaggcag
cagtggggaatattgcgcaatgggcgaaagcctgacgcagcaacgccgcgtgggggatga
aggctttcgggttgtaaacccctttcagtgatgacgaaaatgacggtaatcacagaagaa
gccccggccaactacgtgccagcagccgcggtaacacgtagggggcgagcgttgtccgga
tttattgggcgtaaagagctcgtaggcggttgcgtaagtcggacgtgaaaactcagggct
caaccctgagatgccgttcgatactgcgctgactagagtccggtaggggagcatggaatt
cctggtgtagcggtgaaatgcgcagatatcaggaggaacaccagtggcgaaggcggtgct
ctgggccggaactgacgctgaggagcgaaagcatgggtagcaaacaggattagataccct
ggtagtccatgccgtaaacgttgggcactaggtgtgggacctacttaacgggttccgtgc
cgtagctaacgcattaagtgccccgcctggggagtacgg
>KP304534
cgcggcctatcagcttgttggtgaggtaacggctcaccaaggcatcgacgggtagctggt
ctgagaggacgatcagccacactgggactgagacacggcccagactcctacgggaggcag
cagtagggaatcttgcgcaatgggcgaaagcctgacgcagcaacgccgcgtgggggatga
aggccttcgggtcgtaaacccctttcagcagggacgaaaatgacggtacctgcagaagaa
ggtccggccaactacgtgccagcagccgcggtaatacgtagggaccaagcgttgtccgga

 

ADD COMMENTlink written 21 months ago by Pierre Lindenbaum94k
0
gravatar for osullivanchristopher
21 months ago by
United States
osullivanchristopher130 wrote:

why not just use efetch?

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=KP304532&rettype=fasta 

http://www.ncbi.nlm.nih.gov/books/NBK25499/

ADD COMMENTlink written 21 months ago by osullivanchristopher130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1220 users visited in the last hour