Question: Is there a way to obtain an embl file from a GenBank file (.gbk)?
0
gravatar for Lil Potato
6 months ago by
Lil Potato0
Lil Potato0 wrote:

I have a GenBank file (reference genome) and need to convert it to a corresponding embl file. I have no idea how to do this. I have tried to use EMBOSS-SEQRET (online tool), but I get an error saying that the GenBank file exceeded the size limit.

Any help would be greatly appreciated!

ADD COMMENTlink modified 6 months ago • written 6 months ago by Lil Potato0

I have tried to use EMBOSS-SEQRET (online tool), but I get an error saying that the GenBank file exceeded the size limit.

You can use the same tool on the command line. You will need to download and install EMBOSS package if you don't have it locally available.

ADD REPLYlink written 6 months ago by genomax70k
2
gravatar for jrj.healey
6 months ago by
jrj.healey13k
United Kingdom
jrj.healey13k wrote:

You can do this with BioPython.

Have a look at the convert method of SeqIO here.

Here's a one liner you can use:

$ python -c "import sys; from Bio import SeqIO; SeqIO.convert(sys.stdin, 'embl', sys.stdout, 'genbank');" < infile.embl 

# Append > outfile.gbk if you want to write to a file.

You will, of course, need BioPython. I suggest installing it via anaconda.

Example input:

ID   AB000263 standard; RNA; PRI; 368 BP.
XX
AC   AB000263;
XX
DE   Homo sapiens mRNA for prepro cortistatin like peptide, complete cds.
XX
SQ   Sequence 368 BP;
     acaagatgcc attgtccccc ggcctcctgc tgctgctgct ctccggggcc acggccaccg        60
     ctgccctgcc cctggagggt ggccccaccg gccgagacag cgagcatatg caggaagcgg       120
     caggaataag gaaaagcagc ctcctgactt tcctcgcttg gtggtttgag tggacctccc       180
     aggccagtgc cgggcccctc ataggagagg aagctcggga ggtggccagg cggcaggaag       240
     gcgcaccccc ccagcaatcc gcgcgccggg acagaatgcc ctgcaggaac ttcttctgga       300
     agaccttctc ctcctgcaaa taaaacctca cccatgaatg ctcacgcaag tttaattaca       360
     gacctgaa                                                                368
//

Example output:

LOCUS       AB000263                 368 bp    RNA              PRI 01-JAN-1980
DEFINITION  Homo sapiens mRNA for prepro cortistatin like peptide, complete
            cds..
ACCESSION   AB000263
VERSION     AB000263
KEYWORDS    .
SOURCE      .
  ORGANISM  .
            .
FEATURES             Location/Qualifiers
ORIGIN
        1 acaagatgcc attgtccccc ggcctcctgc tgctgctgct ctccggggcc acggccaccg
       61 ctgccctgcc cctggagggt ggccccaccg gccgagacag cgagcatatg caggaagcgg
      121 caggaataag gaaaagcagc ctcctgactt tcctcgcttg gtggtttgag tggacctccc
      181 aggccagtgc cgggcccctc ataggagagg aagctcggga ggtggccagg cggcaggaag
      241 gcgcaccccc ccagcaatcc gcgcgccggg acagaatgcc ctgcaggaac ttcttctgga
      301 agaccttctc ctcctgcaaa taaaacctca cccatgaatg ctcacgcaag tttaattaca
      361 gacctgaa
//
ADD COMMENTlink modified 6 months ago • written 6 months ago by jrj.healey13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 935 users visited in the last hour