Question: Emboss seqret - problem conversion gff+fasta to EMBL
2
gravatar for Juke-34
3.7 years ago by
Juke-342.1k
Sweden
Juke-342.1k wrote:

Hi everyone,

I try to use the seqret tool from Emboss but I'm experiencing some difficulties.

I would like to create am EMBL formatted file from a gff3 file and a fasta file.

I'm using the following command:

seqret -sequence genome.fasta -feature -fformat gff -fopenfile annotation.gff -osformat embl

My fasta file contains several sequences.

The problem is, the tool writes the gff3 features but as many time as there is a sequence in the fasta file (before each sequence).

Does someone has already experienced that and knows a way to avoid the problem ?

Or any idea about another tool to do that conversion ?

 

Thank you

 

ADD COMMENTlink modified 2.2 years ago • written 3.7 years ago by Juke-342.1k
3
gravatar for Juke-34
2.2 years ago by
Juke-342.1k
Sweden
Juke-342.1k wrote:

After lot of time spent on that, I concluded that no tool was working properly nowadays for that purpose ( GFF3 to EMBL ). Actually in my group we were not the only one that faced up this problem... Indeed it has been released recently such kind of converter for the Prokka gff3 output: https://github.com/sanger-pathogens/gff3toembl In our side we also developped our own tool, but we implemented something more generalized that could be apply to any kind of gff3. We hope to release it publicly in the next few weeks.

ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Juke-342.1k

here is the tool we developed: https://github.com/NBISweden/EMBLmyGFF3

It works for any type of gff3 annotation.

ADD REPLYlink modified 18 months ago • written 20 months ago by Juke-342.1k
1
gravatar for Juke-34
3.7 years ago by
Juke-342.1k
Sweden
Juke-342.1k wrote:

Someone had already asked about the conversion, I found answers here Gff3 + Fasta To Genbank (Augustus Training Set)

Here they also propose an easy way to do the conversion using Bioperl:

http://ratt.sourceforge.net/transform.html

Now my problem changed... I have an issue with the Locus name. Bioperl says:

--------------------- WARNING ---------------------
MSG: Bad LOCUS name?  Changing [NODE_57_length_618_cov_40.4969_ID_247618] to 'unknown' and length to NODE_57_length_618_cov_40.4969_ID_247618

Any suggestion about what kind of locus name is expected to avoid to have it replaced by "unknown" ?

ADD COMMENTlink modified 3.7 years ago • written 3.7 years ago by Juke-342.1k

Ok now I found information about LOCUS information expected here:

Locus Field Format On Genbank

ADD REPLYlink written 3.7 years ago by Juke-342.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1357 users visited in the last hour