Question: Converting Gff/Gtf + Reference To Embl Or Genbank ...Any Tools Available?
2
gravatar for JacobS
5.1 years ago by
JacobS880
Cleveland, Ohio
JacobS880 wrote:

I need to be able to easily change between GFF/GTF + reference to either EMBL and GenBank formats, and vice versa. Are there are any frequently used tools for accomplishing this, or should I script something myself?

gff genbank • 13k views
ADD COMMENTlink modified 20 months ago by j.dolata0 • written 5.1 years ago by JacobS880
19
gravatar for Hamish
5.1 years ago by
Hamish3.0k
UK
Hamish3.0k wrote:

The EMBOSS tool seqret would be a possible option. For example:

Generating an EMBL-Bank style entry from a fasta sequence and a GFF feature table:

seqret -sequence aj242600.fasta -feature -fformat gff -fopenfile aj242600.gff -osformat embl -auto

Alternatively to get a GenBank style entry:

seqret -sequence aj242600.fasta -feature -fformat gff -fopenfile aj242600.gff -osformat genbank -auto

To go the other way and get the sequence in fasta format and the features as GFF use something like:

seqret -sformat embl -sequence aj242600.dat -feature -osformat fasta -offormat gff -auto

Please note that since these are starting from sequence plus features they do not create a full EMBL-Bank or GenBank style entry, since this requires additional information, such as references, not available in the source data.

ADD COMMENTlink modified 5.1 years ago • written 5.1 years ago by Hamish3.0k
0
gravatar for j.dolata
20 months ago by
j.dolata0
j.dolata0 wrote:

Hi I would like to extract data in genbank format based on genome fasta file and gff file with coordinates. Could anybody help me?

ADD COMMENTlink written 20 months ago by j.dolata0

It would be best to ask this as a separate question.

ADD REPLYlink written 20 months ago by WouterDeCoster29k

Bedtools can extract the fasta subsequences

Tool:    bedtools getfasta (aka fastaFromBed)
Version: v2.25.0
Summary: Extract DNA sequences into a fasta file based on feature coordinates.

Usage:   bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf> -fo <fasta> 

Options: 
    -fi Input FASTA file
    -bed    BED/GFF/VCF file of ranges to extract from -fi
    -fo Output file (can be FASTA or TAB-delimited)
    -name   Use the name field for the FASTA header
    -split  given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons)
    -tab    Write output in TAB delimited format.
        - Default is FASTA format.

    -s  Force strandedness. If the feature occupies the antisense,
        strand, the sequence will be reverse complemented.
        - By default, strand information is ignored.

    -fullHeader Use full fasta header.
        - By default, only the word before the first space or tab is used.

get bedtools from here

ADD REPLYlink modified 14 months ago • written 14 months ago by Stephane Plaisance360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1360 users visited in the last hour