Conversion GFF to GTF for specific non-model genome
1
0
Entering edit mode
4.2 years ago
joelepaul • 0

Hi @ll!

I am working with a recently sequenced genome of a non-model system, which is this one:

https://www.ncbi.nlm.nih.gov/assembly/GCA_004329575.1/

The assembly that I can download from there includes only a *.GFF file. A *.GTF file is available too, but it is largely empty (the GFF is not). My bioinformatics pipeline specifically requires me to specify a GTF file. It also says, "Note that the GTF file should resemble the Ensembl format.".

Hence I would like to ask you for advice how to best convert this specific GFF file to GTF. I have read that the conversion differs on a case-to-case basis and there is no general works-on-all-gff-files-method.

So I would be very happy if someone experienced here could take a short look at this specific GFF file and give me advice how to best convert it into an useable *.GTF file.

Thank you a lot for your time!

Cheers

Joe

gtf gff • 1.3k views
ADD COMMENT
0
Entering edit mode
4.2 years ago
Juke34 8.5k

If you talk about this GFF file ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/004/329/575/GCA_004329575.1_ASM432957v1/GCA_004329575.1_ASM432957v1_genomic.gff.gz it's what I call a fake. It does not contain annotation, it just define 'region' features. So what you need to do first is not to convert the GFF but create a real annotation.

EDIT: They have now removed the file

ADD COMMENT
0
Entering edit mode

That's a bummer. Unfortunately I do not have the funds available to create a real annotation, so I cannot apply the pipeline that I wanted to (which wants a GTF file as a mandatory requirement). Is there maybe a way to "simulate" an annotation without actual work on the DNA, such as described here: https://galaxyproject.github.io/training-material/topics/genome-annotation/tutorials/genome-annotation/tutorial.html ?

ADD REPLY
0
Entering edit mode

Run Augustus with the hmm model of the closest species of the one you want to investigate. It's the quicker decent way. You must RepeatMask your genome first (you can apply the same here using the repeat library of the closest species available).

There is plenty of way automated / semi-automated to do annotation. You could go funannotate with is available as container.

Here a list of annotation tools.

ADD REPLY
0
Entering edit mode

My main issue is that there is no hmm model of a remotely close species to this snail. Gastropods/molluscs are not vertebrates or mammals, not worms, not insects, not plants and not fungi.

ADD REPLY

Login before adding your answer.

Traffic: 2100 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6