Question: Using ensembl genomes and gtf conversion
0
gravatar for as9309
3.4 years ago by
as930920
as930920 wrote:

Hi,

I'm trying to align my RNA-seq data to an E coli reference genome that I downloaded from Ensembl bacteria but I'm getting stuck because I need the genome in gtf format. I can only download it in gff3 and cannot convert gff3 to gtf because gffread does not work ("Uncaught exception in exposed API method:").

Does anyone know either how I can align my data to the most current E coli reference genome or convert my gff3 to gtf without the requirement of gffread?

I have previously tried to convert an ensembl gtf to the correct format ("https://usegalaxy.org/u/jeremy/p/transcriptome-analysis-faq") but it gives me a tabular output - can I change this?

Thanks!

gffread conversion rna-seq genome • 1.4k views
ADD COMMENTlink modified 3.4 years ago by andrew.j.skelton735.7k • written 3.4 years ago by as930920

GFF3 is supposed to be a backwards compatible specification of the GFF tabular format and GTF, as far as I understand it, is similar to GFF2. So what is your requirement for converting GFF3 to GTF and why is a tabular output not the correct one ?
 

ADD REPLYlink written 3.4 years ago by Jean-Karim Heriche19k

When I use cufflinks with the gff3 file, it provides the correct gene annotation but with incorrect gene names. e.g. instead of the RpoS gene, it gives me "transcript:AAC75783". Also, the cufflinks programme will not recognise my .tab file. Do you know how I can rearrange the gff3 file to identify the gene with their names instead of transcript number?

ADD REPLYlink written 3.4 years ago by as930920

Could it be because your file doesn't have a .gtf or .gff extension ? If you're using files from Ensembl, those should be in GTF format with the proper .gtf extension. If using GFF3, do you have a gene_name attribute ?

ADD REPLYlink written 3.4 years ago by Jean-Karim Heriche19k

Yes, worked with the gff3 file! Thanks for your help.

ADD REPLYlink written 3.4 years ago by as930920
0
gravatar for andrew.j.skelton73
3.4 years ago by
London
andrew.j.skelton735.7k wrote:

It might be worth specifying what pipeline you're trying to use, Tuxedo maybe? Considering you're using gffread. Just to make sure that you're using the GTF from here right - that should be in GTF format already...?

ADD COMMENTlink written 3.4 years ago by andrew.j.skelton735.7k

Thanks for your reply - yes I took the MG1655 genome from Ensembl bacteria but when I upload to galaxy, it recognises it as a gff file.

ADD REPLYlink written 3.4 years ago by as930920

If you're performing this on galaxy then I'd suggest you check out the Galaxy Wiki which includes mailing lists for support. 

ADD REPLYlink written 3.4 years ago by andrew.j.skelton735.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1588 users visited in the last hour