I have set a UCSC trackhub, which includes annotations that were originally in the GTF or GFF formats. So I followed the steps described here: https://genome.ucsc.edu/goldenPath/help/bigGenePred.html (example 4: GTF (or GFF) to BigGenePred) to obtain a binary bigGenePred for each GTF / GFF.
It works fine, except that when visualizing the result, I loose data on introns and strand orientation. It means that for each gene annotated, I get a solid line, going from the 5' end to the 3'end of the gene.
One example of what I get (the first 3 light blue top lines are from one of my GTF annotation turned to binary bigGenePred):
I really want to see a finer granularity where you can actually see exons / introns and strand orientation, so I tried to upload one of my GTF to a custom track. This way works: I do not lose data from the annotation and get to see all the details. However, I could not find a way to integrate a custom track to a track hub.
Result with the custom track (top 3 black lines are from one of my GTF annotation):
So this my question: what is the best way to integrate a GTF file to a trackhub and not lose any detail ?
Thanks for the help.
bigGenePred does support intron/strand information. It should not be a problem when converting from a GTF/GFF. Initial suspicions are the problem may be the way the trackDb stanza is declared in the hub. Are you stating type bigGenePred, etc? Another possibility could be the input GTF file or the conversion (stating the correct bidBed 12+8), but that's less likely. Essentially from your first screenshot it looks like the Genome Browser is displaying your file as a BED3.
If you would like to email us a copy of the GTF file as well as a link to the hub to our private mailing list (firstname.lastname@example.org) we could take a look. Note that only internal Genome Browser staff can see the contents of the message.
Thanks for your support, it's solved ! I stated "bigBed" instead of "bigGenePred" in the trackDB. I got confused by the last command "bedToBigBed" in the conversion from GTF to bigGenePred.
You cannot use directly the GFF or GTF files into the trackhub? Some important features was probably missing in the GFF/GTF files you converted. I suggest you standardise them with
agat_sp_gxf2gxf.plfrom AGAT and re-try the conversion. If still does not work you could add introns features too using
Nop, doesn't seem to. UCSC manual says:
By compressed binary they mean one of these (from this part of UCSC manual):
I have exactly the same data input in both cases (trackhub and custom track), so no data missing in the GTF/GFF. The problem is the conversion to bigGenePred that produce a loss of data (same as converting from GTF to BED file).
Thanks for the help and suggestions ! I will try it if I cannot find a more standard solution. There are lots of detailed annotations in UCSC so I guess there must be some UCSC-internal way of doing.