Question: Visualize multiple GFF files
gravatar for Anand Rao
12 months ago by
Anand Rao210
United States
Anand Rao210 wrote:

For each of several genomes, apart from the already available fasta sequence and associated GFF3 annotation files, I have also generated 5 additional GFF files for start-stop coordinates of 5 additional types of genomic features.

My goal is four-fold in the context of these 6 GFF files and 1 genomic DNA sequence.

  1. Explore where two or more of these genomic features overlap / intersect / co-localize - I am doing this via text manipulation, using bedtools overlap or bedtools intersect. But I want to visualize 2 or more tracks , and just text-based calculation of intersection is not satisfying during exploration phase. I want to visualize it.

  2. Generate high-detail images (with flexibility of color / shapes like that in IGB or gff2ps) for small intervals, looking at specific and most interesting cases.

  3. Generate overview images for entire chromosomes or even a genome, for overlap across these 6 types of genomic features , without confusing or overwhelming the reader.

  4. Finally, to request advice on which tool and/or which statistical test to perform for verifying whether the observed physical co-localization of any 2 of the 6 types of genomic features is random or non-random. And if latter is true, are there more sophisticated tests to examine physical distribution of genomic loci types, relative to one another? And are there tests that can examine more than 2 types of genomic features at a time?

The rather old thread at What Tools/Libraries Do You Use To Visualize Genomic Feature Data? discusses answers to questions 1 - 3 above, but I am curious to know if there are better / updates tools for my goals than the ones I mentioned above or at the link (bedtools or bedOps, IGB, GBrowse, GFF2PS). Thanks!

ADD COMMENTlink modified 7 months ago by jrj.healey13k • written 12 months ago by Anand Rao210
gravatar for bernatgel
7 months ago by
Barcelona, Spain
bernatgel2.0k wrote:


If you can use R you should be able to create these plots with karyoploteR. You would need to load the data into R (probably using rtracklayer's ' import' function) and then plot them using kpPlotGenesfor genes and kpPlotRegions for everything else. You can find more information and various examples on how to use it at karyoploteR titorial page.

As for point 4 you can use the Bioconductor package regioneR. If you load the data into R you can use the function overlapPermTest to perform a permutation strategy to test if two sets of genomic regions overlap more (or less) than expected by chance.

Hope this helps


ADD COMMENTlink written 7 months ago by bernatgel2.0k
gravatar for jrj.healey
7 months ago by
United Kingdom
jrj.healey13k wrote:

You could do this (I think) with Artemis and/or the Artemis Comparison Tool.

Load up the sequence, and read in the different annotation files and you can view them all together in the same window I believe.

It probably doesn't solve all of your requests though, but worth a look I think.

ADD COMMENTlink modified 7 months ago • written 7 months ago by jrj.healey13k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 808 users visited in the last hour