Question: Program To Visualize Gene Models With Highlighted Protein Features
5
gravatar for Christian
7.4 years ago by
Christian2.8k
Cambridge, US
Christian2.8k wrote:

I have the following input:

(1) List of protein-coding gene models in GFF3 format (2) List of protein features of these genes (e.g. protein domains or transmembrane regions), with coordinates on the protein sequence-level.

As output, I would like to have an image that shows all gene models drawn to scale side-by-side, with protein features mapped onto coding exons. Different feature types should be drawn in different colors (configurable), and features are allowed to span exon-exon junctions. Ideally, the generation of such an image can be completely automated.

Example image:

alt text

Can anyone suggest a program that can do this?

gene visualization • 5.1k views
ADD COMMENTlink modified 7.4 years ago • written 7.4 years ago by Christian2.8k
5
gravatar for Christian
7.0 years ago by
Christian2.8k
Cambridge, US
Christian2.8k wrote:

I ended up implementing two new BioPerl modules myself that, when used in combination, solve this problem. I just uploaded both modules to GitHub:

Quoting from the module descriptions:

Bio::Graphics::Glyph::decorated_gene - A GFF3-compatible gene glyph with protein decorations.

This glyph extends the functionality of the Bio::Graphics::Glyph::gene glyph and allows to draw protein decorations (e.g., signal peptides, transmembrane domains, protein domains) on top of gene models. Currently, the glyph can draw decorations in form of colored or outlined boxes inside or around CDS segments. Protein decorations are specified at the 'mRNA' transcript level in protein coordinates. Protein coordinates are automatically mapped to nucleotide coordinates by the glyph. Decorations are allowed to span exon-exon junctions, in which case decorations are split between exons. By default, the glyph automatically assigns different colors to different types of protein decorations, whereas decorations of the same type are always assigned the same color.

and

Bio::Draw::FeatureStack - BioPerl module to generate GD images of stacked gene models

FeatureStack creates GD images of vertically stacked gene models to facilitate visual comparison of gene structures. Compared genes can be clusters of orthologous genes, gene family members, or any other genes of interest. FeatureStack takes an array of BioPerl feature objects as input, projects them onto a common coordinate space, flips features on the negative strand (optional), left-aligns them by start coordinates (optional), sets a fixed intron size (optional), removes unwanted transcripts/isoforms (optional), and then draws the so transformed features with a user-specified glyph. Output images can be generated in SVG (scalable vectorized image) or PNG (rastered image) format.

Here is an example output of FeatureStack: FeatureStack example output showing RFX genes across a diverse set of species

ADD COMMENTlink modified 6.6 years ago • written 7.0 years ago by Christian2.8k
1

FeatureStack and decorated_gene are now available from CPAN as well: http://search.cpan.org/~chrisfr/

ADD REPLYlink written 6.9 years ago by Christian2.8k
1

A paper describing this module is now published in Bioinformatics: http://bioinformatics.oxfordjournals.org/content/early/2012/09/27/bioinformatics.bts572.short

ADD REPLYlink modified 6.6 years ago • written 6.7 years ago by Christian2.8k
5
gravatar for ALchEmiXt
7.4 years ago by
ALchEmiXt1.9k
The Netherlands
ALchEmiXt1.9k wrote:

You might consider using Artemis for that (developed by Sanger). It allows to read all kind of formats and feature files including gff, Genbank, EMBL, BAM, and such... It's JAVA based. There is also a multiple sequence omparison version called ACT.

I think Artemis can at least show the multiple features as separate tracks, but it also has a one line merge option. So I guess you might be able to pull it off by displaying the proteins on top of the other features...

Example of gene builder in Artemis: alt text

Example main Artemis window: alt text

Images coming from the paper Carver et al 2008.

ADD COMMENTlink written 7.4 years ago by ALchEmiXt1.9k

Great suggestion, +1. Artemis has come a long way. Just tried, it can indeed display features on top of gene models (example here http://bit.ly/wlc3d6), but it seems one has to provide nucleotide and not protein coordinates, correct?

Also, I was looking for a solution that can be completely automated (just edited my question to make that clear). With Artemis, if I wanted to compare many gene models, I would have to look them up individually and cannot compare them side-by-side. Please let me know if I am wrong on this.

ADD REPLYlink written 7.4 years ago by Christian2.8k

@Christian. You seem to be able to control Artemis by the cmd line API but personally I have not done so yet. Maybe in the future. Inded for automation....Dropping Tim Carver an email might help. He is usually quite responsive on the Artemis mailing lists.

ADD REPLYlink written 7.4 years ago by ALchEmiXt1.9k

Just popped in mind (happens a lot lately... :-)); for comparisons sake you can basically do a mutli track comparison using Artemis Comparison Tool (ACT) also from Sanger.

ADD REPLYlink written 7.4 years ago by ALchEmiXt1.9k
4
gravatar for Scott Cain
7.4 years ago by
Scott Cain750
Scott Cain750 wrote:

In an upcoming gbrowse release, we're getting ready to roll out new functionality that will allow transparent glyphs and I think you'd be able to do this, though it would be easier if you transform the protein coordinates to DNA coordinates. I don't have a release schedule, but we generally get them out pretty fast.

ADD COMMENTlink written 7.4 years ago by Scott Cain750

I am currently tinkering around with BioPerl with some success. I would love if Bio::Graphics could generate such annotated gene models out-of-the-box.

One question Scott: Do I really need transparent glyphs to do this? Is there no other way in Bio::Graphics to draw glyphs on top of each other?

ADD REPLYlink written 7.4 years ago by Christian2.8k

No, you don't need transparent glyphs, it would just make it really easy. You could also write your own glyph, which isn't real hard.

ADD REPLYlink written 7.4 years ago by Scott Cain750
3
gravatar for ALchEmiXt
7.4 years ago by
ALchEmiXt1.9k
The Netherlands
ALchEmiXt1.9k wrote:

An alternative that just popped in my mind might be the use of genoplotR. Basically it allows you to "program" entirely your genomebased graphics... of course knowledge of R is required.

ADD COMMENTlink written 7.4 years ago by ALchEmiXt1.9k

Definitely an interesting possibility. However, I think for my specific problem genoplotR is too much of a general-purpose tool. Of course, if anyone has written genoplotR code that does what I am looking for... greatly appreciated!

ADD REPLYlink written 7.4 years ago by Christian2.8k
1
gravatar for Daniel Standage
7.4 years ago by
Daniel Standage3.8k
Davis, California, USA
Daniel Standage3.8k wrote:

I use the AnnotationSketch tool that comes with the GenomeTools library for most of my gene annotation graphics. You can provide the tool with a style file that allows you to define colors, shapes, etc for different feature types, and also control how features collapse (i.e., which feature types have their own tracks and which feature types should be plotted on their "Parent" features).

AnnotationSketch can definitely do what you describe, although you would have to create a style file and make sure that feature relationships are defined properly in the GFF3 file. However, I've been very pleased with how responsive the GenomeTools mailing list is, so you shouldn't have any trouble getting the help you need.

ADD COMMENTlink written 7.4 years ago by Daniel Standage3.8k
1

No, I don't think there are Perl bindings, but they do have bindings for C, Ruby, Python, Lua...I have written C programs that use the C bindings, but most of the time I used the command line tool gt sketch that doesn't require anything other than a GFF3 file (and a style file if you don't like the default style).

ADD REPLYlink written 7.4 years ago by Daniel Standage3.8k

Nice! Setting-up/designing graphs reqs programming in C or Python right? Pity no Perl support... or do I get its usage wrong?

ADD REPLYlink written 7.4 years ago by ALchEmiXt1.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 662 users visited in the last hour