Question: Can't display protein domain data from UCSC table unipDomain using Gviz
1
gravatar for paul.jaschke
6 weeks ago by
paul.jaschke0 wrote:

Hi all, Using Gviz I am trying to create a figure where I have the exons from a gene in one track and below that the protein domains mapped to the genome sequence. I am able to map the transcript/exon information easily by grabing info from the UCSC site using the UcscTrack() function but cannot seem to properly grab the data out of the protein domain table 'unipDomain' represented by the schema here.

The problem seems to be the way the data is organized within the table with no easy 1:1 mapping between the 'chromStart' and 'chromEnd' data columns because there are multiple start positions relative to chromStart represented in the 'chromStarts'. That is, you need to offset a certain amount from the chromStart value using the chromStarts values (comma separated values) and then represent the width of the feature by the 'blockSizes' data, stored as comma-separated values. I don't see any way to do this with the GeneRegionTrack, AnnotationTrack, or DataTrack classes. Any help would be greatly appreciated, either solving this problem or pointing me towards an easier way to represent domain information on the chromosome with another method. Thanks!

This is what I would expect it to look like based on the track in UCSC browser ucsc browser image

This is what mine looks like (one big box instead of split along with exons) Gviz image

Code used to generate image above

library(Gviz)
library(GenomicRanges)

gen <- "hg38"
chr <- "chr11"

# Create the Ideogram track
itrack <- IdeogramTrack(genome = gen, chromosome = chr)

# Create the GenomeAxisTrack
# GenomeAxisTrack class
gtrack <- GenomeAxisTrack()

## example of FKBP2
from <- 64240500
to <- 64244500
knownGenes <- UcscTrack(genome = "hg38", 
                        chromosome = "chr11",
                        track = "knownGene", 
                        from = from,
                        to = to, 
                        trackType = "GeneRegionTrack",
                        rstarts = "exonStarts", 
                        rends = "exonEnds", 
                        gene = "name",
                        symbol = "name",
                        transcript = "name",
                        strand = "strand",
                        fill = "#8282d2",
                        name = "UCSC Genes")

domains <- UcscTrack(genome = "hg38",
                     chromosome = "chr11",
                     track = "uniprot",
                     table = "unipDomain",
                     from = from,
                     to = to,
                     trackType = "GeneRegionTrack",
                     rstarts = "chromStart",
                     rends = "chromEnd",
                     gene = "name",
                     symbol = "name",
                     strand = "strand",
                     name = "domain")

plotTracks(list(itrack, gtrack, knownGenes, domains), transcriptAnnotation = "gene" )
ADD COMMENTlink written 6 weeks ago by paul.jaschke0

Cross-posted: https://support.bioconductor.org/p/133785/

ADD REPLYlink written 6 weeks ago by Kevin Blighe66k

Thanks, didn't know I needed to do that.

ADD REPLYlink written 6 weeks ago by paul.jaschke0

The idea is that you wouldn't normally do that. Cross-posting is considered bad form.

ADD REPLYlink written 6 weeks ago by James W. MacDonald10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 818 users visited in the last hour