Question: R+Gviz: Fake AlignmentsTrack() to visualize Enhancer-Promoter connections
1
gravatar for zepper
10 months ago by
zepper50
zepper50 wrote:

Hello Biostars-Community,

I would need your kind suggestions regarding an issue with the Gviz package from Bioconductor. I am attempting plot Enhancer-Promoter connections and tried to (mis)use the AlignmentsTrack() function and the "sashimi" plots display type for it.

Typically, alignment tracks are generated from BAM files and if I do that, I can nicely display the sashimi plots as shown in the manual. However I would like to visualize enhancer-promoter connections and thus attempted to create a fake alignment file from simulated reads spanning the coordinates of the enhancers and promoters and connecting them with introns.

Unfortunately for this fake file, I can't get the sashimi plots to work. It will display the fake reads in the "pileup" mode, but when I choose type="sashimi", the track remains blank and I get a strange error. I suppose the sashimi plot depends on additional columns like the mapping quality or such, that would automatically be read from a BAM file, but which I didn't provide with the simulated data. Any suggestions?

Thanks a lot Matthias

Minimal working (not working) example:

    library("Gviz")
    library("GenomicRanges")


    eppairs <-
      structure(
        list(
          ID_trscpt = c("Ppp3cb.NM_001310426", "Ppp3cb.NM_001310427"),
          ID_enh = c(
            "WTonly_chr14:21460154-21460564",
            "WTonly_chr14:21297822-21298296"
          ),
          Relevance_int = c(21.32, 10.66),
          chr = c("chr14", "chr14"),
          start = c(21460154L, 21297822L),
          stop = c(21460564L, 21298296L),
          chr_trscpt = c("chr14", "chr14"),
          start_trscpt = c(21365694L,
                           21365694L),
          stop_trscpt = c(21366295L, 21366295L),
          Gene = c("Ppp3cb",
                   "Ppp3cb")
        ),
        .Names = c(
          "ID_trscpt",
          "ID_enh",
          "Relevance_int",
          "chr",
          "start",
          "stop",
          "chr_trscpt",
          "start_trscpt",
          "stop_trscpt",
          "Gene"
        ),
        row.names = c(7981L, 7984L),
        class = "data.frame"
      )


# construct a fake alignment file for sashimi plots. For each relevance point connection, add 10 fake aligned read pairs
read_weights <- floor(eppairs[,"Relevance_int"]*10)

sashimidata <- eppairs[,c("start","stop","start_trscpt","stop_trscpt")]
sashimidata <- apply(sashimidata,1,function(x){
  y <- sort(as.numeric(x))
  data.frame("start"=y[1],"stop"=y[4],"cigar"=paste0(y[2]-y[1],"M",y[3]-y[2],"N",y[4]-y[3],"M"))
  })

sashimidata <- data.frame("chr"=rep(eppairs[,"chr"],read_weights),do.call("rbind",sashimidata[rep(names(sashimidata),read_weights)]))


track_ep_pairs <- AlignmentsTrack(range=with(sashimidata,GRanges(chr,IRanges(start,stop),"*",cigar)),genome="mm9",isPaired=FALSE)

# works, I see gapped reads
plotTracks(list(track_ep_pairs),extend.left = 0.1, extend.right = 0.1)
# Error
plotTracks(list(track_ep_pairs),extend.left = 0.1, extend.right = 0.1,type="sashimi")

#Error in GAlignments(seqnames = seqnames(range), pos = start(range), cigar = range$cigar,  : 
#                        'cigar' must be a character vector with no NAs

# BUT:
#table(sashimidata$cigar)
#601M93859N410M 474M67398N601M 
#213            106 

#anyis.na(sashimidata$cigar))
#[1] FALSE
gviz sashimi genomicranges R • 449 views
ADD COMMENTlink modified 10 months ago • written 10 months ago by zepper50
1
gravatar for zepper
10 months ago by
zepper50
zepper50 wrote:

Thanks to everybody, who read this post and tried to help. I could solve the problem now.

I was unfamiliar with the data format of Gviz tracks, but it turned out that the underlying data can be manipulated just as a regular GRanges object. After I finally found out that a simple

ranges(track_ep_pairs)

gives you access to the data used to generate a track, I could analyze it and track down issues. While mcols() won't work on tracks, ranges() does. It turns out that a simple factor to character conversion of the cigar column will do:

track_ep_pairs <- AlignmentsTrack(range=with(sashimidata,GRanges(chr,IRanges(start,stop),"*",cigar)),genome="mm9",isPaired=FALSE)
ranges(track_ep_pairs)$cigar <- as.character(ranges(track_ep_pairs)$cigar)

Now the sashimi plots render like a charm. Yeah!

ADD COMMENTlink written 10 months ago by zepper50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 769 users visited in the last hour