Range line chart with breaks
3
0
Entering edit mode
9.8 years ago
Prasad ★ 1.6k

I have a dataset of this kind. These represents start and stop positions in chromosome.

I need to plot a line for per start and stop range w.r.t x-axis and my y-axis should be some constant value. (e.g 2)

My x-axis range should be user defined like 0 to 3MB with an interval of 0.5 MB

982853    978000
983330    982868
2274704    2275209
2274717    2275209
976011    975597
976011    975600
983330    982919
1914374    1914874
1914375    1914874

How can O achieve this or do suggest if there are any alternatives

Thanks

R chromosome • 2.3k views
ADD COMMENT
0
Entering edit mode

Thanks for the reply. How i need is, please find the image, portion of the image where the RE section is given.

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3925796/figure/fig3/

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I read the dataset as you have mentioned . Then I used library(GenomicRanges)

followed by the below mentioned steps but I am getting error.

> r<-with(tab, GRanges("chr1", IRanges(pmin(start, end), pmax(start, end)), strand=ifelse(start > end,"-","+")))

Error in rep(each, length.out = l2) : 
  attempt to replicate an object of type 'closure'

Can you please clarify on it.

ADD REPLY
0
Entering edit mode

Yaa got it.

ADD REPLY
0
Entering edit mode

Thanks everyone for the valuable comments. I just want a small clarification, if suppose I want to make my x-axis

something like 950000, 960000, 970000, 980000 so on till the end of this sample set. How can I fix the x-axis interval.

Using ggbio IRanges(950000, 3000000) is it possible to do?

I tried doing something like this

for eg:

r<-with(tab, GRanges("chr1", IRanges(969812, 101000)))
autoplot(r, stat="reduce")

But I am getting a black image in the plot window.The plotted lines are not clear.

Can you please help me out.

ADD REPLY
3
Entering edit mode
9.8 years ago

Here is an example of using ggbio to achieve this. First, I read your pasted ranges and munge them into a formal data structure for representing genomic ranges called GRanges:

tab <- read.table("so.txt", header=TRUE)

library(GenomicRanges)
r <- with(tab, GRanges("chr1",
                       IRanges(pmin(start, end),
                               pmax(start, end)),
                       strand = ifelse(start > end, "-", "+")))

Now, ggbio will know how to plot it:

library(ggbio)
autoplot(r)

ADD COMMENT
0
Entering edit mode

It seems like the input in the question contains nine elements, but the figure above only shows seven elements.

ADD REPLY
0
Entering edit mode

Thanks everyone for the valuable comments. I just want a small clarification, if suppose I want to make my x-axis

something like 950000, 960000, 970000, 980000 so on till the end of this sample set. How can I fix the x-axis interval.

Using ggbio IRanges(950000, 3000000) is it possible to do?

I tried doing something like this

for eg:

r<-with(tab, GRanges("chr1", IRanges(969812, 101000)))
autoplot(r, stat="reduce")

But I am getting a black image in the plot window.The plotted lines are not clear.

Can you please help me out.

ADD REPLY
2
Entering edit mode
9.8 years ago

You can specify xlim and ylim parameters to a plot() function call. Take a look at the segments() function to draw lines between (x, y) locations, within a plot.

#!/usr/bin/env Rscript

x1 <- c(982853, 983330, 2274704, 2274717, 976011, 976011, 983330, 1914374, 1914375)
x2 <- c(978000, 982868, 2275209, 2275209, 975597, 975600, 982919, 1914874, 1914874)
y1 <- rep(2, length(x1))
y2 <- rep(2, length(x2))
df <- data.frame(x1, x2, y1, y2)

pdf("foo.pdf", width=12, height=2)
plot(NA, xlim=c(min(df$x1),max(df$x2)), ylim=c(2,2), xlab="Position", ylab="y")
segments(df$x1, df$y1, df$x2, df$y2, lwd=2)
dev.off()

As mentioned, given the wide domain and narrow range, you're basically going to get dots:

And that's just using the domain around your specific dataset, not 0 to 3 Mb. You may need to consider alternative presentations of this data, or plotting subsets of it around smaller domains.

Also, given that some of your input regions overlap, you will want to jitter the y values of those overlapping regions. One simple way to do this is set up a fifth column with some random values, say drawn from a normal distribution of mean zero and s.d. of one. For each row, perturb the row's y1 and y2 values by that random number. Adjust the ylim values accordingly. This will usually separate overlapping regions:

#!/usr/bin/env Rscript

x1 <- c(982853, 983330, 2274704, 2274717, 976011, 976011, 983330, 1914374, 1914375)
x2 <- c(978000, 982868, 2275209, 2275209, 975597, 975600, 982919, 1914874, 1914874)
y1 <- rep(2, length(x1))
y2 <- rep(2, length(x2))
rnd <- rnorm(length(x1), 0, 1)
df <- data.frame(x1, x2, y1, y2, rnd)

pdf("foo.pdf", width=12, height=3)
plot(NA, xlim=c(min(df$x1),max(df$x2)), ylim=c(0,4), xlab="Position", ylab="y")
segments(df$x1, df$y1 + df$rnd, df$x2, df$y2 + df$rnd, lwd=2)
dev.off()

Now you can see the elements separate:

ADD COMMENT
1
Entering edit mode
9.8 years ago
jeremy S ▴ 30

You should look into the package ggbio, it has a very easy to follow tutorial. But just using base R those lines are not going to be visible on that scale, they will be tiny dots.

ADD COMMENT

Login before adding your answer.

Traffic: 2724 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6