CNV Visualization tool
2
1
Entering edit mode
5.7 years ago

Hi everybody, I have some CNV data that need to be turned into a chromosomal plot. The format is like this:

chr start end ploidy loss/gain

chr1 86000000 117150000 1 loss

chr2 70250000 70500000 3 gain

chr2 203050000 204650000 3 gain

(The last column can probably be omitted as we have all the info we need in the ploidy column). I would like to visualize them on a chromosomal plot. The only tool I've managed to make work with my data is CNANorm (http://www.bioconductor.org/packages/devel/bioc/vignettes/CNAnorm/inst/doc/CNAnorm.pdf), but it insists on analyzing my already analyzed data, so I can't get the results I need. Here's my CNANorm output, although the data points are all wrong because of the extra analysis. enter image description here

Any help would be appreciated, please keep in mind that I don't have that much R experience, and that I only need a viewer for the time being. Thanks everyone :)

CNV R visualisation • 4.0k views
ADD COMMENT
0
Entering edit mode

Does data have neutral regions too ? With ploidy 2 ?

ADD REPLY
0
Entering edit mode

Hi! No, the data I was given is filtered, so they only contain the abnormal regions. We are thinking of making a matrix of all found CNV across all samples, in which case we'll have neutral regions with ploidy = 2, or perhaps change the counting to neutral = 0, loss = -1 or -2, and positive values for the rest. Doing this transformation shouldn't be a problem if needed!

ADD REPLY
4
Entering edit mode
5.7 years ago

You can plot your data like this. CNVs are plotted as segments with a point corresponding to their midpoint. Chromosomes are separated out into facets.

library("ggplot2")

chr <- c("chr1", "chr2", "chr2")
start <- c(86000000, 70250000, 203050000)
end <- c(117150000, 70500000, 204650000)
center <- start + (end - start)/2
ploidy <- c(1, 3, 3)

ploidy_df <- data.frame(chr, start, end, center, ploidy)

ggplot(ploidy_df, aes(x=center, y=ploidy)) + 
  geom_point() +
  geom_segment(aes(x=start, y=ploidy, xend=end, yend=ploidy, colour="segment")) + 
  geom_hline(y=2, linetype=2) +
  facet_wrap(~chr) +
  xlab("Position") +
  theme_bw()

enter image description here

ADD COMMENT
1
Entering edit mode

Also check the ggbio library, it has functions for plotting genomes and tracks.

ADD REPLY
0
Entering edit mode

Hi, thank you for your reply! So, if I understood correctly, I add a "center" column to my data, and delete the last column. Your code gives me an error though,

Error: Unknown parameters: y
ADD REPLY
1
Entering edit mode

Yes, add a "center" column. You don't need to delete the last column, unless you just want to keep things neat.

I'm not sure why you get that error, I just re-ran the code I pasted above and it works fine. If you are running the plotting code as I wrote it above with different data, then you have to make sure the column names of your data frame match what I have (chr, start, end, center, ploidy).

ADD REPLY
0
Entering edit mode

Yes, my column names have the same names as your code. Weird thing is, I am getting the error even when pasting just your own code. (R for Windows 3.2.3 32 & 64 bit, up to date packages). When I run the ggplot command using fewer parameters, the error appears just as soon as I use the geom_hline parameter, if that helps. Thanks again!

ADD REPLY
1
Entering edit mode

Hmmm.. so can you produce a plot if you remove geom_hline()? This parameter just draws the dotted line at ploidy = 2, so it's not really necessary.

I just double checked the documentation and actually you should try this instead geom_hline(yintercept=2, linetype=2). If that doesn't work, then I'm at a loss. Hope that helps!

ADD REPLY
0
Entering edit mode

It worked! Thank you so very much! :)

Edit: Just one last thing, because of the different chromosomal lengths, the smaller the chromosome, the more everything is stuck to the left corner. Is there a way to independently plot the x axis for each chromosome?

enter image description here

ADD REPLY
1
Entering edit mode

Great!! If you check out the options for facet_wrap you can adjust the scales and also the number of rows/columns. In this case you would specify facet_wrap(~ chr, scales = "free") for independent plotting on both x and y.

ADD REPLY
1
Entering edit mode

Thank you, everything looks great now! The only thing that remains is to set the x axis range to the length of each chromosome. I thought about creating an extra column with the chromosome length, and using it for the x axis, but scale_x_continuous doesn't seem to accept variables. Here's the code I've used so far in case someone wants it:

library("ggplot2")
mydata <- read.table("C:\\RData\\testBiostars.txt", header=T)
ggplot(mydata, aes(x=0, y=ploidy)) + 
geom_segment(aes(x=start, y=ploidy, xend=end, yend=ploidy, colour="CNV length")) + 
geom_hline(yintercept=2, linetype=2) +
facet_wrap(~ chr, scales = "free") +
xlab("Position") +
ylab("Ploidy") +
scale_x_continuous(limits = c(0, 200000000))+
theme_bw()

And here's the output: enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 1766 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6