CNV Visualization tool
2
1
Entering edit mode
5.7 years ago

Hi everybody, I have some CNV data that need to be turned into a chromosomal plot. The format is like this:

chr start end ploidy loss/gain

chr1 86000000 117150000 1 loss

chr2 70250000 70500000 3 gain

chr2 203050000 204650000 3 gain


(The last column can probably be omitted as we have all the info we need in the ploidy column). I would like to visualize them on a chromosomal plot. The only tool I've managed to make work with my data is CNANorm (http://www.bioconductor.org/packages/devel/bioc/vignettes/CNAnorm/inst/doc/CNAnorm.pdf), but it insists on analyzing my already analyzed data, so I can't get the results I need. Here's my CNANorm output, although the data points are all wrong because of the extra analysis.

Any help would be appreciated, please keep in mind that I don't have that much R experience, and that I only need a viewer for the time being. Thanks everyone :)

CNV R visualisation • 4.0k views
0
Entering edit mode

Does data have neutral regions too ? With ploidy 2 ?

0
Entering edit mode

Hi! No, the data I was given is filtered, so they only contain the abnormal regions. We are thinking of making a matrix of all found CNV across all samples, in which case we'll have neutral regions with ploidy = 2, or perhaps change the counting to neutral = 0, loss = -1 or -2, and positive values for the rest. Doing this transformation shouldn't be a problem if needed!

4
Entering edit mode
5.7 years ago

You can plot your data like this. CNVs are plotted as segments with a point corresponding to their midpoint. Chromosomes are separated out into facets.

library("ggplot2")

chr <- c("chr1", "chr2", "chr2")
start <- c(86000000, 70250000, 203050000)
end <- c(117150000, 70500000, 204650000)
center <- start + (end - start)/2
ploidy <- c(1, 3, 3)

ploidy_df <- data.frame(chr, start, end, center, ploidy)

ggplot(ploidy_df, aes(x=center, y=ploidy)) +
geom_point() +
geom_segment(aes(x=start, y=ploidy, xend=end, yend=ploidy, colour="segment")) +
geom_hline(y=2, linetype=2) +
facet_wrap(~chr) +
xlab("Position") +
theme_bw()


1
Entering edit mode

Also check the ggbio library, it has functions for plotting genomes and tracks.

0
Entering edit mode

Hi, thank you for your reply! So, if I understood correctly, I add a "center" column to my data, and delete the last column. Your code gives me an error though,

Error: Unknown parameters: y

1
Entering edit mode

Yes, add a "center" column. You don't need to delete the last column, unless you just want to keep things neat.

I'm not sure why you get that error, I just re-ran the code I pasted above and it works fine. If you are running the plotting code as I wrote it above with different data, then you have to make sure the column names of your data frame match what I have (chr, start, end, center, ploidy).

0
Entering edit mode

Yes, my column names have the same names as your code. Weird thing is, I am getting the error even when pasting just your own code. (R for Windows 3.2.3 32 & 64 bit, up to date packages). When I run the ggplot command using fewer parameters, the error appears just as soon as I use the geom_hline parameter, if that helps. Thanks again!

1
Entering edit mode

Hmmm.. so can you produce a plot if you remove geom_hline()? This parameter just draws the dotted line at ploidy = 2, so it's not really necessary.

I just double checked the documentation and actually you should try this instead geom_hline(yintercept=2, linetype=2). If that doesn't work, then I'm at a loss. Hope that helps!

0
Entering edit mode

It worked! Thank you so very much! :)

Edit: Just one last thing, because of the different chromosomal lengths, the smaller the chromosome, the more everything is stuck to the left corner. Is there a way to independently plot the x axis for each chromosome?

1
Entering edit mode

Great!! If you check out the options for facet_wrap you can adjust the scales and also the number of rows/columns. In this case you would specify facet_wrap(~ chr, scales = "free") for independent plotting on both x and y.

1
Entering edit mode

Thank you, everything looks great now! The only thing that remains is to set the x axis range to the length of each chromosome. I thought about creating an extra column with the chromosome length, and using it for the x axis, but scale_x_continuous doesn't seem to accept variables. Here's the code I've used so far in case someone wants it:

library("ggplot2")
ggplot(mydata, aes(x=0, y=ploidy)) +
geom_segment(aes(x=start, y=ploidy, xend=end, yend=ploidy, colour="CNV length")) +
geom_hline(yintercept=2, linetype=2) +
facet_wrap(~ chr, scales = "free") +
xlab("Position") +
ylab("Ploidy") +
scale_x_continuous(limits = c(0, 200000000))+
theme_bw()


And here's the output: