How can I use ggplot2 to indicate chromosomes on a continuous axis?
3
1
Entering edit mode
2.1 years ago
gy494270 ▴ 20

The plot below is what I want to plot: enter image description here

Basically, the plot contains the log(Pi1/Pi2) values for specific positions on each chromosome. (It looks like a Manhattan plot but with negative values)

I have the following data frame

Chromosome  Position  Log_Pi

1              102331    -0.552
2              220231    2.32
5              233433   -2.200
.
.
.
7             522631   -1.023
7             322512    1.2231
10            233356      -0.223
12            5666932     0.2235

I have tried:

ggplot(data = Pi_PLOT, aes(x = Chromosome, y = log_pi)) + geom_point(data = Pi_PLOT, aes(x = Position, y = log_pi, fill = Chromosome)) + labs(x = "Chromosome", y = "log_pi")
ggplot2 • 1.5k views
ADD COMMENT
3
Entering edit mode

I do not know a smart solution, but you can concatenate all the chromosomes into a single one, and compute where in the new coordinates to place the labels and change the colors. I did that when plotting whole-genome pairwise alignments in https://oist.github.io/GenomicBreaks/articles/GenomicBreaks.html#oxford-dot-plots.

ADD REPLY
0
Entering edit mode

nice work on this package Charles Plessy love the synteny visuals

ADD REPLY
1
Entering edit mode
2.1 years ago
gy494270 ▴ 20

I figured out by using the CMplot package in R.

Thanks

ADD COMMENT
0
Entering edit mode
2.1 years ago
Mark ★ 1.5k

Gene names are missing from your dataframe. Add an extra column which has gene names to those of interest. After this, you can do something like:

p = ggplot(data = Pi_PLOT, aes(x = Chromosome, y = log_pi)) + 
      geom_point(data = Pi_PLOT, aes(x = Position, y = log_pi, fill = Chromosome)) + 
      labs(x = "Chromosome", y = "log_pi") +

p + geom_text(aes(label=gene_name))

Where gene_name is the name of the column I mentioned above.

ADD COMMENT
0
Entering edit mode
2.1 years ago
zx8754 11k

I don't have the ready codes for this, but when we have that many values on xaxis precise values do not matter much.

So we can sort the data on chrom and position, then give x axis 1 to nrow values, then for each chrom find the middle. For example, if chr2 is from 2000 to 3000, then we put the label "chr2" at round(2000 + (3000 - 2000)/2), do this for each chrom. Then plot with label.

ADD COMMENT

Login before adding your answer.

Traffic: 1994 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6