Question: Plotting Exon Statistics In R
gravatar for thecuriousbiologist
7.7 years ago by
United States
thecuriousbiologist480 wrote:


I have a table in the following format which provides some information (counts of mapped reads) about exons in some genes across some samples.

Exon     Gene      sampleA      sampleB     sampleC
  E1       A         43          52          12   
  E2       A         0           24          34
  E3       A         19          48          32
  E4       A         76          0           23
  E5       A         5           87          12  
  E1       B         12          109         98
  E2       B         32          76          11
  E1       C         12          0            5
  E2       C          4          8           76 
  E3       C          0          0           32

That is, information about every exon of every gene. I wish to generate a per sample plot (therefore, 3 plots) of counts of all exons in all 3 genes. Within each sample plot, my X-axis would be the exon number and the Y-axis would be the count. And so, I would have 3 "data series" lines (since there are 3 genes) within each plot.

I am new to R and I have no clue how to go about it.

I am wondering if I have to "factor" the gene column in any way to get the exons specific for that gene.?

Any suggestions would be much appreciated.

R exon • 3.3k views
ADD COMMENTlink modified 7.7 years ago by Dan520 • written 7.7 years ago by thecuriousbiologist480

Don't forget to normalize your read counts by sequence depth per sample.

Also, if you want to run statistics on differential exon usage (seems to be where you are going with this), you should look at the DEXSeq package ... an added bonus is that it includes functionality to plot expression over exons

ADD REPLYlink written 7.7 years ago by Steve Lianoglou5.1k
gravatar for Irsan
7.7 years ago by
Irsan7.2k wrote:
# Install ggplot2 and reshape

# load the packages:

# melt the dataframe so that ggplot can handle it. I assume you have the data in object called counts

# and plot counts for each gene for each exon
ggplot(melt_count)+geom_point(aes(x=exon,y=count,color=sample))+facet_grid(exon ~ gene,scales="free_x")

See resulting image here

ADD COMMENTlink modified 7.7 years ago • written 7.7 years ago by Irsan7.2k
gravatar for Dan
7.7 years ago by
Dan520 wrote:

If you're new to R, you should definitely read Chapter 1 (Introduction) of 'S Poetry':

I can't recommend it enough!

For manipulating data frames, you should look at tapply and friends. I don't quite understand what you want to do, but I'm sure you can do it with tapply ;-)

ADD COMMENTlink written 7.7 years ago by Dan520

If you're new to R, you should in general read as much documentation, including online tutorials as you can. You can't expect to have a "clue how to go about it" with no background knowledge whatsoever.

ADD REPLYlink written 7.7 years ago by Neilfws49k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1494 users visited in the last hour