Question: How to plot count data by percentile of transcript length for a set of trasncripts
0
gravatar for Saima
3.9 years ago by
Saima10
United States
Saima10 wrote:

I have a list of miRNA slicing sites (1 bp-long) for a number of transcripts, and I would like to plot the total count of sites along  percentile bins (25th,50th,75th ,100th) of each transcript length for all transcripts. Can anyone please suggest how to do this in R? 

My input data looks like this:

#transcript    site    length
AT1G11440.1    222    1303
AT1G06580.1    1096    2052
AT1G15020.1    538    1773

 

 

R • 1.3k views
ADD COMMENTlink modified 3.9 years ago by Brice Sarver2.5k • written 3.9 years ago by Saima10
1
gravatar for Brice Sarver
3.9 years ago by
Brice Sarver2.5k
United States
Brice Sarver2.5k wrote:

Your question is somewhat ambiguous, but let me see if I can help.

Based on your data structure, I would calculate a third vector, percentage, and append it to the data frame (which I'm calling df) as such:

pos <- df$site/df$length

newDF <- cbind(df, percentage=pos)

All the data is now in a single spot, for reference. You didn't need to do that, however - you could have just searched 'pos.' Either way, the next step is to use operators in condition with length() to get the counts you want. For example, to determine the number of transcripts that have a splice site in the first 25% of bases or less, do:

twentyfive <- length(which(newDF$percentage <= 0.25))

HTH.

ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by Brice Sarver2.5k

thanks-- that helped :)

ADD REPLYlink written 3.9 years ago by Saima10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 780 users visited in the last hour