How to plot count data by percentile of transcript length for a set of trasncripts
1
0
Entering edit mode
9.5 years ago
Saima ▴ 10

I have a list of miRNA slicing sites (1 bp-long) for a number of transcripts, and I would like to plot the total count of sites along percentile bins (25th,50th,75th ,100th) of each transcript length for all transcripts. Can anyone please suggest how to do this in R?

My input data looks like this:

#transcript    site    length
AT1G11440.1    222    1303
AT1G06580.1    1096    2052
AT1G15020.1    538    1773
R • 2.4k views
ADD COMMENT
1
Entering edit mode
9.5 years ago
Brice Sarver ★ 3.8k

Your question is somewhat ambiguous, but let me see if I can help.

Based on your data structure, I would calculate a third vector, percentage, and append it to the data frame (which I'm calling df) as such:

pos <- df$site/df$length

newDF <- cbind(df, percentage=pos)

All the data is now in a single spot, for reference. You didn't need to do that, however - you could have just searched 'pos.' Either way, the next step is to use operators in condition with length() to get the counts you want. For example, to determine the number of transcripts that have a splice site in the first 25% of bases or less, do:

twentyfive <- length(which(newDF$percentage <= 0.25))

HTH.

ADD COMMENT
0
Entering edit mode

thanks-- that helped :)

ADD REPLY

Login before adding your answer.

Traffic: 2694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6