Question: In R, is there a way to convert the density function into a data frame?
gravatar for ac
3.8 years ago by
United States
ac20 wrote:

Is there a way to convert the density() function in R to a data-frame?

So lets give the easiest example possible. Say my variable contains the numbers 2, 4, 4, 6

I want to be able to convert this numbers to a density data.frame. 

Value Density
2 .25
4 .50
6 .25


I know your probably thinking why? but I have a good reason I promise. Thanks for the help!

R • 3.4k views
ADD COMMENTlink modified 9 days ago by Biostar ♦♦ 20 • written 3.8 years ago by ac20
gravatar for Neilfws
3.8 years ago by
Sydney, Australia
Neilfws48k wrote:

Not clear how this question relates to a bioinformatics problem. However, you may be misunderstanding what density() does.

If you try:

d1 <- density(c(2, 4, 4, 6))

you will generate a list, where d1$x contains x-values and d1$y contains y-values (density). By default there are n = 512 data points. You could assign these to a data frame. You won't see discrete values of density for x = 2, 4 and 6 because that isn't how density() works - it uses a smoothing kernel to estimate n y-values across the range of x-values provided.

hist() does something a little more like what you describe.

h1 <- hist(c(2, 4, 4, 6))
[1] 0.25 0.50 0.00 0.25

Also if you want helpful answers, it's best to describe fully what you are doing, rather than "trust me I have my reasons".

ADD COMMENTlink modified 3.8 years ago • written 3.8 years ago by Neilfws48k

I'm working on a ChIP-seq project where I'm looking at the MACS2 called peak lengths of differentially expressed H3K4me3 marked genes found in the presence of an RNAi knockdown of two methyl transferase enzymes. I am plotting the histogram of peak length of each respective RNAi knockdown affected gene to see if the methyl transferase enzymes are responding to a pattern in the chromatin mark (gene length). I want to overlay a density plot of the average peak length of all genes to show how the RNAi affected genes deviate from the overall mean but in ggplot this is not easy. So my data-wrangling approach is to plot a line overlaying the histograms with the respective density parameters (in which i need the density coordinates). 

The reason I didn't post this at the top was because it's complicated and I didn't think it was too relative to answering my question. I realize my example wasn't exactly what I asked because I didn't know how to explain it on the face of a forum so I thought I'd give the "easiest example possible". However, thank you for perfectly answering the question that I meant to ask. I'm new to biostars so I was under the impression that biostars was a forum where bioinformaticians could ask any coding questions they had but maybe my question was better suited for Stack Overflow. I'll be more descriptive in the future.

Lastly, in the future in a professional forum such as this, it might be good to respond in a constructive criticism sort of way rather than passive aggressively. 

ADD REPLYlink modified 3.8 years ago • written 3.8 years ago by ac20

Not being passive-aggressive mate, just trying to help people better articulate their questions.

You're correct: if it's a "general coding question", Stack Overflow is good; if it's posted here, we like to see the application to the bioinformatics problem.

ADD REPLYlink written 3.8 years ago by Neilfws48k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1095 users visited in the last hour