Question: Plotting Fold change (FC) inside genomic interval
0
gravatar for viniciushs88
3.5 years ago by
viniciushs8850
Germany
viniciushs8850 wrote:

I would like to select specific rows in a dataframe when I get a value in some row. These selected lines (plus initial selected line) must compose a new dataframe and the dataframe name must be = $Name in initial selected line.

The logic:

1 - The initial selected lines must have $FC=> 0.7.

2 - The selected lines to form a dataframe must $chr = to initial selected line.

3 - The selected lines must have $Position inside 5000 window (comparing with $Position in initial selected line).

In this example, the line $Name = BD22 cannot be included in BD13 dataframe because $Position out of window number (5000 window to 3000 vary since $Position = 500 until $Position = 5500)
 
Follows above a simplified example:
 
My input input dataframe:

     Name   FC   chr   Position
     BD10   0.1  chr1    1000
     BD11   0.1  chr2    1000
     BD12   0.2  chr3    2000
     BD13   0.7  chr3    3000
     BD14   0.4  chr3    4000
     BD22   0.1  chr3    7000
     BD23   0.2  chr4    1000

I expect a dataframe with name line as output, in this example = BD13:

    Name   FC   chr   Position
    BD12   0.2  chr3   2000
    BD13   0.7  chr3   3000
    BD14   0.4  chr3   4000

After, I would like to plot each composed dataframe like this:

    pdf(BD13.pdf)
    plot(BD13$Name, BD13$FC, main="BD13",
       xlab="Name", ylab="FC")
    dev.off()

Thank you!

 

interval plot R • 904 views
ADD COMMENTlink modified 13 months ago by Biostar ♦♦ 20 • written 3.5 years ago by viniciushs8850

Can you rewrite what you mean in "3a"? It doesn't parse in English.

BTW, what have you tried so far? This is mostly just a matter of using which() and then subsetting the dataframe.

ADD REPLYlink written 3.5 years ago by Devon Ryan73k

I have tried:
out <- subset(input, FC >= 0.7)
out$startw <- (out$Position - 2500)
out$endw <- (out$Position + 2500)


library(plyr)
lvl <- dlply(out, .(Name))

for (i in 1:length(lvl)) {
  Neigh1 <- subset(input, input$Position >= lvl[i]$startw & lvl[i]$chr == input$chr)
  Neigh2 <- subset(input, input$Position <= lvl[i]$endw & lvl[i]$chr == input$chr)
}

pdf(sprintf("%s.pdf", [i]))
boxplot(Neigh$Name, Neigh$FC, xlab=[i], ylab="FC", main="[i]")
dev.off()}

But `Neigh1` and `Neigh2` are empty...

ADD REPLYlink modified 3.5 years ago • written 3.5 years ago by viniciushs8850

Just change the structure such that it iterates through out:

for(i in c(1:nrow(out)) {

That way you needn't have a dataframe that changes names.

ADD REPLYlink written 3.5 years ago by Devon Ryan73k
## Error: unexpected '{' in "for(i in c(1:nrow(out))
ADD REPLYlink written 3.5 years ago by viniciushs8850

I didn't give you the entire solution, just the idea of a different way to structure things to get what you want. Literally just throwing that line in there wouldn't work, I'm expecting you to understand and apply the concept.

ADD REPLYlink written 3.5 years ago by Devon Ryan73k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1108 users visited in the last hour