Plotting Fold change (FC) inside genomic interval
0
0
Entering edit mode
9.9 years ago
viniciushs88 ▴ 50

I would like to select specific rows in a dataframe when I get a value in some row. These selected lines (plus initial selected line) must compose a new dataframe and the dataframe name must be = $Name in initial selected line.

The logic:

  1. The initial selected lines must have $FC=> 0.7.
  2. The selected lines to form a dataframe must $chr = to initial selected line.
  3. The selected lines must have $Position inside 5000 window (comparing with $Position in initial selected line).

In this example, the line $Name = BD22 cannot be included in BD13 dataframe because $Position out of window number (5000 window to 3000 vary since $Position = 500 until $Position = 5500)

Follows above a simplified example:

My input input dataframe:

 Name   FC   chr   Position
 BD10   0.1  chr1    1000
 BD11   0.1  chr2    1000
 BD12   0.2  chr3    2000
 BD13   0.7  chr3    3000
 BD14   0.4  chr3    4000
 BD22   0.1  chr3    7000
 BD23   0.2  chr4    1000

I expect a dataframe with name line as output, in this example = BD13:

Name   FC   chr   Position
BD12   0.2  chr3   2000
BD13   0.7  chr3   3000
BD14   0.4  chr3   4000

After, I would like to plot each composed dataframe like this:

pdf(BD13.pdf)
plot(BD13$Name, BD13$FC, main="BD13",
   xlab="Name", ylab="FC")
dev.off()

Thank you!

interval r plot • 2.2k views
ADD COMMENT
0
Entering edit mode

Can you rewrite what you mean in "3a"? It doesn't parse in English.

BTW, what have you tried so far? This is mostly just a matter of using which() and then subsetting the dataframe.

ADD REPLY
0
Entering edit mode

I have tried:

out <- subset(input, FC >= 0.7)
out$startw <- (out$Position - 2500)
out$endw <- (out$Position + 2500)

library(plyr)
lvl <- dlply(out, .(Name))

for (i in 1:length(lvl)) {
  Neigh1 <- subset(input, input$Position >= lvl[i]$startw & lvl[i]$chr == input$chr)
  Neigh2 <- subset(input, input$Position <= lvl[i]$endw & lvl[i]$chr == input$chr)
}

pdf(sprintf("%s.pdf", [i]))
boxplot(Neigh$Name, Neigh$FC, xlab=[i], ylab="FC", main="[i]")
dev.off()}

But Neigh1 and Neigh2 are empty...

ADD REPLY
0
Entering edit mode

Just change the structure such that it iterates through out:

for(i in c(1:nrow(out)) {

That way you needn't have a dataframe that changes names.

ADD REPLY
0
Entering edit mode
## Error: unexpected '{' in "for(i in c(1:nrow(out))
ADD REPLY
0
Entering edit mode

I didn't give you the entire solution, just the idea of a different way to structure things to get what you want. Literally just throwing that line in there wouldn't work, I'm expecting you to understand and apply the concept.

ADD REPLY

Login before adding your answer.

Traffic: 1907 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6