Question: How to select values based on specific condition from the matrix in R
0
MAPK1.6k wrote:

Hi Guys,

I have a large matrix as shown below `mymatrix`. I would like to know if there is any way I can get the result in the form of list or matrix for each position with only those nucleotides that have values( i.e ones without NA's) and in decreasing order. For example, I want to get the result in these format:

In the form of matrix:

`pos 161111     T(17)  C(1)`

`pos 99022222        G(24)      A(3)`

or in the form of list

`pos161111   `

`T                    C`

`17                   1`

`pos 99022222`

`G                    A`

`24                   3`

and so forth...Thank you.

`mymatrix`

 `pos` `A` `C` `G` `T` `N` `1611111` `NA` `1` `NA` `17` `NA` `99022222` `3` `NA` `24` `NA` `NA` `99092333` `NA` `5` `NA` `91` `NA` `233232333` `2` `22` `NA` `NA` `NA`

`   `

R • 1.4k views
modified 5.3 years ago by Steven Lakin1.5k • written 5.3 years ago by MAPK1.6k

How large of a matrix are we talking here?  An efficient solution might be needed if it is too large.  Otherwise, this problem is relatively easy and I will answer it when you reply.

It's fairly a large matrix. Thank you!

What dimensions?

dim(mymatrix)

Right now my matrix is of 6023 by 8.

Ok, that's not too bad.  I will write a quick answer for it in a sec.

Thank you, I would really appreciate that!

2
Steven Lakin1.5k wrote:

This is not by any means pretty code, but it should work for you and output it in tab delimited format in a file in your output directory.  You can then re-read that into R using read.table() with "sep" set to "\t".  I didn't include the decreasing order since it is late here, but with a little imagination, you could probably add it.

```transformMyMatrix <- function(mymatrix, outputFile) {
for(i in 1:nrow(mymatrix)) {
temp <- paste(c("pos(", mymatrix[i, "pos"], ")"), collapse='')
for(j in 2:ncol(mymatrix)) {
if(!is.na(mymatrix[i,j])) {
temp <- c(temp, paste(c(names(mymatrix)[j], "(", mymatrix[i,j], ")"), collapse=''))
}
}
write.table(t(as.matrix(temp)), file=outputFile, sep="\t", append=T, quote=F, row.names=F, col.names=F)
}
}```

Then call the function:

`transformMyMatrix(mymatrix, "outputFile.txt")`

For example, here is what I get with that:

```> mymatrix
pos  A  C  G  T  N
1   1611111 NA  1 NA 17 NA
2  99022222  3 NA 24 NA NA
3  99092333 NA  5 NA 91 NA
4 233232333  2 22 NA NA NA```

```> transformMyMatrix(mymatrix, outputFile="newMatrix.txt")
> newMatrix
V1   V2    V3
1   pos(1611111) C(1) T(17)
2  pos(99022222) A(3) G(24)
3  pos(99092333) C(5) T(91)
4 pos(233232333) A(2) C(22)```

Be aware that if you try to read it back into R in a data frame format, it will automatically fill in empty spots with NAs if the # of columns are uneven, so you might have to address that.