Question: How to select values based on specific condition from the matrix in R
0
gravatar for MAPK
4.8 years ago by
MAPK1.5k
United States
MAPK1.5k wrote:

Hi Guys,

I have a large matrix as shown below mymatrix. I would like to know if there is any way I can get the result in the form of list or matrix for each position with only those nucleotides that have values( i.e ones without NA's) and in decreasing order. For example, I want to get the result in these format:

In the form of matrix:

pos 161111     T(17)  C(1)

pos 99022222        G(24)      A(3)

or in the form of list

pos161111   

T                    C

17                   1

pos 99022222

G                    A

24                   3

and so forth...Thank you.

 

mymatrix

pos

A

C

G

T

N

1611111

NA

1

NA

17

NA

99022222

3

NA

24

NA

NA

99092333

NA

5

NA

91

NA

233232333

2

22

NA

NA

NA

  

R • 1.4k views
ADD COMMENTlink modified 4.8 years ago by Steven Lakin1.5k • written 4.8 years ago by MAPK1.5k

How large of a matrix are we talking here?  An efficient solution might be needed if it is too large.  Otherwise, this problem is relatively easy and I will answer it when you reply.

ADD REPLYlink written 4.8 years ago by Steven Lakin1.5k

It's fairly a large matrix. Thank you!

ADD REPLYlink written 4.8 years ago by MAPK1.5k

What dimensions?

dim(mymatrix)

ADD REPLYlink written 4.8 years ago by Steven Lakin1.5k

Right now my matrix is of 6023 by 8.

ADD REPLYlink written 4.8 years ago by MAPK1.5k

Ok, that's not too bad.  I will write a quick answer for it in a sec.

ADD REPLYlink written 4.8 years ago by Steven Lakin1.5k

Thank you, I would really appreciate that!

ADD REPLYlink written 4.8 years ago by MAPK1.5k
2
gravatar for Steven Lakin
4.8 years ago by
Steven Lakin1.5k
Fort Collins, CO, USA
Steven Lakin1.5k wrote:

This is not by any means pretty code, but it should work for you and output it in tab delimited format in a file in your output directory.  You can then re-read that into R using read.table() with "sep" set to "\t".  I didn't include the decreasing order since it is late here, but with a little imagination, you could probably add it.

transformMyMatrix <- function(mymatrix, outputFile) {
        for(i in 1:nrow(mymatrix)) {
                temp <- paste(c("pos(", mymatrix[i, "pos"], ")"), collapse='')
                for(j in 2:ncol(mymatrix)) {
                        if(!is.na(mymatrix[i,j])) {
                                temp <- c(temp, paste(c(names(mymatrix)[j], "(", mymatrix[i,j], ")"), collapse=''))
                        }
                }
                write.table(t(as.matrix(temp)), file=outputFile, sep="\t", append=T, quote=F, row.names=F, col.names=F)
        }
}

 

Then call the function:

transformMyMatrix(mymatrix, "outputFile.txt")

 

 

For example, here is what I get with that:

> mymatrix
        pos  A  C  G  T  N
1   1611111 NA  1 NA 17 NA
2  99022222  3 NA 24 NA NA
3  99092333 NA  5 NA 91 NA
4 233232333  2 22 NA NA NA

 

> transformMyMatrix(mymatrix, outputFile="newMatrix.txt")
> newMatrix <- read.table(file="newMatrix.txt", sep="\t")
> newMatrix
              V1   V2    V3
1   pos(1611111) C(1) T(17)
2  pos(99022222) A(3) G(24)
3  pos(99092333) C(5) T(91)
4 pos(233232333) A(2) C(22)

 

Be aware that if you try to read it back into R in a data frame format, it will automatically fill in empty spots with NAs if the # of columns are uneven, so you might have to address that.

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by Steven Lakin1.5k

Thank you so much!

ADD REPLYlink written 4.8 years ago by MAPK1.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1182 users visited in the last hour