Question: Store a scRNA-seq ~gene expression "matrix.txt" as a "matrix.mtx" in R
0
gravatar for choijamtsm
7 months ago by
choijamtsm50
choijamtsm50 wrote:

Hello everyone!

I new here. Actually I downloaded a huge big scRNA-seq .txt file from paper. And I want to analyze this seq data with R ~ "Seurat" package. But in order to do that, I have to convert this .txt format into .mtx file using R Matrix() package!!

And scRNA-seq data format looked like this:

library(Matrix)

micedata <- read.table(file = "mice.txt", header = T, sep="") 

typeof(micedata)
#"list"

dim(micedata)
#[1] 14699 37070

And micedata looked like this:

head(micedata[1:5, 1:3])

    GENE Aging_mouse_brain_portal_data_6_AAACCTGAGGCCCTTG Aging_mouse_brain_portal_data_6_AAACGGGAGAGACGAA
1  Sox17                                         0.000000                                                0
2 Mrpl15                                         1.340271                                                0
3 Lypla1                                         0.000000                                                0

But problem is that when i try to create sparse matrix it shows following errors:

# save sparse matrix
sparse.micedata <- Matrix(micedata, sparse = T )

Error in storage.mode(from) <- "double" : 
  (list) object cannot be coerced to type 'double'

I already stored genes.tsv and barcodes.tsv

# save genes and cells names
write(x = rownames(micedata), file = "genes.tsv") 
write(x = colnames(micedata), file = "barcodes.tsv")

How can i convert to sparse matrix from micedata? Thank you

rna-seq next-gen R • 1.1k views
ADD COMMENTlink modified 7 months ago by Kevin Blighe67k • written 7 months ago by choijamtsm50
1

I want to analyze this seq data with R ~ "Seurat" package. But in order to do that, I have to convert this .txt format into .mtx file using R Matrix() package!!

You actually don't have to convert. Seurat will take any matrix. Read10X is a helper function if you are starting with 10x data. Otherwise, just skip to the CreateSeuratObject function.

ADD REPLYlink modified 7 months ago • written 7 months ago by igor11k

Great point, Igor.

ADD REPLYlink written 7 months ago by Kevin Blighe67k
3
gravatar for Kevin Blighe
7 months ago by
Kevin Blighe67k
Republic of Ireland
Kevin Blighe67k wrote:

Hey,

You should set the GENE column as the rownames, then remove that column from the main data, and then try again. Possibly, you'll have to additionally coerce it via as.matrix() (see below).

rownames(micedata) <- micedata$GENE
micedata <- micedata[,-1]
Matrix(as.matrix(micedata)

I have had to do this, too, quite recently. Basically, you can store your matrix as a sparse matrix and then write it out as the 10x MTX format via DropletUtils. Something along these lines:

require(DropletUtils)
require(Matrix)

x <- Matrix(as.matrix(mat), sparse = TRUE)
message('--writing 10x output')
write10xCounts(
  path = 'mystudy/out'),
  x = x,
  overwrite = FALSE)
message('--done')

It will assume that the barcodes are the colnames (and genes the rownames), but I think that you can specify these separately, too.

Kevin

ADD COMMENTlink modified 7 months ago • written 7 months ago by Kevin Blighe67k

Thanks for your response.

It works well until this point.

x <- Matrix(as.matrix(micedata), sparse = TRUE)

but can you tell me the below code: When I execute the below code its shows the following errors.

message('--writing 10x output to: file_location ', paste0('mystudy/', gsmid))
    write10xCounts(
      path = paste0('mystudy/', gsmid),
      x = x,
      overwrite = FALSE)
    message('--done')
Error in paste0("mystudy/", gsmid) : object 'gsmid' not found

what is gsmid? I loaded "require(DropletUtils)"

Alternatively I also tried to write .mtx file via following:

writeMM(obj = x, file="matrix.mtx")

After completion i have 1.4 GB of matrix.mtx file.

Then try to load the data using:

mice <- Read10X(data.dir = "file_location", gene.column = 1)

Thank you

ADD REPLYlink modified 7 months ago • written 7 months ago by choijamtsm50

Great, so, it worked fine? gsmid is just a sample ID (from GEO) that I was working with. I have edited the code so it will run smoothly for others.

Edit: in fact, I just removed gsmid from my answer, to avoid any other problems.

ADD REPLYlink modified 7 months ago • written 7 months ago by Kevin Blighe67k
1

It worked well. thank you so much.

ADD REPLYlink written 7 months ago by choijamtsm50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1326 users visited in the last hour