Question: Storing a matrix.mtx in a gene expression matrix csv/ txt format
0
gravatar for prgrmmr70
7 months ago by
prgrmmr700
prgrmmr700 wrote:

Hi,

I have a single-cell RNAseq data in matrix.mtx downloaded format from 10Xgenomics, I want to store that in a sparse (with a lot of zero) read counts file in txt format. , how can I do that?

rna-seq • 1.1k views
ADD COMMENTlink modified 7 months ago by swbarnes29.4k • written 7 months ago by prgrmmr700
0
gravatar for Mensur Dlakic
7 months ago by
Mensur Dlakic8.2k
USA
Mensur Dlakic8.2k wrote:

Your question is unclear. If you want to store the matrix in a sparse format, that would be the one without any zeros. I am assuming that you already have a matrix in sparse (MatrixMarket) format, but want to convert it into dense format. You can clear this up by showing us the first line of your file:

head -1 matrix.mtx

You matrix is already sparse if the screen output is something like this:

%%MatrixMarket matrix coordinate integer general

If so, this thread explains the conversion. If you actually have a dense matrix (with lots of zeros) and want to convert it into sparse format, this thread will show you how. If needed, I have a custom python script for the dense -> sparse task as well.

ADD COMMENTlink written 7 months ago by Mensur Dlakic8.2k

Hi Mensur, Thank you for your response. yes, I mean I need a dense data (according to your definition, a data with a lot of zero) to use in deep learning algorithms. So, I followed the instructions you share with me in "this thread", and there were two options. One using python, that only reads the data and not any conversion. The other option is using CellRenger. right? If yes, so should I install it on Unix/ Linux?

ADD REPLYlink written 7 months ago by prgrmmr700

SciPy can read and write MatrixMarket files, and in that page I referenced before you already have an example of how to read the matrix. Once loaded, SciPy will also convert to a dense matrix or a numpy array. Note that you will need very large memory for this conversion, and in fact it may be impossible to do depending on your computer's RAM. Assuming the conversion works, you'll probably want to save it as a compressed array because the file will be huge.

I am explaining how to do this because you asked, but I recommend against it. All machine learning (ML) tools will struggle with dense datasets of this size, especially given its level of sparsity. Almost all types of modern ML models - (extreme) gradient boosting, random forests, even support vector machines - work with sparse matrices without any conversion. If you absolutely require dense data, I suggest truncated SVD for converting the sparse matrix into low-dimension dense data.

ADD REPLYlink written 7 months ago by Mensur Dlakic8.2k

The reason I need a dense matrix is that I need to have genes in the rows and cells in the columns. So, what is your ideas to have this type of data?

ADD REPLYlink written 7 months ago by prgrmmr700
0
gravatar for Kevin Blighe
7 months ago by
Kevin Blighe69k
Republic of Ireland
Kevin Blighe69k wrote:

Hi, if using R, DropletUtils is what you need: https://www.bioconductor.org/packages/release/bioc/html/DropletUtils.html

Kevin

ADD COMMENTlink written 7 months ago by Kevin Blighe69k
0
gravatar for swbarnes2
7 months ago by
swbarnes29.4k
United States
swbarnes29.4k wrote:

A "sparse" matrix does not have a lot of zeros. 10X data in the three file output format is already sparse. If you want not-sparse data, cellranger has a mat2csv function.

ADD COMMENTlink written 7 months ago by swbarnes29.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2325 users visited in the last hour
_