**0**wrote:

Hi,

I have a single-cell RNAseq data in matrix.mtx downloaded format from 10Xgenomics, I want to store that in a sparse (with a lot of zero) read counts file in txt format. , how can I do that?

Question: Storing a matrix.mtx in a gene expression matrix csv/ txt format

0

prgrmmr70 • **0** wrote:

Hi,

I have a single-cell RNAseq data in matrix.mtx downloaded format from 10Xgenomics, I want to store that in a sparse (with a lot of zero) read counts file in txt format. , how can I do that?

0

Mensur Dlakic • **8.2k** wrote:

Your question is unclear. If you want to store the matrix in a sparse format, that would be the one without any zeros. I am assuming that you already have a matrix in sparse (MatrixMarket) format, but want to convert it into dense format. You can clear this up by showing us the first line of your file:

```
head -1 matrix.mtx
```

You matrix is already sparse if the screen output is something like this:

```
%%MatrixMarket matrix coordinate integer general
```

If so, **this thread** explains the conversion. If you actually have a dense matrix (with lots of zeros) and want to convert it into sparse format, **this thread** will show you how. If needed, I have a custom python script for the `dense -> sparse`

task as well.

Hi Mensur, Thank you for your response. yes, I mean I need a dense data (according to your definition, a data with a lot of zero) to use in deep learning algorithms. So, I followed the instructions you share with me in "this thread", and there were two options. One using python, that only reads the data and not any conversion. The other option is using CellRenger. right? If yes, so should I install it on Unix/ Linux?

SciPy can **read** and **write** MatrixMarket files, and in that page I referenced before you already have an example of how to read the matrix. Once loaded, SciPy will also convert to a **dense matrix** or a **numpy array**. Note that you will need very large memory for this conversion, and in fact it may be impossible to do depending on your computer's RAM. Assuming the conversion works, you'll probably want to save it as a **compressed array** because the file will be huge.

I am explaining how to do this because you asked, but I recommend against it. All machine learning (ML) tools will struggle with dense datasets of this size, especially given its level of sparsity. Almost all types of modern ML models - (extreme) gradient boosting, random forests, even support vector machines - work with sparse matrices without any conversion. If you absolutely require dense data, I suggest **truncated SVD** for converting the sparse matrix into low-dimension dense data.

0

Kevin Blighe ♦ **69k** wrote:

Hi, if using R, *DropletUtils* is what you need:
https://www.bioconductor.org/packages/release/bioc/html/DropletUtils.html

Kevin

0

swbarnes2 ♦ **9.4k** wrote:

A "sparse" matrix does **not** have a lot of zeros. 10X data in the three file output format is already sparse. If you want not-sparse data, cellranger has a mat2csv function.

Please log in to add an answer.

Use of this site constitutes acceptance of our User
Agreement
and Privacy
Policy.

Powered by Biostar
version 2.3.0

Traffic: 2325 users visited in the last hour