2
0
Entering edit mode
7 weeks ago
Nico • 0

Hi everyone, I am currently having a problem with reading a 10x matrix file I obtained online.

When I tried to read it using Read10X of Seurat or readMM, I got an error saying

Error in scan(file, nmax = 1, what = what, quiet = TRUE, ...) :
scan() expected 'an integer', got '2319108599'


Below is the output when I looked at the head of the file.

%%MatrixMarket matrix coordinate real general
%
1462702 27943 2319108599
1 18558 6.2000000e+01
1 18565 8.0000000e+00
1 18564 2.9000000e+01
1 18562 9.6000000e+01


Below is the output when I looked at the head of another 10x matrix file where I was able to use Read10X function

%%MatrixMarket matrix coordinate integer general
33538 85144 127119627
33509 1 150
33507 1 16
33506 1 122


I realized that
1. The first file is in "real" and not "integer" format.
2. The matrix seems to be transposed, for the 2nd file, I see that there are 33538 cells and 85144 features. However for the first file, I am expecting 1462702 cells.

Does anyone have any idea how and if it is fine to transpose and convert the data type from real to integer?

Thanks,
Nico

RNA-Seq 10x Seurat • 280 views
1
Entering edit mode
7 weeks ago
Nico • 0

Fixed by following the codes here: https://github.com/satijalab/seurat/issues/4030

Seems like the problem was due to the matrix being too large and dividing the file into 2 fixed it.

0
Entering edit mode
7 weeks ago
Gordon Smyth ★ 2.5k

The problem is that R cannot represent integers larger than about 2e9. The developers of readMM could fix that by reading the integer as a real instead as an integer but the large number of matrix elements would still cause problems elsewhere. No doubt neither the Seurat nor the readMM developers expected to see files with as many cells as you seem to have.